首页 > 解决方案 > 在 C# 中读取单词

问题描述

输入

string afterIN = "Text Field= Assignee AND Ticket Status != Deleted";

我尝试用下面的代码处理它:

char[] delimiterChars = { ' ', ',', '.', ':', '\t' };

string text = afterIN;

string[] words = text.Split(delimiterChars);
string str = "";

foreach (var word in words)
{
    if (word != "")
    {
        string strDelimit = "\"";

        str += strDelimit + word + strDelimit + ",";
    }
}

我想要输出

"Text Field",
"=",
"Assignee",
"AND",
"Ticket Status",
"!=",
"Deleted"

另一种类型的输入是 SQL 查询,例如

SELECT # Tickets WHERE Ticket Status=Open OR Ticket Status=Pending

所需的输出被拆分WHERE

"Ticket Status",
"=",
"Open",
"OR",
"Ticket Status",
"=",
"Pending"

标签: c#asp.netc#-4.0webforms

解决方案


一般情况下,您需要一个解析器;但是,如果可以保证源字符串没有注释字符串和其他复杂的语法结构,例如

  // here we should split on first 2 "AND"s
  Text Filed = /* And is commented*/ "A \"AND B" /* String */ AND Ticket Status != Deleted

您可以尝试借助正则表达式进行拆分:

  using System.Text.RegularExpressions;

  ...

  string source = "Text Field = Assignee AND Ticket Status != Deleted";

  // split on =, !=, and, or
  // Trim() each item if you want to get rid of leading / trailing spaces 
  string[] items = Regex.Split(
      source, 
    @"(!=|\band\b|=|\bor\b)", 
      RegexOptions.IgnoreCase);

为了处理(非常)简单的SQL(没有注释、字符串等),我们可以添加一些Linq(到Skip查询的初始部分,并且Take只有where部分):

using System.Linq;
using System.Text.RegularExpressions;
... 
string source = 
  @"SELECT # Tickets 
     WHERE Ticket Status <> Open OR Ticket Status > Pending 
  GROUP BY x 
  ORDER BY y";

string[] delimiters = new string[] {
  "where",
  "order",
  "group",

  //TODO: put all delimiters here
  ">", "<", "<>", "=", "!=", ">=", "<=",
  "and", "or", "not"
};

string pattern = string.Join("|", delimiters
  .OrderByDescending(item => item.Length)
  .Select(item => item.All(c => char.IsLetter(c)) 
      ? $@"\b{item}\b" 
      : Regex.Escape(item))); 

string[] items = Regex
  .Split(source, $"({pattern})", RegexOptions.IgnoreCase)
  .Select(item => item.Trim())
  .SkipWhile(item => !"where".Equals(item, StringComparison.OrdinalIgnoreCase))
  .Skip(1) 
  .TakeWhile(item => !"order".Equals(item, StringComparison.OrdinalIgnoreCase) &&
                     !"group".Equals(item, StringComparison.OrdinalIgnoreCase))
  .ToArray();

推荐阅读