首页 > 解决方案 > 使用正则表达式从许多文件中获取特定数据

问题描述

为了从原始数据(奇怪的日志文件)中获取特定数据,我遇到了一些问题。基本上,我需要得到一个 SerialNumber + Measurement 结果,如下所示:

输出文件:results.txt 文件内容:123456789124,1.521978E-04(100多个文件结果相同)。

{@BTEST|123456789124|06|191016115628|000045|0|all||n|n|191016115713||002|32145698712
{@BLOCK|2%c407|01
{@A-CAP|1|+1.521978E-04{@LIM3|+1.267300E-04|+1.520760E-04|+1.013840E-04}}
{@RPT|DEVICES IN PARALLEL}
{@RPT|c436 47.0u}
{@RPT|c408 10.0u}
{@RPT|c409 10.0u}
{@RPT|c412 10.0u}
{@RPT|c420 10.0u}
{@RPT|c500 10.0u}
{@RPT|c502 10.0u}
{@RPT|c503 10.0u}
{@RPT|c580 10.0u}
{@RPT|c427 4.70u}
{@RPT|c702d 4.70u}
{@RPT|c712d 4.70u}
{@RPT|c750 4.70u}
{@RPT|c515 1.00u}
{@RPT|c516 1.00u}
{@RPT|c560 1.00u}
{@RPT|c1091 1.00u}
{@RPT|c701b 330n}
{@RPT|c114 100n}
{@RPT|c405 100n}
{@RPT|c501 100n}
{@RPT|c504 100n}
{@RPT|c581 100n}
{@RPT|c700a 100n}
{@RPT|c700c 100n}
{@RPT|c701d 100n}
{@RPT|c703d 100n}
{@RPT|c704d 100n}
{@RPT|c711d 100n}
{@RPT|c751 100n}
{@RPT|c752 100n}
{@RPT|c1090 100n}
{@RPT|c1502 100n}
{@RPT|c1503 100n}
{@RPT|q380 q380%r1 57.0k}
{@RPT|q400 q400%q1 500, 100}

此时,我可以获取序列号,但我在获取测量结果时遇到了问题。这是我的代码:

static void Main(string[] args)
    {
        Regex regex = new Regex(@"(?<={@BTEST\|)(.*?)(?=\|)");

        string LogPath = @"C:\Data\FailedLogs";

        var LogDBSNResult = @"C:\results1.txt";


        List<string> fileNames = new List<string>(Directory.GetFiles(LogPath));
        List<string> dbSerialNumbers = new List<string>();

        using (System.IO.StreamWriter file = new System.IO.StreamWriter(LogDBSNResult))
        {
            foreach (var filename in fileNames)
            {
                string fileContent = File.ReadAllText(filename);

                if (regex.IsMatch(fileContent))
                {
                    var result = regex.Match(fileContent).ToString();

                    // Adding the results in the list
                    dbSerialNumbers.Add(result);
                    Console.WriteLine("Found: " + result);
                    file.WriteLine(result);
                }
            }

            //List<string> removeDuplicated = dbSerialNumbers.Distinct().ToList();
            //foreach (var value in removeDuplicated)
            //{
            //    file.WriteLine(value);
            //}

        }     
    }

标签: c#regex

解决方案


我建议您拆分搜索,一个用于提取序列的模式,另一个用于提取测量结果。试试看:

string patSerial = @"(?<={@BTEST\|)(.*?)(?=\|)";
string patMeasur = @"(?<={@A-CAP\|\d\|.)(.*?)(?=\{)";
string LogPath = @"C:\Data\FailedLogs";
string LogDBSNResult = @"C:\results1.txt";

using (StreamWriter log = new StreamWriter(LogDBSNResult))
{
    foreach(string fileName in Directory.GetFiles(LogPath))
    {
        string fileContent = File.ReadAllText(fileName);

        if (Regex.IsMatch(fileContent, patSerial) && Regex.IsMatch(fileContent, patMeasur))
        {
            string serial = Regex.Match(fileContent, patSerial).Value;
            string measurement = Regex.Match(fileContent, patMeasur).Value;

            log.WriteLine($"{serial}, {measurement}");
        }
    }
}

祝你好运。


推荐阅读