c# - 解析日志文件,不明确的分隔符
问题描述
我必须解析一个日志文件,但不确定如何最好地获取每行的不同部分。我面临的问题是原始开发人员使用':'来分隔令牌,这有点愚蠢,因为该行包含本身包含':'的时间戳!
示例行如下所示:
transaction_date_time:[systemid]:sending_system:receiving_system:data_length:data:[ws_name]
2019-05-08 15:03:13:494|2019-05-08 15:03:13:398:[192.168.1.2]:ABC:DEF:67:cd71f7d9a546ec2b32b,AACN90012001000012,OPNG:[WebService.SomeName.WebServiceModule::WebServiceName]
我读取日志文件并访问每一行没有问题,但不知道如何解析这些片段?
解决方案
使用正则表达式我能够解析所有内容。看起来数据来自excel,因为秒的派系有一个冒号而不是句号。c# 不喜欢冒号,所以我不得不用句号替换冒号。我还从右到左解析以解决冒号问题。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
using System.IO;
namespace ConsoleApplication3
{
class Program1
{
const string FILENAME = @"c:\temp\test.txt";
static void Main(string[] args)
{
string line = "";
int rowCount = 0;
StreamReader reader = new StreamReader(FILENAME);
string pattern = @"^(?'time'.*):\[(?'systemid'[^\]]+)\]:(?'sending'[^:]+):(?'receiving'[^:]+):(?'length'[^:]+):(?'data'[^:]+):\[(?'ws_name'[^\]]+)\]";
while ((line = reader.ReadLine()) != null)
{
line = line.Trim();
if (line.Length > 0)
{
if (++rowCount != 1) //skip header row
{
Log_Data newRow = new Log_Data();
Log_Data.logData.Add(newRow);
Match match = Regex.Match(line, pattern, RegexOptions.RightToLeft);
newRow.ws_name = match.Groups["ws_name"].Value;
newRow.data = match.Groups["data"].Value;
newRow.length = int.Parse(match.Groups["length"].Value);
newRow.receiving_system = match.Groups["receiving"].Value;
newRow.sending_system = match.Groups["sending"].Value;
newRow.systemid = match.Groups["systemid"].Value;
//end data is first then start date is second
string[] date = match.Groups["time"].Value.Split(new char[] {'|'}).ToArray();
string replacePattern = @"(?'leader'.+):(?'trailer'\d+)";
string stringDate = Regex.Replace(date[1], replacePattern, "${leader}.${trailer}", RegexOptions.RightToLeft);
newRow.startDate = DateTime.Parse(stringDate);
stringDate = Regex.Replace(date[0], replacePattern, "${leader}.${trailer}", RegexOptions.RightToLeft);
newRow.endDate = DateTime.Parse(stringDate );
}
}
}
}
}
public class Log_Data
{
public static List<Log_Data> logData = new List<Log_Data>();
public DateTime startDate { get; set; } //transaction_date_time:[systemid]:sending_system:receiving_system:data_length:data:[ws_name]
public DateTime endDate { get; set; }
public string systemid { get; set; }
public string sending_system { get; set; }
public string receiving_system { get; set; }
public int length { get; set; }
public string data { get; set; }
public string ws_name { get; set; }
}
}
推荐阅读
- node.js - 在节点缓冲区中写入浮点值
- java - 解决:JAVA Selenium:无法在输入上设置值
- angular - 带有标志角 6 的 ng 构建错误
- java - Spring Boot 中的计划作业
- amazon-dynamodb - Dynamodb:ValidationException:不支持查询键条件。
- sql - 如何防止更新语句将某些记录设置为空?
- mongodb - MongoDB 2.6.10 count() 聚合()
- python - PostGIS 空间查询性能低下
- python - 为什么在 Pykd 中找到符号时会出现“未找到符号”?
- angularjs - 如何在AngularJS $http中的url中添加一个变量元素