elasticsearch - 如何使用 NEST 和 Elastic Search 从查询中获取字段值
问题描述
我是弹性搜索和使用(尝试)NEST 库的新手。我正在使用 Serilog Elastic Search Sink 将日志写入索引。所以首先要考虑的是我无法控制接收器使用的结构,只有我选择记录的结构化日志属性。
无论如何,我只是试图运行一个基本搜索,我想从索引中返回第一个 X 文档。我能够从查询中获取一些属性值,但对于任何字段都没有。
查询如下:
var searchResponse = await _elasticClient.SearchAsync<LogsViewModel>(s => s
.Index("webapp-razor-*")
.From(0)
.Size(5)
.Query(q => q.MatchAll()));
我猜我为字段返回 null 的原因是模型类的结构不正确。
在弹性搜索门户中运行控制台工具以获取简单的 GET 请求:
此查询返回的示例文档如下:
{
"_index" : "webapp-razor-2021.05",
"_type" : "_doc",
"_id" : "34v3t43kBwE34t3vJowGRgl",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2021-05-03T20:19:46.9329848+01:00",
"level" : "Information",
"messageTemplate" : "{@LogEventCategory}{@LogEventType}{@LogEventSource}{@LogCountry}{@LogRegion}{@LogCity}{@LogZip}{@LogLatitude}{@LogLongitude}{@LogIsp}{@LogIpAddress}{@LogMobile}{@LogUserId}{@LogUsername}{@LogForename}{@LogSurname}{@LogData}",
"message" : "\"Open Id Connect\"\"User Sign In\"\"WebApp-RAZOR\"\"United Kingdom\"\"England\"\"MyTown\"\"PX27\"\"54.8951\"\"-9.1585\"\"My ISP\"\"123.345.789.180\"\"False\"\"a8vce3vc-8e61-44fc-b142-93ck396ad91ce\"\"joe@email.net\"\"joe@email.net\"\"Bloggs\"\"User with username [joe@email.net] forename [joe@email.net] surname [Bloggs] from IP Address [123.345.789.180] signed into the application [WebApp_RAZOR] Succesfully\"",
"fields" : {
"LogEventCategory" : "Open Id Connect",
"LogEventType" : "User Sign In",
"LogEventSource" : "WebApp-RAZOR",
"LogCountry" : "United Kingdom",
"LogRegion" : "England",
"LogCity" : "MyTown",
"LogZip" : "PX27",
"LogLatitude" : "54.8951",
"LogLongitude" : "-9.1585",
"LogIsp" : "My ISP",
"LogIpAddress" : "123.345.789.180",
"LogMobile" : "False",
"LogUserId" : "a8vce3vc-8e61-44fc-b142-93ck396ad91ce",
"LogUsername" : "joe@email.net",
"LogForename" : "joe@email.net",
"LogSurname" : "Bloggs",
"LogData" : "User with username [joe@email.net] forename [Joe] surname [Bloggs] from IP Address [123.345.789.180] signed into the application [WebApp_RAZOR] Succesfully",
"RequestId" : "0HM8ED1IRB7AK:00000001",
"RequestPath" : "/signin-oidc",
"ConnectionId" : "0HM8ED1IRB7AK",
"MachineName" : "DESKTOP-OS52032",
"MemoryUsage" : 23688592,
"ProcessId" : 26212,
"ProcessName" : "WebApp-RAZOR",
"ThreadId" : 6
}
示例模型类(或其中的一部分)
public class LogsViewModel
{
[JsonProperty("@timestamp")]
public string Timestamp { get; set; }
[JsonProperty("level")]
public string Level { get; set; }
[JsonProperty("fields")]
public Fields Fields { get; set; }
}
public class Fields
{
[JsonProperty("LogEventCategory")]
public string LogEventCategory { get; set; }
// Not all propeties shown here but would be the same principal...
}
有人可以给我一个想法吗?一旦我知道如何从诸如“LogEventCategory”之类的字段中获取值,那么我应该能够继续前进并弄清楚。Elastic 的所有文档示例都不适用于我,谢谢
解决方案
经过几天的反复试验,我终于得出了一个解决方案,能够从弹性文档中的 _source 对象中提取选择的字段。这里可能有更优化的方法,因此欢迎对该主题提出任何反馈。
我的第一步是从 Serilog 写入的索引中查看示例文档的结构,请注意,在我的情况下,我不一定在写入 Elastic 的所有日志事件中包括所有结构化日志事件属性,即在系统启动时,我根本不需要用户/位置等的详细信息。
使用 Elastic Portal 中的 DevTools,我执行了一个简单的 GET 请求:
用户 Russ Cam 在上面的评论中给出了很好的提示,他建议将 NuGet 包用于Elastic Common Schema .NET的优势,它为使用 Serilog 和从各种不同的应用程序/源记录到 Elastic 提供了一些标准化。阅读论坛,似乎 Elastic 强烈鼓励我们使用通用模式,因为它在处理图表/指标/仪表板创建等时会发挥更好的作用。
我的 WebApp 使用的是 .NET Core 5,我在 Program.cs 文件中包含了下面使用的代码部分,该部分显示了我在何处添加了对上述 Elastic Common Schema .NET 库的引用。现在因为我要连接到 Elastic Cloud,所以在构建 Elastic 客户端时我必须包含身份验证详细信息,我尝试了几次之后才弄清楚如何将此包参考与其他一些 Elastic 客户端选项合并:
程序.cs 文件:
public static void Main(string[] args)
{
var configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile(path: "appsettings.json", optional: false, reloadOnChange: true)
.Build();
// Credentials used for eleastic cloud logging sink.
var elkUri = configuration.GetSection("ElasticCloud").GetValue<string>("Uri");
var elkUsername = configuration.GetSection("ElasticCloud").GetValue<string>("Username");
var elkPassword = configuration.GetSection("ElasticCloud").GetValue<string>("Password");
var elkApplicationName = configuration.GetSection("ElasticCloud").GetValue<string>("ApplicationName");
Log.Logger = new LoggerConfiguration()
.ReadFrom.Configuration(configuration)
.WriteTo.Elasticsearch(new ElasticsearchSinkOptions(new Uri(elkUri))
{
ModifyConnectionSettings = x => x.BasicAuthentication(elkUsername, elkPassword),
IndexFormat = "webapp-razor-{0:yyyy.MM}",
AutoRegisterTemplate = true,
CustomFormatter = new EcsTextFormatter() // *Elastic Common Schema .NET package ref HERE*
})
.CreateLogger();
var host = CreateHostBuilder(args).Build();
using var scope = host.Services.CreateScope();
var services = scope.ServiceProvider;
string logEventCategory = "WebApp-RAZOR";
string logEventType = "Application Startup";
string logEventSource = "System";
string logData = "";
try
{
// Tested OK 1.5.2021
//throw new Exception(); // Testing only..
logData = "Application Starting Up";
Log.Information(
"{@LogEventCategory}" +
"{@LogEventType}" +
"{@LogEventSource}" +
"{@LogData}",
logEventCategory,
logEventType,
logEventSource,
logData);
host.Run(); // Run the WebHostBuilder.
}
catch (Exception ex)
{
logData = "The Application failed to start correctly.";
// Tested on 08/07/2020
Log.Fatal(ex,
"{@LogEventCategory}" +
"{@LogEventType}" +
"{@LogEventSource}" +
"{@LogData}",
logEventCategory,
logEventType,
logEventSource,
logData);
}
finally // Cleanup code.
{
Log.CloseAndFlush();
};
}
我在 NEST 客户端方法中使用动态类型引用的方法是,这样我就可以避免使用强类型模型,这使得在尝试通过暂停结果来确定从查询返回的数据的结构时变得更加容易调试并查看内容结构。
var searchResponse = await _elasticClient.SearchAsync<dynamic>(s => s
//.AllIndices()
.Index("webapp-razor-*")
.Query(q => q
.MatchAll()
)
);
// Once the searchResponse data is returned from the query,
// I then map the results to a View Model
// (which I use for rendering the list of results to my Razor page)
LogsViewModel = new LogsViewModel
{
ScannedEventCount = searchResponse.Hits.Count,
LogEventProperties = new List<LogEventProperties>()
};
foreach (var doc in searchResponse.Documents)
{
var lep = new LogEventProperties();
lep.Timestamp = DateTime.Parse(doc["@timestamp"].ToString());
lep.Level = doc["log.level"];
// Properties
if (((IDictionary<string, object>)doc).ContainsKey("_metadata"))
{
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_event_category", out object value1)) { lep.LogEventCategory = value1.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_event_type", out object value2)) { lep.LogEventType = value2.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_event_source", out object value3)) { lep.LogEventSource = value3.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_device_id", out object value4)) { lep.LogDeviceId = value4.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_country", out object value5)) { lep.LogCountry = value5.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_region", out object value6)) { lep.LogRegion = value6.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_city", out object value7)) { lep.LogCity = value5.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_zip", out object value8)) { lep.LogZip = value5.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_latitude", out object value9)) { lep.LogLatitude = value9.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_longitude", out object value10)) { lep.LogLongitude = value10.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_isp", out object value11)) { lep.LogIsp = value5.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_ip_address", out object value12)) { lep.LogIpAddress = value12.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_mobile", out object value13)) { lep.LogMobile = value13.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_user_id", out object value14)) { lep.LogUserId = value14.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_username", out object value15)) { lep.LogUsername = value15.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_forename", out object value16)) { lep.LogForename = value16.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_surname", out object value17)) { lep.LogSurname = value17.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("log_data", out object value18)) { lep.LogData = value18.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("request_id", out object value19)) { lep.RequestId = value19.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("request_path", out object value20)) { lep.RequestPath = value20.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("connection_id", out object value21)) { lep.ConnectionId = value21.ToString(); }
if (((IDictionary<String, object>)doc["_metadata"]).TryGetValue("memory_usage", out object value22)) { lep.MemoryUsage = (Int64)value22; }
}
// Exception
if (((IDictionary<string, object>)doc).ContainsKey("error"))
{
if (((IDictionary<String, object>)doc["error"]).TryGetValue("message", out object value23)) { lep.ErrorMessage = value23.ToString(); }
if (((IDictionary<String, object>)doc["error"]).TryGetValue("type", out object value24)) { lep.ErrorType = value24.ToString(); }
if (((IDictionary<String, object>)doc["error"]).TryGetValue("stack_trace", out object value25)) { lep.ErrorStackTrace = value25.ToString(); }
}
// Machine Name
if (((IDictionary<string, object>)doc).ContainsKey("host"))
{
if (((IDictionary<String, object>)doc["host"]).TryGetValue("name", out object value26)) { lep.MachineName = value26.ToString(); }
}
// Process
if (((IDictionary<string, object>)doc).ContainsKey("process"))
{
if (((IDictionary<String, object>)doc["process"]["thread"]).TryGetValue("id", out object value27)) { lep.ThreadId = (Int64)value27; }
if (((IDictionary<String, object>)doc["process"]).TryGetValue("pid", out object value28)) { lep.ProcessId = (Int64)value28; }
if (((IDictionary<String, object>)doc["process"]).TryGetValue("name", out object value29)) { lep.ProcessName = value29.ToString(); }
}
LogsViewModel.LogEventProperties.Add(lep);
}
}
return View(LogsViewModel);
我采用上述方法的根本原因是某些文档不会包含所有结构化的日志记录事件属性。在尝试访问值之前,我必须得出一种检查字典键是否存在的方法,否则当键丢失时我会收到异常错误。这方面的一个示例是在异常期间生成的日志事件与用户登录应用程序时生成的日志信息事件之间观察到的差异。
下面的两个文档显示了一个稍微不同的 JSON 结构,它强调了我使用动态类型获取结果的决定。一般来说,对于我自己在 Elastic 中创建的任何文档,我通常会将项目映射到适当的模型,因为我总是事先知道完整的结构。
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 70,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "webapp-razor-2021.05",
"_type" : "_doc",
"_id" : "_2sOPnkBwE4YgJownxnP",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2021-05-05T20:43:34.6041763+01:00",
"log.level" : "Information",
"message" : "\"WebApp-RAZOR\"\"Application Startup\"\"System\"\"Application Starting Up\"",
"_metadata" : {
"message_template" : "{@LogEventCategory}{@LogEventType}{@LogEventSource}{@LogData}",
"log_event_category" : "WebApp-RAZOR",
"log_event_type" : "Application Startup",
"log_event_source" : "System",
"log_data" : "Application Starting Up",
"memory_usage" : 4680920
},
"ecs" : {
"version" : "1.5.0"
},
"event" : {
"severity" : 2,
"timezone" : "GMT Standard Time",
"created" : "2021-05-05T20:43:34.6041763+01:00"
},
"host" : {
"name" : "DESKTOP-OS52032"
},
"log" : {
"logger" : "Elastic.CommonSchema.Serilog",
"original" : null
},
"process" : {
"thread" : {
"id" : 9
},
"pid" : 3868,
"name" : "WebApp-RAZOR",
"executable" : "WebApp-RAZOR"
}
}
},
{
"_index" : "webapp-razor-2021.05",
"_type" : "_doc",
"_id" : "AGsOPnkBwE4YgJowyBrP",
"_score" : 1.0,
"_source" : {
"@timestamp" : "2021-05-05T20:43:44.3936344+01:00",
"log.level" : "Information",
"message" : "\"Open Id Connect\"\"User Sign In\"\"WebApp-RAZOR\"\"United Kingdom\"\"England\"\"MyTown\"\"OX26\"\"51.8951\"\"-1.1585\"\"My ISP\"\"123.456.789.101\"\"False\"\"34vc34-34v34534-44fc-b142-32223ad91ce\"\"joe.bloggs@email.net\"\"joe.bloggs@email.net\"\"Bloggs\"\"User with username [joe.bloggs@email.net] forename [Jose] surname [Bloggs] from IP Address [123.345.789.101] signed into the application [WebApp_RAZOR] Succesfully\"",
"_metadata" : {
"message_template" : "{@LogEventCategory}{@LogEventType}{@LogEventSource}{@LogCountry}{@LogRegion}{@LogCity}{@LogZip}{@LogLatitude}{@LogLongitude}{@LogIsp}{@LogIpAddress}{@LogMobile}{@LogUserId}{@LogUsername}{@LogForename}{@LogSurname}{@LogData}",
"log_event_category" : "Open Id Connect",
"log_event_type" : "User Sign In",
"log_event_source" : "WebApp-RAZOR",
"log_country" : "United Kingdom",
"log_region" : "England",
"log_city" : "MyTown",
"log_zip" : "OX26",
"log_latitude" : "55.1234",
"log_longitude" : "-10.1585",
"log_isp" : "My ISP",
"log_ip_address" : "123.456.789.101",
"log_mobile" : "False",
"log_user_id" : "34vc34-34v3434-44fc-b142-32223ad91ce",
"log_username" : "joe.bloggs@email.net",
"log_forename" : "joe.bloggs@email.net",
"log_surname" : "Bloggs",
"log_data" : "User with username [joe.bloggs@email.net] forename [Joe] surname [Bloggs] from IP Address [123.456.789.101] signed into the application [WebApp_RAZOR] Succesfully",
"request_id" : "0HM8FVO9FFHDD:00000001",
"request_path" : "/signin-oidc",
"connection_id" : "0HM8FVO9FFHDD",
"memory_usage" : 23954480
},
"ecs" : {
"version" : "1.5.0"
},
"event" : {
"severity" : 2,
"timezone" : "GMT Standard Time",
"created" : "2021-05-05T20:43:44.3936344+01:00"
},
"host" : {
"name" : "DESKTOP-OS52032"
},
"log" : {
"logger" : "Elastic.CommonSchema.Serilog",
"original" : null
},
"process" : {
"thread" : {
"id" : 16
},
"pid" : 3868,
"name" : "WebApp-RAZOR",
"executable" : "WebApp-RAZOR"
}
}
},
推荐阅读
- android - 下面源码中如何使用Retrofit Cache?
- java - 如何使用 JDK 11 向 Collection.toArray() 提供生成器函数?
- html - 仅在 iPad 视图上显示的列之间的垂直间隙
- ios - 从 URL Xcode Swift 加载 3d 资源
- python - Python:不多次打印不等于0
- pyspark - pyarrow 错误:toPandas 尝试了箭头优化
- java - Springboot中无限执行方法
- javascript - 如何将画布图像保存为数据库并生成 url
- javascript - Promisify 使用速率限制器的回调
- ember.js - 应用程序初始化程序和 ember 中的服务有什么区别?