首页 > 解决方案 > 对大型 XML 数据进行更好的 Linq 解析

问题描述

我有一个应用程序,它接收许多 xml 文件并执行查找以创建 csv 文件,我注意到数据并不总是 100%,即丢失结果或 2,所以我认为我处理数据的方式不正确和穷人所以真的很感谢这里的大师的一些帮助。

小型 XML 示例:

<?xml version="1.0" encoding="utf-8"?>
<lookupdb xmlns:xsd="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns="urn:sample:lookupdb:0.1">
    <References>
          <Reference id="3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4">
            <Vehicle>Beach_Buggy_01</Vehicle>
            <Engineers>
              <Engineer>Joe Bloggs</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Bill Bloggs</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Bill</OwnerName>
            <CostID>ABCDEF123456</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address</Address>
          </Reference>
          <Reference id="d1053bd3-a1cb-4fb4-a7d5-ffee3e10ffdb">
            <Vehicle>Transit</Vehicle>
            <Engineers>
              <Engineer>Joe Bloggs2</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Andy Bloggs</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Andy</OwnerName>
            <CostID>9345089</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address4</Address>
          </Reference>
          <Reference id="30f8cfe8-40fd-4c99-9c7d-5ab98f8e5620">
            <Vehicle>Ford Fiesta</Vehicle>
            <Engineers>
              <Engineer>Steve Bloggs</Engineer>
            </Engineers>
            <IsActive>true</IsActive>
            <Owner>Sarah H</Owner>
            <Serviced>True</Serviced>
            <OwnerName>Bill</OwnerName>
            <CostID>834hsdfgs</CostID>
            <FuelType>Petrol</FuelType>
            <Phone>1234567890</Phone>
            <Address>Some Address3</Address>
          </Reference>
    </References>
    <Sessions>
        <RentalSession id="cc5d9960-3a80-4fd9-b7d6-0963198567c3">
              <VehicleRefId>3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4</VehicleRefId>
              <RentalPeriod startDate="2018-10-02T07:46:34Z" endDate="2018-10-02T08:27:36Z" />
              <HiringInfo HireId="2e428f42-f8f1-4603-9570-fed1fa78e470" customerId="1929936734" customerRefId="6da73407-f443-491d-9cad-c4fed9bfb71f" />
              <Notes>Vehicle Broke Down Recovery ordered</Notes>
              <VehicleGroup>ATV</VehicleGroup>
        </RentalSession>
        <RentalSession id="829221a2-196e-403a-bdcb-9759959cfa70">
              <VehicleRefId>3cb7ceb0-43c7-4c67-a7fb-fffb32fc71c4</VehicleRefId>
              <RentalPeriod startDate="2018-10-03T07:46:34Z" endDate="2018-10-04T08:27:36Z" />
              <HiringInfo HireId="4fb2cd21-9f48-44de-ae72-01ce4eeccdf9" customerId="2929936735" customerRefId="0a2d3d8b-ab06-4cd1-9ec5-aea4ac3f6da3" />
              <Notes>Returned on Time no Damage</Notes>
              <VehicleGroup>ATV</VehicleGroup>
        </RentalSession>
        <RentalSession id="68a6b485-d30a-439a-8081-8c09f724d23b">
              <VehicleRefId>d1053bd3-a1cb-4fb4-a7d5-ffee3e10ffdb</VehicleRefId>
              <RentalPeriod startDate="2018-10-05T07:46:34Z" endDate="2018-10-05T08:27:36Z" />
              <HiringInfo HireId="c4022764-7fc2-4415-97bf-57d616e3b8bd" customerId="3929936736" customerRefId="cb260bfc-34c1-4ac5-befa-17f69b2406bb" />
              <Notes>Scratch to Door Charges applied</Notes>
              <VehicleGroup>VANS</VehicleGroup>
        </RentalSession>
        <RentalSession id="c4083f9a-65ee-4693-8488-e299271064b1">
              <VehicleRefId>30f8cfe8-40fd-4c99-9c7d-5ab98f8e5620</VehicleRefId>
              <RentalPeriod startDate="2018-10-09T07:46:34Z" endDate="2018-10-09T08:27:36Z" />
              <HiringInfo HireId="cb260bfc-34c1-4ac5-befa-17f69b2406bb" customerId="4929936737" customerRefId="c4022764-7fc2-4415-97bf-57d616e3b8bd" />
              <Notes>Generally a rubbish vehicle</Notes>
              <VehicleGroup>Small Cars</VehicleGroup>
        </RentalSession>
    </Sessions>
</lookupdb>

用户名是程序的主要查找,以及所需的工程师,因为会话中的 VehicleRefId 与参考 id 匹配,大部分数据来自租赁会话;但是,从一些本地测试中,我发现首先获取会话数据似乎效果更好,但对这种方法并不完全确定,这是我认为需要查看的代码:

1:获取租赁数据

 var result = xDoc.Descendants().Descendants(ns + "RentalSession")
                            .Where(x => x.Element(ns + "VehicleRefId").Value != null)
                            .Select(x => new
                            {
                                _VehicleRefId = GetResultValue(true, x, "VehicleRefId", "VehicleRefId", "Vehicle Reference ID"),
                                _RentalSessionId = GetResultValue(false, x, "RentalSession", "id", "Session ID"),
                                _startDate = GetResultValue(false, x, "RentalPeriod", "startDate", "Start date"),
                                _endDate = GetResultValue(false, x, "RentalPeriod", "endDate", "End date"),
                                _VehicleGroup = GetResultValue(true, x, "VehicleGroup", "VehicleGroup", "Vehicle Group"),
                                _Notes = GetResultValue(true, x, "Notes", "Notes", "Event Notes")
                            }).ToList().Distinct();

2:租赁数据查询查询中看到的方法:

private string GetResultValue(bool isNode, XElement atrr_value,string nodeName, string xattr_Name, string value_text)
{
    string retValue = "";
    try
    {
        switch(isNode)
        {
            case true:
                    retValue = !string.IsNullOrEmpty((string)atrr_value.Element(ns + nodeName).Value)
                                       ? (string)atrr_value.Element(ns + nodeName).Value
                                          : $"No {value_text} Found.";
                    break;
            default:
                    if(nodeName == "RentalSession")
                    {
                        retValue = !string.IsNullOrEmpty((string)atrr_value.Attribute(xattr_Name).Value)
                                       ? (string)atrr_value.Attribute(xattr_Name).Value
                                          : $"No {value_text} Found.";
                    }
                    else
                    {
                        retValue = !string.IsNullOrEmpty((string)atrr_value.Element(ns + nodeName).Attribute(xattr_Name).Value)
                                       ? (string)atrr_value.Element(ns + nodeName).Attribute(xattr_Name).Value
                                          : $"No {value_text} Found.";
                    }
                    break;
        }
    }
    catch(Exception rex)
    {
        retValue = "null";
    }

    return retValue;
}

3:获取 Owner 和 Engineer 数据:

foreach(var itemData in result)
{
    try
    {
        var references = xDoc.Descendants().Descendants(ns + "Reference")
                         .Where(
                                a => a.Attribute("id").Value == itemData._VehicleRefId
                               )
                         .Select(a => new
                         {
                                _OwnerName = a.Element(ns + "OwnerName").Value,
                                _Engineer = a.Elements(ns + "Engineers").Descendants(ns + "Engineer").Select(e => e.Value).Single()
                         }).FirstOrDefault();

                         ... Further parsing 
    catch (Exception xEx)
    {
        //some error handling stuff
    }
}

非常感谢您的帮助,以了解我在学习和简化这部分代码方面的不足之处。

提前谢谢了。

编辑:上面的 xml 只显示了一段数据,会有多个引用和会话,一些会话将匹配相同的引用。

标签: c#xmllinq

解决方案


不要使用当元素为空时会出现问题的“值”属性。而是做一个像下面的代码一样的演员

var result = xDoc.Descendants().Descendants(ns + "RentalSession")
                            .Where(x => x.Element(ns + "VehicleRefId").Value != null)
                            .Select(x => new
                            {
                                _VehicleRefId = (string)x.Element("VehicleRefId"),
                                _RentalSessionId = (string)x.Element("RentalSession),
                                _startDate = (DateTime)x.Element("RentalPeriod),
                                _endDate = (DateTime)x.Element("RentalPeriod"),
                                _VehicleGroup = (string)x.Element("VehicleGroup"),
                                _Notes = (string)x.Element("Notes")
                            }).ToList().Distinct();

推荐阅读