c# - 在 XDocument.SetAttributeValue 中写入一个巨大的字符串
问题描述
我有一个大的StringBuilder
(~140MB),我需要在 XML 属性中写入。我XDocument
用来处理 XML 操作。
尝试将 写入string
XAttribute 时,我得到一个System.OutOfMemoryException
(因为我需要调用StringBuilder.ToString()
,我想它会将整个字符串加载到内存中)。
var length = value.RawArtifact.Content.Length;
StringBuilder b = new StringBuilder();
int pos = 0;
while (pos < length - 1000)
{
b.Append(BitConverter.ToString(value.RawArtifact.Content, pos, 1000).Replace("-", ""));
pos += 1000;
}
b.Append(BitConverter.ToString(value.RawArtifact.Content, pos)).Replace("-", "");
var buffer = b.ToString(); // This throws an exception
myAttribute.SetAttributeValue("my-attribute", buffer);
我找不到任何过载,因为SetAttributeValue
它需要像 aStreamReader
或任何东西,所以我现在感觉有点卡住了。
有什么建议么 ?
解决方案
如果您检查参考源,XAttribute
您将看到XAttribute
hasinternal string value;
因此无法使用StringBuilder
orStreamReader
作为值。
相反,您可能会考虑一种流式处理方法,在该方法中,您在写出XDocument
. 如果你这样做了,你可以结合XmlWriter.WriteStartAttribute()
使用XmlWriter.WriteChars()
以块的形式写入你的巨大属性值。WriteChars()
方法:
可用于一次写入一个缓冲区的大量文本。
所以正是为这种情况而设计的。有两种基本方法可以实现属性值的流式注入:
使用Mark Fussell 的Combining the XmlReader and XmlWriter classes for simple streaming transformations
XmlReader
中的算法,并在从返回的流式传输XDocument.CreateReader()
到XmlWriter
.有关一些示例,请参阅C# 中的文件大小限制或限制、编辑大型 XML 文件和自动从外部文件替换表。
子类
XmlWriter
化本身并在编写目标元素时注入属性。例如,请参阅自定义 xmlWriter 以跳过某个元素?.
采用第二种方式,首先创建如下扩展方法:
public static partial class XmlExtensions
{
public static void WriteAttribute(this XmlWriter writer, string localName, IEnumerable<(char [] Buffer, int Length)> valueSegments) =>
WriteAttribute(writer, null, localName, null, valueSegments);
public static void WriteAttribute(this XmlWriter writer, string localName, string namespaceUri, IEnumerable<(char [] Buffer, int Length)> valueSegments) =>
WriteAttribute(writer, null, localName, namespaceUri, valueSegments);
public static void WriteAttribute(this XmlWriter writer, string prefix, string localName, string namespaceUri, IEnumerable<(char [] Buffer, int Length)> valueSegments)
{
writer.WriteStartAttribute(prefix, localName, namespaceUri);
char [] surrogateBuffer = null;
// According to the docs, surrogate pairs cannot be split across calls to WriteChars():
// https://docs.microsoft.com/en-us/dotnet/api/system.xml.xmlwriter.writechars?view=net-5.0#remarks
// So if the last character of a segment is a high surrogate, buffer it and write it with the first character of the next buffer.
foreach (var segment in valueSegments)
{
if (segment.Length < 1)
continue;
int start = 0;
if (surrogateBuffer != null && surrogateBuffer[0] != '\0')
{
surrogateBuffer[1] = segment.Buffer[start++];
writer.WriteChars(surrogateBuffer, 0, 2);
surrogateBuffer[0] = surrogateBuffer[1] = '\0';
}
int count = segment.Length - start;
if (count > 0 && char.IsHighSurrogate(segment.Buffer[segment.Length-1]))
{
(surrogateBuffer = surrogateBuffer ?? new char[2])[0] = segment.Buffer[segment.Length-1];
count--;
}
writer.WriteChars(segment.Buffer, start, count);
}
writer.WriteEndAttribute();
if (surrogateBuffer != null && surrogateBuffer[0] != '\0')
throw new XmlException(string.Format("Unterminated surrogate pair {0}", surrogateBuffer[0]));
}
}
public static class ByteExtensions
{
// Copied from this answer https://stackoverflow.com/a/14333437
// By https://stackoverflow.com/users/445517/codesinchaos
// To https://stackoverflow.com/questions/311165/how-do-you-convert-a-byte-array-to-a-hexadecimal-string-and-vice-versa
// And modified to populate a char span rather than return a string.
public static void ByteToHexBitFiddle(ReadOnlySpan<byte> bytes, Span<char> c)
{
if (c.Length < 2* bytes.Length)
throw new ArgumentException("c.Length < 2* bytes.Length");
int b;
for (int i = 0; i < bytes.Length; i++) {
b = bytes[i] >> 4;
c[i * 2] = (char)(55 + b + (((b-10)>>31)&-7));
b = bytes[i] & 0xF;
c[i * 2 + 1] = (char)(55 + b + (((b-10)>>31)&-7));
}
}
public static IEnumerable<(char [] segment, int length)> GetHexCharSegments(ReadOnlyMemory<byte> bytes, int chunkSize = 1000)
{
var buffer = new char[2*chunkSize];
var length = bytes.Length;
int pos = 0;
while (pos < length - chunkSize)
{
ByteExtensions.ByteToHexBitFiddle(bytes.Span.Slice(pos, chunkSize), buffer);
yield return (buffer, buffer.Length);
pos += chunkSize;
}
ByteExtensions.ByteToHexBitFiddle(bytes.Span.Slice(pos), buffer);
yield return (buffer, 2*(length - pos));
}
}
接下来,子类XmlWriter
如下:
public class ElementEventArgs : EventArgs
{
public XName Element { get; init; }
public Stack<XName> ElementStack { get; init; }
}
public class NotifyingXmlWriter : XmlWriterProxy
{
readonly Stack<XName> elements = new Stack<XName>();
public NotifyingXmlWriter(XmlWriter baseWriter) : base(baseWriter) { }
public event EventHandler<ElementEventArgs> OnElementStarted;
public event EventHandler<ElementEventArgs> OnElementEnded;
public override void WriteStartElement(string prefix, string localName, string ns)
{
base.WriteStartElement(prefix, localName, ns);
var name = XName.Get(localName, ns);
elements.Push(name);
OnElementStarted?.Invoke(this, new ElementEventArgs { Element = name, ElementStack = elements });
}
public override void WriteEndElement()
{
base.WriteEndElement();
var name = elements.Pop(); // Pop after base.WriteEndElement() lets the base class throw an exception on a stack error.
OnElementEnded?.Invoke(this, new ElementEventArgs { Element = name, ElementStack = elements });
}
}
public class XmlWriterProxy : XmlWriter
{
// Taken from this answer https://stackoverflow.com/a/32150990/3744182
// by https://stackoverflow.com/users/3744182/dbc
// To https://stackoverflow.com/questions/32149676/custom-xmlwriter-to-skip-a-certain-element
// NOTE: async methods not implemented
readonly XmlWriter baseWriter;
public XmlWriterProxy(XmlWriter baseWriter) => this.baseWriter = baseWriter ?? throw new ArgumentNullException();
protected virtual bool IsSuspended { get { return false; } }
public override void Close() => baseWriter.Close();
public override void Flush() => baseWriter.Flush();
public override string LookupPrefix(string ns) => baseWriter.LookupPrefix(ns);
public override void WriteBase64(byte[] buffer, int index, int count)
{
if (IsSuspended)
return;
baseWriter.WriteBase64(buffer, index, count);
}
public override void WriteCData(string text)
{
if (IsSuspended)
return;
baseWriter.WriteCData(text);
}
public override void WriteCharEntity(char ch)
{
if (IsSuspended)
return;
baseWriter.WriteCharEntity(ch);
}
public override void WriteChars(char[] buffer, int index, int count)
{
if (IsSuspended)
return;
baseWriter.WriteChars(buffer, index, count);
}
public override void WriteComment(string text)
{
if (IsSuspended)
return;
baseWriter.WriteComment(text);
}
public override void WriteDocType(string name, string pubid, string sysid, string subset)
{
if (IsSuspended)
return;
baseWriter.WriteDocType(name, pubid, sysid, subset);
}
public override void WriteEndAttribute()
{
if (IsSuspended)
return;
baseWriter.WriteEndAttribute();
}
public override void WriteEndDocument()
{
if (IsSuspended)
return;
baseWriter.WriteEndDocument();
}
public override void WriteEndElement()
{
if (IsSuspended)
return;
baseWriter.WriteEndElement();
}
public override void WriteEntityRef(string name)
{
if (IsSuspended)
return;
baseWriter.WriteEntityRef(name);
}
public override void WriteFullEndElement()
{
if (IsSuspended)
return;
baseWriter.WriteFullEndElement();
}
public override void WriteProcessingInstruction(string name, string text)
{
if (IsSuspended)
return;
baseWriter.WriteProcessingInstruction(name, text);
}
public override void WriteRaw(string data)
{
if (IsSuspended)
return;
baseWriter.WriteRaw(data);
}
public override void WriteRaw(char[] buffer, int index, int count)
{
if (IsSuspended)
return;
baseWriter.WriteRaw(buffer, index, count);
}
public override void WriteStartAttribute(string prefix, string localName, string ns)
{
if (IsSuspended)
return;
baseWriter.WriteStartAttribute(prefix, localName, ns);
}
public override void WriteStartDocument(bool standalone) => baseWriter.WriteStartDocument(standalone);
public override void WriteStartDocument() => baseWriter.WriteStartDocument();
public override void WriteStartElement(string prefix, string localName, string ns)
{
if (IsSuspended)
return;
baseWriter.WriteStartElement(prefix, localName, ns);
}
public override WriteState WriteState => baseWriter.WriteState;
public override void WriteString(string text)
{
if (IsSuspended)
return;
baseWriter.WriteString(text);
}
public override void WriteSurrogateCharEntity(char lowChar, char highChar)
{
if (IsSuspended)
return;
baseWriter.WriteSurrogateCharEntity(lowChar, highChar);
}
public override void WriteWhitespace(string ws)
{
if (IsSuspended)
return;
baseWriter.WriteWhitespace(ws);
}
}
现在您将能够执行以下操作:
string fileName = @"Question68941254.xml"; // or whatever
XNamespace targetNamespace = "";
XName targetName = targetNamespace + "TheNode";
using (var textWriter = new StreamWriter(fileName))
using (var innerXmlWriter = XmlWriter.Create(textWriter, new XmlWriterSettings { Indent = true }))
using (var xmlWriter = new NotifyingXmlWriter(innerXmlWriter))
{
xmlWriter.OnElementStarted += (o, e) =>
{
if (e.Element == targetName)
{
// Add the attribute with the byte hex value to the target element.
((XmlWriter)o).WriteAttribute("TheAttribute", ByteExtensions.GetHexCharSegments(value.RawArtifact.Content.AsMemory()));
}
};
xdocument.WriteTo(xmlWriter);
}
xdocument
当然,XDocument
您正在尝试填充并将属性添加TheAttribute
到 node的一些内容在哪里TheNode
。
笔记:
由于您的代码显示您正在
StringBuilder
通过将大字节数组转换为大十六进制字符串缓冲区来填充,因此我消除了中间部分StringBuilder
并直接以块的形式写入字节数组。如果您确实需要将某些内容分
StringBuilder b
块写入,请使用public static partial class StringBuilderExtensions { public static IEnumerable<(char [] segment, int length)> GetSegments(this StringBuilder sb, int bufferSize = 1024) { var buffer = new char[bufferSize]; for (int i = 0; i < sb.Length; i += buffer.Length) { int length = Math.Min(buffer.Length, sb.Length - i); sb.CopyTo(i, buffer, length); yield return (buffer, length); } } }
并传递
b.GetSegments()
给XmlExtensions.WriteAttribute()
.
演示在这里摆弄结果:
<?xml version="1.0" encoding="utf-8"?>
<Root>
<SomeOtherNode>some value</SomeOtherNode>
<TheNode TheAttribute="000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F303132333435363738393A3B3C3D3E3F404142434445464748494A4B4C4D4E4F505152535455565758595A5B5C5D5E5F606162636465666768696A6B6C6D6E6F707172737475767778797A7B7C7D7E7F808182838485868788898A8B8C8D8E8F909192939495969798999A9B9C9D9E9FA0A1A2A3A4A5A6A7A8A9AAABACADAEAFB0B1B2B3B4B5B6B7B8B9BABBBCBDBEBFC0C1C2C3C4C5C6C7C8C9CACBCCCDCECFD0D1D2D3D4D5D6D7D8D9DADBDCDDDEDFE0E1E2E3E4E5E6E7E8E9EAEBECEDEEEFF0F1F2F3F4F5F6F7F8F9FAFBFCFDFEFF000102030405060708090A0B0C0D0E0F101112131415161718191A1B1C1D1E1F202122232425262728292A2B">
<foo></foo>the node value</TheNode>
<AnotherNode>another value</AnotherNode>
</Root>
推荐阅读
- python - 删除多级列数据透视表python
- excel-formula - 如何构造我的 if 文本然后颜色语句
- excel - 从列表中随机选择单词,同时从列表中排除一个值
- user-interface - 为什么这里找不到这个简单的 showMessage 对话框标识符
- python - 使用 PyGLM 或 NumPy 在 python 中翻译 vmath 等效语句的方法
- python - 如何通过Python列表中的文件迭代命令(循环)
- javascript - 即使 el 在 Vue 中不存在,为什么挂载的钩子也会触发?
- azure-cosmosdb - cosmos db 支持集合中的不同模式?
- c# - 以编程方式从 ASP.NET Core 应用程序运行 .NET Core 测试
- java-8 - 在实例变量中存储函数(Java)