c# - Encode only accented characters in an HTML string
问题描述
I have the following function that accepts an HTML string, for example "<p>áêö</p>"
:
public string EncodeString(string input)
{
// ...
return System.Net.WebUtility.HtmlEncode(input);
}
I'd like to modify that function to output the same string, but with the accented characters as HTML entities. Using System.Net.WebUtility.HtmlEncode()
encodes the entire string, including the HTML tags. I'd like to preserve the HTML tags if possible, since the string is parsed and rendered elsewhere in the application. Is this something that is better solved with a regex?
解决方案
您可以使用AngleSharp之类的库来替换 html 元素的内容:
public static async Task<string> EncodeString(string input)
{
var context = BrowsingContext.New(Configuration.Default);
var document = await context.OpenAsync(req => req.Content(input));
var pElement = document.QuerySelector("p");
pElement.TextContent = System.Net.WebUtility.HtmlEncode(pElement.TextContent);
return pItem.ToHtml();
}
在此处查看实际操作:.NET Fiddle
对于嵌套元素的更一般情况,以下是修改后的代码:
public static async Task<string> EncodeString(string input)
{
var context = BrowsingContext.New(Configuration.Default);
var document = await context.OpenAsync(req => req.Content(input));
return await EncodeString(document.Body.FirstChild);
}
private static async Task<string> EncodeString(INode content)
{
foreach(var node in content.ChildNodes)
{
node.NodeValue = node.NodeType == NodeType.Text ?
System.Net.WebUtility.HtmlEncode(node.NodeValue) :
await EncodeString(node);
}
return content.ToHtml();
}
推荐阅读
- php - 如何使用 txt 文件的文件列表作为数组变量?
- python-3.x - 期望值:第 1 行第 1 列(字符 0)
- ansible - 带双引号的 ansible playbook 变量
- c++ - 尽管有范围块,std::lock_guard 似乎提供了线程安全
- android - 使用滚动视图时,Edittext 被键盘隐藏
- joomla - 如何在文章中显示导航链接
- docker - Docker 在同一个 Docker Compose 堆栈中运行时不会向 fluentd 发送日志
- java - Java:如何打开不同类型的 url,如 mailto 和 http?
- reactjs - 如何使用 React hook useEffect 添加事件处理程序取决于状态?
- groovy - 如何获取运行 Jenkins 管道作业的文件夹名称