c# - Compare two word document in c#
问题描述
I have a problem. I need to compare word document. Text and format in c# and i found a third party library to view and process the document and it is Devexpress. So i downloaded the trial to check if the problem can be solved with this
Example i have two word document
1: This is a text example
- This is not a text example
In the text above the difference is only the word not
My problem is how can i check the difference including the format?
So far this is my code for iterating the contents of the Document
public void CompareEpub(string word)
{
try
{
using (DevExpress.XtraRichEdit.RichEditDocumentServer srv = new DevExpress.XtraRichEdit.RichEditDocumentServer())
{
srv.LoadDocument(word);
MyIterator visitor = new MyIterator();
DocumentIterator iterator = new DocumentIterator(srv.Document, true);
while (iterator.MoveNext())
{
iterator.Current.Accept(visitor);
}
foreach (var item in visitor.ListOfText)
{
Debug.WriteLine("text: " + item.Text + " b: " + item.IsBold + " u: " + item.IsUnderline + " i: " + item.IsUnderline);
}
}
}
catch (Exception ex)
{
Debug.WriteLine(ex.Message);
Debug.WriteLine(ex.StackTrace);
throw ex;
}
}
public class MyIterator : DocumentVisitorBase
{
public List<Model.HtmlContent> ListOfText { get; }
public MyIterator()
{
ListOfText= new List<Model.HtmlContent>();
}
public override void Visit(DocumentText text)
{
var m = new Model.HtmlContent
{
Text = text.Text,
IsBold = text.TextProperties.FontBold,
IsItalic = text.TextProperties.FontItalic,
IsUnderline = text.TextProperties.UnderlineWordsOnly
};
ListOfText.Add(m);
}
}
With the code above i can navigate to the text and its format. But how can i use this as a text compare?
If I'm going to create a two list for each document to compare.
How can i compare it?
If i'm going to compare the text in with another list. Compare it in loop.
I will be receiving it as only two words are equal.
Can help me with this. Or just provide an idea how i can make it work.
I didn't post in the devexpress forum because i feel that this is a problem with how i will be able to do it. And not a problem with the trial or the control i've been using. And i also found out that the control doesn't have a functionality to compare text. Like the one with Microsoft word.
Thank you.
Update:
Desired output
This is (not) a text example
The text inside the () means it is not found in the first document The output i want is like the output of Diff Match Patch https://github.com/pocketberserker/Diff.Match.Patch
But i can't implement the code for checking the format.
解决方案
推荐阅读
- java - 如何使用扫描仪检查 Zip 文件是否由 CSV 或 JAVA 中的其他类型文件压缩?
- c++ - 从数字数组中删除重复项
- generics - 如何在 UML 中对依赖于属性的接口实现建模?
- dataweave - 从数组生成数组
- re-frame - reg-event-db - 它如何影响 SPA 加载速度
- java - 如何在 JavaFx TableView 中显示值
- c# - 列表集合已修改 - 尝试使用 foreach 循环时出错
- python - MacOS:运行代码时对 IDE 和终端的不同处理?
- react-native - npm install -g expo-cli 未完成安装
- android - 如何按房间持久性库中的列总和排序?