.net 以编程方式比较Word文档

o2g1uqev  于 2023-01-22  发布在  .NET
关注(0)|答案(7)|浏览(152)

我需要比较两个办公室文件,在这种情况下两个字的文件,并提供一个差异,这是有点类似于什么是显示在SVN。没有到那种程度,但至少能够突出的差异。
我试着使用office COM dll,走了这么远。

object fileToOpen = (object)@"D:\doc1.docx";
string fileToCompare = @"D:\doc2.docx";

WRD.Application WA = new WRD.Application();

Document wordDoc = null;

wordDoc = WA.Documents.Open(ref fileToOpen, Type.Missing, Type.Missing, Type.Missing, Type.Missing,      Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);
wordDoc.Compare(fileToCompare, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing, Type.Missing);

关于如何继续下一步有什么建议吗?这将是一个有很多点击率的web应用程序。使用office com对象是正确的方法吗?或者有什么其他的东西我可以看看吗?

ki0zmccv

ki0zmccv1#

您应该使用Document类比较文件并在Word文档中打开结果。

using OfficeWord = Microsoft.Office.Interop.Word;

object fileToOpen = (object)@"D:\doc1.docx";
string fileToCompare = @"D:\doc2.docx";

var app = Global.OfficeFile.WordApp;

object readOnly = false;
object AddToRecent = false;
object Visible = false;

OfficeWord.Document docZero = app.Documents.Open(fileToOpen, ref missing, ref readOnly, ref AddToRecent, Visible: ref Visible);

docZero.Final = false;
docZero.TrackRevisions = true;
docZero.ShowRevisions = true;
docZero.PrintRevisions = true;

//the OfficeWord.WdCompareTargetNew defines a new file, you can change this valid value to change how word will open the document
docZero.Compare(fileToCompare, missing, OfficeWord.WdCompareTarget.wdCompareTargetNew, true, false, false, false, false);
r3i60tvu

r3i60tvu2#

所以我的要求是我必须使用一个.Net库,我想避免处理实际的文件,而是处理流。
ZipArchive位于System.IO中。已压缩
我所做的和它的工作相当不错的是使用ZipArchive从.Net和比较内容,而跳过.rels文件,因为它似乎是随机生成的每个文件创建.这里是我的片段:

private static bool AreWordFilesSame(byte[] wordA, byte[] wordB)
    {
        using (var streamA = new MemoryStream(wordA))
        using (var streamB = new MemoryStream(wordB))
        using (var zipA = new ZipArchive(streamA))
        using (var zipB = new ZipArchive(streamB))
        {
            streamA.Seek(0, SeekOrigin.Begin);
            streamB.Seek(0, SeekOrigin.Begin);

            for(int i = 0; i < zipA.Entries.Count; ++i)
            {
                Assert.AreEqual(zipA.Entries[i].Name, zipB.Entries[i].Name);

                if (zipA.Entries[i].Name.EndsWith(".rels")) //These are some weird word files with autogenerated hashes
                {
                    continue;
                }

                var streamFromA = zipA.Entries[i].Open();
                var streamFromB = zipB.Entries[i].Open();

                using (var readerA = new StreamReader(streamFromA))
                using (var readerB = new StreamReader(streamFromB))
                {
                    var bytesA = readerA.ReadToEnd();
                    var bytesB = readerB.ReadToEnd();
                    if (bytesA != bytesB || bytesA.Length == 0)
                    {
                        return false;
                    }
                }
            }

            return true;
        }
    }
v09wglhw

v09wglhw3#

我同意w/ Joseph关于对字符串进行比较的观点。我还推荐一个专门构建的比较引擎(几个可以在这里找到:Any decent text diff/merge engine for .NET?),这可以帮助您避免一些常见的差异化陷阱。

h6my8fg2

h6my8fg24#

对于服务器上的解决方案,或者在未安装Word的情况下运行并使用COM工具的解决方案,可以使用XmlPowerTools的WmlComparer组件。
documentation有点局限,但下面是一个示例用法:

var expected = File.ReadAllBytes(@"c:\expected.docx");
var actual = File.ReadAllBytes(@"c:\result.docx");
var expectedresult = new WmlDocument("expected.docx", expected);
var actualDocument = new WmlDocument("result.docx", actual);
var comparisonSettings = new WmlComparerSettings();

var comparisonResults = WmlComparer.Compare(expectedresult, actualDocument, comparisonSettings);
var revisions = WmlComparer.GetRevisions(comparisonResults, comparisonSettings);

它会告诉你这两份文件的不同之处。

mgdq6dx1

mgdq6dx15#

你真的应该把文档提取成一个字符串,然后对它进行比较。
你只关心文本的变化,而不是格式的权利?

b1zrtrql

b1zrtrql6#

此函数允许您在C#中比较两个文档以及一个文档的两个版本。

public async Task<object> compare()
        {
            Word.Application wordApp = new Word.Application();
            wordApp.Visible = false;
            object wordTrue = (object)true;
            object wordFalse = (object)false;
            object fileToOpen = @"Give your file path here";
            object missing = Type.Missing;
            Word.Document doc1 = wordApp.Documents.Open(ref fileToOpen,
                   ref missing, ref wordFalse, ref wordFalse, ref missing,
                   ref missing, ref missing, ref missing, ref missing,
                   ref missing, ref missing, ref wordTrue, ref missing,
                   ref missing, ref missing, ref missing);

            object fileToOpen1 = @"Give your file path here";
            Word.Document doc2 = wordApp.Documents.Open(ref fileToOpen1,
                   ref missing, ref wordFalse, ref wordFalse, ref missing,
                   ref missing, ref missing, ref missing, ref missing,
                   ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing);

            Word.Document doc = wordApp.CompareDocuments(doc1, doc2, Word.WdCompareDestination.wdCompareDestinationNew,
                                Word.WdGranularity.wdGranularityWordLevel,
                                true, true, true, true, true, true, true, true, true, true, "", true);

            doc1.Close(ref missing, ref missing, ref missing);
            doc2.Close(ref missing, ref missing, ref missing);

            // This Hides both original and revised documents you can change it according to your use case.
            wordApp.ActiveWindow.ShowSourceDocuments = WdShowSourceDocuments.wdShowSourceDocumentsNone;

            wordApp.Visible = true;
            doc.Activate();

            return Ok("Compared Successfully");
        }
xxhby3vn

xxhby3vn7#

要在Word文档之间进行比较,您需要
1.用于操作Word文档的库,例如从Word文件读取段落、文本、表格等。您可以尝试Office Interop、OpenXML或Aspose.Words for .NET
1.一个算法/库,用于对从两个Word文档中检索到的文本进行实际比较。您可以自己编写或使用DiffMatchPatch或类似的库。
这个问题已经过时了,现在有更多类似GroupDocs Compare的解决方案可用。
Document Comparison by Aspose.Words for .NET是一个开源展示项目,它使用Assose.Words和DiffMatchPatch进行比较。
我是Aspose的开发人员宣传员。

相关问题