java PDFBox印前检查分析器无法检测PDF/A-1b文件

我正在使用以下代码来检测一个文件是否是PDF/A-1b文件？

public boolean isPDF_A1BFile(File file) throws IOException {
        PreflightParser parser = new PreflightParser(file);
        parser.parse(Format.PDF_A1B);
        PreflightDocument preflightDocument = parser.getPreflightDocument();
        preflightDocument.validate();

        ValidationResult validationResult = preflightDocument.getResult();
        
        return validationResult.isValid(); //Return false in every case
    }

但无论文件是否为PDF/A-1b，它总是返回false。我用的是pdf/a-1b file。我已经验证了使用preflight工具在acrobat和它说，该文件是PDF/A-1b的合规性。分享samex 1c 0d1x的截图有人能告诉我我的代码中有什么问题吗？或者我错过了什么？
此外，是否有任何方法可以检查文件是否符合PDF/A-2B？

该文件是容忍的一些PDF应用程序，因为许多将修复这种差异，但PDF框是检测到许多奇怪的，我没有试图花太多时间，但评论似乎潜在的有效，因此该文件是潜在的不符合。

The file Doc1-withHelvetica-pdfa1b.pdf is not a valid PDF/A-1b file, error(s) :
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 32264 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Length}:COSInt{8702};COSName{Subtype}:COSName{XML};COSName{Type}:COSName{Metadata};}; defined length=8702; actual length=8702, starting offset=23561
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 35134 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{2574};COSName{N}:COSInt{3};COSName{Range}:COSArray{COSFloat{0.0};COSFloat{1.0};0;1065353216;0;1065353216;};}; defined length=2574; actual length=2574, starting offset=32559
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 1562 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{202};}; defined length=202; actual length=202, starting offset=1359
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 4486 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Alternate}:COSName{DeviceRGB};COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{2612};COSName{N}:COSInt{3};}; defined length=2612; actual length=2612, starting offset=1873
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 4640 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{17};}; defined length=17; actual length=17, starting offset=4622
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 15067 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{10342};COSName{Length1}:COSInt{27968};}; defined length=10342; actual length=10342, starting offset=4724
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 16081 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{407};}; defined length=407; actual length=407, starting offset=15673
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 22792 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{6627};COSName{Length1}:COSInt{15080};}; defined length=6627; actual length=6627, starting offset=16164
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 23435 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{Length}:COSInt{355};}; defined length=355; actual length=355, starting offset=23079
1.2.2 : Body Syntax error, Expected 'EOL' before the endstream keyword at offset 822 but found '101'
1.2.5 : Body Syntax error, Stream length is invalid [dic=COSDictionary{COSName{Filter}:COSName{FlateDecode};COSName{I}:COSInt{93};COSName{Length}:COSInt{85};COSName{S}:COSInt{39};}; defined length=85; actual length=85, starting offset=736

因此，从表面上看，我只是在MuPDF中使用“clean”重建文件，并在PDF框中重新运行验证。
C:\Apps\PDF\inspectors\Apache\preflight-app-3.0.0-alpha3.jar Doc1-withHelvetica-pdfa1ba.pdf
文件Doc1-withHelvetica-pdfa1ba.pdf是有效的PDF/A-1b文件
然而，catch 22现在在报告时会使其他验证失败
PDF结构已损坏，但已修复。根据损坏的程度，理论上可能会丢失一些数据（尽管通常不太可能）。
因此，通过删除PDF/A兼容性回收，并通过重新生成为PDF/A，看看有什么问题，现在的报告是至少有1个坏的字体定义Calibri（并不奇怪，因为它以前是一个word文档打印输出。）什么是不明显的是有一个流氓Calibri空格字符在该行的末尾，包含Helvetica粗体和删除，然后报告其他问题，所以另一个运行通过编辑器，最后与所有的糟粕删除，双方都同意没有更多的问题。

java PDFBox印前检查分析器无法检测PDF/A-1b文件

1条答案

相关问题

热门标签

最新问答