是否存在相同错误的现有问题?
- 我已检查了现有的问题。
分支名称
main
提交ID
其他环境信息
- 无响应*
实际行为
遇到了这个问题:
PS C:\PYTHON\ragflow> python .\deepdoc\vision\t_ocr.py --inputs "C:\TMP\PDF_Test\Old_file.pdf"
[HUQIE]:Build default trie
[HUQIE]:Build trie C:\PYTHON\ragflow\rag/res\huqie.txt
[HUQIE]:Faild to build trie, C:\PYTHON\ragflow\rag/res\huqie.txt 'charmap' codec can't decode byte 0x9d in position 20: character maps to <undefined>
File "C:\PYTHON\ragflow\deepdoc\vision\t_ocr.py", line 56, in <module>
main(args)
File "C:\PYTHON\ragflow\deepdoc\vision\t_ocr.py", line 45, in main
f.write("\n".join([o["text"] for o in bxs]))
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\uff16' in position 581: character maps to <undefined>
在使用这个旧的PDF时:
https://doi.org/10.1021/ie00095a027
预期行为
- 无响应*
重现步骤
adding
encoding='utf-8' to the open function fixed it for me:
with open(outputs\[i] + ".txt", "w+", encoding='utf-8') as f:
其他信息
ragflow真是太棒了!感谢各位!
1条答案
按热度按时间hpcdzsge1#
感谢关注。