ragflow [Bug]: UnicodeEncodeError: 'charmap'编码器无法编码字符

imzjd6km  于 3个月前  发布在  其他
关注(0)|答案(1)|浏览(87)

是否存在相同错误的现有问题?

  • 我已检查了现有的问题。

分支名称

main

提交ID

3ae8a87

其他环境信息

  • 无响应*

实际行为

遇到了这个问题:

PS C:\PYTHON\ragflow> python .\deepdoc\vision\t_ocr.py --inputs "C:\TMP\PDF_Test\Old_file.pdf"
[HUQIE]:Build default trie
[HUQIE]:Build trie C:\PYTHON\ragflow\rag/res\huqie.txt
[HUQIE]:Faild to build trie,  C:\PYTHON\ragflow\rag/res\huqie.txt 'charmap' codec can't decode byte 0x9d in position 20: character maps to <undefined>
File "C:\PYTHON\ragflow\deepdoc\vision\t_ocr.py", line 56, in <module>
main(args)
File "C:\PYTHON\ragflow\deepdoc\vision\t_ocr.py", line 45, in main
f.write("\n".join([o["text"] for o in bxs]))
File "C:\Program Files\WindowsApps\PythonSoftwareFoundation.Python.3.11_3.11.2544.0_x64__qbz5n2kfra8p0\Lib\encodings\cp1252.py", line 19, in encode
return codecs.charmap_encode(input,self.errors,encoding_table)[0]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
UnicodeEncodeError: 'charmap' codec can't encode character '\uff16' in position 581: character maps to <undefined>

在使用这个旧的PDF时:
https://doi.org/10.1021/ie00095a027

预期行为

  • 无响应*

重现步骤

adding

encoding='utf-8' to the open function fixed it for me:

with open(outputs\[i] + ".txt", "w+", encoding='utf-8') as f:

其他信息

ragflow真是太棒了!感谢各位!

相关问题