DB-GPT [Bug] [QWEN] Excel文件中的数据有中文列和中文文本，但在分析Excel文件后，列名是Unicode,

mnemlml8 于 4个月前发布在其他

关注(0)|答案(1)|浏览(46)

在提问前搜索

我在 issues 中进行了搜索，但没有找到类似的问题。

操作系统信息

Windows

Python版本信息

=3.11

DB-GPT版本

主版

安装信息

设备信息

CPU

型号信息

LLM:qwen

发生的问题

当列名是unicode时，生成的sql运行失败。例如下面的上下文。

1.[column name] _ u5730 u5e02 u540d u79f0_
 1.[description] 城市名，用于标识不同的地理区域
 2.[column name] 2024 u5e7404 u6708 u79fb u52a8 u65b0 u589e u7528 u6237 u6570 u91cf
 2.[description] 2024年4月新增用户数，反映了当月新用户的增长情况
 3.[column name] _ u622a u6b622024 u5e7404 u6708 u79fb u52a8 u65b0 u589e u7528 u6237 u6570 u91cf u7d2f u8ba1_
 3.[description] 2024年4月新增用户数的环比减少或增加，通常用百分比表示
 4.[column name] 2024 u5e7405 u6708 u79fb u52a8 u65b0 u589e u7528 u6237 u6570 u91cf
 4.[description] 2024年5月新增用户数，用于跟踪月度趋势
 5.[column name] _ u622a u6b622024 u5e7405 u6708 u79fb u52a8 u65b0 u589e u7528 u6237 u6570 u91cf u7d2f u8ba1_
SQL[SELECT DISTINCT "u5730" "u5e02" "u540d" "u79f0" FROM "excel_data" ].Error:Parser Error: syntax error at or near ""u540d""。

DB-GPT

来源：https://github.com/eosphoros-ai/DB-GPT/issues/1660

1条答案

按热度按时间

f1tvaqid1#

DB-GPT/dbgpt/app/scene/chat_data/chat_excel/excel_learning/chat.py
第48行：84988b8
| | "data_example": json.dumps(datas, cls=EnhancedJSONEncoder), |
在调用json.dumps时，请添加参数"ensure_ascii=False"。

input_values = {
        "data_example": json.dumps(datas, cls=EnhancedJSONEncoder, ensure_ascii=False),
        "file_name": self.excel_reader.excel_file_name,
    }

赞(0）回复(0）举报 4个月前

我来回答