DataX Mysql Reader and Hdfs writer, some records duplicate but port only single record can't reproduce

oxcyiej7 于 2021-11-29 发布在 Java

关注(0)|答案(0)|浏览(146)

the step is :

Using mysql reader and hdfs writer, the number of record from mysql is 258507, and Datax say"读出记录总数:258507"。
then I go to see hive table, the number of the record in hive table is 259461, so i guess some records duplicate
run in hive sql: "select fskuid,count() from tablenamexxxx group by fskuid having count()>1", then some records listed: 66 8 (the value of fskuid is 66, and get 8 duplicates)
then I go to Datax Json config file, and modify mysql reader'querySql to add "where fskuid=66" and re-run
I get only one record.

来源：https://github.com/alibaba/DataX/issues/305

暂无答案！

目前还没有任何答案，快来回答吧！

相关问题

热门标签

Java query python Node 开发语言 request Util 数据库 Table 后端算法 Logger Message Element Parser

最新问答

xxl-job 安全组扫描到执行器端口服务存在信息泄露漏洞
回答(1) 发布于 4个月前
xxl-job 不能和nacos兼容？
回答(3) 发布于 4个月前
xxl-job 任务执行完后无法结束，日志一直转圈
回答(3) 发布于 4个月前
xxl-job-admin页面上查看调度日志样式问题
回答(1) 发布于 4个月前
xxl-job 参数512字符限制能否去掉
回答(1) 发布于 3个月前