当我想将数据上传到我的“测试集群”到apache cassandra时,我打开终端,然后:
export PATH=/home/mypc/dsbulk-1.7.0/bin:$PATH
source ~/.bashrc
dsbulk load -url /home/mypc/Desktop/test/file.csv -k keyspace_test -t table_test
但是。。。
At least 1 record does not match the provided schema.mapping or schema.query. Please check that the connector configuration and the schema configuration are correct.
Operation LOAD_20201105-103000-577734 aborted: Too many errors, the maximum allowed is 100.
total | failed | rows/s | p50ms | p99ms | p999ms | batches
104 | 104 | 0 | 0,00 | 0,00 | 0,00 | 0,00
Rejected records can be found in the following file(s): mapping.bad
Errors are detailed in the following file(s): mapping-errors.log
Last processed positions can be found in positions.txt
什么意思?为什么我不能加载?
谢谢您!
2条答案
按热度按时间pbwdgjma1#
错误是您没有提供csv数据和表之间的Map。可以通过两种方式实现:
如果csv文件的标题中的列名与cassandra中的列名匹配,则使用
-header true
使用-m
选项(请参阅文档)-您需要将csv列Map到cassandra列。关于dsbulk使用的不同方面,有一系列非常好的博客文章:
https://www.datastax.com/blog/2019/03/datastax-bulk-loader-introduction-and-loading
https://www.datastax.com/blog/2019/04/datastax-bulk-loader-more-loading
https://www.datastax.com/blog/2019/04/datastax-bulk-loader-common-settings
https://www.datastax.com/blog/2019/06/datastax-bulk-loader-unloading
https://www.datastax.com/blog/2019/07/datastax-bulk-loader-counting
https://www.datastax.com/blog/2019/12/datastax-bulk-loader-examples-loading-other-locations
其中前两个非常详细地介绍了数据加载
vuv7lop32#
这意味着csv输入文件中的列与
table_test
table。您可以在mapping-errors.log
所以您知道哪些列有问题。由于csv列与表架构不匹配,因此需要通过指定
--schema.mapping
旗帜。有关详细信息,请参阅dsbulk公共选项页。您还可以在本文中查看模式Map示例。干杯!