sqoop使用update键导出

nbnkbykc 于 2021-06-04 发布在 Hadoop

关注(0)|答案(4)|浏览(484)

我必须将hdfs文件导出到mysql中。
假设我的hdfs文件是：

1,abcd,23
2,efgh,24
3,ijkl,25
4,mnop,26
5,qrst,27

假设我的mysql数据库模式是：

+-----+-----+-------------+
| ID  | AGE |    NAME     |
+-----+-----+-------------+
|     |     |             |
+-----+-----+-------------+

使用以下sqoop命令插入时：

sqoop export \
--connect jdbc:mysql://localhost/DBNAME \
--username root \
--password root \
--export-dir /input/abc \
--table test \
--fields-terminated-by "," \
--columns "id,name,age"

它工作正常，正在插入数据库。
但是，当我需要更新已经存在的记录时，我必须使用 --update-key 以及 --columns .
现在，当我尝试使用以下命令更新表时：

sqoop export \
--connect jdbc:mysql://localhost/DBNAME \
--username root \
--password root \
--export-dir /input/abc \
--table test \
--fields-terminated-by "," \
--columns "id,name,age" \
--update-key id

我面临的问题是数据没有按照中的指定更新到列中 --columns 我做错什么了吗？
我们不能这样更新数据库吗？hdfs文件应该在mysql架构中，只有更新？
有没有其他方法可以达到这个目的？

hadoop hdfs sqoop2

来源：https://stackoverflow.com/questions/25887086/sqoop-export-using-update-key

4条答案

按热度按时间

bybem2ql1#

4b.将hdfs中的数据更新到关系数据库中的表中
在mysql测试数据库中创建emp表tbl

create table emp
(
id int not null primary key,
name varchar(50)
);

vi emp-->创建包含以下内容的文件

1,Thiru
2,Vikram
3,Brij
4,Sugesh

将文件移到hdfs

hadoop fs -put emp <dir>

执行下面的sqoop作业将数据导出到mysql

sqoop export --connect <jdbc connection> \
--username sqoop \
--password sqoop \
--table emp \
--export-dir <dir> \
--input-fields-terminated-by ',';

验证mysql表中的数据

mysql> select * from emp;

+----+--------+
| id | name   |
+----+--------+
|  1 | Thiru  |
|  2 | Vikram |
|  3 | Brij   |
|  4 | Sugesh |
+----+--------+

更新emp文件&将更新后的文件移到hdfs中。更新文件的内容

1,Thiru
2,Vikram
3,Sugesh
4,Brij
5,Sagar

sqoop export for upsert-如果键与else insert匹配，则更新。

sqoop export --connect <jdbc connection> \
--username sqoop \
--password sqoop \
--table emp \
--update-mode allowinsert \
--update-key id \
--export-dir <dir> \
--input-fields-terminated-by ',';

Note: --update-mode <mode> - we can pass two arguments "updateonly" - to update the records. this will update the records if the update key matches.
if you want to do upsert (If exists UPDATE else INSERT) then use "allowinsert" mode.
example: 
--update-mode updateonly \ --> for updates
--update-mode allowinsert \ --> for upsert

验证结果：

mysql> select * from emp;
+----+--------+
| id | name   |
+----+--------+
|  1 | Thiru  |
|  2 | Vikram |
|  3 | Sugesh |--> Previous value "Brij"
|  4 | Brij   |--> Previous value "Sugesh"
|  5 | Sagar  |--> new value inserted
+----+--------+

赞(0）回复(0）举报 2021-06-04

wvt8vs2t2#

您可能需要尝试--输入字段以结尾。当前您使用的字段以结尾，这意味着用于导入。

赞(0）回复(0）举报 2021-06-04

btqmn9zl3#

实际上，我在sqoop上用多种方法尝试了这个。update键只能更新表中已经存在的列，并且不能插入它们，除非您还向allowinsert提及更新模式（并非所有数据库都支持这种模式）。如果实际尝试使用update key进行更新，它将更新update key中提到的key的行。

赞(0）回复(0）举报 2021-06-04

vu8f3i0k4#

试试看吧 --update-key primary_key ```
sqoop export --connect jdbc:mysql://localhost/DBNAME -username root -password root --export-dir /input/abc --table test --fields-terminated-by "," --update-key id

它对我有用。它更新所有与主键匹配的记录(可能不会插入新数据）
利用 `--update-mode updateonly/allowinsert` 明智地

赞(0）回复(0）举报 2021-06-04

我来回答

sqoop使用update键导出

4条答案

相关问题

热门标签

最新问答