sqoop日期到impala时间戳

fdbelqdn 于 2021-05-29 发布在 Hadoop

关注(0)|答案(3)|浏览(725)

我用sqoop将数据从mysql导入hdfs，作为impala使用的parquet文件。将mysql日期类型转换为impala时间戳时出现问题。
执行时出现impala错误消息 compute stats table 或者 select * 是：

File 'hdfs://....parquet'
has an incompatible type with the table schema for column 'day'.
Expected type: INT32.  Actual type: INT64

将day列的数据类型更改为bigint或string将保留相同的错误消息。
即使我将impala中的day列类型更改为string并在sqoop中设置--map列java“day=string”（也尝试了day=integer和long），我还是得到：

Expected type: INT32.  Actual type: BYTE_ARRAY in Impala

我还尝试在jdbc连接字符串中设置mapdatetotimestamp=false（和true），但没有效果
使用parquet文件时，用sqoop将mysql日期转换为impala时间戳的方法是什么（注意，我没有使用avro）？
（我使用的sqoop版本是1.4.5-cdh5.3.3）sqoop命令如下所示：

sqoop import
--connect jdbc:mysql://adress/db
--username name
--password pass
--table tableName
--target-dir dir
--as-parquetfile -m 1
--driver com.mysql.jdbc.Driver

编辑：我尝试用

--query "SELECT UNIX_TIMESTAMP(STR_TO_DATE(day, '%Y-%m-%d'))

但 Impala 认为它是int64，并期望int96。所以另一个问题是如何把它转换成int96？

hadoop mysql impala sqoop timestamp

来源：https://stackoverflow.com/questions/34680975/sqoop-date-to-impala-timestamp

3条答案

按热度按时间

brqmpdu11#

看来唯一的办法就是 --map-column-java "day=String" 要么在impala中将该列作为字符串并使用 cast() 函数用于查询日期或使用临时表。

赞(0）回复(0）举报 2021-05-30

exdqitrt2#

通常我们所遵循的是，当我们从scoop或其他外部系统导入时，所有的数据类型都将在我们使用的登陆后默认为string（登陆）

from_unixtime(unix_timestamp(<datecol>,yyyyMMdd'),'yyyy-MM-dd')

转换为 Impala 特定的时间戳

赞(0）回复(0）举报 2021-05-30

628mspwn3#

你试过96吗？另外，我认为您无法将tinyint、smallint列转换为bigint

赞(0）回复(0）举报 2021-05-29

我来回答

sqoop日期到impala时间戳

3条答案

相关问题

热门标签

最新问答