配置单元无法使用嵌套的avro架构创建表

h9a6wy2h  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(466)

我正在尝试使用嵌套的avro模式来创建一个配置单元表。但它不起作用。我正在使用cdh5.7.2中的hive1.1。
以下是我的嵌套avro架构:

  1. [
  2. {
  3. "type": "record",
  4. "name": "Id",
  5. "namespace": "com.test.app_list",
  6. "doc": "Device ID",
  7. "fields": [
  8. {
  9. "name": "idType",
  10. "type": "int"
  11. },{
  12. "name": "id",
  13. "type": "string"
  14. }
  15. ]
  16. },
  17. {
  18. "type": "record",
  19. "name": "AppList",
  20. "namespace": "com.test.app_list",
  21. "doc": "",
  22. "fields": [
  23. {
  24. "name": "appId",
  25. "type": "string",
  26. "avro.java.string": "String"
  27. },
  28. {
  29. "name": "timestamp",
  30. "type": "long"
  31. },
  32. {
  33. "name": "idList",
  34. "type": [{"type": "array", "items": "com.test.app_list.Id"}]
  35. }
  36. ]
  37. }
  38. ]

和我的sql来创建表:

  1. CREATE EXTERNAL TABLE app_list
  2. ROW FORMAT SERDE
  3. 'org.apache.hadoop.hive.serde2.avro.AvroSerDe'
  4. STORED AS INPUTFORMAT
  5. 'org.apache.hadoop.hive.ql.io.avro.AvroContainerInputFormat'
  6. OUTPUTFORMAT
  7. 'org.apache.hadoop.hive.ql.io.avro.AvroContainerOutputFormat'
  8. TBLPROPERTIES (
  9. 'avro.schema.url'='/hive/schema/test_app_list.avsc');

但是Hive给了我:

  1. FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.DDLTask. java.lang.RuntimeException: MetaException(message:org.apache.hadoop.hive.serde2.avro.AvroSerdeException Schema for table must be of type RECORD. Received type: UNION)
  2. ``` `hive` 文件显示: `Supports arbitrarily nested schemas.` 发件人:https://cwiki.apache.org/confluence/display/hive/avroserde#avroserde-概述–从Hive使用AVR
  3. 数据样本:

{
"appId":{"string":"com.test.app"},
"timestamp":{"long":1495893601606},
"idList":{
"array":[
{"idType":15,"id":"6c:5c:14:c3:a5:39"},
{"idType":13,"id":"eb297afe56ff340b6bb7de5c5ab09193"}
]
}

}

  1. 但我不知道该怎么做。我需要一些帮助来解决这个问题。谢谢!
jfewjypa

jfewjypa1#

avro模式的顶层应该是记录类型,这就是为什么hive不允许这样做的原因。解决方法可以是创建顶层作为记录,内部创建两个字段作为记录类型。

  1. {
  2. "type": "record",
  3. "name": "myRecord",
  4. "namespace": "com.test.app_list"
  5. "fields": [
  6. {
  7. "type": "record",
  8. "name": "Id",
  9. "doc": "Device ID",
  10. "fields": [
  11. {
  12. "name": "idType",
  13. "type": "int"
  14. },{
  15. "name": "id",
  16. "type": "string"
  17. }
  18. ]
  19. },
  20. {
  21. "type": "record",
  22. "name": "AppList",
  23. "doc": "",
  24. "fields": [
  25. {
  26. "name": "appId",
  27. "type": "string",
  28. "avro.java.string": "String"
  29. },
  30. {
  31. "name": "timestamp",
  32. "type": "long"
  33. },
  34. {
  35. "name": "idList",
  36. "type": [{"type": "array", "items": "com.test.app_list.Id"}]
  37. }
  38. ]
  39. }
  40. ]
  41. }
展开查看全部

相关问题