hive查询未返回任何数据

sycxhyv7  于 2021-06-26  发布在  Hive
关注(0)|答案(1)|浏览(452)
  1. CREATE EXTERNAL TABLE invoiceitems (
  2. InvoiceNo INT,
  3. StockCode INT,
  4. Description STRING,
  5. Quantity INT,
  6. InvoiceDate BIGINT,
  7. UnitPrice DOUBLE,
  8. CustomerID INT,
  9. Country STRING,
  10. LineNo INT,
  11. InvoiceTime STRING,
  12. StoreID INT,
  13. TransactionID STRING
  14. )
  15. ROW FORMAT SERDE 'org.apache.hive.hcatalog.data.JsonSerDe'
  16. LOCATION 's3a://streamingdata/data/*';

数据文件是由spark结构化流作业创建的:

  1. ...
  2. data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json 7.1 KB 29/08/2018 10:27:32 PM
  3. data/part-00000-0075634b-8513-47b3-b5f8-19df8269cf9d-c000.json 1.3 KB 30/08/2018 10:47:32 AM
  4. data/part-00000-00b6b230-8bb3-49d1-a42e-ad768c1f9a94-c000.json 2.3 KB 30/08/2018 1:25:02 AM
  5. ...

以下是第一个文件的前几行:

  1. {"InvoiceNo":5421462,"StockCode":22426,"Description":"ENAMEL WASH BOWL CREAM","Quantity":8,"InvoiceDate":1535578020000,"UnitPrice":3.75,"CustomerID":13405,"Country":"United Kingdom","LineNo":6,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"542146260180829"}
  2. {"InvoiceNo":5501932,"StockCode":22170,"Description":"PICTURE FRAME WOOD TRIPLE PORTRAIT","Quantity":4,"InvoiceDate":1535578020000,"UnitPrice":6.75,"CustomerID":13952,"Country":"United Kingdom","LineNo":26,"InvoiceTime":"21:27:00","StoreID":0,"TransactionID":"5501932260180829"}

但是,如果运行查询,则不会返回任何数据:

  1. hive> select * from invoiceitems limit 5;
  2. OK
  3. Time taken: 24.127 seconds

配置单元的日志文件为空:

  1. $ ls /var/log/hive*
  2. /var/log/hive:
  3. /var/log/hive-hcatalog:
  4. /var/log/hive2:

如何进一步调试?

dzjeubhm

dzjeubhm1#

我在运行时收到了更多关于错误的提示:

  1. select count(*) from invoiceitems;

这返回了以下错误
...
由于vertex\U失败,dag未成功。失败dvertices:1 killedvertices:1失败:执行错误,从org.apache.hadoop.hive.ql.exec.tez.teztask返回代码2。vertex失败,vertexname=map 1,vertexid=vertex\u 1535521291031\u 0011\u 00,diagnostics=[vertex vertex\u 1535521291031\u 0011\u 00[map 1]已终止/失败,原因是:根输入初始化失败,vertex输入:invoiceitems初始值设定项失败,vertex=vertex\u 1535521291031\u 0011\u 00[map 1],java.io.ioexception:在pathtPartitionInfo:[s3a://streamingdata/data/part-00000-006fc42a-c6a1-42a2-af03-ae0c326b40bd-c000.json]中找不到dir=s3a://streamingdata/data/part-00000-006fc42a-c6a1-42a2
我决定将create table定义从:

  1. LOCATION 's3a://streamingdata/data/*';

  1. LOCATION 's3a://streamingdata/data/';

这解决了问题。

展开查看全部

相关问题