当我在hivecli上运行“createtableasselect”查询时,已经创建了表,但是没有填充数据。但是当我在Hive蜂蜡上运行相同的查询时,我得到的目标表中填充了数据。
以下是查询:
hive -e '
create table table_validation as
select listing_id, city, area, expected_amount_inr, property_id, house_type, case when area_builtup_sqft
is NULL or
area_builtup_sqft = 0 or area_builtup_sqft = " " then plot_area else area_builtup_sqft end as area_sqft,
case when area_builtup_sqft is NULL or area_builtup_sqft = 0 or area_builtup_sqft = " "
then expected_amount_inr/plot_area else expected_amount_inr/area_builtup_sqft end as
price_sqft,listing_state,
case when house_type like "apartment" then "apartment" when house_type like "plot" then "plot" else
"others" end as property_type, case when house_type like "plot" then "NA" when num_bedrooms between 1 and 1.9 then 1 when num_bedrooms between
2 and 2.9 then 2 when num_bedrooms between 3 and 3.9 then 3 when num_bedrooms >= 4 then 4 else num_bedrooms end as number_bedrooms
from realestate_listing_main
where listing_type LIKE "rent"
and added_on between '2015-02-01' and '2015-03-31'
' --database default;
当我运行此查询时,得到以下结果:
running hive query
0 2015-03-31 18:40:41,025 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.input.dir.recursive is deprecated. Instead, use mapreduce.input.fileinputformat.input.dir.recursive
2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.max.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.maxsize
2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize
2015-03-31 18:40:41,030 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size.per.rack is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.rack
2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.min.split.size.per.node is deprecated. Instead, use mapreduce.input.fileinputformat.split.minsize.per.node
2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.reduce.tasks is deprecated. Instead, use mapreduce.job.reduces
2015-03-31 18:40:41,031 INFO [main] Configuration.deprecation (Configuration.java:warnOnceIfDeprecated(1011)) - mapred.reduce.tasks.speculative.execution is deprecated. Instead, use mapreduce.reduce.speculative
2015-03-31 18:40:41,336 WARN [main] conf.HiveConf (HiveConf.java:initialize(1155)) - DEPRECATED: Configuration property hive.metastore.local no longer has any effect. Make sure to provide a valid value for hive.metastore.uris if you are connecting to a remote metastore.
Logging initialized using configuration in jar:file:/usr/lib/hive/lib/hive-common-0.12.0-cdh5.1.2.jar!/hive-log4j.properties
OK
Time taken: 0.621 seconds
Total MapReduce jobs = 3
Launching Job 1 out of 3
Number of reduce tasks is set to 0 since there's no reduce operator
Starting Job = job_1427789583342_0014, Tracking URL = http://ip-10-172-133-249.ap-southeast-1.compute.internal:8088/proxy/application_1427789583342_0014/
Kill Command = /usr/lib/hadoop/bin/hadoop job -kill job_1427789583342_0014
Hadoop job information for Stage-1: number of mappers: 10; number of reducers: 0
2015-03-31 18:40:59,849 Stage-1 map = 0%, reduce = 0%
2015-03-31 18:41:10,188 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:11,219 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:12,252 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:13,289 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:14,321 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:15,357 Stage-1 map = 10%, reduce = 0%, Cumulative CPU 5.86 sec
2015-03-31 18:41:16,393 Stage-1 map = 35%, reduce = 0%, Cumulative CPU 39.78 sec
2015-03-31 18:41:17,428 Stage-1 map = 40%, reduce = 0%, Cumulative CPU 41.17 sec
2015-03-31 18:41:18,460 Stage-1 map = 45%, reduce = 0%, Cumulative CPU 43.26 sec
2015-03-31 18:41:19,499 Stage-1 map = 67%, reduce = 0%, Cumulative CPU 49.68 sec
2015-03-31 18:41:20,536 Stage-1 map = 70%, reduce = 0%, Cumulative CPU 50.49 sec
2015-03-31 18:41:21,569 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:22,598 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:23,627 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:24,655 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:25,684 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:26,714 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:27,743 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:28,773 Stage-1 map = 80%, reduce = 0%, Cumulative CPU 56.28 sec
2015-03-31 18:41:29,803 Stage-1 map = 85%, reduce = 0%, Cumulative CPU 61.88 sec
2015-03-31 18:41:30,840 Stage-1 map = 90%, reduce = 0%, Cumulative CPU 63.8 sec
2015-03-31 18:41:31,872 Stage-1 map = 90%, reduce = 0%, Cumulative CPU 63.8 sec
2015-03-31 18:41:32,905 Stage-1 map = 95%, reduce = 0%, Cumulative CPU 69.86 sec
2015-03-31 18:41:33,935 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.58 sec
2015-03-31 18:41:34,964 Stage-1 map = 100%, reduce = 0%, Cumulative CPU 71.58 sec
MapReduce Total cumulative CPU time: 1 minutes 11 seconds 580 msec
Ended Job = job_1427789583342_0014
Stage-4 is selected by condition resolver.
Stage-3 is filtered out by condition resolver.
Stage-5 is filtered out by condition resolver.
Moving data to: hdfs://ip-10-172-133-249.ap-southeast-1.compute.internal:8020/tmp/hive-root/hive_2015-03-31_18-40-42_689_38529489390850959-1/-ext-10001
Moving data to: hdfs://ip-10-172-133-249.ap-southeast-1.compute.internal:8020/user/hive/warehouse/default.db/table_validation
Table default.table_validation stats: [num_partitions: 0, num_files: 10, num_rows: 0, total_size: 0, raw_data_size: 0]
MapReduce Jobs Launched:
Job 0: Map: 10 Cumulative CPU: 71.58 sec HDFS Read: 2635527679 HDFS Write: 0 SUCCESS
Total MapReduce CPU Time Spent: 1 minutes 11 seconds 580 msec
OK
Time taken: 52.896 seconds
它不执行第二和第三个作业。但当我在Hive蜂蜡上运行查询时,所有的作业都会被执行,并且表是用数据创建的。
请让我知道我错过了什么?从过去的三天开始,我就被困在这上面了。
1条答案
按热度按时间bcs8qyzn1#
我知道答案了。需要添加
serde.jar
在运行查询之前,如果没有这个jar,配置单元将无法识别数据。