hive在选择特定列时打印空值

3xiyfsfu  于 2021-05-29  发布在  Hadoop
关注(0)|答案(3)|浏览(367)

我有一张Hive桌。我正在为配置单元表使用json数据。当我选择整个表时,它对我有效。如果我选择一个特定的列,它将打印空值。
数据是这样的

{"page_1":"{\"city\":\"Bangalore\",\"locality\":\"Battarahalli\",\"Name_of_Person\":\"xxx\",\"User_email_address\":\"test@gmail.com\",\"user_phone_number\":\"\",\"sub_locality\":\"\",\"street_name\":\"7th Cross Road, Near Reliance Fresh, T.c Palya,\",\"home_plot_no\":\"45\",\"pin_code\":\"560049\",\"project_society_build_name\":\"Sunshine Layout\",\"landmark_reference_1\":\"\",\"landmark_reference_2\":\"\",\"No_of_Schools\":20,\"No_of_Hospitals\":20,\"No_of_Metro\":0,\"No_of_Mall\":11,\"No_of_Park\":10,\"Distance_of_schools\":1.55,\"Distance_of_Hospitals\":2.29,\"Distance_of_Metro\":0,\"Distance_of_Mall\":1.55,\"Distance_of_Park\":2.01,\"lat\":13.0243273,\"lng\":77.7077906,\"ipinfo\":{\"ip\":\"113.193.30.130\",\"hostname\":\"No Hostname\",\"city\":\"\",\"region\":\"\",\"country\":\"IN\",\"loc\":\"20.0000,77.0000\",\"org\":\"AS45528 Tikona Digital Networks Pvt Ltd.\"}}","page_2":"{\"home_type\":\"Flat\",\"area\":\"1350\",\"beds\":\"3 BHK\",\"bath_rooms\":2,\"building_age\":\"1\",\"floors\":2,\"balcony\":2,\"amenities\":\"premium\",\"amenities_options\":{\"gated_security\":\"\",\"physical_security\":\"\",\"cctv_camera\":\"\",\"controll_access\":\"\",\"elevator\":true,\"power_back_up\":\"\",\"parking\":true,\"partial_parking\":\"\",\"onsite_maintenance_store\":\"\",\"open_garden\":\"\",\"party_lawn\":\"\",\"amenities_balcony\":\"\",\"club_house\":\"\",\"fitness_center\":\"\",\"swimming_pool\":\"\",\"party_hall\":\"\",\"tennis_court\":\"\",\"basket_ball_court\":\"\",\"squash_coutry\":\"\",\"amphi_theatre\":\"\",\"business_center\":\"\",\"jogging_track\":\"\",\"convinience_store\":\"\",\"guest_rooms\":\"\"},\"interior\":\"regular\",\"interior_options\":{\"tiles\":true,\"marble\":\"\",\"wooden\":\"\",\"modular_kitchen\":\"\",\"partial_modular_kitchen\":\"\",\"gas_pipe\":\"\",\"intercom_system\":\"\",\"air_conditioning\":\"\",\"partial_air_conditioning\":\"\",\"wardrobe\":\"\",\"sanitation_fixtures\":\"\",\"false_ceiling\":\"\",\"partial_false_ceiling\":\"\",\"recessed_lighting\":\"\"},\"location\":\"regular\",\"location_options\":{\"good_view\":true,\"transporation_hub\":true,\"shopping_center\":\"\",\"hospital\":\"\",\"school\":\"\",\"ample_parking\":\"\",\"park\":\"\",\"temple\":\"\",\"bank\":\"\",\"less_congestion\":\"\",\"less_pollution\":\"\"},\"maintenance\":\"\",\"maintenance_value\":\"\",\"near_by\":{\"school\":\"\",\"hospital\":\"\",\"mall\":\"\",\"park\":\"\",\"metro\":\"\",\"Near_by_school\":\"Little Champ Gurukulam Pre School \\\/ 1.52 km\",\"Near_by_hospital\":\"Suresh Hospital \\\/ 2.16 km\",\"Near_by_mall\":\"LORVEN LEO \\\/ 2.13 km\",\"Near_by_park\":\"SURYA ENCLAIVE \\\/ 2.09 km\"},\"city\":\"Bangalore\",\"locality\":\"Battarahalli\",\"token\":\"344bd4f0fab99b460873cfff6befb12f\"}"}

我创建了这样的表

CREATE EXTERNAL TABLE orc_test (json string) 
LOCATION '/user/ec2-user/test_orc';

如果我使用这个查询,它对我有效。

select * from orc_test;

如果我尝试选择一列,它将打印空值

select get_json_object(orc_test.json,'$.locality') as loc
from orc_test;

它打印空
任何帮助都将不胜感激。

3j86kqsm

3j86kqsm1#

除了vmachan的答案(我认为是对的)之外,我在类似情况下遇到的问题是json记录没有放在单独的行中。当它是一个数组时,它也不起作用。因此,例如,在Hive3.1.0中使用横向视图/json\tuple时,这一切正常:

{"color":"black","category":"hue","type":"primary","code":{"rgba":[255,255,255,1],"hex":"#000"}}
{"color":"white","category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}}
{"color":"red","category":"hue","type":"primary","code":{"rgba":[255,0,0,1],"hex":"#FF0"}}
{"color":"blue","category":"hue","type":"primary","code":{"rgba":[0,0,255,1],"hex":"#00F"}}
{"color":"yellow","category":"hue","type":"primary","code":{"rgba":[255,255,0,1],"hex":"#FF0"}}
{"color":"green","category":"hue","type":"secondary","code":{"rgba":[0,255,0,1],"hex":"#0F0"}}

但效果并不理想:

[{"color":"black","category":"hue","type":"primary","code":{"rgba":[255,255,255,1],"hex":"#000"}},
{"color":"white","category":"value","code":{"rgba":[0,0,0,1],"hex":"#FFF"}},
{"color":"red","category":"hue","type":"primary","code":{"rgba":[255,0,0,1],"hex":"#FF0"}},
{"color":"blue","category":"hue","type":"primary","code":{"rgba":[0,0,255,1],"hex":"#00F"}},
{"color":"yellow","category":"hue","type":"primary","code":{"rgba":[255,255,0,1],"hex":"#FF0"}},
{"color":"green","category":"hue","type":"secondary","code":{"rgba":[0,255,0,1],"hex":"#0F0"}}]
q35jwt9p

q35jwt9p2#

在您的案例中,我认为数据中的反斜杠是导致问题的原因,同时也是围绕您的 page 数据。我在下面列出了更新后的数据,您可以将其保存到一个文件并将其加载到表中,然后您的查询就可以工作了。

{"page_1":{"city":"Bangalore","locality":"Battarahalli","Name_of_Person":"xxx","User_email_address":"test@gmail.com","user_phone_number":"","sub_locality":"","street_name":"7th Cross Road, Near Reliance Fresh, T.c Palya,","home_plot_no":"45","pin_code":"560049","project_society_build_name":"Sunshine Layout","landmark_reference_1":"","landmark_reference_2":"","No_of_Schools":20,"No_of_Hospitals":20,"No_of_Metro":0,"No_of_Mall":11,"No_of_Park":10,"Distance_of_schools":1.55,"Distance_of_Hospitals":2.29,"Distance_of_Metro":0,"Distance_of_Mall":1.55,"Distance_of_Park":2.01,"lat":13.0243273,"lng":77.7077906,"ipinfo":{"ip":"113.193.30.130","hostname":"No Hostname","city":"","region":"","country":"IN","loc":"20.0000,77.0000","org":"AS45528 Tikona Digital Networks Pvt Ltd."}},"page_2":{"home_type":"Flat","area":"1350","beds":"3 BHK","bath_rooms":2,"building_age":"1","floors":2,"balcony":2,"amenities":"premium","amenities_options":{"gated_security":"","physical_security":"","cctv_camera":"","controll_access":"","elevator":true,"power_back_up":"","parking":true,"partial_parking":"","onsite_maintenance_store":"","open_garden":"","party_lawn":"","amenities_balcony":"","club_house":"","fitness_center":"","swimming_pool":"","party_hall":"","tennis_court":"","basket_ball_court":"","squash_coutry":"","amphi_theatre":"","business_center":"","jogging_track":"","convinience_store":"","guest_rooms":""},"interior":"regular","interior_options":{"tiles":true,"marble":"","wooden":"","modular_kitchen":"","partial_modular_kitchen":"","gas_pipe":"","intercom_system":"","air_conditioning":"","partial_air_conditioning":"","wardrobe":"","sanitation_fixtures":"","false_ceiling":"","partial_false_ceiling":"","recessed_lighting":""},"location":"regular","location_options":{"good_view":true,"transporation_hub":true,"shopping_center":"","hospital":"","school":"","ample_parking":"","park":"","temple":"","bank":"","less_congestion":"","less_pollution":""},"maintenance":"","maintenance_value":"","near_by":{"school":"","hospital":"","mall":"","park":"","metro":"","Near_by_school":"Little Champ Gurukulam Pre School / 1.52 km","Near_by_hospital":"Suresh Hospital / 2.16 km","Near_by_mall":"LORVEN LEO / 2.13 km","Near_by_park":"SURYA ENCLAIVE / 2.09 km"},"city":"Bangalore","locality":"Battarahalli","token":"344bd4f0fab99b460873cfff6befb12f"}}

我试过这个,它对我有效。

hive> select get_json_object(orc_test.json,'$.page_1.locality') as loc from orc_test;
OK
Battarahalli
Time taken: 0.091 seconds, Fetched: 1 row(s)

hive> select get_json_object(orc_test.json,'$.page_1.city') as loc from orc_test;
OK
Bangalore
Time taken: 0.097 seconds, Fetched: 1 row(s)

hive> select get_json_object(orc_test.json,'$.page_2.home_type') as loc from orc_test;
OK
Flat
Time taken: 0.091 seconds, Fetched: 1 row(s)
gfttwv5a

gfttwv5a3#

似乎您还没有创建包含许多列的表。配置单元表中只有一列。在hive中,json值的所有数据都是一列的单个值。因此,它为列显示空值。
使用json serde让配置单元将jsonMap到表中的列。

相关问题