如何在hive(hadoop)中解析字符串(来自不同的表)并将其加载到不同的表中

f87krz0w 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(591)

我将此表作为输入：

Table Name:Deals
Columns: Doc_id(BIGINT),Nv_Pairs_Feed(STRING),Nv_Pairs_Category(STRING)
For Example:
Doc_id: 4997143658422483637
Nv_Pairs_Feed: "TYPE:Wiper Blade;CONDITION:New;CATEGORY:Auto Parts and Accessories;STOCK_AVAILABILITY:Y;ORIGINAL_PRICE:0.00"
Nv_Pairs_Category: "Condition:New;Store:PartsGeek.com;"

我正在尝试解析字段：“nv\u pairs\u feed”和“nv\u pairs\u category”，并提取它们的n:v对（每对都除以“；”，并且每个名称和值都用“：”除。我的目标是将每个n:v作为一行插入此表中：

Doc_id | Name | Value | Source_Field

期望结果示例：

4997143658422483637 | Condition | New | Nv_Pairs_Category
4997143658422483637 | Store     | PartsGeek.com | Nv_Pairs_Category
4997143658422483637 | TYPE | Wiper Blade | Nv_Pairs_Feed
4997143658422483637 | CONDITION | New | Nv_Pairs_Feed
4997143658422483637 | CATEGORY | Auto Parts and Accessories | Nv_Pairs_Feed
4997143658422483637 | STOCK_AVAILABILITY | Y | Nv_Pairs_Feed
4997143658422483637 | ORIGINAL_PRICE | 0.00 | Nv_Pairs_Feed

hadoop Hive bigdata Map

来源：https://stackoverflow.com/questions/20610436/how-to-parse-a-string-from-a-different-table-in-hive-hadoop-and-load-it-to-a

1条答案

按热度按时间

zsbz8rwp1#

可以使用标准配置单元udf将字符串转换为Map str_to_map 然后使用brickhouse udf（http://github.com/klout/brickhouse ) map_key_values , combine 以及 numeric_range 去炸那些Map。i、大概是这样的

create view deals_map_view as
  select doc_id, 
     map_key_values(
         combine( map_to_str( nv_pairs_feed, ';', ':'),
                  map_to_str( mv_pairs_category, ';', ':'))) as deals_map_key_values
 from deals;
 select
   doc_id,
    array_index( deals_map_key_values, i ).key as name,
    array_index( deals_map_key_values, i ).value as value
   from deals_map_view
   lateral view numeric_range( size( feed_map_key_values) ) i1 as i

你可以用一个 explode_map 自定义项

赞(0）回复(0）举报 2021-06-04

我来回答

如何在hive(hadoop)中解析字符串(来自不同的表)并将其加载到不同的表中

1条答案

相关问题

热门标签

最新问答