如何将数据从csvMap到嵌套的avro模式

8ehkhllq  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(263)

假设我有下面这样的模式

{
    "name": "phoneNumber",
    "type": {
      "type": "record",
      "name": "internalNumber",
      "namespace": "com.wiki",
      "fields": [{
        "name": "areacode",
        "type": "string",
      }, {
        "name": "phone",
        "type": ["null", "string"],
        "doc": "Acutal full number",
        "default": null
      }]
    }
  }

我有一个csv,它把这些数据分散到多个列中,比如:

areaCode  phoneNumber
+1        1234512345

如何从pig脚本获得如下avro文件:

"phoneNumber" : {
 "areacode" : "+1",
  "phone" : "1234512345"
}

因为它是嵌套的。

8yparm6h

8yparm6h1#

A = LOAD 'path' USING CSVLoader as (areaCode: chararray, phoneNumber: chararray);
B = foreach A generate (areaCode, phoneNumber as phone) as phoneNumber;
STORE B INTO 'path' using AvroStorage;

你需要装Pig油的和Pig油罐里的存货

相关问题