DruidSpatialDimensions在hadoop摄取期间加载数据错误

y4ekin9u  于 2021-05-29  发布在  Hadoop
关注(0)|答案(1)|浏览(478)

我有一个数据的hadoop摄取过程(就像https://druid.apache.org/docs/latest/ingestion/hadoop.html)
当前的Druid索引器版本是0.14.2-cubrating
数据是地面军事系统上的tsv文件。
以前使用Druid索引器的旧版本,没有问题。升级到新版本后出现错误。
一些细节
以下是我的规范中的解析部分:

"parser": {
        "parseSpec": {
          "dimensionsSpec": {
            "spatialDimensions": [
              {
                "dimName": "geo",
                "dims": ["latitude", "longitude"]
              }
            ],
            "dimensionExclusions": [],
            "dimensions":[
              "ip_address",
              "radius",
              "confidence"
            ]
          },
          "timestampSpec": {
            "format": "millis",
            "column": "ts"
          },
          "columns": [
            "ts",
            "ip_address",
            "latitude",
            "longitude",
            "radius",
            "confidence"
          ],
          "format":"tsv"
        },
        "type": "lzo"
      }
    },

本节将导致如下错误:

java.lang.reflect.InvocationTargetException
    at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
    at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
    at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
    at java.lang.reflect.Method.invoke(Method.java:498)
    at org.apache.druid.cli.CliHadoopIndexer.run(CliHadoopIndexer.java:116)
    at org.apache.druid.cli.Main.main(Main.java:118)
Caused by: java.lang.IllegalArgumentException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
    at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3459)
    at shade.com.fasterxml.jackson.databind.ObjectMapper.convertValue(ObjectMapper.java:3378)
    at org.apache.druid.segment.indexing.DataSchema.getParser(DataSchema.java:126)
    at org.apache.druid.indexer.HadoopDruidIndexerConfig.verify(HadoopDruidIndexerConfig.java:591)
    at org.apache.druid.indexer.HadoopDruidIndexerJob.<init>(HadoopDruidIndexerJob.java:49)
    at org.apache.druid.cli.CliInternalHadoopIndexer.run(CliInternalHadoopIndexer.java:124)
    at org.apache.druid.cli.Main.main(Main.java:118)
    ... 6 more
Caused by: shade.com.fasterxml.jackson.databind.JsonMappingException: Instantiation of [simple type, class org.apache.druid.data.input.impl.DelimitedParseSpec] value failed: column[geo] not in columns. (through reference chain: org.apache.druid.data.input.impl.StringInputRowParser["parseSpec"])
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.wrapException(StdValueInstantiator.java:399)
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:231)
    at shade.com.fasterxml.jackson.databind.deser.impl.PropertyBasedCreator.build(PropertyBasedCreator.java:135)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:442)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
    at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
    at shade.com.fasterxml.jackson.databind.deser.SettableBeanProperty.deserialize(SettableBeanProperty.java:518)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeWithErrorWrapping(BeanDeserializer.java:463)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeUsingPropertyBased(BeanDeserializer.java:378)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializerBase.deserializeFromObjectUsingNonDefault(BeanDeserializerBase.java:1099)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserializeFromObject(BeanDeserializer.java:296)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer._deserializeOther(BeanDeserializer.java:166)
    at shade.com.fasterxml.jackson.databind.deser.BeanDeserializer.deserialize(BeanDeserializer.java:136)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer._deserializeTypedForId(AsPropertyTypeDeserializer.java:122)
    at shade.com.fasterxml.jackson.databind.jsontype.impl.AsPropertyTypeDeserializer.deserializeTypedFromObject(AsPropertyTypeDeserializer.java:93)
    at shade.com.fasterxml.jackson.databind.deser.AbstractDeserializer.deserializeWithType(AbstractDeserializer.java:131)
    at shade.com.fasterxml.jackson.databind.deser.impl.TypeWrappedDeserializer.deserialize(TypeWrappedDeserializer.java:42)
    at shade.com.fasterxml.jackson.databind.ObjectMapper._convert(ObjectMapper.java:3454)
    ... 12 more
Caused by: java.lang.IllegalArgumentException: column[geo] not in columns.
    at shade.com.google.common.base.Preconditions.checkArgument(Preconditions.java:148)
    at org.apache.druid.data.input.impl.DelimitedParseSpec.verify(DelimitedParseSpec.java:119)
    at org.apache.druid.data.input.impl.DelimitedParseSpec.<init>(DelimitedParseSpec.java:63)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
    at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
    at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
    at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
    at shade.com.fasterxml.jackson.databind.introspect.AnnotatedConstructor.call(AnnotatedConstructor.java:125)
    at shade.com.fasterxml.jackson.databind.deser.std.StdValueInstantiator.createFromObjectWith(StdValueInstantiator.java:227)
    ... 33 more

我看到spec解析器试图在列之间定位维度,但它是空间维度!
这是一个相当痛苦的问题,打击了生产。有人知道如何修正这个错误吗?

2guxujil

2guxujil1#

"parser": {
  "type": "string",
  "parseSpec": {
    "format": "json",
     "flattenSpec": {
          "fields": [
          { "type": "path", "name": "Longitude", "expr": "$.location.lon" },
           { "type": "path", "name": "Latitude", "expr": "$.location.lat" }
        ]
      },
    "timestampSpec": {
      "column": "timeStamp",
      "format": "auto"
    },
     "dimensionsSpec": {
      "dimensions": ["blogid", "category","eventType","userid" ],
        "spatialDimensions": [
    {
        "dimName": "coordinates",
        "dims": ["Latitude", "Longitude"]
    } 
                                    ]       
    }
    }       
  }

相关问题