lucidworks save solr format未知字段

z2acfund 于 2021-05-29 发布在 Hadoop

关注(0)|答案(1)|浏览(381)

我正在用spark java做一个脚本。我需要使用lucidworks-spark solr工具将数据（来自Dataframe）插入solr集合(https://github.com/lucidworks/spark-solr)
我的schema.xml：

<schema name="MY_NAME" version="1.6">
    <field name="_version_" type="long" indexed="true" stored="true" />
    <field name="_root_" type="string" indexed="true" stored="false" />
    <field name="ignored_id" type="ignored" />
    <field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false" />
    <field name="age" type="int" indexed="true" stored="true" required="false" multiValued="false" />
    <field name="height" type="tlong" indexed="true" stored="true" required="false" multiValued="false" />
    <field name="name " type="string" indexed="true" stored="true" required="false" multiValued="false" />

    <fieldType name="string" class="solr.StrField" sortMissingLast="true" />
    <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0" />
    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0" />
    <fieldType name="tlong" class="solr.TrieLongField" precisionStep="8" positionIncrementGap="0" />
    <fieldType name="ignored" stored="false" indexed="false" multiValued="true" class="solr.StrField" />

    <uniqueKey>id</uniqueKey>
</schema>

我的Dataframe：

DataFrame df = sqlContext.sql("SELECT id, age, height, name FROM TABLE");

df.show（）提供：

+--------------------+-----------+------+------+
|                  id|        age|height|name |
+--------------------+-----------+------+------+
|12345678912345678...|         10|   101|hello|

但当我试图在我的solr收藏中插入：

df.write()
.format("solr")
.option("collection", MY_COLLECTION)
.option("zkhost", MY_ZKHOST)
.save()

我有以下错误：

Caused by: org.apache.solr.client.solrj.impl.HttpSolrClient$RemoteSolrException: Error from server at http://MY_IP/solr/MY_COLLECTION_SHARD_REPLICA: ERROR :[doc=123456789123456789] unknown field '_indexed_at_tdt'

我不明白字段“\u indexed\u at\u tdt”从何而来。
dataframe似乎只有4个我想插入的字段是正确的，但是我仍然不能在我的solr集合中插入，因为这个未知字段“\u indexed\u at\u tdt”。
更多信息：我有一个hbase索引器，它插入到同一个集合中并且正在工作。
提前感谢您的帮助！

Java hadoop apache-spark solr lucidworks

来源：https://stackoverflow.com/questions/43371613/lucidworks-save-solr-format-unknown-field

1条答案

按热度按时间

0mkxixxg1#

正如您在这里看到的，这个字段似乎是由lucidworks代码自动添加的。
您只需将对应字段添加到架构中，它就可以工作了：

<field name="_indexed_at_tdt" type="tdate" indexed="true" stored="true" required="false" multiValued="false" />

或者，如果你喜欢的话，把它变成动态的*\tdt。

赞(0）回复(0）举报 2021-05-29

我来回答

lucidworks save solr format未知字段

1条答案

相关问题

热门标签

最新问答