我使用URL抓取站点并从这些站点获取数据…..我使用solr 3.4.0和nutch 1.9。
它工作得很好,但现在突然从上周开始,我得到了这个错误:
2015-06-18 18:32:49,718 INFO indexer.IndexingFilters - Adding org.apache.nutch.indexer.anchor.AnchorIndexingFilter
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: content dest: content
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: site dest: site
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: title dest: title
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: host dest: host
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: segment dest: segment
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: boost dest: boost
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: digest dest: digest
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: tstamp dest: tstamp
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: url dest: id
2015-06-18 18:32:53,531 INFO solr.SolrMappingReader - source: url dest: url
2015-06-18 18:32:54,484 INFO solr.SolrWriter - Adding 1000 documents
2015-06-18 18:34:34,156 WARN mapred.LocalJobRunner - job_local_0030
org.apache.solr.common.SolrException: Internal Server Error
Internal Server Error
request: http://host IP:port/solr/news/update?wt=javabin&version=2
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:430)
at org.apache.solr.client.solrj.impl.CommonsHttpSolrServer.request(CommonsHttpSolrServer.java:244)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:105)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:49)
at org.apache.nutch.indexer.solr.SolrWriter.write(SolrWriter.java:81)
at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:54)
at org.apache.nutch.indexer.IndexerOutputFormat$1.write(IndexerOutputFormat.java:44)
at org.apache.hadoop.mapred.ReduceTask$3.collect(ReduceTask.java:440)
at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:166)
at org.apache.nutch.indexer.IndexerMapReduce.reduce(IndexerMapReduce.java:51)
at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:463)
at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:411)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:216)
2015-06-18 18:34:35,062 ERROR solr.SolrIndexer - java.io.IOException: Job failed!
2015-06-18 18:34:35,140 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: starting at 2015-06-18 18:34:35
2015-06-18 18:34:35,140 INFO solr.SolrDeleteDuplicates - SolrDeleteDuplicates: Solr url: http://Host IP:Port/solr/news
2015-06-18 18:36:52,718 WARN mapred.LocalJobRunner - job_local_0031
java.lang.NullPointerException
at org.apache.hadoop.io.Text.encode(Text.java:388)
at org.apache.hadoop.io.Text.set(Text.java:178)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:270)
at org.apache.nutch.indexer.solr.SolrDeleteDuplicates$SolrInputFormat$1.next(SolrDeleteDuplicates.java:241)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:192)
at org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:176)
at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:358)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:307)
at org.apache.hadoop.mapred.LocalJobRunner$Job.run(LocalJobRunner.java:177)
任何人可以帮我消除这个错误。提前谢谢。
solr日志给了我这个error:org.apache.solr.update.solrindexwriter finalize 严重:solrindexwriter在finalize()之前未关闭,表示存在错误--可能存在资源泄漏!!!
org.apache.solr.common.solrexception log severe:org.apache.lucene.store.lockobtainfailedexception:锁获取超时:nativefslock
暂无答案!
目前还没有任何答案,快来回答吧!