我已经将我的2台服务器配置为在分布式模式下运行(使用hadoop),我的爬网进程配置为nutch2.2.1-hbase(作为存储)和solr。solr由tomcat管理。问题是每次我尝试做最后一步-我的意思是当我想把数据从hbase索引到solr时。然后发生此[1]错误。我尝试添加catalina\u opts(或java\u opts)如下:
catalina\u opts=“$java\u opts-xx:+useConMarkSweepGC-xms1g-xmx6000m-xx:minheapfreeratio=10-xx:maxheapfreeratio=30-xx:maxpermsize=512m-xx:+cmsclassunloadingenabled”
使用tomcat的catalina.sh脚本运行服务器,但没有帮助。我还将这些[2]属性添加到nutch-site.xml文件中,但结果是 OutOfMemory
再一次。你能帮帮我吗?
[1]
2014-09-06 22:52:50,683 FATAL org.apache.hadoop.mapred.Child: Error running child : java.lang.OutOfMemoryError: Java heap space
at java.util.Arrays.copyOf(Arrays.java:2367)
at java.lang.AbstractStringBuilder.expandCapacity(AbstractStringBuilder.java:130)
at java.lang.AbstractStringBuilder.ensureCapacityInternal(AbstractStringBuilder.java:114)
at java.lang.AbstractStringBuilder.append(AbstractStringBuilder.java:587)
at java.lang.StringBuffer.append(StringBuffer.java:332)
at java.io.StringWriter.write(StringWriter.java:77)
at org.apache.solr.common.util.XML.escape(XML.java:204)
at org.apache.solr.common.util.XML.escapeCharData(XML.java:77)
at org.apache.solr.common.util.XML.writeXML(XML.java:147)
at org.apache.solr.client.solrj.util.ClientUtils.writeVal(ClientUtils.java:161)
at org.apache.solr.client.solrj.util.ClientUtils.writeXML(ClientUtils.java:129)
at org.apache.solr.client.solrj.request.UpdateRequest.writeXML(UpdateRequest.java:355)
at org.apache.solr.client.solrj.request.UpdateRequest.getXML(UpdateRequest.java:271)
at org.apache.solr.client.solrj.request.RequestWriter.getContentStream(RequestWriter.java:66)
at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getDelegate(RequestWriter.java:94)
at org.apache.solr.client.solrj.request.RequestWriter$LazyContentStream.getName(RequestWriter.java:104)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:247)
at org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:197)
at org.apache.solr.client.solrj.request.AbstractUpdateRequest.process(AbstractUpdateRequest.java:117)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:68)
at org.apache.solr.client.solrj.SolrServer.add(SolrServer.java:54)
at org.apache.nutch.indexwriter.solr.SolrIndexWriter.close(SolrIndexWriter.java:96)
at org.apache.nutch.indexer.IndexWriters.close(IndexWriters.java:117)
at org.apache.nutch.indexer.IndexerOutputFormat$1.close(IndexerOutputFormat.java:54)
at org.apache.hadoop.mapred.MapTask$NewDirectOutputCollector.close(MapTask.java:650)
at org.apache.hadoop.mapred.MapTask.closeQuietly(MapTask.java:1793)
at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:779)
at org.apache.hadoop.mapred.MapTask.run(MapTask.java:364)
at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
[2]
<property>
<name>http.content.limit</name>
<value>150000000</value>
</property>
<property>
<name>indexer.max.tokens</name>
<value>100000</value>
</property>
<property>
<name>http.timeout</name>
<value>50000</value>
</property>
<property>
<name>solr.commit.size</name>
<value>100</value>
</property>
1条答案
按热度按时间8dtrkrch1#
我已通过以下配置(mapred-site.xml文件)解决了此问题: