我使用Storm0.8.1读取AmazonSQS队列中的传入消息,并在执行此操作时获得一致的异常:
2013-12-02 02:21:38 executor [ERROR]
java.lang.RuntimeException: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.)
at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:219)
at REDACTED.spouts.SqsQueueSpout.nextTuple(SqsQueueSpout.java:88)
at backtype.storm.daemon.executor$fn__3976$fn__4017$fn__4018.invoke(executor.clj:447)
at backtype.storm.util$async_loop$fn__465.invoke(util.clj:377)
at clojure.lang.AFn.run(AFn.java:24)
at java.lang.Thread.run(Thread.java:701)
Caused by: com.amazonaws.AmazonClientException: Unable to unmarshall response (ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.)
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:524)
at com.amazonaws.http.AmazonHttpClient.executeHelper(AmazonHttpClient.java:298)
at com.amazonaws.http.AmazonHttpClient.execute(AmazonHttpClient.java:167)
at com.amazonaws.services.sqs.AmazonSQSClient.invoke(AmazonSQSClient.java:812)
at com.amazonaws.services.sqs.AmazonSQSClient.receiveMessage(AmazonSQSClient.java:575)
at REDACTED.spouts.SqsQueueSpout.handleNextTuple(SqsQueueSpout.java:191)
... 5 more
Caused by: javax.xml.stream.XMLStreamException: ParseError at [row,col]:[1,1]
Message: JAXP00010001: The parser has encountered more than "64000" entity expansions in this document; this is the limit imposed by the JDK.
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.setInputSource(XMLStreamReaderImpl.java:219)
at com.sun.org.apache.xerces.internal.impl.XMLStreamReaderImpl.<init>(XMLStreamReaderImpl.java:189)
at com.sun.xml.internal.stream.XMLInputFactoryImpl.getXMLStreamReaderImpl(XMLInputFactoryImpl.java:277)
at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLStreamReader(XMLInputFactoryImpl.java:129)
at com.sun.xml.internal.stream.XMLInputFactoryImpl.createXMLEventReader(XMLInputFactoryImpl.java:78)
at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:85)
at com.amazonaws.http.StaxResponseHandler.handle(StaxResponseHandler.java:41)
at com.amazonaws.http.AmazonHttpClient.handleResponse(AmazonHttpClient.java:503)
... 10 more
我已经调试了队列中的数据,一切看起来都很好。我不明白为什么api的xml响应会导致这些问题。有什么想法吗?
1条答案
按热度按时间h5qlskok1#
多年来一直在回答我自己的问题。
目前,oracle和openjdk的java中存在一个xml扩展限制处理错误,导致在解析多个xml文档时,共享计数器达到默认上限。
https://blogs.oracle.com/joew/entry/jdk_7u45_aws_issue_123
https://bugs.openjdk.java.net/browse/jdk-8028111
https://github.com/aws/aws-sdk-java/issues/123
尽管我认为我们的版本(6b27-1.12.6-1ubuntu0.12.04.4)没有受到影响,但运行openjdk bug报告中给出的示例代码确实验证了我们易受bug影响。
为了解决这个问题,我需要通过考试
jdk.xml.entityExpansionLimit=0
给风暴工兵。通过将以下内容添加到storm.yaml
在我的集群中,我能够缓解这个问题。我应该注意到,从技术上讲,这会使您面临拒绝服务攻击,但由于我们的xml文档仅来自sqs,因此我并不担心有人伪造恶意xml来杀死我们的工作人员。