spring集成dsl lasttime不更新错误

vwoqyblh  于 2021-07-23  发布在  Java
关注(0)|答案(1)|浏览(406)

我对springintegrationdsl完全陌生,我要做的是从mongo获取所有rss提要,将它们全部注册到flow上下文中,并处理从提要获取的所有文章。目前,我有一个for each循环,用于调用包含以下内容的函数:

IntegrationFlow newFlow = IntegrationFlows
            .from(Feed.inboundAdapter(new URL(url), source), e -> e.id(org + "-" + source + "-feed").poller(Pollers.fixedDelay(poll).maxMessagesPerPoll(maxMessages).errorChannel("feedErrors")))
            .enrichHeaders(h -> h.header("sourceId", sourceId))
            .enrichHeaders(h -> h.header("sourceName", sourceName))
            .enrichHeaders(h -> h.header("source", source))
            .enrichHeaders(h -> h.header("categories", categories))
            .enrichHeaders(h -> h.header("org", org))
            .channel(MessageChannels.executor(taskExecutor))
            .handle("enricher", "enhance")
            .channel(news())
            .get();

this.flowContext.registration(newFlow).id(org + "-" + source + "-flow").register();

它做得如此出色,除了例外,有趣的是。当提要当前不可用、已重命名或提要本身的文章格式不正确时,将引发异常,并转到流定义中指示的“feederrors”通道。该部分如下所示:

@Component
public class RssFeedErrorHandler {

    private final Logger logger = LoggerFactory.getLogger(RssFeedErrorHandler.class);

    @Bean
    public IntegrationFlow errorSender() {
        return IntegrationFlows.from("feedErrors")
            .handle("rssFeedErrorHandler", "handleError")
            .get();
    }

    public void handleError(Message<MessagingException> message) {
        logger.error(message.toString());
    }
}

例外示例:

AdviceMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen, feedResource=null, metadataKey='google-news-politics.https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen', lastTime=1613665200000}'; nested exception is java.io.FileNotFoundException: https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen was successful, headers={id=101f3da9-42de-eb03-0da2-015952fc0a23, timestamp=1613667428995}, inputMessage=ErrorMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen, feedResource=null, metadataKey='google-news-politics.https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen', lastTime=1613665200000}'; nested exception is java.io.FileNotFoundException: https://news.google.com/rss/topics/CAAqIQgKIhtDQkFTRGdvSUwyMHZNRFZ4ZERBU0FtVnVLQUFQAQ?hl=en-GB&gl=GB&ceid=GB%3Aen, headers={id=8a470a5f-4027-371c-44fd-1250703077b0, timestamp=1613667428995}]]
AdviceMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://www.startupbootcamp.org/feed/, feedResource=null, metadataKey='startupbootcamp-startups.https://www.startupbootcamp.org/feed/', lastTime=1613015630000}'; nested exception is java.net.ConnectException: Operation timed out (Connection timed out) was successful, headers={id=6b5a1a00-905a-68b6-128b-4cdd8eb06737, timestamp=1613664837150}, inputMessage=ErrorMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://www.startupbootcamp.org/feed/, feedResource=null, metadataKey='startupbootcamp-startups.https://www.startupbootcamp.org/feed/', lastTime=1613015630000}'; nested exception is java.net.ConnectException: Operation timed out (Connection timed out), headers={id=70ef8ae7-1b75-9e38-e631-c88777741609, timestamp=1613664837150}]]
AdviceMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms, feedResource=null, metadataKey='india-times-entertainment.https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms', lastTime=1613715922000}'; nested exception is com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: Character reference "&#55357" is an invalid XML character. was successful, headers={id=fb936878-6450-d79a-851a-ca1ceeef8182, timestamp=1613705658735}, inputMessage=ErrorMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms, feedResource=null, metadataKey='india-times-entertainment.https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms', lastTime=1613715922000}'; nested exception is com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: Character reference "&#55357" is an invalid XML character., headers={id=24f2a11a-80a6-6c13-dbe1-9548747174aa, timestamp=1613705658734}]]
AdviceMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://www.jeffbullas.com/feed/, feedResource=null, metadataKey='jeff-bullas-marketing.https://www.jeffbullas.com/feed/', lastTime=1613574000000}'; nested exception is java.io.IOException: Server returned HTTP response code: 500 for URL: https://www.jeffbullas.com/feed/ was successful, headers={id=e6998e35-afdc-b476-c2d9-2f4b0cc46c6b, timestamp=1613710313323}, inputMessage=ErrorMessage [payload=org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://www.jeffbullas.com/feed/, feedResource=null, metadataKey='jeff-bullas-marketing.https://www.jeffbullas.com/feed/', lastTime=1613574000000}'; nested exception is java.io.IOException: Server returned HTTP response code: 500 for URL: https://www.jeffbullas.com/feed/, headers={id=420f9b5d-1c58-cd0a-c4e2-5d640578f1c9, timestamp=1613710313322}]]

这里有两个完整的堆栈:

org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms, feedResource=null, metadataKey='india-times-entertainment.https://timesofindia.indiatimes.com/rssfeeds/1081479906.cms', lastTime=-1}'; nested exception is com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: Character reference "&#55357" is an invalid XML character.
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:239)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.populateEntryList(FeedEntryMessageSource.java:202)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:177)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:58)
        at org.springframework.integration.endpoint.AbstractMessageSource.receive(AbstractMessageSource.java:167)
        at org.springframework.integration.endpoint.SourcePollingChannelAdapter.receiveMessage(SourcePollingChannelAdapter.java:250)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.doPoll(AbstractPollingEndpoint.java:359)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.pollForMessage(AbstractPollingEndpoint.java:328)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$null$1(AbstractPollingEndpoint.java:275)
        at org.springframework.integration.util.ErrorHandlingTaskExecutor.lambda$execute$0(ErrorHandlingTaskExecutor.java:57)
        at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
        at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:55)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$2(AbstractPollingEndpoint.java:272)
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
        at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:93)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: Character reference "&#55357" is an invalid XML character.
        at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:236)
        at com.rometools.rome.io.SyndFeedInput.build(SyndFeedInput.java:150)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:226)
        ... 20 more
Caused by: org.jdom2.input.JDOMParseException: Error on line 1: Character reference "&#55357" is an invalid XML character.
        at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232)
        at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303)
        at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196)
        at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:233)
        ... 22 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 15681; Character reference "&#55357" is an invalid XML character.
        at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
        at org.apache.xerces.impl.XMLScanner.scanCharReferenceValue(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanCharReference(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl$FragmentContentDispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
        at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
        ... 25 more

org.springframework.messaging.MessagingException: Failed to retrieve feed for 'FeedEntryMessageSource{feedUrl=http://www.independent.co.uk/news/uk/politics/rss, feedResource=null, metadataKey='independent-politics.http://www.independent.co.uk/news/uk/politics/rss', lastTime=1613726122000}'; nested exception is com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:239)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.populateEntryList(FeedEntryMessageSource.java:202)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:177)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.doReceive(FeedEntryMessageSource.java:58)
        at org.springframework.integration.endpoint.AbstractMessageSource.receive(AbstractMessageSource.java:167)
        at org.springframework.integration.endpoint.SourcePollingChannelAdapter.receiveMessage(SourcePollingChannelAdapter.java:250)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.doPoll(AbstractPollingEndpoint.java:359)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.pollForMessage(AbstractPollingEndpoint.java:328)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$null$1(AbstractPollingEndpoint.java:275)
        at org.springframework.integration.util.ErrorHandlingTaskExecutor.lambda$execute$0(ErrorHandlingTaskExecutor.java:57)
        at org.springframework.core.task.SyncTaskExecutor.execute(SyncTaskExecutor.java:50)
        at org.springframework.integration.util.ErrorHandlingTaskExecutor.execute(ErrorHandlingTaskExecutor.java:55)
        at org.springframework.integration.endpoint.AbstractPollingEndpoint.lambda$createPoller$2(AbstractPollingEndpoint.java:272)
        at org.springframework.scheduling.support.DelegatingErrorHandlingRunnable.run(DelegatingErrorHandlingRunnable.java:54)
        at org.springframework.scheduling.concurrent.ReschedulingRunnable.run(ReschedulingRunnable.java:93)
        at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:834)
Caused by: com.rometools.rome.io.ParsingFeedException: Invalid XML: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
        at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:236)
        at com.rometools.rome.io.SyndFeedInput.build(SyndFeedInput.java:150)
        at org.springframework.integration.feed.inbound.FeedEntryMessageSource.getFeed(FeedEntryMessageSource.java:226)
        ... 20 more
Caused by: org.jdom2.input.JDOMParseException: Error on line 1: DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
        at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:232)
            at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:303)
        at org.jdom2.input.SAXBuilder.build(SAXBuilder.java:1196)
        at com.rometools.rome.io.WireFeedInput.build(WireFeedInput.java:233)
            ... 22 more
Caused by: org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 10; DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true.
        at org.apache.xerces.util.ErrorHandlerWrapper.createSAXParseException(Unknown Source)
        at org.apache.xerces.util.ErrorHandlerWrapper.fatalError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLErrorReporter.reportError(Unknown Source)
        at org.apache.xerces.impl.XMLScanner.reportFatalError(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentScannerImpl$PrologDispatcher.dispatch(Unknown Source)
        at org.apache.xerces.impl.XMLDocumentFragmentScannerImpl.scanDocument(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XML11Configuration.parse(Unknown Source)
        at org.apache.xerces.parsers.XMLParser.parse(Unknown Source)
        at org.apache.xerces.parsers.AbstractSAXParser.parse(Unknown Source)
        at org.apache.xerces.jaxp.SAXParserImpl$JAXPSAXParser.parse(Unknown Source)
        at org.jdom2.input.sax.SAXBuilderEngine.build(SAXBuilderEngine.java:217)
        ... 25 more

所以错误被抛出,它被记录,生命继续,除了一些原因,轮询获取似乎被保存以供重试。如果它重试一次,这就不是问题了,但它只是不断地重试,当错误堆积得足够多时,它只会不断地重试这些错误,而不会获取任何新的内容。
例如,500个feed被轮询,其中10个抛出一个异常,其中feed已经被移动而不再存在,它有坏的xml或者可能发生了其他事情。现在,轮询器只能一遍又一遍地轮询这10个提要,直到不再失败(这永远不会发生)。我记得在某个地方读到,如果抛出异常 handleError 函数则会继续重试,但如果没有抛出异常,则应该继续。这里似乎不是这样。我已经花了3天的时间来考虑这个问题,尝试不同的解决方案,但它总是在大约4-10分钟后窒息,这取决于需要多长时间才能获取足够的错误提要。

c9qzyr3d

c9qzyr3d1#

我想说的是,您尝试谈论许多不同的异常:不可用、格式错误、处理错误等。我认为最佳做法是区分这些异常,并为每个异常做出适当的业务决策:我们绝对不能将“不可用”视为处理错误,必须重试,直到它恢复可用。
对于处理一个,我建议调查一个 ExpressionEvaluatingRequestHandlerAdvice 在服务上,您的流程输入的提要:https://docs.spring.io/spring-integration/docs/current/reference/html/messaging-endpoints.html#message-处理建议链
对于格式错误的源代码,我建议从上下文中查看流删除。所以,你再也不会试图调查一个错误的来源了。你绝对可以从错误流中找到答案 feedErrors 频道。
在这种情况下,你所面临的其他错误也是如此。
我们可能会修改你的建议 10 are of those throw an exception where the feeds have been removed for whatever reason 更详细地说 FeedEntryMessageSource 是基于 updatedDate 条目的名称。所以,如果我们不能正确地转换和生产,它真的会再次被拉,因为 lastTime 状态未适当更新。但是,让我们在单独的so线程中使用更多的细节进行调查!
更新
关于堆栈跟踪的一些想法。
例外情况如下 org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 15681; Character refer 是不好的,必须视为致命的。你肯定不能再解析rss源了,至少在它恢复正常之前。这种异常可以像 stop() 为了这个 Feed.inboundAdapter() . 所以,在 feedErrors 通道流并调用 stop() 导致终结点的。或者最好删除这个rss源的整个动态流。
例外 org.xml.sax.SAXParseException; lineNumber: 1; columnNumber: 10; DOCTYPE is disallowed when the feature "http://apache.org/xml/features/disallow-doctype-decl" set to true. 也是不好的,可能会被同样致命的方式与后续治疗 stop() 对于端点。或者你可以调查 xerces 库如何允许该dtd声明。可能是这样的选择 syndFeedInput(SyndFeedInput) 还可以为xml解析提供一些钩子。例如,我看到它有以下选项:

/**
 * Since ROME 1.5.1 we fixed a security vulnerability by disallowing Doctype declarations by default. 
 * This change breaks the compatibility with at least RSS 0.91N because it requires a Doctype declaration. 
 * You are able to allow Doctype declarations again with this property. You should only activate it 
 * when the feeds that you process are absolutely trustful. 
 *  
 * @param allowDoctypes true when Doctype declarations should be allowed again, false otherwise
 */
public void setAllowDoctypes(boolean allowDoctypes) {

相关问题