CoreNLP How to use the neural coref model

rggaifut  于 4个月前  发布在  其他
关注(0)|答案(2)|浏览(69)

我正在尝试使用指代消解模型作为基线。它与'dcoref'和'statistical'指代消解模型配合良好。但是当我尝试使用'neural'模型时,我遇到了这个错误:

[pool-1-thread-2] INFO CoreNLP - [/127.0.0.1:55494] API call w/annotators tokenize,ssplit,pos,lemma,ner,depparse,coref
Uh-huh . It happened that I was going to have lunch with a friend , um , at noon . And then , the friend first sent me an SMS , Uh-huh . saying he would come pick me up to go together .
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator tokenize
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ssplit
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator pos
[pool-1-thread-2] INFO edu.stanford.nlp.tagger.maxent.MaxentTagger - Loading POS tagger from edu/stanford/nlp/models/pos-tagger/english-left3words/english-left3words-distsim.tagger ... done [0.6 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator lemma
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator ner
[pool-1-thread-2] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.all.3class.distsim.crf.ser.gz ... done [1.3 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.muc.7class.distsim.crf.ser.gz ... done [0.6 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.ie.AbstractSequenceClassifier - Loading classifier from edu/stanford/nlp/models/ner/english.conll.4class.distsim.crf.ser.gz ... done [0.6 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.time.JollyDayHolidays - Initializing JollyDayHoliday for SUTime from classpath edu/stanford/nlp/models/sutime/jollyday/Holidays_sutime.xml as sutime.binder.1.
[pool-1-thread-2] INFO edu.stanford.nlp.time.TimeExpressionExtractorImpl - Using following SUTime rules: edu/stanford/nlp/models/sutime/defs.sutime.txt,edu/stanford/nlp/models/sutime/english.sutime.txt,edu/stanford/nlp/models/sutime/english.holidays.sutime.txt
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 580704 unique entries out of 581863 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_caseless.tab, 0 TokensRegex patterns.
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 4869 unique entries out of 4869 from edu/stanford/nlp/models/kbp/english/gazetteers/regexner_cased.tab, 0 TokensRegex patterns.
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.TokensRegexNERAnnotator - ner.fine.regexner: Read 585573 unique entries from 2 files
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator depparse
[pool-1-thread-2] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Loading depparse model: edu/stanford/nlp/models/parser/nndep/english_UD.gz ...
[pool-1-thread-2] INFO edu.stanford.nlp.parser.nndep.Classifier - PreComputed 99996, Elapsed Time: 8.211 (s)
[pool-1-thread-2] INFO edu.stanford.nlp.parser.nndep.DependencyParser - Initializing dependency parser ... done [9.4 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.StanfordCoreNLP - Adding annotator coref
[pool-1-thread-2] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref model edu/stanford/nlp/models/coref/neural/english-model-default.ser.gz ... done [0.4 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.coref.neural.NeuralCorefAlgorithm - Loading coref embeddings edu/stanford/nlp/models/coref/neural/english-embeddings.ser.gz ... done [0.4 sec].
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - Error with building coref mention annotator!
[pool-1-thread-2] INFO edu.stanford.nlp.pipeline.CorefMentionAnnotator - java.lang.ClassNotFoundException: edu.stanford.nlp.hcoref.md.MentionDetectionClassifier

供您参考,我正在使用最新版本(3.9.2)并且我正在使用stanfordCoreNLP( https://github.com/Lynten/stanford-corenlp )调用服务器。

nlp = StanfordCoreNLP(r'/Users/stanford-corenlp-full-2018-10-05/', quiet=False)
props = {'annotators': 'coref', 'coref.algorithm': 'neural', 'pipelineLanguage': 'en'}

有人能帮忙吗?非常感谢!

svmlkihl

svmlkihl1#

我认为这里的主要问题是神经指代消解不仅仅与依存句法解析一起工作(它没有被训练成与基于依赖的提及检测一起工作),你必须运行parse标注器(不幸的是,它运行得很慢)。在你的标注器列表中做出这个改变可能就足够了(而不是仅仅将'annotators': 'coref'改为'annotators': 'tokenize,ssplit,pos,lemma,ner,parse,coref')。

你可以按照Python库的说明启动服务器,而不需要依赖于你的Python进程。

在Java命令中添加-serverProperties neural-coref.props

将这些设置放在neural-coref.props中:

annotators = tokenize, ssplit, pos, lemma, ner, parse, coref
coref.algorithm = neural

然后在请求中不带特殊属性调用。

当然,有无数种方法可以解决这种情况。此外,我认为你发现了我们正在发布的依赖提及检测模型的一个bug,它们没有更新以适应当前的代码。我会尝试为指代标注器提供更清晰的文档,说明哪些选项是可行的。

vmdwslir

vmdwslir2#

感谢回复。
使用Python API调用服务器仍然无法正常工作,但我通过首先创建一堆文档,然后直接使用Java命令解析它们成功解决了这个问题。所以我认为问题可能出在服务器对共指包的支持上,而不是共指包本身。

相关问题