nltk/nltk/tag/mapping.py
Lines 90 to 112 in 2a5aece
| | iftarget=="universal": |
| | _load_universal_map(source) |
| | # Added the new Russian National Corpus mappings because the |
| | # Russian model for nltk.pos_tag() uses it. |
| | _MAPPINGS["ru-rnc-new"]["universal"] = { |
| | "A": "ADJ", |
| | "A-PRO": "PRON", |
| | "ADV": "ADV", |
| | "ADV-PRO": "PRON", |
| | "ANUM": "ADJ", |
| | "CONJ": "CONJ", |
| | "INTJ": "X", |
| | "NONLEX": ".", |
| | "NUM": "NUM", |
| | "PARENTH": "PRT", |
| | "PART": "PRT", |
| | "PR": "ADP", |
| | "PRAEDIC": "PRT", |
| | "PRAEDIC-PRO": "PRON", |
| | "S": "NOUN", |
| | "S-PRO": "PRON", |
| | "V": "VERB", |
| | } |
This patch from #2151 just don't work, because source == 'ru-rnc-new'
failed on line
nltk/nltk/tag/mapping.py
Line 90 in 2a5aece
| | iftarget=="universal": |
with LookupError for file 'ru-rnc-new.map'
So, why don't change 'ru-rnc-new' to ru-rnc.map
, or just create ru-rnc-new.map
?
P.S. this is a @alvations patch, so requesting the author
3条答案
按热度按时间2jcobegt1#
感谢您提出这个问题。在
ru-rnc-new
中的补丁是为了在不破坏nltk_data
中现有数据的情况下热插拔新Map。我也认为更好的方法是将新Map添加到
ru-rnc-new.map
文件中,然后将其添加到nltk_data/taggers/universal_tagset
中。但无论如何,代码实现没有失败,按需工作。我可能没有理解这个问题,所以请解释一下,如果从 #2151 实现的
tagset_mapping
没有达到预期效果 =)huwehgph2#
问题在于,在尝试使用通用标签集时,
LookUpError
会搜索'ru-rnc-new.map'
文件。我自己已经下载并通过nltk.download
检查了所有文件,但错误仍然存在:0kjbasz63#
看起来bug仍然存在?