Extracting entities from news articles I've realized this behavior:
These words are present in articles but are not extracted by the models.
Does anyone know the reason?
Info about spaCy
- spaCy version: 3.7.5
- Platform: Linux-6.1.85+-x86_64-with-glibc2.35
- Python version: 3.10.12
- Pipelines: es_core_news_lg (3.7.0), it_core_news_lg (3.7.0)
1条答案
按热度按时间nfzehxib1#
这个问题可能是由以下原因导致的:
es_core_news_lg
和it_core_news_lg
模型分别针对西班牙语和意大利语进行了专门的训练。如果你试图提取的实体是特定领域的或者是不太常见的,这些模型可能表现不佳。要解决这个问题,你可以尝试以下步骤,并告诉我是否有效:
示例代码:
希望这对你有所帮助,谢谢!