I am working on extracting triplets from an annotated string, but the code's returning NULL. Here is the code used:-
library(tidyverse)
library(tm)
library(coreNLP)
# Increase Java heap space
options(java.parameters = "-Xmx4g")
# Initialize CoreNLP with the path to the unzipped folder
initCoreNLP("/Users/..../stanford-corenlp-4.5.7/")
# Function to extract relations using CoreNLP
extract_relations <- function(text) {
cat("Text to be annotated:\n", text, "\n\n")
annotation <- tryCatch({
annotateString(text)
}, error = function(e) {
message("Error in annotation: ", e)
return(NULL)
})
if (is.null(annotation)) {
message("Annotation is NULL")
return(list())
}
print(annotation)
triples <- tryCatch({
getOpenIE(annotation)
}, error = function(e) {
message("Error in extracting OpenIE triples: ", e)
return(NULL)
})
if (is.null(triples) || length(triples) == 0) {
message("No triples extracted.")
return(list())
}
print(triples)
}
# Mock dataset to train the model
mock_data <- data.frame(
match_id = 1,
home_team = "Manchester United",
away_team = "Chelsea",
match_preview = "Manchester United won their last game convincingly and have a strong home record. Chelsea, on the other hand, are struggling with injuries and have lost three of their last five away games.",
outcome = "homewin",
stringsAsFactors = FALSE
)
# Extracting features and assigning scores
match <- mock_data[1, ]
relations <- extract_relations(match$match_preview)
This is the output:-
Stanford core NLP used- stanford-corenlp-4.5.7
R version - R version 4.3.1
Is it an issue with the way CoreNLP is initialised or something else? Any help is appreciated.
Regards.
3条答案
按热度按时间new9mtju1#
请注意,要高亮显示一个大的代码块,你需要三个反引号```而不是一个。
我对R接口到CoreNLP一无所知。我建议你检查接口的输出,以确保它实际上是使用OpenIE注解器作为第一阶段启动CoreNLP。
qkf9rpyu2#
你好,@AngledLuffa ,
感谢你的提醒。
我认为它正在启动CoreNLP,因为
annotateString(text)
正在成功地对文本进行注解。问题出在三元组提取上。这是否是由于某些缺失的注解器导致的?P.S- 我手动下载了
stanford-corenlp-4.5.7
,但似乎无法在该包中找到.Properties文件。不确定是否是这个问题。祝好
翻译结果:嗨,@AngledLuffa ,
感谢你的提醒。
我认为它正在启动CoreNLP,因为
annotateString(text)
正在成功地对文本进行注解。问题出在三元组提取上。这是否是由于某些缺失的注解器导致的?P.S- 我手动下载了
stanford-corenlp-4.5.7
,但似乎无法在该包中找到.Properties文件。不确定是否是这个问题。祝好
brtdzjyr3#
听起来不错。首先,我会尝试检查OpenIE模型是否实际上是R在创建用于接口的管道时加载的标注器的一部分。如果R接口允许将输出进行管道处理,那么它应该显示在管道的输出中。
就我个人而言,我对R接口没有任何经验,建议自己测试而不是依赖于我们的帮助。您还可以找到R接口的作者,询问他们如何检查OpenIE包。