CoreNLP 无法提取三元组

c2e8gylq  于 2个月前  发布在  其他
关注(0)|答案(3)|浏览(47)

I am working on extracting triplets from an annotated string, but the code's returning NULL. Here is the code used:-

library(tidyverse)
library(tm)
library(coreNLP)

# Increase Java heap space
options(java.parameters = "-Xmx4g")

# Initialize CoreNLP with the path to the unzipped folder
initCoreNLP("/Users/..../stanford-corenlp-4.5.7/")  

# Function to extract relations using CoreNLP
extract_relations <- function(text) {
  cat("Text to be annotated:\n", text, "\n\n")
  
  annotation <- tryCatch({
    annotateString(text)
  }, error = function(e) {
    message("Error in annotation: ", e)
    return(NULL)
  })
  
  if (is.null(annotation)) {
    message("Annotation is NULL")
    return(list())
  }
  
  print(annotation)
  
  triples <- tryCatch({
    getOpenIE(annotation)
  }, error = function(e) {
    message("Error in extracting OpenIE triples: ", e)
    return(NULL)
  })
  
  if (is.null(triples) || length(triples) == 0) {
    message("No triples extracted.")
    return(list())
  }
  
  print(triples)
  
}

# Mock dataset to train the model
mock_data <- data.frame(
  match_id = 1,
  home_team = "Manchester United",
  away_team = "Chelsea",
  match_preview = "Manchester United won their last game convincingly and have a strong home record. Chelsea, on the other hand, are struggling with injuries and have lost three of their last five away games.",
  outcome = "homewin",
  stringsAsFactors = FALSE
)

# Extracting features and assigning scores
match <- mock_data[1, ]
relations <- extract_relations(match$match_preview)

This is the output:-

Stanford core NLP used- stanford-corenlp-4.5.7
R version - R version 4.3.1
Is it an issue with the way CoreNLP is initialised or something else? Any help is appreciated.
Regards.

new9mtju

new9mtju1#

请注意,要高亮显示一个大的代码块,你需要三个反引号```而不是一个。
我对R接口到CoreNLP一无所知。我建议你检查接口的输出,以确保它实际上是使用OpenIE注解器作为第一阶段启动CoreNLP。

qkf9rpyu

qkf9rpyu2#

你好,@AngledLuffa ,

感谢你的提醒。

我认为它正在启动CoreNLP,因为annotateString(text)正在成功地对文本进行注解。问题出在三元组提取上。这是否是由于某些缺失的注解器导致的?

P.S- 我手动下载了stanford-corenlp-4.5.7,但似乎无法在该包中找到.Properties文件。不确定是否是这个问题。

祝好

翻译结果:嗨,@AngledLuffa ,

感谢你的提醒。

我认为它正在启动CoreNLP,因为annotateString(text)正在成功地对文本进行注解。问题出在三元组提取上。这是否是由于某些缺失的注解器导致的?

P.S- 我手动下载了stanford-corenlp-4.5.7,但似乎无法在该包中找到.Properties文件。不确定是否是这个问题。

祝好

brtdzjyr

brtdzjyr3#

听起来不错。首先,我会尝试检查OpenIE模型是否实际上是R在创建用于接口的管道时加载的标注器的一部分。如果R接口允许将输出进行管道处理,那么它应该显示在管道的输出中。
就我个人而言,我对R接口没有任何经验,建议自己测试而不是依赖于我们的帮助。您还可以找到R接口的作者,询问他们如何检查OpenIE包。

相关问题