plot.window(...)中的错误：当我在R中的阈值处截断值时，需要最终的“xlim”值

我有这样的数据集（小部分）

final_data=structure(list(Y = c(2282L, 2565L, 2242L, 2109L, 2704L, 2352L, 
2492L, 2608L, 2667L, 1863L), is_red_ndvi_v_down = c("yes", "yes", 
"yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes"), ndvi_v_down = c(0.032460447, 
0.028369653, 0.017094017, 0.016972906, 0.015228979, 0.020649285, 
0.028151986, 0.036528581, 0.036026201, 0.017097506), is_red_mtci_m50_85 = c("yes", 
"yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes"
), mtci_m50_85 = c(0.195646208, 0.112022057, 0.229670211, 0.19607818, 
0.205472798, 0.314782868, 0.238119728, 0.230381033, 0.21754644, 
0.092345478), is_red_gcvi_m75_2 = c("yes", "yes", "yes", "yes", 
"yes", "yes", "yes", "yes", "yes", "yes"), gcvi_m75_2 = c(5.590222802, 
4.439820215, 6.659634599, 6.321884806, 5.049482031, 5.039738058, 
5.336354603, 6.236330399, 6.231815273, 5.627697383), is_red_vdvi_vi_max = c("yes", 
"yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes", "yes"
), vdvi_vi_max = c(0.428571429, 0.283018868, 0.307692308, 0.307692308, 
0.591836735, 0.50877193, 0.393939394, 0.461538462, 0.514285714, 
0.428571429), is_red_mtci_m35_50 = c("yes", "yes", "yes", "yes", 
"yes", "yes", "yes", "yes", "yes", "yes"), mtci_m35_50 = c(0.080354124, 
0.134743258, 0.14510097, 0.198023501, 0.278444767, 0.235650507, 
0.316062043, 0.216993856, 0.235756291, 0.002585028)), row.names = c(NA, 
10L), class = "data.frame")

字符串
数据包含变量的名称和前缀为is_red的完全相同的名称。例如，ndvi_v_down是度量变量，is_red_ndvi_v_down是值为yes或no的分类变量。Yes表示图表上的点标记为red，这表明它接近于一条单调直线（当一个或另一个预测因子与Y相关时）。只是所有这些都被上传到最终数据集进行目视检查。然而，我想更新这个最终数据集如下。我需要截止点阈值为自动确定（而不是像我一样手动）。为此，阈值是通过k-mean确定的。我做到了。但我需要在运行代码后在最终数据集中更新所有分类变量的值（是或否）。这是我的建议。

plot_and_threshold <- function(data, response_var) {
   # Plotting and thresholding for each predictor
   for (x_var in names(data)) {
     if (x_var != response_var) {
       # Variables for x and y axis
       x <- as.numeric(data[[x_var]])
       y <- as.numeric(data[[response_var]])
      
       # Plot with red dots
       plot(x, y, col = "red", xlim = range(x, na.rm = TRUE))
      
       # Linear regression to fit data
       model <- lm(y ~ x)
      
       # Getting trend line coefficients
       a <- coef(model)[1]
       b <- coef(model)[2]
      
       # Calculate the distance between points and the trend line
       distances <- abs(y - (a + b * x))
      
       # Histogram clustering
       kmeans_obj <- kmeans(matrix(distances), centers = 2)
       cluster_centers <- kmeans_obj$centers
      
       # Threshold in the area of cluster separation
       threshold <- (cluster_centers[1] + cluster_centers[2]) / 2
      
       # Update variables is_red_*
       data[[paste0("is_red_", x_var)]] <- ifelse(distances > threshold, "no", "yes")
      
       # Plot with updated data
       points(x, y, col = ifelse(data[[paste0("is_red_", x_var)]] == "no", "grey", "red"))
     }
   }
  
   # Returning an updated dataset
   return(data)
}
final_data_updated <- plot_and_threshold(final_data, "Y")

型
我得到错误

Error in plot.window(...) : final 'xlim' values needed
In addition: Warnings:
1: In plot_and_threshold(final_data, "Y") :
   as a result of the transformation, NAs were created
2: In min(x) : 'min' has no non-missing arguments; return Inf
3: In max(x) : 'max' has no non-missing arguments; return -Inf

型
我做错了什么？如何正确获得更新的数据集？谢谢你的帮助。

发现你的函数工作正常，你只需要包括数字数据。

> final_data_updated <- 
+   plot_and_threshold(final_data[sapply(final_data, is.numeric)], "Y")

字符串

的数据

> final_data_updated
      Y ndvi_v_down mtci_m50_85 gcvi_m75_2 vdvi_vi_max mtci_m35_50 is_red_ndvi_v_down
1  2282  0.03246045  0.19564621   5.590223   0.4285714 0.080354124                yes
2  2565  0.02836965  0.11202206   4.439820   0.2830189 0.134743258                yes
3  2242  0.01709402  0.22967021   6.659635   0.3076923 0.145100970                yes
4  2109  0.01697291  0.19607818   6.321885   0.3076923 0.198023501                yes
5  2704  0.01522898  0.20547280   5.049482   0.5918367 0.278444767                 no
6  2352  0.02064928  0.31478287   5.039738   0.5087719 0.235650507                yes
7  2492  0.02815199  0.23811973   5.336355   0.3939394 0.316062043                yes
8  2608  0.03652858  0.23038103   6.236330   0.4615385 0.216993856                yes
9  2667  0.03602620  0.21754644   6.231815   0.5142857 0.235756291                yes
10 1863  0.01709751  0.09234548   5.627697   0.4285714 0.002585028                 no
   is_red_mtci_m50_85 is_red_gcvi_m75_2 is_red_vdvi_vi_max is_red_mtci_m35_50
1                 yes               yes                yes                yes
2                  no               yes                 no                 no
3                  no               yes                yes                yes
4                  no                no                yes                 no
5                  no                no                yes                yes
6                  no               yes                yes                yes
7                 yes               yes                yes                yes
8                  no                no                yes                yes
9                  no                no                yes                yes
10                 no                no                 no                yes
>

型

plot.window(...)中的错误：当我在R中的阈值处截断值时，需要最终的“xlim”值

1条答案

相关问题

热门标签

最新问答