我正在基于GPL 13667的GEO数据集GSE 166427上工作。我正在尝试通过下面的R脚本获取此数据;但是在最后一行我遇到了一些错误和警告,然而,表达式集没有加载到我的项目中。如何处理GEOquery::getGEO
以获得表达式集?!
library(Biobase)
library(GEOquery)
library(limma)
series <- "GSE166427"
platform <- "GPL13667"
setwd("D:/Proj/DEGs/COAD DEGs/")
Sys.setenv(VROOM_CONNECTION_SIZE=131072*10)
gset <- getGEO(series,
GSEMatrix = TRUE,
AnnotGPL = TRUE,
destdir = "COAD Data/")
字符串
在此步骤中,我面临:
Found 2 file(s)
GSE166427-GPL13534_series_matrix.txt.gz
Using locally cached version: COAD Data//GSE166427-GPL13534_series_matrix.txt.gz
Rows: 0 Columns: 241
- Column specification -----------------------------------------------------------------------------------
Delimiter: "\t"
chr (241): ID_REF, GSM3759532, GSM3759533, GSM3759534, GSM3759535, GSM3759536, GSM3759537, GSM3759538, ...
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL13534 found here: `COAD Data//GPL13534.soft`
Error in `parseGSEMatrix(fname, destdir = destdir, AnnotGPL = AnnotGPL, : parsing failed--expected only one '!series_data_table_begin'`
In addition: Warning message:
In `download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery"))` :
cannot open URL <https://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL13nnn/GPL13534/annot/GPL13534.annot.gz>: HTTP status was '404 Not Found'\`
型
所以,我把脚本改为:
gset <- getGEO(series,
filename ="COAD Data/GSE166427-GPL13667_series_matrix.txt.gz",
GSEMatrix = TRUE,
AnnotGPL = TRUE,
destdir = "COAD Data/")
型
仍然,输出是:
Rows: 49386 Columns: 247
- Column specification -----------------------------------------------------------------------------------
`Delimiter: "\t"`
chr (1): ID_REF
dbl (246): GSM1077598, GSM1077599, GSM1077600, GSM1077601, GSM1077602, GSM1077603, GSM1077604, GSM10776...
i Use `spec()` to retrieve the full column specification for this data.
i Specify the column types or set `show_col_types = FALSE` to quiet this message.
Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL13667 found here: `COAD Data//GPL13667.soft`
Error in `.rowNamesDF <- (x, value = value) : invalid 'row.names' length`
In addition: Warning messages:
1: In `download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery"))` :
cannot open URL <https://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL13nnn/GPL13667/annot/GPL13667.annot.gz> : HTTP status was '404 Not Found'
2: One or more parsing issues, call `problems()` on your data frame for details, e.g.: `dat <- vroom(...)`
problems(dat)
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)
Matrix products: default
1条答案
按热度按时间enxuqcxy1#
您的目录可能不存在。
试试这个
字符串
注意,目录名不应该有空格,所以实际上应该使用
"COAD_Data"
。型