ExpressionSet Data(gset)未加载到我的项目中

pcrecxhr  于 2023-07-31  发布在  其他
关注(0)|答案(1)|浏览(132)

我正在基于GPL 13667的GEO数据集GSE 166427上工作。我正在尝试通过下面的R脚本获取此数据;但是在最后一行我遇到了一些错误和警告,然而,表达式集没有加载到我的项目中。如何处理GEOquery::getGEO以获得表达式集?!

library(Biobase)  
library(GEOquery)  
library(limma)

series <- "GSE166427"  
platform <- "GPL13667"  
setwd("D:/Proj/DEGs/COAD DEGs/")  
Sys.setenv(VROOM_CONNECTION_SIZE=131072*10)

gset <- getGEO(series, 
                GSEMatrix = TRUE,  
                AnnotGPL = TRUE, 
                destdir = "COAD Data/")

字符串
在此步骤中,我面临:

Found 2 file(s)  
GSE166427-GPL13534_series_matrix.txt.gz  
Using locally cached version: COAD Data//GSE166427-GPL13534_series_matrix.txt.gz

Rows: 0 Columns: 241

- Column specification -----------------------------------------------------------------------------------

Delimiter: "\t"  
chr (241): ID_REF, GSM3759532, GSM3759533, GSM3759534, GSM3759535, GSM3759536, GSM3759537, GSM3759538, ...

i Use `spec()` to retrieve the full column specification for this data.  
i Specify the column types or set `show_col_types = FALSE` to quiet this message.  
Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL13534 found here: `COAD Data//GPL13534.soft`

Error in `parseGSEMatrix(fname, destdir = destdir, AnnotGPL = AnnotGPL,  : parsing failed--expected only one '!series_data_table_begin'`
In addition: Warning message:
In `download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery"))` :
cannot open URL <https://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL13nnn/GPL13534/annot/GPL13534.annot.gz>: HTTP status was '404 Not Found'\`


所以,我把脚本改为:

gset <- getGEO(series, 
                 filename ="COAD Data/GSE166427-GPL13667_series_matrix.txt.gz", 
                 GSEMatrix = TRUE, 
                 AnnotGPL = TRUE, 
                 destdir = "COAD Data/")


仍然,输出是:

Rows: 49386 Columns: 247

- Column specification -----------------------------------------------------------------------------------

`Delimiter: "\t"`  
chr   (1): ID_REF  
dbl (246): GSM1077598, GSM1077599, GSM1077600, GSM1077601, GSM1077602, GSM1077603, GSM1077604, GSM10776...

i Use `spec()` to retrieve the full column specification for this data.  
i Specify the column types or set `show_col_types = FALSE` to quiet this message.  
Annotation GPL not available, so will use submitter GPL instead Using locally cached version of GPL13667 found here: `COAD Data//GPL13667.soft`

Error in `.rowNamesDF <- (x, value = value) : invalid 'row.names' length`  
In addition: Warning messages:  
1: In `download.file(myurl, destfile, mode = mode, quiet = TRUE, method = getOption("download.file.method.GEOquery"))` :
cannot open URL <https://ftp.ncbi.nlm.nih.gov/geo/platforms/GPL13nnn/GPL13667/annot/GPL13667.annot.gz> : HTTP status was '404 Not Found'

2: One or more parsing issues, call `problems()` on your data frame for details, e.g.: `dat <- vroom(...)`
problems(dat)
> sessionInfo()
R version 4.1.0 (2021-05-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 10 x64 (build 19045)

Matrix products: default
enxuqcxy

enxuqcxy1#

您的目录可能不存在。
试试这个

dir.create("COAD_Data")  ## use this once, then comment out!

gset <- getGEO(series, 
               GSEMatrix = TRUE,  
               AnnotGPL = TRUE, 
               destdir = "COAD_Data")
# Found 2 file(s)
# GSE166427-GPL13534_series_matrix.txt.gz
# Annotation GPL not available, so will use submitter GPL instead
# |--------------------------------------------------|
# |==================================================|
# |--------------------------------------------------|
# |==================================================|
# GSE166427-GPL13667_series_matrix.txt.gz
# Annotation GPL not available, so will use submitter GPL instead

字符串
注意,目录名不应该有空格,所以实际上应该使用"COAD_Data"

dir('COAD_Data')  ## list directory
# [1] "GPL13534.soft.gz"                       
# [2] "GPL13667.soft.gz"                       
# [3] "GSE166427-GPL13534_series_matrix.txt.gz"
# [4] "GSE166427-GPL13667_series_matrix.txt.gz"

相关问题