在R中阅读.csv文件后,所有数据都在一列中...我该如何分离?

v6ylcynt  于 2023-03-27  发布在  其他
关注(0)|答案(2)|浏览(325)

我并不总是需要在stackoverflow上问一个问题,但我根本无法找到任何适当的解决方案。我试图在R上读取一个.csv文件,每次打开时,所有的数据都出现在一列中。我尝试了所有方法,但我无法分离数据。我很高兴能得到一些帮助〈3这是我试图获取的数据的链接:https://data.worldbank.org/indicator/NY.GDP.PCAP.CD?end=2019&start=1990
我已经试过在Excel中分离所有列,并尝试了几个单独的命令,但我觉得这个解决方案太愚蠢了。

fykwrbwg

fykwrbwg1#

我建议你通过世界银行API以编程方式下载数据。你需要安装WDI包。

gdp <- WDI::WDI(
  indicator = "NY.GDP.PCAP.CD",
  start = 1990,
  end = 2019
)

utils::head(gdp)
#>                       country iso2c iso3c year NY.GDP.PCAP.CD
#> 1 Africa Eastern and Southern    ZH   AFE 2019       1512.271
#> 2 Africa Eastern and Southern    ZH   AFE 2018       1564.734
#> 3 Africa Eastern and Southern    ZH   AFE 2017       1628.587
#> 4 Africa Eastern and Southern    ZH   AFE 2016       1443.692
#> 5 Africa Eastern and Southern    ZH   AFE 2015       1538.552
#> 6 Africa Eastern and Southern    ZH   AFE 2014       1719.184

创建于2023-03-21带有reprex v2.0.2

ogsagwnx

ogsagwnx2#

我假设您正在加载主数据文件,而不是与之相伴的两个元数据文件(尽管这些文件值得检查,以确保您正确地解释了数据)。
如果我们在文本编辑器或RStudio“import dataset”向导中查看数据,我们可以看到文件的前几行有一些元数据,但大部分数据从第5行开始。

"Data Source","World Development Indicators",

"Last Updated Date","2023-03-01",

"Country Name","Country Code","Indicator Name","Indicator Code","1960","1961","1962","1963","1964","1965","1966","1967","1968","1969","1970","1971","1972","1973","1974","1975","1976","1977","1978","1979","1980","1981","1982","1983","1984","1985","1986","1987","1988","1989","1990","1991","1992","1993","1994","1995","1996","1997","1998","1999","2000","2001","2002","2003","2004","2005","2006","2007","2008","2009","2010","2011","2012","2013","2014","2015","2016","2017","2018","2019","2020","2021",
"Aruba","ABW","GDP per capita (current US$)","NY.GDP.PCAP.CD","","","","","","","","","","","","","","","","","","","","","","","","","","","6283.00144344602","7567.25364168664","9274.51415613905","10767.3962204623","11638.7337057728","12850.2157123975","13657.6706444765","14970.1523419526","16675.2784883673","17140.4333687405","17375.2253063755","18713.4253880988","19742.3167386832","19833.8267458639","21023.1575127316","20913.2994971137","21377.0951851076","22050.8309318377","24104.6461765229","24975.6732567007","25833.445623022","27665.4264651752","29011.5592450359","25739.1372506975","24452.9283634427","26044.4359333351","25609.9557239373","26515.678080228","26942.3079764655","28421.3864931862","28451.2737445083","29326.7080582111","30220.5945232395","31650.7605367511","24487.8635601966","29342.1008575886",

大多数CSV阅读器默认使用第一行作为标题,但在这种情况下,第一行只有两个带标签的列。要读取所有标题都从第5行开始的数据,我们可以跳过前4行。

df <- readr::read_csv("~/API_NY.GDP.PCAP.CD_DS2_en_csv_v2_5182294.csv", 
    skip = 4)

相关问题