为什么我在使用tidyverse计算我的数据中按国家-年度-部门的增长率时得到的是零?

3xiyfsfu  于 2023-06-03  发布在  其他
关注(0)|答案(1)|浏览(201)

我想在以下数据集中计算每个国家/地区的增长率:

> sapply(sa1, class)
     country         year       sector sector_share 
    "factor"    "numeric"     "factor"    "numeric" 
> print(sa1)
               country year        sector sector_share
1   Sub-Saharan Africa 1981   agriculture    15.724457
2   Sub-Saharan Africa 1982   agriculture    16.165780
3   Sub-Saharan Africa 1983   agriculture    15.908671
4   Sub-Saharan Africa 1984   agriculture    17.593971
5   Sub-Saharan Africa 1985   agriculture    19.428871
6   Sub-Saharan Africa 1986   agriculture    19.593291
7   Sub-Saharan Africa 1987   agriculture    19.789807
8   Sub-Saharan Africa 1988   agriculture    20.597277
9   Sub-Saharan Africa 1989   agriculture    19.933259
10  Sub-Saharan Africa 1990   agriculture    19.790467

42  Sub-Saharan Africa 1981      industry    35.516119
43  Sub-Saharan Africa 1982      industry    32.407578
44  Sub-Saharan Africa 1983      industry    32.303477
45  Sub-Saharan Africa 1984      industry    30.437994
46  Sub-Saharan Africa 1985      industry    30.544564
47  Sub-Saharan Africa 1986      industry    29.458658
48  Sub-Saharan Africa 1987      industry    29.490104
49  Sub-Saharan Africa 1988      industry    29.009534
50  Sub-Saharan Africa 1989      industry    29.340000
51  Sub-Saharan Africa 1990      industry    29.698078
52  Sub-Saharan Africa 1991      industry    28.727260

83  Sub-Saharan Africa 1981 manufacturing    18.419694
84  Sub-Saharan Africa 1982 manufacturing    17.895412
85  Sub-Saharan Africa 1983 manufacturing    18.037958
86  Sub-Saharan Africa 1984 manufacturing    16.316419
87  Sub-Saharan Africa 1985 manufacturing    16.256940
88  Sub-Saharan Africa 1986 manufacturing    15.728073
89  Sub-Saharan Africa 1987 manufacturing    15.825253
90  Sub-Saharan Africa 1988 manufacturing    16.320170
91  Sub-Saharan Africa 1989 manufacturing    16.062034
92  Sub-Saharan Africa 1990 manufacturing    16.134401
93  Sub-Saharan Africa 1991 manufacturing    15.826331

124 Sub-Saharan Africa 1981      services    44.946512
125 Sub-Saharan Africa 1982      services    46.323757
126 Sub-Saharan Africa 1983      services    46.071141
127 Sub-Saharan Africa 1984      services    45.820815
128 Sub-Saharan Africa 1985      services    43.226268
129 Sub-Saharan Africa 1986      services    43.409858
130 Sub-Saharan Africa 1987      services    44.298582
131 Sub-Saharan Africa 1988      services    43.191570
132 Sub-Saharan Africa 1989      services    43.023115
133 Sub-Saharan Africa 1990      services    44.043939
134 Sub-Saharan Africa 1991      services    44.995853

我使用以下代码:

sa1 <- sa1 %>%
  group_by(country, year, sector) %>%
  arrange(year) %>%
  mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))

但是我得到了零,这不应该是零,因为在sector_share列中没有NA。

> print(sa1)
# A tibble: 164 × 5
# Groups:   country, year, sector [164]
   country             year sector        sector_share growth_rate
   <fct>              <dbl> <fct>                <dbl>       <dbl>
 1 Sub-Saharan Africa  1981 agriculture           15.7           0
 2 Sub-Saharan Africa  1981 industry              35.5           0
 3 Sub-Saharan Africa  1981 manufacturing         18.4           0
 4 Sub-Saharan Africa  1981 services              44.9           0
 5 Sub-Saharan Africa  1982 agriculture           16.2           0
 6 Sub-Saharan Africa  1982 industry              32.4           0
 7 Sub-Saharan Africa  1982 manufacturing         17.9           0
 8 Sub-Saharan Africa  1982 services              46.3           0
 9 Sub-Saharan Africa  1983 agriculture           15.9           0
10 Sub-Saharan Africa  1983 industry              32.3           0
# ℹ 154 more rows
# ℹ Use `print(n = ...)` to see more rows

我试着计算增长率,但结果是零。这是没有意义的,因为我的数据在sector_share列中没有NA,我甚至在代码中进行检查以防万一。
有人能帮帮我吗谢谢你!

nzk0hqpo

nzk0hqpo1#

由于您按year分组,因此您的计算一次只能“看到”一年,因此无法计算多年的增长。所以不要按年份分组:

library(dplyr)

sa1 %>%
  group_by(country, sector) %>%
  arrange(year) %>%
  mutate(growth_rate = ifelse(!is.na(lag(sector_share)), (sector_share / lag(sector_share) - 1) * 100, 0))
# A tibble: 43 × 5
# Groups:   country, sector [4]
   country  year sector        sector_share growth_rate
   <chr>   <int> <chr>                <dbl>       <dbl>
 1 Africa   1981 agriculture           15.7       0    
 2 Africa   1981 industry              35.5       0    
 3 Africa   1981 manufacturing         18.4       0    
 4 Africa   1981 services              44.9       0    
 5 Africa   1982 agriculture           16.2       2.81 
 6 Africa   1982 industry              32.4      -8.75 
 7 Africa   1982 manufacturing         17.9      -2.85 
 8 Africa   1982 services              46.3       3.06 
 9 Africa   1983 agriculture           15.9      -1.59 
10 Africa   1983 industry              32.3      -0.321
# ℹ 33 more rows

相关问题