我有一个组合频率,在一组读数上有两个条件。
Dataframe 可以在这里找到:
dput(Merchant_Category_Frequency_with_Target)
structure(list(Var1 = structure(1:31, .Label = c("Airline", "Airports",
"Alcohol", "Auto", "Books & stationery", "Business Services",
"Cloth stores", "Contracted services", "Dept stores", "Digital goods",
"Direct marketing", "Education", "Electronics", "Food", "Fuel",
"Govt services", "Home furnishing", "Hotels", "Insurance", "Medical",
"Misc Services", "Music stores", "Professional services & memberships",
"Quasi cash", "Railways", "Rent Payments", "Restaurants", "Retail",
"Transportation services", "Utility", "Wallet load"), class = "factor"),
Freq.x = c(429L, 1L, 325L, 499L, 239L, 1324L, 5242L, 38L,
3881L, 355L, 91L, 1554L, 2200L, 424L, 5588L, 1935L, 264L,
1409L, 2384L, 1789L, 971L, 23L, 505L, 5L, 1662L, 4408L, 1820L,
3135L, 1297L, 4660L, 1543L), Freq.y = c(16L, NA, 11L, 34L,
19L, 56L, 179L, 1L, 141L, 10L, 8L, 100L, 229L, 8L, 142L,
40L, 13L, 37L, 142L, 75L, 39L, NA, 18L, NA, 62L, 389L, 33L,
148L, 39L, 437L, 194L)), row.names = c(NA, -31L), class = "data.frame")
我想有一个所有读数(Var1)和两个频率的组合频率分布表,(Freq.x)应该是一种颜色的酒吧和堆叠在它上面,(Freq.y)应该是另一种颜色的酒吧。
我试着按照网上的各种教程,但他们似乎没有工作,因为这里的变量是一个字符,而不是一个数字数据。
干杯
1条答案
按热度按时间zpjtge221#
首先,您需要使用
pivot_longer
或gather
转换数据,我使用pivot_longer
:然后使用ggplot函数:
这是输出: