我有两组数据,看起来像这样(这是一个非常小的子集)。
data1 <- data.frame("Metal" = c("Al", "Al", "Al", "Al", "Al", "Al", "Al"), "Type" =
c("F", "F", "F", "F", "F", "F", "F"), "Date" = c("2000-01-01", "2000-01-01", "2000-
01-02", "2000-01-03",
"2000-01-03", "2000-01-07", "2000-01-07"), "Time" = c("11:00:00", "12:00:00",
"15:00:00", "13:00:00", "17:00:00", "20:00:00", "20:00:00"), "Value" = c(100, 200,
300, 100, 400, 500, 500))
data2 <- data.frame("Metal" = c("Al", "Al", "Al", "Al", "Al", "Al", "Al"), "Type" =
c("P", "P",
"P", "P", "P",
"P", "P"), "Date" = c("2000-01-01", "2000-01-01", "2000-01-01", "2000-01-03", "2000-
01-03",
"2000-01-04", "2000-01-07"), "Time" = c("11:00:00", "11:00:00", "14:00:00",
"17:00:00", "13:00:00", "16:00:00", "20:00:00"), "Value" = c(100, 100, 200, 900, 100,
400, 999))
我希望保留两个表中具有相同日期和时间的数据,并创建一个新表(data 3)。有时data 1和data 2之间会有重复,我不希望data 3包含重复,只有其中的一个,并与它的对从其他表。我还希望输出表被排序,以显示对从每个表下对方(因此,我的“Type”列将交替使用F、P、F、P等)。
以下是我所需的输出
data3 <- data.frame("Metal" = c("Al", "Al", "Al", "Al", "Al",
"Al", "Al", "Al"), "Type" = c("F", "P", "F",
"P", "F", "P", "F", "P"), "Date" = c("2000-01-01", "2000-01-01",
"2000-01-03", "2000-01-03", "2000-01-03", "2000-01-03", "2001-01-
07", "2001-01-07"), "Time" =
c("11:00:00", "11:00:00", "13:00:00",
"13:00:00", "17:00:00", "17:00:00", "20:00:00", "20:00:00"),
"Value" = c(100, 100, 100, 100, 400, 900, 500, 999))
我试过使用dplyr中的各种类型的连接,但是它们都不是我想要的连接方式。
谢谢你的帮助!!
5条答案
按热度按时间wribegjk1#
我们可能需要绑定数据,然后在分组后
filter
出重复项knsnq2tg2#
这并不容易:-)
7qhs6swi3#
inner_join
的一种方法这里的困难在于获得正确的格式,纯粹的数据过滤器本身是在inner_join之后完成的。
整理数据
3hvapo4f4#
有一些使用tidyverse包的方法应该可以用于排序,这些方法与Beeflight31的解决方案的一部分合并在一起:
jjhzyzn05#
编辑:这就是你想要的吗?