我有以下 Dataframe :
# Creating Data_One_df
data_one = {
"Date": ["12/1/2022"] * 7,
"Time": ["6:30", "6:35", "6:40", "6:45", "6:50", "6:55", "7:00"],
"Bars_left": [6, 5, 4, 3, 2, 1, 0],
"6:30": [""] * 7,
"6:35": [""] * 7,
"6:40": [""] * 7,
"6:45": [""] * 7,
"6:50": [""] * 7,
"6:55": [""] * 7,
"7:00": [""] * 7,
}
Data_One_df = pd.DataFrame(data_one)
# Creating Data_Two_df
data_two = {
"Time": ["6:30", "6:35", "6:40", "6:45", "6:50", "6:55", "7:00"],
"Fcst": [1, 2, 1, 1, 2, 1, 1],
}
Data_Two_df = pd.DataFrame(data_two)
我想填充Data_One_df
(输入表)的列,方法是将"Time"列的行值与Data_Two_df
(查找表)中的"Time"列匹配,并将"Fcst"的累计和的偏移量(向下1行)返回到"Input table"的偏移量(向右1列)。
例如:对于Data_One_df
中的行6:30,我想在Data_Two_df
('lookup table')中找到对应的行(即6:30),然后偏移(即6:35)。接下来,计算从6:35开始的累计和,即8。将结果放入'input table'的列(6:35)中。然后,我想在所有列中重复此值。
以下是预期结果:
data_three = {
"Date": ["12/1/2022"] * 7,
"Time": ["6:30", "6:35", "6:40", "6:45", "6:50", "6:55", "7:00"],
"Bars_left": [6, 5, 4, 3, 2, 1, 0],
"6:30": ["NaN", "NaN", "NaN", "NaN", "NaN", "NaN", "NaN"],
"6:35": [8, "NaN", "NaN", "NaN", "NaN", "NaN", "NaN"],
"6:40": [8, 6, "NaN", "NaN", "NaN", "NaN", "NaN"],
"6:45": [8, 6, 5, "NaN", "NaN", "NaN", "NaN"],
"6:50": [8, 6, 5, 4, "NaN", "NaN", "NaN"],
"6:55": [8, 6, 5, 4, 2, "NaN", "NaN"],
"7:00": [8, 6, 5, 4, 2, 1, "NaN"],
}
Desired_df = pd.DataFrame(data_three)
我尝试了许多方法,但都无济于事。最新的代码如下:
# Get a list of time columns
time_cols = Data_One_df.columns[3:]
# Loop through the time columns and update the values
# in Data_One_df based on Data_Two_df
for i, col in enumerate(time_cols):
# Get the corresponding forecast value from Data_Two_df
fcst = Data_Two_df["Fcst"][i]
# Update the values in Data_One_df based on the forecast
Data_One_df[col] = [np.nan] * fcst + list(Data_One_df[col][fcst:])
1条答案
按热度按时间1sbrub3j1#
下面是使用Pandas shift和cumsum执行此操作的一种方法:
然后: