pandas 使用.transform()或apply()后,get_loc_level中出现密钥错误(密钥)

nlejzf6q  于 2023-01-19  发布在  其他
关注(0)|答案(1)|浏览(104)

我有一个包含多个组的大型分组数据框,我尝试在每个组中过滤行。为了简化,我将与一个组共享一个简化的数据框,但在其中出现错误。df5按"Detail", "ID", "Year"分组

data2 = {"Year":["2012","2012","2012","2012","2012","2012","2012","2012","2012"],
        "Country":['USA','USA','USA','USA','USA','USA','USA','CANADA',"CANADA"],
         "Country_2": ["", "", "", "", "", "", "", "USA", "USA"],
        "ID":["AF12","A15","BU14","DU157","L12","N10","RU156","DU157","RU156"],
         "Detail":[1,1,1,1,1,1,1,1,1],
         "Second_country_available":[False,False,False,False,False,False,False,True,True],
      
        }
df5 = pd.DataFrame(data2)
df5_true = df5["Second_country_available"] == True
Country_2_gr = df5[df5_true].groupby(["Detail", "ID", "Year"])['Country_2'].agg(
            '|'.join)
Country_2_gr
grouped_df5 = (df5.groupby(["Detail", "ID", "Year"], group_keys=False)['Country'])
filtered = grouped_df5.transform(lambda g: g.str.fullmatch(Country_2_gr[g.name]))
filtered

错误将是:

return (self._engine.get_loc(key), None)
  File "pandas\_libs\index.pyx", line 774, in pandas._libs.index.BaseMultiIndexCodesEngine.get_loc
KeyError: (1, 'A15', '2012')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "packages\pandas\core\indexes\.py", line 3045, in _get_loc_level
    raise KeyError(key) from err
KeyError: (1, 'A15', '2012')

这段代码在大多数情况下都能正常工作,所以我不想对它进行根本性的修改,我想修复一下在类似于我所展示的情况下,行将被删除的问题。

gt0wga4j

gt0wga4j1#

Country_2_gr基于过滤的 Dataframe ,因此它不会具有所有密钥,您可以尝试切换到get,默认值为:

filtered = grouped_df5.transform(lambda g: g.str.fullmatch(Country_2_gr.get(g.name, default="")))
filtered

相关问题