Python基于单独的查找替换df中的现有列值列上的代码/值(df)KeyError.get_loc

mqkwyuun 于 2022-10-30 发布在 Python

关注(0)|答案(1)|浏览(183)

问题

根据单独的查找类型操作替换目标数据框中的现有列值，以匹配另一个单独的源数据框中的“代码/值”列，并进行更新，例如，将目标数据框列中的文本替换为源数据的“代码/值”。基本上就是将类似于“10”的内容替换为“您的全名”。

代码尝试时出现键错误

此操作引发了键错误。

countynames.set_index('CountyCode')
employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])

潜在解决方案构想

类似于让apply（）函数在dataframe 'countynames'中查找雇员['County. Code']，并用countynames ['Value']替换/覆盖/更新现有雇员['County. Code']。
寻找替代方法，因为我的第一次尝试导致了KeyError。


### potential approach 1:
employee['County.Code'] = countynames.apply(lambda x: employee.loc[x['County.Code'], x['Value']], axis=1)
### potential approach 2:
employee['County.Code']<- lapply(employee, function(x) look$class[match(x, look$CountyCode)])

实验代码

employee = pd.read_csv("employee_data.csv")
countynames = pd.read_csv("County Codes.csv")
employee['County.Code']
0    34
1    34
2    34
3    34
4    55
Name: County.Code, dtype: int64

源，查找数据框：

countynames.head()
    CountyCode  Value
0   1   Alameda
1   2   Alpine
2   3   Amador
3   4   Butte
4   5   Calaveras

错误：键错误

在列上引发错误.get_loc（item）

KeyError                                  Traceback (most recent call last)
Input In [410], in <cell line: 2>()
      1 countynames.set_index('CountyCode')
----> 2 employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:4602, in DataFrame.lookup(self, row_labels, col_labels)
   4600     result = np.empty(n, dtype="O")
   4601     for i, (r, c) in enumerate(zip(row_labels, col_labels)):
-> 4602         result[i] = self._get_value(r, c)
   4604 if is_object_dtype(result):
   4605     result = lib.maybe_convert_objects(result)
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3615, in DataFrame._get_value(self, index, col, takeable)
   3612     series = self._ixs(col, axis=1)
   3613     return series._values[index]
-> 3615 series = self._get_item_cache(col)
   3616 engine = self.index._engine
   3618 if not isinstance(self.index, MultiIndex):
   3619     # CategoricalIndex: Trying to use the engine fastpath may give incorrect
   3620     #  results if our categories are integers that dont match our codes
   3621     # IntervalIndex: IntervalTree has no get_loc
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3931, in DataFrame._get_item_cache(self, item)
   3926 res = cache.get(item)
   3927 if res is None:
   3928     # All places that call _get_item_cache have unique columns,
   3929     #  pending resolution of GH#33047
-> 3931     loc = self.columns.get_loc(item)
   3932     res = self._ixs(loc, axis=1)
   3934     cache[item] = res
File ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance)
   3621     return self._engine.get_loc(casted_key)
   3622 except KeyError as err:
-> 3623     raise KeyError(key) from err
   3624 except TypeError:
   3625     # If we have a listlike key, _check_indexing_error will raise
   3626     #  InvalidIndexError. Otherwise we fall through and re-raise
   3627     #  the TypeError.
   3628     self._check_indexing_error(key)
KeyError: 1

python

来源：https://stackoverflow.com/questions/74248845/python-replace-existing-column-value-in-df-based-on-separate-lookup-code-value

1条答案

按热度按时间

hsvhsicv1#

没有数据总是很难。
但试试看：

employee['County.Code'].replace(countynames.set_index("CountyCode")["Value"].to_dict(), inplace=True)

赞(0）回复(0）举报 2022-10-30

我来回答

Python基于单独的查找替换df中的现有列值列上的代码/值(df)KeyError.get_loc

1条答案

相关问题

热门标签

最新问答