问题
根据单独的查找类型操作替换目标数据框中的现有列值,以匹配另一个单独的源数据框中的“代码/值”列,并进行更新,例如,将目标数据框列中的文本替换为源数据的“代码/值”。基本上就是将类似于“10”的内容替换为“您的全名”。
代码尝试时出现键错误
此操作引发了键错误。
countynames.set_index('CountyCode')
employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])
潜在解决方案构想
类似于让apply()函数在dataframe 'countynames'中查找雇员['County. Code'],并用countynames ['Value']替换/覆盖/更新现有雇员['County. Code']。
寻找替代方法,因为我的第一次尝试导致了KeyError。
### potential approach 1:
employee['County.Code'] = countynames.apply(lambda x: employee.loc[x['County.Code'], x['Value']], axis=1)
### potential approach 2:
employee['County.Code']<- lapply(employee, function(x) look$class[match(x, look$CountyCode)])
实验代码
employee = pd.read_csv("employee_data.csv")
countynames = pd.read_csv("County Codes.csv")
employee['County.Code']
0 34
1 34
2 34
3 34
4 55
Name: County.Code, dtype: int64
源,查找数据框:
countynames.head()
CountyCode Value
0 1 Alameda
1 2 Alpine
2 3 Amador
3 4 Butte
4 5 Calaveras
错误:键错误
在列上引发错误.get_loc(item)
KeyError Traceback (most recent call last)
Input In [410], in <cell line: 2>()
1 countynames.set_index('CountyCode')
----> 2 employee['County.Code'] = countynames.lookup(countynames.index, countynames['CountyCode'])
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:4602, in DataFrame.lookup(self, row_labels, col_labels)
4600 result = np.empty(n, dtype="O")
4601 for i, (r, c) in enumerate(zip(row_labels, col_labels)):
-> 4602 result[i] = self._get_value(r, c)
4604 if is_object_dtype(result):
4605 result = lib.maybe_convert_objects(result)
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3615, in DataFrame._get_value(self, index, col, takeable)
3612 series = self._ixs(col, axis=1)
3613 return series._values[index]
-> 3615 series = self._get_item_cache(col)
3616 engine = self.index._engine
3618 if not isinstance(self.index, MultiIndex):
3619 # CategoricalIndex: Trying to use the engine fastpath may give incorrect
3620 # results if our categories are integers that dont match our codes
3621 # IntervalIndex: IntervalTree has no get_loc
File ~\anaconda3\lib\site-packages\pandas\core\frame.py:3931, in DataFrame._get_item_cache(self, item)
3926 res = cache.get(item)
3927 if res is None:
3928 # All places that call _get_item_cache have unique columns,
3929 # pending resolution of GH#33047
-> 3931 loc = self.columns.get_loc(item)
3932 res = self._ixs(loc, axis=1)
3934 cache[item] = res
File ~\anaconda3\lib\site-packages\pandas\core\indexes\base.py:3623, in Index.get_loc(self, key, method, tolerance)
3621 return self._engine.get_loc(casted_key)
3622 except KeyError as err:
-> 3623 raise KeyError(key) from err
3624 except TypeError:
3625 # If we have a listlike key, _check_indexing_error will raise
3626 # InvalidIndexError. Otherwise we fall through and re-raise
3627 # the TypeError.
3628 self._check_indexing_error(key)
KeyError: 1
1条答案
按热度按时间hsvhsicv1#
没有数据总是很难。
但试试看: