我的输出不会导出到Excel。我不知道我的Python脚本中缺少什么。我已经安装了openpyx1包。
该脚本在我合并两个不同的数据集时工作,但我无法访问输出。
import pandas as pd
import recordlinkage
al_usa = pd.read_csv('va_reference.csv', index_col = 'id')
al_sample = pd.read_csv('state_va.csv', index_col = 'uei', low_memory=False)
al_sample = al_sample.rename(columns={'companyname': 'sample_companyname'})
indexer = recordlinkage.Index()
indexer.full()
candidates = indexer.index(al_usa, al_sample)
print(len(candidates))
compare = recordlinkage.Compare()
compare.string('companyname',
'sample_companyname',
threshold=0.85,
label='business')
features = compare.compute(candidates, al_usa, al_sample)
match_counts = features.sum(axis=1).value_counts().sort_index(ascending=False)
print(match_counts)
# Save the reordered features to an Excel file
output_file = 'sorted_results.xlsx'
sorted_features.to_excel(output_file)
下面是错误代码:名称错误:名称'sorted_features'未定义
1条答案
按热度按时间guicsvcw1#
您从未定义变量sorted_features您可以尝试使用features变量来代替