我正在读取一个来自hdfs的文件,不断出现以下错误: TypeError: 'int' object is not subscriptable
csv文件:
CLAIM_NUM,BEN_ST,AGE,MEDICAL_ONLY_IND,TTL_MED_LOSS,TTL_IND_LOSS,TTL_MED_EXP,TTL_IND_EXP,BP_CD,NI_CD,legalrep,depression,cardiac,diabetes,hypertension,obesity,smoker,subabuse,arthritis,asthma,CPT_codes,D,P,NDC_codes
123456789,IL,99,1,2201.26,0,97.16,0,31,4,1,0,0,0,0,0,0,0,0,0,NA,8409~71941,NA,NA
987654321,AL,98,1,568.12,0,20.82,0,42,52,1,0,0,0,0,0,0,0,0,0,NA,7242~8472~E9273,NA,NA
我的代码:
with hdfs.open("/user/ras.csv") as f:
reader = f.read()
for i, row in enumerate(reader, start=1):
root = ET.Element('cbcalc')
icdNode = ET.SubElement(root, "icdcodes")
for code in row['D'].split('~'):
ET.SubElement(icdNode, "code").text = code
ET.SubElement(root, "clientid").text = row['CLAIM_NUM']
ET.SubElement(root, "state").text = row['BEN_ST']
ET.SubElement(root, "country").text = "US"
ET.SubElement(root, "age").text = row['AGE']
ET.SubElement(root, "jobclass").text = "1"
ET.SubElement(root, "fulloutput").text ="Y"
cfNode = ET.SubElement(root, "cfactors")
for k in ['legalrep', 'depression', 'diabetes',
'hypertension', 'obesity', 'smoker', 'subabuse']:
ET.SubElement(cfNode, k.lower()).text = str(row[k])
psNode = ET.SubElement(root, "prosummary")
psicdNode = ET.SubElement(psNode, "icd")
for code in row['P'].split('~'):
ET.SubElement(psNode, "code").text = code
psndcNode = ET.SubElement(psNode, "ndc")
for code in row['NDC_codes'].split('~'):
ET.SubElement(psNode, "code").text = code
cptNode = ET.SubElement(psNode, "cpt")
for code in row['CPT_codes'].split('~'):
ET.SubElement(cptNode, "code").text = code
ET.SubElement(psNode, "hcpcs")
doc = ET.tostring(root, method='xml', encoding="UTF-8")
response = requests.post(target_url, data=doc, headers=login_details)
response_data = json.loads(response.text)
if type(response_data)==dict and 'error' in response_data.keys():
error_results.append(response_data)
else:
api_results.append(response_data)
我需要做什么更改,以便循环浏览csv文件并将数据转换为xml格式以进行api调用?
我已经用python测试了这段代码,它似乎可以工作,但是一旦我把我的文件hdfs放进去,它就开始崩溃了。
1条答案
按热度按时间q9rjltbz1#
问题是(可能;我没有安装这个库)
f.read()
正在返回bytes对象。如果你迭代它(使用enumerate
例如)您将检查int
s(文件的每个字符一个,取决于上下文),而不是任何类型的结构化“行”对象。在开始要写入的循环之前,需要进行额外的处理。
像这样的事情可能会做你想做的: