pandas 迭代CSV标题并创建字典的python列表

js5cn81o 于 2023-03-11 发布在 Python

关注(0)|答案(4)|浏览(152)

我正在尝试创建一个遍历CSV标题的字典列表。对于每一列，我们需要检查空单元格并将字段更新为True/False
我已经加载了csv到Pandas Dataframe ，并创建了一个列的列表。
输入CSV数据：
| 身份证|姓名|位置|
| - ------|- ------|- ------|
| 1个|亨利|伦敦|
| 第二章|乔|秘鲁|
| 三个|钠氮|德国|
| 四个|史密斯|钠氮|
输出：

item=列名
序列= i*2
is_null =对于任何空单元格，为True，否则为False

[{项目：ID为空：假序列：2版本：完成} {项目：名称新：名称为空（_N）：真实序列：4个版本：完成} {项目：位置为空（_N）：真实序列：6个版本：完成}]
更新输出：1.增加额外字段版本：2.将大写列转换为小写添加附加字段“new”需要帮助来迭代标题并获得输出。我仍在学习，如果有任何错误请纠正我：）
新输出：

pandas

来源：https://stackoverflow.com/questions/75694684/iterate-csv-header-and-create-a-python-list-of-dictionary

4条答案

按热度按时间

6ljaweal1#

你的问题不清楚，但我想你可能想：

out = (
 df.isna().any()                # check if any NaN per col
   .rename_axis('item')         # set index name
   .reset_index(name='is_null') # set flag name
   .assign(seq=lambda x: range(2, 2*len(x)+2, 2)) # assign counter * 2
   .to_dict('records')          # convert to dictionary
)

或者，使用列表解析：

out = [{'item': col, 'is_null': df[col].isna().any(), 'seq': i*2}
       for i, col in enumerate(df, start=1)]

或者：

out = [{'item': col, 'is_null': flag, 'seq': i*2}
       for i, (col, flag) in enumerate(df.isna().any().items(), start=1)]

输出：

[{'item': 'id', 'is_null': False, 'seq': 2},
 {'item': 'name', 'is_null': True, 'seq': 4},
 {'item': 'location', 'is_null': True, 'seq': 6}]

赞(0）回复(0）举报 2023-03-11

to94eoyn2#

你可以使用列表解析来实现这个目的：

[{"item": c, "is_null": forms[c].isnull().values.any(), "seq": i * 2} for i, c in enumerate(df.columns, start=1)]

它循环遍历每个列，为item键设置列名，检查该列中是否存在任何空值，并将其设置为is_null，最后设置为seq

赞(0）回复(0）举报 2023-03-11

eqqqjvef3#

首先，我认为你应该在问题沿着加上某种代码，这表明你已经尝试过了。
至于你的问题，你能做的是-
1.使用df.columns得到 Dataframe 中的列列表，并将其转换为list。
1.然后遍历这个列表，你会得到你的item键的值，seq使用简单的方法，对于你的is_null，只要做-df[col].isnull（）.values.any（）[所以如果任何一个值为null，这将返回True否则返回False].
让我知道这是否有帮助，然后你可以通过张贴一些尝试编辑，如果你仍然没有得到它，我们可以看到的代码
编辑：我已经给出了一个简单的迭代方法，但是mozway回答的第一个解决方案绝对是最好的方法。

赞(0）回复(0）举报 2023-03-11

yk9xbfzb4#

类似下面这样的方法会起作用。

import pandas as pd

# Dataframe
df = pd.DataFrame({'col1': [1, 2, 3, None, 5], 
                   'col2': ['a', 'b', 'c', None, 'e'], 
                   'col3': [True, False, True, True, None]})

# create a list of dictionaries with NaN value status and column index
null_list = [{'item': col, 'is_null': df[col].isnull().any(), 'seq': i*2} 
             for i, col in enumerate(df.columns)]

print(null_list)

输出：

[
 {'item': 'col1', 'is_null': True, 'seq': 0},
 {'item': 'col2', 'is_null': True, 'seq': 2},
 {'item': 'col3', 'is_null': True, 'seq': 4}
]

赞(0）回复(0）举报 2023-03-11

我来回答

pandas 迭代CSV标题并创建字典的python列表

4条答案

相关问题

热门标签

最新问答