csv 格式化Pandas数据框的打印输出

5lhxktic  于 2022-12-06  发布在  其他
关注(0)|答案(1)|浏览(158)

我正在建立一个程序来跟踪我的雇员。我有一个CSV文件来跟踪信息。我试图循环并打印那些没有end_date的雇员行--即那些仍在工作的雇员。我已经能够打印正确的行,但格式不是我希望的那样,这是在行中。下面是我的CSV文件的一个例子:

csv = [employee_id,name,address,Phone,date_of_birth,job_title,start_date,end_date
      1,Arya,New York,1234567890,1/1/1970,lecturer,1/1/2021,10/20/2022
      2,Terri,New York,25151521,010109,Nurse,10/10/2022,
      42,Bill,New York,2314,09/10/1994,Teacher,10/14/2022,
      48,Steve,New York,454554,08/10/1994,Teacher,02/25/2022,
      9,Stephen,New York,526415252,10/08/1994,Teacher,10/15/2022,N/A]

下面是我正在运行的程序:

df2 = pd.read_csv('employees.csv')

print()
for index, row in df2.iterrows():
    if ((len(str(row['end_date'])) <= 3)):
        print(df2.loc[index])
    else:
        continue
print()

每行的打印输出如下所示(以下内容的倍数):

employee_id             8
name                 Bill
address          New York
phone               25235
date_of_birth      081019
job_title        Engineer
start_date         081019
end_date              NaN
Name: 2, dtype: object

但是,我希望打印输出看起来像开始的csv,但只显示在'end_date'列中没有值的人的行,如下所示:

[employee_id,name,address,Phone,date_of_birth,job_title,start_date,end_date
2,Terri,New York,25151521,010109,Nurse,10/10/2022,
42,Bill,New York,2314,09/10/1994,Teacher,10/14/2022,
48,Steve,New York,454554,08/10/1994,Teacher,02/25/2022,

我不想使用df.drop,因为我想保留每个人记录。

3bygqnnd

3bygqnnd1#

这应该可以

import pandas as pd
import numpy as np

df = {'employee_id': {0: 1, 1: 2, 2: 42, 3: 48, 4: 9}, 'name': {0: 'Arya', 1: 'Terri', 2: 'Bill', 3: 'Steve', 4: 'Stephen'}, 'address': {0: 'New York', 1: 'New York', 2: 'New York', 3: 'New York', 4: 'New York'}, 'Phone': {0: '1234567890', 1: '25151521', 2: '2314', 3: '454554', 4: '526415252'}, 'date_of_birth': {0: '1/1/1970', 1: '010109', 2: '09/10/1994', 3: '08/10/1994', 4: '10/08/1994'}, 'job_title': {0: 'lecturer', 1: 'Nurse', 2: 'Teacher', 3: 'Teacher', 4: 'Teacher'}, 'start_date': {0: '1/1/2021', 1: '10/10/2022', 2: '10/14/2022', 3: '02/25/2022', 4: '10/15/2022'}, 'end_date': {0: '10/20/2022', 1: '', 2: '', 3: '', 4: 'N/A'}}

df['end_date'] = df['end_date'].replace(['N/A',''], np.nan)

#prints only rows with null values in end_date
df[df['end_date'].isna()]

此外,还有一种更简单的方法来获得每个员工的垂直打印输出:

for e in df['employee_id']:
    df[df['employee_id']==e].transpose()

#output
employee_id             1
name                 Arya
address          New York
Phone          1234567890
date_of_birth    1/1/1970
job_title        lecturer
start_date       1/1/2021
end_date       10/20/2022

相关问题