Pandas -创建新列并为新列赋值

6uxekuva  于 2023-03-28  发布在  其他
关注(0)|答案(1)|浏览(98)

我想寻求帮助,在分配文件名值(如file1.txt)到一个新的列(如文件名)。然而,我坚持在部分创建新的列,并分配与文件名的值。没有新的列被视为当我导出为. csv。感谢如果可以建议我的逻辑是否是错误的。

-raw text (no column names)-

file1.txt -> AL; 1A;
file1.txt -> BL; 2A;
file1.txt -> CL; 3A;

-sample file path - C:\Users\CL\Desktop\folder\file1.txt-

-desired output (add filename col)-

name class filename
AL   1A    file1.txt
BL   2A    file2.txt
CL   3A    file3.txt
-current progress-

import os
import pandas as pd
import glob

src_path = r'C:\Users\CL\Desktop\folder' #3 files total

for f in glob.glob(os.path.join(src_path ,"*.txt")):
    filename = f #file1.txt
    files = [pd.read_csv(f, delimiter=';', names = ['name', 'class'], index_col = False)]
    files_df = pd.DataFrame(files) #convert to df to add new column
    files_df['filename'] = f #assign value to new column

files_df = pd.concat(files_df) #concat all file data together
files_df.to_csv("df.csv")
---update---

#trying to include index value for each row's data.

#desired output
name class filename
AL   1A    file1_1.txt
AL   1A    file1_2.txt
BL   2A    file2_1.txt
BL   2A    file2_2.txt
CL   3A    file3_1.txt
CL   3A    file3_2.txt

import os
import pandas as pd
import glob

i = 0

src_path = r'C:\Users\CL\Desktop\folder' #3 files total

pd.concat([pd.read_csv(f, delimiter=';', names=['name', 'class'], index_col=False
                       ).assign(filename=f)
           for i, f in enumerate(glob.glob(os.path.join(src_path ,"*.txt"))), i+=1]
          ).to_csv("df.csv")
cbjzeqam

cbjzeqam1#

未经测试,但我想你的代码应该改为:

import os
import pandas as pd
import glob

src_path = r'C:\Users\CL\Desktop\folder' #3 files total

all_dfs = []

for f in glob.glob(os.path.join(src_path ,"*.txt")):
    tmp_df = pd.read_csv(f, delimiter=';', names=['name', 'class'], index_col=False)
    all_dfs.append(tmp_df.assign(filename=f))

files_df = pd.concat(all_dfs) #concat all file data together
files_df.to_csv("df.csv")

作为“一句话”:

import os
import pandas as pd
import glob

src_path = r'C:\Users\CL\Desktop\folder' #3 files total

pd.concat([pd.read_csv(f, delimiter=';', names=['name', 'class'], index_col=False
                       ).assign(filename=f)
           for f in glob.glob(os.path.join(src_path ,"*.txt"))]
          ).to_csv("df.csv")

相关问题