将两个文件中的信息合并到一个CSV文件中

y0u0uwnf  于 2022-12-06  发布在  其他
关注(0)|答案(1)|浏览(136)

例如,第一个文件包含以冒号分隔的名称和日期:

john:01.01.2001
mary:06.03.2016

第二个文件包含名称和城市:

john:london
mary:new york

我需要顶部合并他们的名字到csv文件,这样:

name,town,date
john,london,01.01.2001
mary,new york,06.03.2016

此外,如果缺少有关人员的信息,则在输出文件中应为“-”:

name,town,date
john,-,01.01.2001
mary,new york,-
kyvafyod

kyvafyod1#

草稿。等我以后有机会我会整理一下。

cat name_date.csv                                                                                                                                                                                                       
john:01.01.2001
mary:06.03.2016
sue:

cat name_city.csv                                                                                                                                                                                                       
john:london
mary:new york
bob:

import csv

with open("name_date.csv") as dt_csv:
    new_dict = {}
    dt_dictR = csv.DictReader(dt_csv, fieldnames=["name", "date"],  delimiter=':')
    for row in dt_dictR:
        if not row["date"]:
            row["date"] = '-'
        new_dict.update({row["name"]: {"date": row["date"]}})
    with open("name_city.csv") as city_csv:
        dt_dictC = csv.DictReader(city_csv, fieldnames=["name", "city"],  delimiter=':')
        print(new_dict)
        for row in dt_dictC:
            if not row["city"]: 
                row["city"] = '-' 
            if new_dict.get(row["name"]):
                new_dict[row["name"]].update({"city": row["city"]})
            else:
                new_dict.update({row["name"]: {"date": '-', "city": row["city"]}})
    with open("merged_csv", "w", newline='') as out_file:
        csv_w = csv.writer(out_file)
        csv_w.writerow(["name","town","date"])
        for item in new_dict:
            if not new_dict[item].get("city"):
                new_dict[item]["city"] = '-'
            csv_w.writerow([item, new_dict[item]["city"], new_dict[item]["date"]])

cat merged_csv                                                                                                                                                                                                          
name,town,date
john,london,01.01.2001
mary,new york,06.03.2016
sue,-,-
bob,-,-

使用defaultdict进行一些简化:

import csv 
from collections import defaultdict

with open("name_date.csv") as dt_csv:
    def cityDateDict():
        return {"city": "-", "date": "-"}
    new_dict = defaultdict(cityDateDict)
    dt_dictR = csv.DictReader(dt_csv, fieldnames=["name", "date"],  delimiter=':')
    for row in dt_dictR:
        new_dict[row["name"]]
        if row["date"].strip():
            new_dict[row["name"]]["date"] = row["date"]
    with open("name_city.csv") as city_csv:
        dt_dictC = csv.DictReader(city_csv, fieldnames=["name", "city"],  delimiter=':')
        for row in dt_dictC:
            new_dict[row["name"]] 
            if row["city"].strip():
                new_dict[row["name"]]["city"] = row["city"] 
    with open("merged_csv", "w", newline='') as out_file:
        csv_w = csv.writer(out_file)
        csv_w.writerow(["name","town","date"])
        for item in new_dict:
            csv_w.writerow([item, new_dict[item]["city"], new_dict[item]["date"]])

defaultdict允许您使用'default'值动态构建字典。在本例中,城市/日期dict的默认值为-。然后,可以使用非空值更新相应的关键字(城市/日期)以覆盖默认值。

相关问题