一次读取CSV两行,合并值,写入新CSV

6qftjkof  于 2022-12-06  发布在  其他
关注(0)|答案(3)|浏览(203)

如何一次读取两行输入CSV,将一些值合并到新的一行中,然后将该行写入新的CSV?
在下面的输入中,我想读取两行,从第二行中取出price 1,并在一个新的组合行中将其设置为price 2,然后对接下来的两行重复此操作:

输入

date, name, qt, price1
9/12/22, AB, 2, 5.00
9/12/22, AB, 2, 5.08
9/12/22, BC, 1, 2.00
9/12/22, BC, 1, 2.03

新csv

第一次
缺少Adrian的答案:

  • 导入csv
  • #换行需要python 3
  • csv必须是UTF-8,并且标题必须匹配
bnlyeluc

bnlyeluc1#

cat price_old.csv                                                                                                                                                         
date,name,qt,price1
9/12/22,AB,2,5.00
9/12/22,AB,2,5.08
9/12/22,BC,1,2.00
9/12/22,BC,1,2.03 

with open("price_old.csv") as old_csv:
    new_list = []
    dt, nm = None, None
    c_reader = csv.DictReader(old_csv)
    for row in c_reader:
        if row['date'] != dt and row['name'] != nm:
            dt, nm = row['date'], row['name']
            price_list = [row['price1']]
        else:
            price_list.append(row['price1'])
            row.update({'price1': price_list[0], 'price2': price_list[1]})
            new_list.append(row)
            price_list = []
            dt, nm = None, None
    with open('price_new.csv', 'w', newline='') as new_csv:
        c_writer = csv.DictWriter(new_csv, fieldnames= ['date', 'name','qt', 'price1', 'price2'])
        c_writer.writeheader()
        c_writer.writerows(new_list)

cat price_new.csv

date,name,qt,price1,price2
9/12/22,AB,2,5.00,5.08
9/12/22,BC,1,2.00,2.03
e4eetjau

e4eetjau2#

使用pandas会让你的生活更轻松。

import pandas as pd

df = pd.read_csv('test.csv', delimiter=', ')

# Group by the first 3 columns, and use the rows to form the columns (unstack)
df = df.groupby(['date','name','qt'])['price1'].apply(
    lambda df: df.reset_index(drop=True)).unstack().reset_index()

# Rename the columns
df.columns = [*df.columns[:-2], 'price1', 'price2']

df.to_csv('output.csv', sep=',', index=False)

output.csv为:

date,name,qt,price1,price2
9/12/22,AB,2,5.0,5.08
9/12/22,BC,1,2.0,2.03
ryevplcw

ryevplcw3#

pythonscsv阅读器可以轻松地一次读取任意数量的行。
读取器(包括csv.reader和csv.DictReader)都是迭代器,这意味着我们可以使用next()函数手动获取一行,就像我们可以使用elem = next(my_iterator)获取任何迭代器中的下一个元素一样:

import csv
import pprint

new_rows = []
with open("input.csv", newline="") as f:
    reader = csv.reader(f)

    header = next(reader)
    new_rows.append(header + [" price 2"])

    row1 = next(reader)
    row2 = next(reader)
    new_rows.append(row1 + row2[:-1])

    row1 = next(reader)
    row2 = next(reader)
    new_rows.append(row1 + row2[:-1])

pprint.pprint(new_rows)

为我们提供:

[
    ['date',    ' name', ' qt', ' price1', ' price 2'],
    ['9/12/22',   ' AB',  ' 2',   ' 5.00',    ' 5.08'],
    ['9/12/22',   ' BC',  ' 1',   ' 2.00',    ' 2.03']
]

这种方法可以通过一个for循环来驱动迭代,for循环总是给予当前行组中的“第一行”,在循环中,我们需要该组中的所有后续行:

with open("input.csv", newline="") as f:
    reader = csv.reader(f)

    header = next(reader)
    new_rows.append(header + [" price 2"])

    for row1 in reader:
        row2 = next(reader)
        new_rows.append(row1 + row2[-1:])

这种方法假设你的输入有一个头,然后行计数是组大小的偶数倍。如果你的输入不符合这一点,你会得到一个StopIteration异常,看起来像这样(在Python 3.11中):

File "/Users/zyoung/develop/StackOverflow/./main.py", line 31, in <module>
    row2 = next(reader)
           ^^^^^^^^^^^^
StopIteration

如果可能的话,可以添加异常处理:

for row1 in reader:
    try:
        row2 = next(reader)
    except StopIteration as e:
        print(f"encountered an odd row on line {reader.line_num}; stopping reading, moving on to writing")
        break

最后,我们可以将此概念扩展到逻辑组中的任意行数,例如每组3行:
一个

相关问题