csv到yaml,具有多个测试

vqlkdk9b  于 2023-02-27  发布在  其他
关注(0)|答案(2)|浏览(140)

我正在尝试编写一个python函数,可以将csv更改为yaml。现在,我的csv看起来如下所示

name,description,tests
product_id,ID of the product,not null|unique
product_name,Name of the product,not null

我希望输出为

- name : product_id
   description: ID of the product
   tests:
   - not null
   - unique
 - name: product_name
   description: Name of the product
   tests: 
   - not null

现在我只有这个了

for row_index, row in enumerate(datareader):
  if row_index == 0:
    # let's do this once here
    data_headings = list()
    for heading_index, heading in enumerate(row):
      fixed_heading = heading.lower().replace(" ", "_").replace("-", "")
      data_headings.append(fixed_heading)
      if fixed_heading == "type":
        type_index = heading_index
      elif fixed_heading == "childfields":
        child_fields_index = heading_index
  else:
    content = dict()
    is_array = False
    for cell_index, cell in enumerate(row):

     content[data_headings[cell_index]] = cell
     is_array = (cell_index == type_index) and (cell == "array")
    result.append(content)`
70gysomp

70gysomp1#

Python的标准库有一个处理CSV文件的模块。它的DictReader类假设输入文件的第一行是列名(除非你提供fieldnames参数)。使用它,你只需要对字段名'tests'做一些特殊的事情。不幸的是,它还不能处理pathlib.Path()示例,所以你必须自己打开文件。
您应该使用ruamel.yaml转储生成的数据结构,您必须将其安装在virtualenv中,例如使用python -m pip install ruamel.yaml。它是YAML 1.2(免责声明:我是那个软件包的作者)。

import csv
from pathlib import Path
import ruamel.yaml

input = Path('input.csv')
output = Path('output.yaml')

data = []
reader = csv.DictReader(input.open(newline=''))
for row in reader:
    d = {}
    data.append(d)
    for field in reader.fieldnames:
        d[field] = row[field].split('|') if field == 'tests' else row[field]

yaml = ruamel.yaml.YAML()
yaml.dump(data, output)

print(output.read_text())

其给出:

- name: product_id
  description: ID of the product
  tests:
  - not null
  - unique
- name: product_name
  description: Name of the product
  tests:
  - not null
cbeh67ev

cbeh67ev2#

在我看来,一个更像Python的解决方案是:

import yaml
import csv

with open('sample_data.csv', 'r') as f:
    reader = csv.DictReader(f)
    data = []
    for row in reader:
        item = {}
        item['name'] = row['name']
        item['description'] = row['description']
        item['tests'] = row['tests'].split('|')
        data.append(item)

with open('output.yaml', 'w') as f:
    yaml.dump(data, f)

输入:

name,description,tests
product_id,ID of the product,not null|unique
product_name,Name of the product,not null

输出:

- description: ID of the product
  name: product_id
  tests:
  - not null
  - unique
- description: Name of the product
  name: product_name
  tests:
  - not null

相关问题