Python .csv编写器将数据放在错误的行(Python 3.7)奇怪的格式

enxuqcxy 于 12个月前发布在 Python

关注(0)|答案(2)|浏览(113)

我尝试使用BeautifulSoup从网页中提取数据并将数据格式化为.csv文件。我已经成功地将数据放入页面中，但我无法正确格式化文件。
我的问题是，如果我在第一列有10个项目（11行标题），下一列的数据从我的第12行开始。.csv最终看起来交错（像楼梯），例如：

Field1,Field2,Field3
data1,,
data1,,
data1,,
,data2,
,data2,
,data2,
,,data3
,,data3
,,data3

字符串
显然，使用.csv格式会更容易：

Field1,Field2,Field3
data1,data2,data3
data1,data2,data3
data1,data2,data3

型
我的代码看起来像这样：

import time
import requests
import csv
from bs4 import BeautifulSoup

# Time to wait between each item.
t = .010

# Create a csv file to write to.
f = open('filename.csv', 'w')
fieldnames = ('Field1','Field2')
writer = csv.DictWriter(f, fieldnames = fieldnames, lineterminator = '\n')
writer.writeheader()

# Define target page.
url = 'https://www.example.com'
page = requests.get(url)
soup = BeautifulSoup(page.text, 'html.parser')

# Filter useful information from the page.
data_list = soup.find(class_='class0')
data_raw = data_list.find_all(class_='class1')
otherData_raw = otherData_list.find_all(class_='class2')

# Extract [data1] from html.
for data_location in data_raw:
    data_refine = data_location.find_all('a')

    for data_item in data_refine:
        field1 = data_item.contents[0]
        writer.writerow({'Field1':field1})
    time.sleep(t)

# Extract [data2] from html.
for otherData_location in otherData_raw:
    otherData_refine = otherData_location.find_all('a')

    for otherData_item in otherData_refine:
        field2 = otherData_item.contents[0]
        writer.writerow({'Field2':field2})
    time.sleep(t)

f.close()

型
我已经尝试了一些解决方案，但没有任何运气。我是Python的初学者，所以如果这是一个愚蠢的问题，我提前道歉。我非常感谢任何帮助这个问题。谢谢！

csv

来源：https://stackoverflow.com/questions/55167781/python-csv-writer-putting-data-on-wrong-row-python-3-7-strange-formatting

2条答案

按热度按时间

qaxu7uf21#

我的建议是在输出任何东西之前收集所有数据。如果您想要在一行中包含多个数据，请将它们添加到列表中，然后将它们写入CSV，如下所示：

with open('csv.csv', 'w', encoding='utf-8') as f:
    for line in csv_data:
        f.write(','.join(line) + '\n')

字符串
当然，你也可以使用CSV模块。
如果你提供一个例子页面，你想刮以及感兴趣的领域，这将有助于回答你的问题，这是相当模糊的

赞(0）回复(0）举报 12个月前

v8wbuo2f2#

代码是每行写一个单元格：

writer.writerow({'Field1':field1})

字符串
会写

foo,,  # Only Field1 column is populated

writer.writerow({'Field2':field2})

型
会写

,foo,  # Only Field2 column is populated

型
在将行写入文件之前收集行中的所有列

row = {'Field1': 'foo', 'Field2': 'bar'...}
writer.writerow(row)

型

赞(0）回复(0）举报 12个月前

我来回答

Python .csv编写器将数据放在错误的行(Python 3.7)奇怪的格式

2条答案

相关问题

热门标签

最新问答