我追加了CSV数据,现在我想在主项目文件中以相同的格式循环通过它

gkl3eglg  于 2023-02-27  发布在  其他
关注(0)|答案(1)|浏览(115)

我试图抓取多个网站的新闻文章。我把它设置为一个项目,不知道如何调用它到主来保持正确的格式,以获得所有网站的所有类别。我可以为每个网站分别获得正确的格式,就像如果我只是添加with open节从main.py之前的for list in lists:在个人网站。py。我想通过每个网站的结果循环,使他们都在同一个.csv文件。
单个Websites.py看起来类似于:

from bs4 import BeautifulSoup as soup
import requests
import pandas as pd

URL = 'https://ir.akerotx.com/press-releases'
full = 'https://ir.akerotx.com'

AKROlinks = []

html_text = requests.get(URL).text
chickennoodle = soup(html_text, 'html.parser')

lists = chickennoodle.find_all("article", class_ = "clearfix node node--nir-news--nir-widget-list node--type-nir-news node--view-mode-nir-widget-list node--promoted")

for list in lists:
    ticker = "AKRO"
    title = list.find("div", class_ = "nir-widget--field nir-widget--news--headline").text.strip()
    link = full + list.find("div", class_ = "nir-widget--field nir-widget--news--headline").a["href"]
    date = list.find("div", class_ = "nir-widget--field nir-widget--news--date-time").text.strip()
    AKROinfo = [ticker, title, link, date]
    #print(info)
    AKROlinks.append(AKROinfo)

    print(AKROlinks)

main.py 看起来像:

from csv import writer

output = "C:\\user\\file location.csv"

from AKROscrape import AKROlinks
from AXLAscrape import AXLAlinks

links2excel =(AXLAlinks, AKROlinks)

with open(output, 'w', encoding = 'utf8', newline = "") as f:
    thewriter = writer(f)
    header = ["Ticker","Title", "Link", "Date"]
    thewriter.writerow(header)
    for i in links2excel:
        thewriter.writerow(i)

我的当前输出如下所示:

我想要的是:

yqkkidmi

yqkkidmi1#

聊天GPT回答了,是links2excel的格式化问题。
main.py:

import pandas as pd
from csv import writer

output = "D:\\Lickety-Split\\Python\\Python to Excel\\From Python.csv"

from AKROscrape import AKROlinks
from AXLAscrape import AXLAlinks


links2excel = AXLAlinks + AKROlinks

with open(output, 'w', encoding = 'utf8', newline = "") as f:
    thewriter = writer(f)
    header = ["Ticker","Title", "Link", "Date"]
    thewriter.writerow(header)

    for link in links2excel:
        thewriter.writerow(link)

相关问题