import re
with open('text.txt') as f: #replace text.txt with your text file path
for line in f:
result = re.search(r"^(\d+),(\d+),(\d{4}-\d{2}-\d{2})"gm, line)
if re.search(r"(^\d+):", line) is not None:
movie_id = re.search(r"(^\d+):", line).group(1)
elif result:
costomer_id = result.group(1)
rating = result.group(2)
date = result.group(3)
data_list = [costomer_id, rating, date, movie_id] #data that you want. you can store it as csv file
# YOUR CODE
else:
continue
2条答案
按热度按时间hgc7kmma1#
首先,创建包含这些标题的字典。然后使用类似readline([n])的命令读取此文本文件中的每一行。如果您的字符是特殊字符,如逗号或空格。将这些值 在字典中的键,然后你可以通过python的pandas库很容易地将字典转换成csv文件来创建一个数据框。你可以阅读Pandas的文档。
ctzwtxfj2#