用Pandas阅读rpt文件

enxuqcxy 于 2023-03-11 发布在其他

关注(0)|答案(2)|浏览(153)

我读报告数据的Pandas使用：

import pandas as pd
df = pd.read_fwf("2014-1.rpt", skiprows=[1], nrows=150)

我实际上遵循anwser here但是，对于某些列，分隔是不准确的。这是我得到的示例：

Country   Order Date Device   Category
UK        2014-01-03 Desktop  Shoes
IT        2014-01-03 Desktop  Shoes
FR        2014-01-04 Desktop  Dress
FR        2014-01-04 Tablet   Dress
US        2014-01-05 Desktop  Bags
US        2014-01-06 Desktop  Bags
UK        2014-01-07 Tablet   Dress

比如它把Order Date和Device列当做一列来读，其实只是一个例子，这样的列很多，怎么解决，你有什么想法吗，其实这些有问题的列可能是固定宽度的

pandas

来源：https://stackoverflow.com/questions/47685206/reading-rpt-files-with-pandas

2条答案

按热度按时间

5lhxktic1#

这个问题很老了，但这里有一个答案，你可以用Pandas把它读成csv，我已经把这个用在各种rpt文件上了，它起作用了。

import pandas as pd
df = pd.read_csv("2014-1.rpt", skiprows=[1], nrows=150)

赞(0）回复(0）举报 2023-03-11

yptwkmov2#

要读取SQL Server rpt数据导出文件，可以执行以下操作：

import pandas as pd

myfile = 'myfile.rpt'  # <--- edit this

# get column specification using the second line with all the dashes
count = 1
for x in open(myfile, encoding='utf8'):
    cols = x.rstrip()  # remove newline character
    count += 1
    if count > 2:
        break

# build colspecs list
colspecs = []
idx = 0
for c in cols.split(' '):
    n = len(c)
    colspecs.append((idx, idx + n))
    idx += 1 + n

df = pd.read_fwf(myfile, colspecs=colspecs, encoding='utf8', skiprows=[1])

赞(0）回复(0）举报 2023-03-11

我来回答

用Pandas阅读rpt文件

2条答案

相关问题

热门标签

最新问答