numpy 读取file.csv(两列;x和y),然后计算第二列的累积移动平均值

u91tlkcl  于 2022-11-10  发布在  其他
关注(0)|答案(2)|浏览(112)

我想先看看我的CSV档案。https://github.com/hamzaal014/file/blob/main/file.csv
.csv文件包含两列X和Y。以下是我的脚本:

import numpy as np
from pandas import DataFrame as df
import csv

origin_data = open("file.csv", "r")
dato = list(csv.reader(origin_data, delimiter=","))
print(dato)

rowcount  = 0

# iterating through the whole file

for row in dato:
  rowcount+= 1

# printing the result

# _ print("Number of lines present:-", rowcount)

print(rowcount)

dati = df(dato, columns=['x', 'y'])

window = 6
roll_avg = dati.rolling(window).mean()

roll_avg_cumulative = dati['y'].cumsum()/np.arange(1, 25)
print(roll_avg_cumulative)

但我的剧本不起作用了?
错误------------------

Traceback (most recent call last):
  File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/ops/array_ops.py", line 163, in _na_arithmetic_op
    result = func(left, right)
  File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 239, in evaluate
    return _evaluate(op, op_str, a, b)  # type: ignore[misc]
  File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 128, in _evaluate_numexpr
    result = _evaluate_standard(op, op_str, a, b)
  File "/home/haz/miniconda39/lib/python3.9/site-packages/pandas/core/computation/expressions.py", line 69, in _evaluate_standard
    return op(a, b)
TypeError: unsupported operand type(s) for /: 'str' and 'int'
dxxyhpgq

dxxyhpgq1#

当从文件中读取时,返回的是字符串。这就是问题的根源,因为字符串永远不会转换成数字。您可以通过以下方式进行修复:

dati = df(dato, columns=['x', 'y'], dtype_float)

如果它对您有帮助,我还想指出一些可以改进您的代码的事情:

  • 您正在使用Pandas作为数据容器,因此我建议使用Pandas函数将CSV文件转换为DataFrame,而不是手动执行(使用pandas.read_csv)
  • 使用len运算符可以轻松计算行数,而无需迭代所有行
  • 请坚持使用更广泛的导入别名(import pandas as pd),而不是创建自己的别名。这将帮助您的代码对其他人更具可读性

因此,您的代码可以变成:

import numpy as np
import pandas as pd

dati = pd.read_csv("file.csv", sep=",", dtype=float, names=["x", "y"])
rowcount = len(dati)

window = 6
roll_avg = dati.rolling(window).mean()

roll_avg_cumulative = dati["y"].cumsum() / np.arange(1, 25)
print(roll_avg_cumulative)
brtdzjyr

brtdzjyr2#

您的代码出了什么问题:

  • 所有版本都加载为str
    简单方法
import numpy as np
import pandas as pd
import csv

dati = pd.read_csv('file.csv', header=None)

window = 6
roll_avg = dati.rolling(window).mean()
print(dati[1].cumsum())

roll_avg_cumulative = dati[1].cumsum()/np.arange(1, 25)
print(roll_avg_cumulative)

相关问题