Python Pandas带datetime对象的框架查询无法执行[重复]

jchrr9hc  于 11个月前  发布在  Python
关注(0)|答案(1)|浏览(116)

此问题在此处已有答案

Use variable in Pandas query(2个答案)
22天前关闭
我有一张中等大小的tableExcel中的数据(近1200行),其中字段包含我希望用Python分析的 str 值或 datetime.date 值。我将Excel数据读入Pandas嵌套框架(df),然后使用嵌套框架查询(df.query(query string))解析出我感兴趣的特定数据。下面提供了数据表的示例,显示与查询最相关的4/9列。

BF  Ship            …   Commissioned    Decommissioned
X   Able            …   22-Jul-1992     30-Sep-2030
X   Abraham Lincoln …   11-Nov-1989  
X   Abraham Lincoln …   11-Mar-1961     28-Feb-1981
X   Acadia          …   6-Jun-1981      16-Dec-1994
X   Adroit          …   4-Mar-1957      12-Dec-1991
X   Adventurous     …   19-Aug-1988      5-Jun-1992
X   Affray          …   8-Dec-1958      20-Dec-1992
X   Ainsworth       …   31-Mar-1973     27-May-1994
X   Ajax            …   30-Sep-1943     31-Dec-1986
X   Alabama         …   25-May-1985     30-Sep-2028
X   Alamo           …   24-Aug-1956     28-Sep-1990
X   Alan Shepard    …   26-Jun-2007  
X   Alaska          …   25-Jan-1986      1-Jul-2025
X   Albany          …   7-Apr-1990      30-Sep-2028
X   Albert David    …   19-Oct-1968     28-Sep-1988
X   Albuquerque     …   21-May-1983     16-Sep-2015
X   Alexandria      …   29-Jun-1991     30-Sep-2026
X   Algol           …   1-Oct-1981       1-Oct-2007
X   Altair          …   5-Feb-1982       1-Oct-2007
X   Amelia Earhart  …   30-Oct-2008  
X   America         …   11-Oct-2014

字符串
代码正确地将电子表格数据读入Pandas对象,每个列的数据类型都正确(字符串或日期时间)。我定义了一个日期时间变量now,它存储当前日期,并使用它来查询Battle Force(BF)中的活动船只数量。该查询工作得很好:

qrystr = "BF == 'X' "\
         "and (Commissioned != 'NaT' and Commissioned <= 'now')"\
         "and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
df3 = df.query(qrystr)
print("\nNumber of Battle Force ships is: ", len(df3))


随后开发了一个查询,只列出舰队中当前活动的巡洋舰,如下所示。该查询也正确运行。

qrystr = "Type == 'Cruiser' "\
         "and (Commissioned != 'NaT' and Commissioned <= 'now')"\
         "and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
dfCruisers = df3.query(qrystr)
print ("Currently active cruisers are:\n", dfCruisers)


这就导致了我的嵌套框查询算法失败的部分。我构建了一个循环来计算从1980年到2030年舰队中活跃巡洋舰的数量。我使用循环控制变量来表示每一年,然后构建一个嵌套框查询来过滤完整的船舶列表,得到一个新的嵌套框列表,其中只包含在循环控制年度内仍在服役的船舶。我使用len()函数来计算这个数字,然后将结果追加到lstCruisers中。不幸的是,当它试图执行查询时,我得到了一个意外的错误。这里是源代码和结果的摘要。有人能帮助我理解为什么datetime变量在最后一个df.query中没有被正确处理吗?

import datetime as dt
import pandas as pd

now = dt.date.today()
print(now, type(now))

df = pd.read_excel("Ships.xlsx")
print("\n", df.dtypes)

print("df:\n", df)

# Calculate number of current Battle Force ships
qrystr = "BF == 'X' "\
         "and (Commissioned != 'NaT' and Commissioned <= 'now')"\
         "and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
df3 = df.query(qrystr)
print("\nNumber of Battle Force ships is: ", len(df3))

qrystr = "Type == 'Cruiser' "\
         "and (Commissioned != 'NaT' and Commissioned <= 'now')"\
         "and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
dfCruisers = df3.query(qrystr)
print ("Currently active cruisers are:\n", dfCruisers)

lstCruisers = []
yearStart = 1980
yearStop = 1985
for yr in range(yearStart, yearStop+1):
    stopDate = pd.Timestamp(yr, 12, 31).date()
    print(stopDate, type(stopDate))

    qrystr = "Type == 'Cruiser' " \
             " and (Commissioned != 'NaT' and Commissioned <= 'stopDate')" \
             " and (Decommissioned >= 'stopDate' or Decommissioned == 'NaT') "
    dfCruisers = df.query(qrystr)
    print(dfCruisers)

    nrShips = len(dfCruisers)
    lstCruisers.append([yr, nrShips])

print(lstCruisers)
/usr/bin/python3/PycharmProjects/pythonProject3/Ships.py 

2023-12-12 <class 'datetime.date'>

BF                        object
Ship                      object
Trigraph                  object
Hull Number               object
HullType                  object
Type                      object
Cmd Level                 object
Commissioned      datetime64[ns]
Decommissioned    datetime64[ns]

dtype: object
df:
        BF       Ship Trigraph  ...       Cmd Level Commissioned Decommissioned
0       X  (Unnamed)      NaN  ...        CAPT-SWO   2029-07-01            NaT
1       X  (Unnamed)      NaN  ...        CAPT-SWO   2030-07-01            NaT
2       X  (Unnamed)      NaN  ...  Sequential-Air   2036-12-31            NaT
3       X  (Unnamed)      NaN  ...  Sequential-Air   2040-12-31            NaT
4       X  (Unnamed)      NaN  ...        CAPT-SWO   2029-07-01            NaT
...   ...        ...      ...  ...             ...          ...            ...
1152    X      Yuma       YMA  ...             NaN   2017-04-21            NaT
1153  NaN     Zephyr      ZPR  ...        LCDR-SWO   2011-09-30     2021-02-17
1154  NaN     Zephyr      ZPR  ...        LCDR-SWO   1994-10-15     2004-10-01
1155    X      Zeus       ZEU  ...             NaN          NaT            NaT
1156    X    Zumwalt      ZUM  ...         CDR-SWO   2016-10-15            NaT

[1157 rows x 9 columns]

Number of Battle Force ships is:  290
Currently active cruisers are:
      BF             Ship Trigraph  ... Cmd Level Commissioned Decommissioned
80    X         Antietam      ANT  ...  CAPT-SWO   1987-06-06     2024-09-30
196   X  Cape St. George      CSG  ...  CAPT-SWO   1993-06-12     2027-09-30
231   X           Chosin      CHO  ...  CAPT-SWO   1991-01-12     2027-09-30
275   X          Cowpens      COW  ...  CAPT-SWO   1991-03-09     2026-09-30
407   X       Gettysburg      GET  ...  CAPT-SWO   1991-06-22     2026-09-30
605   X        Lake Erie      LKE  ...  CAPT-SWO   1993-05-10     2025-09-30
623   X       Leyte Gulf      LTG  ...  CAPT-SWO   1987-09-26     2024-09-30
746   X         Normandy      NOR  ...  CAPT-SWO   1989-12-09     2025-09-30
806   X   Philippine Sea      PSE  ...  CAPT-SWO   1989-03-18     2025-09-30
841   X        Princeton      PRN  ...  CAPT-SWO   1989-02-11     2026-09-30
893   X    Robert Smalls      NaN  ...  CAPT-SWO   2023-03-04     2026-07-01
966   X           Shiloh      SHI  ...  CAPT-SWO   1992-07-18     2024-09-30
1095  X        Vicksburg      VIK  ...  CAPT-SWO   1992-11-14     2023-12-31

[13 rows x 9 columns]
1980-12-31 <class 'datetime.date'>
Traceback (most recent call last):
  File "pandas/_libs/tslibs/conversion.pyx", line 530, in pandas._libs.tslibs.conversion._convert_str_to_tsobject
  File "pandas/_libs/tslibs/parsing.pyx", line 318, in pandas._libs.tslibs.parsing.parse_datetime_string
  File "/Library/Python/3.8/site-packages/dateutil/parser/_parser.py", line 1368, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/Library/Python/3.8/site-packages/dateutil/parser/_parser.py", line 643, in parse
    raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: stopDate

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/PycharmProjects/pythonProject3/Ships.py", line 55, in <module>
    dfCruisers = df.query(qrystr)
  File "/Library/Python/3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/frame.py", line 4474, in query
    res = self.eval(expr, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
    return func(*args, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/frame.py", line 4612, in eval
    return _eval(expr, inplace=inplace, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/eval.py", line 353, in eval
    parsed_expr = Expr(expr, engine=engine, parser=parser, env=env)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 813, in __init__
    self.terms = self.parse()
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 832, in parse
    return self._visitor.visit(self.expr)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 421, in visit_Module
    return self.visit(expr, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 424, in visit_Expr
    return self.visit(node.value, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 750, in visit_BoolOp
    return reduce(visitor, operands)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 744, in visitor
    rhs = self._try_visit_binop(y)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 739, in _try_visit_binop
    return self.visit(bop)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 750, in visit_BoolOp
    return reduce(visitor, operands)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 744, in visitor
    rhs = self._try_visit_binop(y)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 739, in _try_visit_binop
    return self.visit(bop)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 723, in visit_Compare
    return self.visit(binop)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
    return visitor(node, **kwargs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 538, in visit_BinOp
    return self._maybe_evaluate_binop(op, op_class, left, right)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 505, in _maybe_evaluate_binop
    res = op(lhs, rhs)
  File "/Library/Python/3.8/site-packages/pandas/core/computation/ops.py", line 379, in __init__
    self.convert_values()
  File "/Library/Python/3.8/site-packages/pandas/core/computation/ops.py", line 477, in convert_values
    v = Timestamp(ensure_decoded(v))
  File "pandas/_libs/tslibs/timestamps.pyx", line 1698, in pandas._libs.tslibs.timestamps.Timestamp.__new__
  File "pandas/_libs/tslibs/conversion.pyx", line 249, in pandas._libs.tslibs.conversion.convert_to_tsobject
  File "pandas/_libs/tslibs/conversion.pyx", line 533, in pandas._libs.tslibs.conversion._convert_str_to_tsobject
ValueError: could not convert string to Timestamp

Process finished with exit code 1
bzzcjhmw

bzzcjhmw1#

查询失败,因为您没有正确引用stopDate变量,您可以使用@

qrystr = "Type == 'Cruiser' " \
         " and (Commissioned != 'NaT' and Commissioned <= @stopDate)" \
         " and (Decommissioned >= @stopDate or Decommissioned == 'NaT') "

字符串

相关问题