此问题在此处已有答案:
Use variable in Pandas query(2个答案)
22天前关闭
我有一张中等大小的tableExcel中的数据(近1200行),其中字段包含我希望用Python分析的 str 值或 datetime.date 值。我将Excel数据读入Pandas嵌套框架(df),然后使用嵌套框架查询(df.query(query string))解析出我感兴趣的特定数据。下面提供了数据表的示例,显示与查询最相关的4/9列。
BF Ship … Commissioned Decommissioned
X Able … 22-Jul-1992 30-Sep-2030
X Abraham Lincoln … 11-Nov-1989
X Abraham Lincoln … 11-Mar-1961 28-Feb-1981
X Acadia … 6-Jun-1981 16-Dec-1994
X Adroit … 4-Mar-1957 12-Dec-1991
X Adventurous … 19-Aug-1988 5-Jun-1992
X Affray … 8-Dec-1958 20-Dec-1992
X Ainsworth … 31-Mar-1973 27-May-1994
X Ajax … 30-Sep-1943 31-Dec-1986
X Alabama … 25-May-1985 30-Sep-2028
X Alamo … 24-Aug-1956 28-Sep-1990
X Alan Shepard … 26-Jun-2007
X Alaska … 25-Jan-1986 1-Jul-2025
X Albany … 7-Apr-1990 30-Sep-2028
X Albert David … 19-Oct-1968 28-Sep-1988
X Albuquerque … 21-May-1983 16-Sep-2015
X Alexandria … 29-Jun-1991 30-Sep-2026
X Algol … 1-Oct-1981 1-Oct-2007
X Altair … 5-Feb-1982 1-Oct-2007
X Amelia Earhart … 30-Oct-2008
X America … 11-Oct-2014
字符串
代码正确地将电子表格数据读入Pandas对象,每个列的数据类型都正确(字符串或日期时间)。我定义了一个日期时间变量now
,它存储当前日期,并使用它来查询Battle Force(BF)中的活动船只数量。该查询工作得很好:
qrystr = "BF == 'X' "\
"and (Commissioned != 'NaT' and Commissioned <= 'now')"\
"and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
df3 = df.query(qrystr)
print("\nNumber of Battle Force ships is: ", len(df3))
型
随后开发了一个查询,只列出舰队中当前活动的巡洋舰,如下所示。该查询也正确运行。
qrystr = "Type == 'Cruiser' "\
"and (Commissioned != 'NaT' and Commissioned <= 'now')"\
"and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
dfCruisers = df3.query(qrystr)
print ("Currently active cruisers are:\n", dfCruisers)
型
这就导致了我的嵌套框查询算法失败的部分。我构建了一个循环来计算从1980年到2030年舰队中活跃巡洋舰的数量。我使用循环控制变量来表示每一年,然后构建一个嵌套框查询来过滤完整的船舶列表,得到一个新的嵌套框列表,其中只包含在循环控制年度内仍在服役的船舶。我使用len()
函数来计算这个数字,然后将结果追加到lstCruisers
中。不幸的是,当它试图执行查询时,我得到了一个意外的错误。这里是源代码和结果的摘要。有人能帮助我理解为什么datetime变量在最后一个df.query中没有被正确处理吗?
import datetime as dt
import pandas as pd
now = dt.date.today()
print(now, type(now))
df = pd.read_excel("Ships.xlsx")
print("\n", df.dtypes)
print("df:\n", df)
# Calculate number of current Battle Force ships
qrystr = "BF == 'X' "\
"and (Commissioned != 'NaT' and Commissioned <= 'now')"\
"and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
df3 = df.query(qrystr)
print("\nNumber of Battle Force ships is: ", len(df3))
qrystr = "Type == 'Cruiser' "\
"and (Commissioned != 'NaT' and Commissioned <= 'now')"\
"and (Decommissioned >= 'now' or Decommissioned == 'NaT') "
dfCruisers = df3.query(qrystr)
print ("Currently active cruisers are:\n", dfCruisers)
lstCruisers = []
yearStart = 1980
yearStop = 1985
for yr in range(yearStart, yearStop+1):
stopDate = pd.Timestamp(yr, 12, 31).date()
print(stopDate, type(stopDate))
qrystr = "Type == 'Cruiser' " \
" and (Commissioned != 'NaT' and Commissioned <= 'stopDate')" \
" and (Decommissioned >= 'stopDate' or Decommissioned == 'NaT') "
dfCruisers = df.query(qrystr)
print(dfCruisers)
nrShips = len(dfCruisers)
lstCruisers.append([yr, nrShips])
print(lstCruisers)
/usr/bin/python3/PycharmProjects/pythonProject3/Ships.py
2023-12-12 <class 'datetime.date'>
BF object
Ship object
Trigraph object
Hull Number object
HullType object
Type object
Cmd Level object
Commissioned datetime64[ns]
Decommissioned datetime64[ns]
dtype: object
df:
BF Ship Trigraph ... Cmd Level Commissioned Decommissioned
0 X (Unnamed) NaN ... CAPT-SWO 2029-07-01 NaT
1 X (Unnamed) NaN ... CAPT-SWO 2030-07-01 NaT
2 X (Unnamed) NaN ... Sequential-Air 2036-12-31 NaT
3 X (Unnamed) NaN ... Sequential-Air 2040-12-31 NaT
4 X (Unnamed) NaN ... CAPT-SWO 2029-07-01 NaT
... ... ... ... ... ... ... ...
1152 X Yuma YMA ... NaN 2017-04-21 NaT
1153 NaN Zephyr ZPR ... LCDR-SWO 2011-09-30 2021-02-17
1154 NaN Zephyr ZPR ... LCDR-SWO 1994-10-15 2004-10-01
1155 X Zeus ZEU ... NaN NaT NaT
1156 X Zumwalt ZUM ... CDR-SWO 2016-10-15 NaT
[1157 rows x 9 columns]
Number of Battle Force ships is: 290
Currently active cruisers are:
BF Ship Trigraph ... Cmd Level Commissioned Decommissioned
80 X Antietam ANT ... CAPT-SWO 1987-06-06 2024-09-30
196 X Cape St. George CSG ... CAPT-SWO 1993-06-12 2027-09-30
231 X Chosin CHO ... CAPT-SWO 1991-01-12 2027-09-30
275 X Cowpens COW ... CAPT-SWO 1991-03-09 2026-09-30
407 X Gettysburg GET ... CAPT-SWO 1991-06-22 2026-09-30
605 X Lake Erie LKE ... CAPT-SWO 1993-05-10 2025-09-30
623 X Leyte Gulf LTG ... CAPT-SWO 1987-09-26 2024-09-30
746 X Normandy NOR ... CAPT-SWO 1989-12-09 2025-09-30
806 X Philippine Sea PSE ... CAPT-SWO 1989-03-18 2025-09-30
841 X Princeton PRN ... CAPT-SWO 1989-02-11 2026-09-30
893 X Robert Smalls NaN ... CAPT-SWO 2023-03-04 2026-07-01
966 X Shiloh SHI ... CAPT-SWO 1992-07-18 2024-09-30
1095 X Vicksburg VIK ... CAPT-SWO 1992-11-14 2023-12-31
[13 rows x 9 columns]
1980-12-31 <class 'datetime.date'>
Traceback (most recent call last):
File "pandas/_libs/tslibs/conversion.pyx", line 530, in pandas._libs.tslibs.conversion._convert_str_to_tsobject
File "pandas/_libs/tslibs/parsing.pyx", line 318, in pandas._libs.tslibs.parsing.parse_datetime_string
File "/Library/Python/3.8/site-packages/dateutil/parser/_parser.py", line 1368, in parse
return DEFAULTPARSER.parse(timestr, **kwargs)
File "/Library/Python/3.8/site-packages/dateutil/parser/_parser.py", line 643, in parse
raise ParserError("Unknown string format: %s", timestr)
dateutil.parser._parser.ParserError: Unknown string format: stopDate
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/PycharmProjects/pythonProject3/Ships.py", line 55, in <module>
dfCruisers = df.query(qrystr)
File "/Library/Python/3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/frame.py", line 4474, in query
res = self.eval(expr, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/util/_decorators.py", line 331, in wrapper
return func(*args, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/frame.py", line 4612, in eval
return _eval(expr, inplace=inplace, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/eval.py", line 353, in eval
parsed_expr = Expr(expr, engine=engine, parser=parser, env=env)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 813, in __init__
self.terms = self.parse()
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 832, in parse
return self._visitor.visit(self.expr)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 421, in visit_Module
return self.visit(expr, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 424, in visit_Expr
return self.visit(node.value, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 750, in visit_BoolOp
return reduce(visitor, operands)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 744, in visitor
rhs = self._try_visit_binop(y)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 739, in _try_visit_binop
return self.visit(bop)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 750, in visit_BoolOp
return reduce(visitor, operands)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 744, in visitor
rhs = self._try_visit_binop(y)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 739, in _try_visit_binop
return self.visit(bop)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 723, in visit_Compare
return self.visit(binop)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 415, in visit
return visitor(node, **kwargs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 538, in visit_BinOp
return self._maybe_evaluate_binop(op, op_class, left, right)
File "/Library/Python/3.8/site-packages/pandas/core/computation/expr.py", line 505, in _maybe_evaluate_binop
res = op(lhs, rhs)
File "/Library/Python/3.8/site-packages/pandas/core/computation/ops.py", line 379, in __init__
self.convert_values()
File "/Library/Python/3.8/site-packages/pandas/core/computation/ops.py", line 477, in convert_values
v = Timestamp(ensure_decoded(v))
File "pandas/_libs/tslibs/timestamps.pyx", line 1698, in pandas._libs.tslibs.timestamps.Timestamp.__new__
File "pandas/_libs/tslibs/conversion.pyx", line 249, in pandas._libs.tslibs.conversion.convert_to_tsobject
File "pandas/_libs/tslibs/conversion.pyx", line 533, in pandas._libs.tslibs.conversion._convert_str_to_tsobject
ValueError: could not convert string to Timestamp
Process finished with exit code 1
1条答案
按热度按时间bzzcjhmw1#
查询失败,因为您没有正确引用
stopDate
变量,您可以使用@
:字符串