pandas 将时间帧掩码应用于 Dataframe 时出错

dy1byipe  于 2023-02-17  发布在  其他
关注(0)|答案(1)|浏览(121)
    • 简介**

我正在编写一段python代码,它使用pcycopg2、panda和datetime从PGAdmin数据库检索信息,PGAdmin数据库将数据过滤到最近14天的活动。
我要从数据库中提取的列是:id(唯一编号)、创建人(姓名)、创建日期
我提取所有的信息,因为我认为我可以将其传输到 Dataframe 中,然后使用掩码获得我想要的位。

    • 代码**
import psycopg2
import pandas as pd
import datetime

Today = datetime.datetime.now() #gets todays date
TimeFrameInDays = datetime.timedelta(days = 14) #gets int amount of days
TwoWeeksAgo = Today - TimeFrameInDays #calculates date two weeks ago

Query = 'SELECT createdby, created, id FROM "02 Planning"."fibre"'

print("Robo-Cop Initialised") #DEV

try:
    connection = psycopg2.connect(database = DB_Name,
                            user = DB_User,
                            password = DB_Pass,
                            host = DB_Host,
                            port = DB_Port)

    print("Database connected")

except (Exception, psycopg2.Error) as error: 
    #if error occurs, message is returned
    print("Error Occured Trying to Connect")

finally:
    cursor = connection.cursor() #makes refrencing the cursor easier
    cursor.execute(Query) #executes query
    Data = cursor.fetchall() #saves results
    
    DataFrame = pd.DataFrame(Data) #assembles into dataframe
    DataFrame.rename(columns={0:'Created By', 1:'Created On', 2:'Database Id'}) #renames columns
    #print(DataFrame)
    mask = (DataFrame['Created On'] > TwoWeeksAgo.date) & (DataFrame['Created On'] <= Today.date)
    DataFrame = DataFrame.loc[mask] 
    print(DataFrame)
    • 错误**
Traceback (most recent call last):
  File "c:\Users\\Documents\GitHub\\main.py", line 42, in <module>
    mask = (DataFrame['Created On'] > TwoWeeksAgo.date) & (DataFrame['Created On'] <= Today.date)
  File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\frame.py", line 3805, in __getitem__   
    indexer = self.columns.get_loc(key)
  File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexes\range.py", line 395, in get_loc
    raise KeyError(key)
KeyError: 'Created On'
    • 为什么我的问题不重复**

我的问题与重命名列无关,我并不是要编辑如何重命名列,而是要对一个 Dataframe 应用一个掩码,它只显示最近14天的日期。

    • 新错误**

有了这段代码
DataFrame = pd.DataFrame(Data, columns=['Created By','Created On','Database Id'])
我得到

File "c:\Users\\Documents\GitHub\\main.py", line 42, in <module>
    mask = (DataFrame['Created On'].dt.date > TwoWeeksAgo.date) & (DataFrame['Created On'].dt.date <= Today.date)
  File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\generic.py", line 5902, in __getattr__
    return object.__getattribute__(self, name)
  File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\accessor.py", line 182, in __get__
    accessor_obj = self._accessor(obj)
  File "C:\Users\\AppData\Local\Packages\PythonSoftwareFoundation.Python.3.10_qbz5n2kfra8p0\LocalCache\local-packages\Python310\site-packages\pandas\core\indexes\accessors.py", line 512, in __new__
    raise AttributeError("Can only use .dt accessor with datetimelike values")
AttributeError: Can only use .dt accessor with datetimelike values. Did you mean: 'at'?
cngwdvgl

cngwdvgl1#

完整解决方案:

#pass columns names to DataFrame constructor
DataFrame = pd.DataFrame(Data, columns=['Created By','Created On','Database Id'])

Today = datetime.datetime.now() #gets todays date
TimeFrameInDays = datetime.timedelta(days = 14) #gets int amount of days
 #calculates date two weeks ago
TwoWeeksAgo = Today - TimeFrameInDays

#convert column to datetimes
DataFrame['Created On'] = pd.to_datetime(DataFrame['Created On'])

#compare dates
mask = (DataFrame['Created On'].dt.date > TwoWeeksAgo.date()) & 
        (DataFrame['Created On'].dt.date <= Today.date())
DataFrame1 = DataFrame.loc[mask] 
print(DataFrame1)

或者使用Timestamp.normalize表示不带时间的日期时间(时间为00:00:00),以便可以比较日期时间列:

Today = pd.Timestamp.now().normalize()
TwoWeeksAgo = Today - pd.Timedelta(days = 14)

mask = (DataFrame['Created On'] > TwoWeeksAgo) & (DataFrame['Created On'] <= Today)
DataFrame1 = DataFrame.loc[mask] 
print(DataFrame1)

或者:

Today = pd.Timestamp.now().normalize()
TwoWeeksAgo = Today - pd.Timedelta(days = 14)

mask = DataFrame['Created On'].between(TwoWeeksAgo, Today, inclusive='right')
DataFrame1 = DataFrame.loc[mask]

相关问题