我的Python/Pandas代码在我的MacOS上运行良好,但现在我已经将其移动到Windows,由于类型差异,它无法正常工作,并且在尝试写入gbq(Google Big Query)时出现错误:
代码如下所示:
def formatNumber(x):
if math.isnan(x):
f_number = 0.0
else:
f_number = str(round(x, 8))
return f_number
... <reading df from file> ...
print("A")
print(df.info())
df['Date'] = [x.date().strftime("%Y-%m-%d") for x in df['Date']]
df['A'] = [formatNumber(x) for x in df['A']]
# drop duplicates
print(df.shape)
df = df.drop_duplicates()
print(df.shape)
# upload to bigquery
print("B")
print(df.info())
table_schema = [{
'name': 'Date',
'type': 'date'
}, {
'name': 'A',
'type': 'numeric'
}, {
'name': 'B',
'type': 'string'
}]
df.to_gbq('tablename',
'dbname',
chunksize=None,
if_exists='replace',
table_schema=table_schema,
credentials=credentials
)
输出为:
A
<class 'pandas.core.frame.DataFrame'>
Int64Index: 82624 entries, 0 to 9
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 82624 non-null datetime64[ns]
1 A 82624 non-null float64
2 B 80769 non-null object
...
dtypes: datetime64[ns](1), float64(6), object(6)
memory usage: 8.8+ MB
None
(82624, 13)
(82624, 13)
[5 rows x 13 columns]
B
<class 'pandas.core.frame.DataFrame'>
Int64Index: 82624 entries, 0 to 9
Data columns (total 13 columns):
# Column Non-Null Count Dtype
--- ------ -------------- -----
0 Date 82624 non-null datetime64[ns]
1 A 82624 non-null object
2 B 80769 non-null object
...
dtypes: datetime64[ns](1), float64(6), object(6)
memory usage: 8.8+ MB
错误信息:
File "pyarrow\array.pxi", line 1044, in pyarrow.lib.Array.from_pandas
File "pyarrow\array.pxi", line 316, in pyarrow.lib.array
File "pyarrow\array.pxi", line 83, in pyarrow.lib._ndarray_to_array
File "pyarrow\error.pxi", line 123, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Expected bytes, got a 'datetime.time' object
我注意到在MacOS和Windows上运行它的另一个区别是MacOS上的索引更改,而Windows上没有任何更改。
MacOS操作系统:
- A --〉Int 64索引:82624个条目,0到1015
- B --〉范围索引:1016个条目,0到1015
窗口:
- A和B --〉Int 64索引:82624个条目,0到9
1条答案
按热度按时间jckbn6z71#
试图改变
到
您收到的错误提示datetime.time对象和预期的bytes类型之间存在类型不兼容。这可能是由于macOS和Windows上datetime对象的strftime()方法的行为差异造成的。