I am querying data from BQ. I have to insert it to SQL Server 2008.
pseudocode:
# get results from bigquery
client = bigquery.Client()
result = client.query(sql).result()
# create parametrized insert sql statement
sql = 'INSERT INTO [SANDBOX].[dbo].[table_stg] VALUES ((?),(?),(?),(?),(?),(?),(?),(?),(?),(?))'
data = [(2016, 'STRING', 'STRING', 'STRING', 'STRING', 'STRING', INT, 'STRING', '09/28/2015', '09/25/2016'),
(2016, 'STRING', 'STRING', 'STRING', 'STRING', 'STRING', INT, 'STRING', '09/28/2015', '09/25/2016')]
# target table schema
[varchar](50) NULL,
[varchar](50) NULL,
[varchar](50) NULL,
[varchar](50) NULL,
[varchar](50) NULL,
[varchar](50) NOT NULL,
[int] NULL,
[varchar](50) NULL,
[datetime] NULL,
[datetime] NULL
# insert into sql server with fast_executemany
cursor.fast_executemany = True
cursor.executemany(sql, data)
conn.commit()
cursor.close()
conn.close()
So when I run cursor.executemany(sql, data) with cursor.fast_executemany = True, I receive error:
Invalid character value for cast specification.
String data, right truncation: length 44 buffer 20, HY000
The code works perfectly without cursor.fast_executemany = True, yet it executes each insert with single commit meaning 8 hours to transfer 2 milion rows.
Thanks in advance for help. Also, if there is some filtering mechanism I could sue to split into fast_execute and normal execute I would be super gratefull.
EDIT: edited the question as per comments
1条答案
按热度按时间von4xj4u1#
The related question here illustrates how
fast_executemany = True
affects the way that empty string parameters are interpreted fordatetime
columns. Withfast_executemany = False
, empty strings are interpreted as1900-01-01 00:00:00
. Withfast_executemany = True
, empty string parameters cause the "Invalid character value for cast specification" error.The same appears to be true for
aa/bb/yyyy
string parameters. That format is potentially ambiguous (Is itmm/dd/yyyy
ordd/mm/yyyy
?) and SQL Server makes assumptions underfast_executemany = False
that it does not make underfast_executemany = True
.This does not throw an error for a column defined as
d1 [datetime]
but this fails
However, this works:
So, either reformat the strings to
yyyy/mm/dd
or parse them into adatetime
object