我正在使用scrapy和python从一个站点抓取数据,并将数据存储在csv文件中。然后我尝试从csv文件中获取值,并尝试将这些值存储在mysql数据库表中。insert语句既不会触发错误,也不会向数据库中插入任何数据。我检查了值来自csv的字段的数据类型。一切都是弦。存储在csv中的所有值都是字符串格式。这就是为什么在db中存储值时,除了string/varchar之外的所有数据类型都会出现问题。我现在该怎么办?除了varchar之外,我的数据库表中还有int(6)和timestamp数据类型的列。
导入csv导入重新导入pymysql导入sys
connection = pymysql.connect (host = "localhost", user = "root", passwd = ".....", db = "city_details")
cursor = connection.cursor ()
def insert_articles2(rows):
rowcount = 0
for row in rows:
if rowcount!= 0:
sql = "INSERT IGNORE INTO articles2 (country, event_name, md5, date_added, profile_image, banner, sDate, eDate, address_line1, address_line2, pincode, state, city, locality, full_address, latitude, longitude, start_time, end_time, description, website, fb_page, fb_event_page, event_hashtag, source_name, source_url, email_id_organizer, ticket_url) VALUES (%s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %d, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s, %s)"
cursor.execute = (sql, (row[0], row[1], row[2], row[3], row[4], row[5], row[6], row[7], row[8], row[9], row[10], row[11], row[12], row[13], row[14], row[15], row[16], row[17], row[18], row[19], row[20], row[21], row[22], row[23], row[24], row[25], row[26], row[27]))
rowcount+=1
rows = csv.reader(open("items.csv", "r"))
insert_articles2(rows)
connection.commit()
表articles2的表结构
CREATE TABLE IF NOT EXISTS `articles2` (
`id` int(6) NOT NULL AUTO_INCREMENT,
`country` varchar(45) NOT NULL,
`event_name` varchar(200) NOT NULL,
`md5` varchar(35) NOT NULL,
`date_added` timestamp NULL DEFAULT NULL,
`profile_image` varchar(350) NOT NULL,
`banner` varchar(350) NOT NULL,
`sDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`eDate` timestamp NOT NULL DEFAULT '0000-00-00 00:00:00',
`address_line1` mediumtext,
`address_line2` mediumtext,
`pincode` int(7) NOT NULL,
`state` varchar(30) NOT NULL,
`city` text NOT NULL,
`locality` varchar(50) NOT NULL,
`full_address` varchar(350) NOT NULL,
`latitude` varchar(15) NOT NULL,
`longitude` varchar(15) NOT NULL,
`start_time` time NOT NULL,
`end_time` time NOT NULL,
`description` longtext CHARACTER SET utf16 NOT NULL,
`website` varchar(50) DEFAULT NULL,
`fb_page` varchar(200) DEFAULT NULL,
`fb_event_page` varchar(200) DEFAULT NULL,
`event_hashtag` varchar(30) DEFAULT NULL,
`source_name` varchar(30) NOT NULL,
`source_url` varchar(350) NOT NULL,
`email_id_organizer` varchar(100) NOT NULL,
`ticket_url` mediumtext NOT NULL,
PRIMARY KEY (`id`),
KEY `full_address` (`full_address`),
KEY `full_address_2` (`full_address`),
KEY `id` (`id`),
KEY `event_name` (`event_name`),
KEY `sDate` (`sDate`),
KEY `eDate` (`eDate`),
KEY `id_2` (`id`),
KEY `country` (`country`),
KEY `event_name_2` (`event_name`),
KEY `sDate_2` (`sDate`),
KEY `eDate_2` (`eDate`),
KEY `state` (`state`),
KEY `locality` (`locality`),
KEY `start_time` (`start_time`),
KEY `start_time_2` (`start_time`),
KEY `end_time` (`end_time`),
KEY `id_3` (`id`),
KEY `id_4` (`id`),
KEY `event_name_3` (`event_name`),
KEY `md5` (`md5`),
KEY `sDate_3` (`sDate`),
KEY `eDate_3` (`eDate`),
KEY `latitude` (`latitude`),
KEY `longitude` (`longitude`),
KEY `start_time_3` (`start_time`),
KEY `end_time_2` (`end_time`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1 AUTO_INCREMENT=4182 ;
1条答案
按热度按时间rqcrx0a61#
不管这个与sql相关的错误是什么,这很可能取决于某些数据不匹配,我强烈建议避免导出到csv,而是添加粗糙的mysql管道,这将导出您的刮项目直接到一个mysql表,从那里你可以很容易地移动到其他表或处理它的日期。。。
如果你不能使用管道和/或你想要一些更可定制的东西,那么你可以在stackoverflow上看看这个答案,你会发现关于如何编写你自己的定制mysql管道的有用信息。。。