mariadb jdbc驱动程序不能有效地批处理更新vs sql server

kmynzznz  于 2021-06-20  发布在  Mysql
关注(0)|答案(1)|浏览(375)

我在我的应用程序中测试了insert/update/delete例程的性能,我正在将其从sqlserver移植到mariadb。
本地win10工作站上的java 1.8,i7 2.80ghz cpu+16gb ram
jdbc org.mariadb。jdbc:mariadb-java-client:2.2.4
10.2.12 aws上的mariadb-log mariadb服务器
基准测试触发50000个插入,同样的更新和删除。
sql server通过net.sourceforge.jtds jdbc驱动程序在1秒内处理所有这些问题。
带有mariadbjava客户机驱动程序的mariadb插入速度更快,但是更新(和删除)速度慢得多,只有3.5秒。
两个数据库中的模式是相同的,我假设由于mariadb中的插入速度很快,这可能排除了索引问题或服务器配置错误。
我尝试了jdbc连接字符串的多种变体,结果是最快的:

?verifyServerCertificate=true\
  &useSSL=true\
  &requireSSL=true\
  &allowMultiQueries=true\
  &cachePrepStmts=true\
  &cacheResultSetMetadata=true\
  &cacheServerConfiguration=true\
  &elideSetAutoCommits=true\
  &maintainTimeStats=false\
  &prepStmtCacheSize=50000\
  &prepStmtCacheSqlLimit=204800\
  &rewriteBatchedStatements=false\
  &useBatchMultiSend=true\
  &useBatchMultiSendNumber=50000\
  &useBulkStmts=true\
  &useLocalSessionState=true\
  &useLocalTransactionState=true\
  &useServerPrepStmts=true

mysql和mysql connectorj的性能在所有情况下都比mariadb差。
我已经看了一个星期了,正在考虑采用我在前面的问题中建议的解决方法如何提高mysql和sqlserver中一系列更新的速度?
为了防止服务器配置错误,以下是我得到的关键变量:

key_buffer_size                16MB
innodb_buffer_pool_size        24GB (mem 30GB)
innodb_log_file_size           134MB
innodb_log_buffer_size         8MB
innodb_flush_log_at_trx_commit 0
max_allowed_packet             16MB

我的50000次写入只是少量的数据-大约2mb。但是对于sql语法,当它通过jdbc连接时,这个值大概要大10倍-对吗?
以下是sql和解释计划:

Describe `data`
+---------------+------------------+------+-----+---------------------+-------------------------------+
| Field         | Type             | Null | Key | Default             | Extra                         |
+---------------+------------------+------+-----+---------------------+-------------------------------+
| parentId      | int(10) unsigned | NO   | PRI | NULL                |                               |
| modifiedDate  | date             | NO   | PRI | NULL                |                               |
| valueDate     | date             | NO   | PRI | NULL                |                               |
| value         | float            | NO   |     | NULL                |                               |
| versionstamp  | int(10) unsigned | NO   |     | 1                   |                               |
| createdDate   | datetime         | YES  |     | current_timestamp() |                               |
| last_modified | datetime         | YES  |     | NULL                | on update current_timestamp() |
+---------------+------------------+------+-----+---------------------+-------------------------------+

INSERT INTO `data` (`value`, `parentId`, `modifiedDate`, `valueDate`) VALUES (4853.16314229298,52054,'20-Apr-18','28-Dec-18')

+------+-------------+-------+------+---------------+------+---------+------+------+-------+
| id   | select_type | table | type | possible_keys | key  | key_len | ref  | rows | Extra |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+
|    1 | INSERT      | data  | ALL  | NULL          | NULL | NULL    | NULL | NULL | NULL  |
+------+-------------+-------+------+---------------+------+---------+------+------+-------+

UPDATE `data` SET `value` = 4853.16314229298 WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '28-Dec-18'

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | data  | range | PRIMARY       | PRIMARY | 10      | NULL |    1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

DELETE FROM `data` WHERE `parentId` = 52054 AND `modifiedDate` = '20-Apr-18' AND `valueDate` = '29-Jan-16'

+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
| id   | select_type | table | type  | possible_keys | key     | key_len | ref  | rows | Extra       |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+
|    1 | SIMPLE      | data  | range | PRIMARY       | PRIMARY | 10      | NULL |    1 | Using where |
+------+-------------+-------+-------+---------------+---------+---------+------+------+-------------+

[更新]
jdbc用法-这是一个精简版,请原谅任何严重错误:

final Connection connection = dataSource.getConnection();
    connection.setAutoCommit(false);
    try (PreparedStatement statement = connection.prepareStatement(
                 "UPDATE data SET value = ? " +
                         "WHERE parentId = ? " +
                         "AND modifiedDate = ? " +
                         "AND valueDate = ? ")) {
        // timeSeries is a list of 50,000 data points
        Arrays.stream(timeSeries)
                .forEach(ts -> {
            try {
                statement.setDouble(1, value);
                statement.setLong(2, parentId);
                statement.setDate(3, new java.sql.Date(
                        modifiedDate.getTime()));
                statement.setDate(4, new java.sql.Date(
                        valueDate.getTime()));
                statement.addBatch();
            } catch (SQLException e) {
                throw new RuntimeException(
                        "Bad batch statement handling", e);
            }
        });
        int[] results = statement.executeBatch();
        connection.commit();
    } catch (SQLException e) {
        connection.rollback();
        throw e;
    } finally {
        connection.close();
    }

我还有一些来自general\u日志的数据,显示了传入的jdbc调用,它看起来非常基本—一个“prepare”调用来设置语句,然后是单独的更新。
这让我很惊讶-似乎没有配料:

13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Query   set autocommit=0
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Prepare UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
13/06/2018 15:09    service_user_t[service_user_t] @  [9.177.2.31]  75954   298206495   Execute UPDATE `data` SET `value` = ? WHERE `parentId` = ? AND `modifiedDate` = ? AND `valueDate` = ?
etc
etc
jv4diomz

jv4diomz1#

在批处理中的一些行之间添加“begin”和“commit”语句。或者在批处理之前启动事务,然后提交。这将比成千上万的个人陈述快得多。
如果您只做插入,rewritebatchstatements=true应该会大大加快速度,而不需要事务处理。此外,您还可以将最大数据包大小增加到1gb,这将进行更多批处理,也许您的整个批处理将转换为1个非常大的多插入。

相关问题