初始行数据在下表中-
create table offer_row_data (
customer_key char(20),
offer1_id char(20),
offer1_cd char(20),
offer1_brand_nm char(20),
offer2_id char(20),
offer2_cd char(20),
offer2_brand_nm char(20),
offer3_id char(20),
offer3_cd char(20),
offer3_brand_nm char(20),
offer4_id char(20),
offer4_cd char(20),
offer4_brand_nm char(20)
);
我需要转换这个,并在下表中加载。。。
create table offer_data (
offer_key INTEGER NOT NULL AUTO_INCREMENT PRIMARY KEY,
offer_id char(20) NOT NULL,
offer_cd char(20) NOT NULL,
offer_brand_nm char(20));
这是一些随机数据-
INSERT INTO offer_row_data VALUES ('1','offer_id_1a','offe3_cd_1a','offer_nm_1a','offer_id_1b','offe3_cd_1b','offer_nm_1b','offer_id_1c','offe3_cd_1c','offer_nm_1c','offer_id_1d','offe3_cd_1d','offer_nm_1d');
INSERT INTO offer_row_data VALUES ('2','offer_id_2a','offe3_cd_2a','offer_nm_2a','offer_id_2b','offe3_cd_2b','offer_nm_2b','offer_id_2c','offe3_cd_2c','offer_nm_2c','offer_id_2d','offe3_cd_2d','offer_nm_2d');
INSERT INTO offer_row_data VALUES ('3','offer_id_3a','offe3_cd_3a','offer_nm_3a','offer_id_3b','offe3_cd_3b','offer_nm_3b','offer_id_3c','offe3_cd_3c','offer_nm_3c','offer_id_3d','offe3_cd_3d','offer_nm_3d');
这是我的解决方案-
INSERT INTO offer_data (offer_id,offer_cd,offer_brand_nm)
(SELECT distinct offer1_id, offer1_cd, offer1_brand_nm FROM offer_row_data)
UNION
(SELECT distinct offer2_id, offer2_cd, offer2_brand_nm FROM offer_row_data)
UNION
(SELECT distinct offer3_id, offer3_cd, offer3_brand_nm FROM offer_row_data)
UNION
(SELECT distinct offer4_id, offer4_cd, offer4_brand_nm FROM offer_row_data)
因为我有2500万的数据集,所以这将是一个性能负担,并期待一个更有效的解决方案
例子:
输入:
+--------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+
| customer_key | offer1_id | offer1_cd | offer1_brand_nm | offer2_id | offer2_cd | offer2_brand_nm | offer3_id | offer3_cd | offer3_brand_nm | offer4_id | offer4_cd | offer4_brand_nm |
+--------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+
| 1 | offer_id_1a | offe3_cd_1a | offer_nm_1a | offer_id_1b | offe3_cd_1b | offer_nm_1b | offer_id_1c | offe3_cd_1c | offer_nm_1c | offer_id_1d | offe3_cd_1d | offer_nm_1d |
| 2 | offer_id_2a | offe3_cd_2a | offer_nm_2a | offer_id_2b | offe3_cd_2b | offer_nm_2b | offer_id_2c | offe3_cd_2c | offer_nm_2c | offer_id_2d | offe3_cd_2d | offer_nm_2d |
| 3 | offer_id_3a | offe3_cd_3a | offer_nm_3a | offer_id_3b | offe3_cd_3b | offer_nm_3b | offer_id_3c | offe3_cd_3c | offer_nm_3c | offer_id_3d | offe3_cd_3d | offer_nm_3d |
+--------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+-------------+-------------+-----------------+
预期产量:
+-----------+-------------+-------------+----------------+
| offer_key | offer_id | offer_cd | offer_brand_nm |
+-----------+-------------+-------------+----------------+
| 1 | offer_id_1a | offe3_cd_1a | offer_nm_1a |
| 2 | offer_id_2a | offe3_cd_2a | offer_nm_2a |
| 3 | offer_id_3a | offe3_cd_3a | offer_nm_3a |
| 4 | offer_id_1b | offe3_cd_1b | offer_nm_1b |
| 5 | offer_id_2b | offe3_cd_2b | offer_nm_2b |
| 6 | offer_id_3b | offe3_cd_3b | offer_nm_3b |
| 7 | offer_id_1c | offe3_cd_1c | offer_nm_1c |
| 8 | offer_id_2c | offe3_cd_2c | offer_nm_2c |
| 9 | offer_id_3c | offe3_cd_3c | offer_nm_3c |
| 10 | offer_id_1d | offe3_cd_1d | offer_nm_1d |
| 11 | offer_id_2d | offe3_cd_2d | offer_nm_2d |
| 12 | offer_id_3d | offe3_cd_3d | offer_nm_3d |
+-----------+-------------+-------------+----------------+
3条答案
按热度按时间hmtdttj41#
如果使用cte,它只读取一次数据,而不是在原始sql中读取4次,而且速度可能更快。
请让我知道这是否解决了您的性能问题。
68de4m5k2#
我认为这很简单,而且可能是最快的,如果您不使用mysql 8.0/mariadb 10.2(其中包含cte),它就可以工作:
如果行中有“最多4个”报价,我建议您将此添加到
SELECT
#2:(其他选择也一样。)
DISTINCT
会让事情变慢,但如果你需要的话就去做。或者,使用INSERT IGNORE
吃点东西UNIQUE
钥匙准备好抓DUP了。你可能想重新考虑一下
AUTO_INCREMENT
在新table上。也许其中一个(或一个组合)的新列可以是“自然的”PRIMARY KEY
?这也是检查数据类型的好时机。也许你不需要8字节
BIGINT
,但可以使用4字节INT
或3字节MEDIUMINT
. 是char(20)
列的长度真的固定吗?是utf8吗?VARCHAR(20)
用合适的CHARACTER SET
在空间和性能方面可能会更好。一旦你把table装上东西,就把旧table扔了。将数组分布在多个列上是不好的做法。
uxh89sit3#
(修正了demicioglu答案的语法错误)
它暂时比我的答案快,我的答案包括4个独立的插入。我不知道它是否“只读取一次数据”。