如何合并多个SQLite数据库?

camsedfj  于 2023-11-21  发布在  SQLite
关注(0)|答案(8)|浏览(166)

如果我有大量的SQLite数据库,所有数据库都具有相同的模式,为了在所有数据库上执行查询,将它们合并在一起的最佳方法是什么?
我知道可以使用ATTACH来实现这一点,但它有32和64个数据库的限制,具体取决于机器上的内存系统。

f87krz0w

f87krz0w1#

从DavidM的回答中总结出Nabble post

attach 'c:\test\b.db3' as toMerge;           
BEGIN; 
insert into AuditRecords select * from toMerge.AuditRecords; 
COMMIT; 
detach toMerge;

字符串
根据需要重复。

  • 注:根据mike的评论添加了detach toMerge;。*
7kjnsjlb

7kjnsjlb2#

虽然这是一个非常古老的主题,但在今天的编程需求中,这仍然是一个相关的问题。我在这里发布这个问题,因为还没有一个答案是简洁的,简单的,直截了当的。这是为了最终出现在这个页面上的Google员工。GUI我们去:
1.下载Sqlitestudio
1.使用Ctrl + O键盘快捷键添加所有数据库文件
1.双击每个现在加载的数据库文件以全部打开/激活/展开它们
1.有趣的部分:只需右键单击每个表并单击Copy,然后转到已加载数据库文件列表中的目标数据库(或根据需要创建新数据库),右键单击目标数据库并单击Paste
我惊讶地意识到,这样一个艰巨的任务可以使用古老的编程技巧来解决,称为:复制和粘贴:)

anauzrmj

anauzrmj3#

这里有一个简单的python代码,可以合并两个数据库文件,或者扫描一个目录以找到所有数据库文件并将它们合并在一起(只需将其他文件中的所有数据插入到找到的第一个数据库文件中)。请注意,此代码只是将数据库附加到相同的模式。

import sqlite3
import os

def merge_databases(db1, db2):
    con3 = sqlite3.connect(db1)

    con3.execute("ATTACH '" + db2 +  "' as dba")

    con3.execute("BEGIN")
    for row in con3.execute("SELECT * FROM dba.sqlite_master WHERE type='table'"):
        combine = "INSERT OR IGNORE INTO "+ row[1] + " SELECT * FROM dba." + row[1]
        print(combine)
        con3.execute(combine)
    con3.commit()
    con3.execute("detach database dba")

def read_files(directory):
    fname = []
    for root,d_names,f_names in os.walk(directory):
        for f in f_names:
            c_name = os.path.join(root, f)
            filename, file_extension = os.path.splitext(c_name)
            if (file_extension == '.sqlitedb'):
                fname.append(c_name)

    return fname

def batch_merge(directory):
    db_files = read_files(directory)
    for db_file in db_files[1:]:
        merge_databases(db_files[0], db_file)

if __name__ == '__main__':
    batch_merge('/directory/to/database/files')

字符串

ccrfmcuu

ccrfmcuu4#

回答较晚,但您可以使用:用途:

#!/usr/bin/python

import sys, sqlite3

class sqlMerge(object):
    """Basic python script to merge data of 2 !!!IDENTICAL!!!! SQL tables"""

    def __init__(self, parent=None):
        super(sqlMerge, self).__init__()

        self.db_a = None
        self.db_b = None

    def loadTables(self, file_a, file_b):
        self.db_a = sqlite3.connect(file_a)
        self.db_b = sqlite3.connect(file_b)

        cursor_a = self.db_a.cursor()
        cursor_a.execute("SELECT name FROM sqlite_master WHERE type='table';")

        table_counter = 0
        print("SQL Tables available: \n===================================================\n")
        for table_item in cursor_a.fetchall():
            current_table = table_item[0]
            table_counter += 1
            print("-> " + current_table)
        print("\n===================================================\n")

        if table_counter == 1:
            table_to_merge = current_table
        else:
            table_to_merge = input("Table to Merge: ")

        return table_to_merge

    def merge(self, table_name):
        cursor_a = self.db_a.cursor()
        cursor_b = self.db_b.cursor()

        new_table_name = table_name + "_new"

        try:
            cursor_a.execute("CREATE TABLE IF NOT EXISTS " + new_table_name + " AS SELECT * FROM " + table_name)
            for row in cursor_b.execute("SELECT * FROM " + table_name):
                print(row)
                cursor_a.execute("INSERT INTO " + new_table_name + " VALUES" + str(row) +";")

            cursor_a.execute("DROP TABLE IF EXISTS " + table_name);
            cursor_a.execute("ALTER TABLE " + new_table_name + " RENAME TO " + table_name);
            self.db_a.commit()

            print("\n\nMerge Successful!\n")

        except sqlite3.OperationalError:
            print("ERROR!: Merge Failed")
            cursor_a.execute("DROP TABLE IF EXISTS " + new_table_name);

        finally:
            self.db_a.close()
            self.db_b.close()

        return

    def main(self):
        print("Please enter name of db file")
        file_name_a = input("File Name A:")
        file_name_b = input("File Name B:")

        table_name = self.loadTables(file_name_a, file_name_b)
        self.merge(table_name)

        return

if __name__ == '__main__':
    app = sqlMerge()
    app.main()

字符串
SRC:Tool to merge identical SQLite3 databases

ubbxdtey

ubbxdtey5#

如果你只需要执行一次合并操作(创建一个新的更大的数据库),你可以创建一个脚本/程序,它将循环所有的sqlite数据库,然后将数据插入到你的主(大)数据库。

wwwo4jvm

wwwo4jvm6#

Bash helper自动合并每个数据库中的所有表

下面是https://stackoverflow.com/a/68526717/895245的一个简洁的Bash版本,它循环遍历给定DB中具有相同模式的所有表:
sqlite-merge-dbs

#!/usr/bin/env bash
set -eu
outdb="$1"
shift
indb0="$1"
shift
cp "$indb0" "$outdb"
for table in $(sqlite3 "$outdb" "SELECT name FROM sqlite_master WHERE type='table'"); do
  echo "table: $table"
  for db in "$@"; do
    echo "db: $db"
    sqlite3 "$outdb" "attach '$db' as 'db2'" "insert into \"$table\" select * from \"db2\".\"$table\""
  done
done

字符串
这是通过从特殊的sqlite_master表中提取表名来实现的。
样品使用:

sqlite-merge-dbs out.sqlite in0.sqlite in1.sqlite in2.sqlite


试验项目:

rm -f in0.sqlite in1.sqlite in2.sqlite

sqlite3 in0.sqlite 'create table t(i integer, j integer)'
sqlite3 in1.sqlite 'create table t(i integer, j integer)'
sqlite3 in2.sqlite 'create table t(i integer, j integer)'
sqlite3 in0.sqlite 'insert into t values (1, -1), (2, -2)'
sqlite3 in1.sqlite 'insert into t values (3, -3), (4, -4)'
sqlite3 in2.sqlite 'insert into t values (5, -5), (6, -6)'

sqlite3 in0.sqlite 'create table s(k integer, l integer)'
sqlite3 in1.sqlite 'create table s(k integer, l integer)'
sqlite3 in2.sqlite 'create table s(k integer, l integer)'
sqlite3 in0.sqlite 'insert into s values (11, -11), (12, -12)'
sqlite3 in1.sqlite 'insert into s values (13, -13), (14, -14)'
sqlite3 in2.sqlite 'insert into s values (15, -15), (16, -16)'

./sqlite-merge-dbs out.sqlite in0.sqlite in1.sqlite in2.sqlite

sqlite3 out.sqlite 'select * from t'
echo
sqlite3 out.sqlite 'select * from s'


输出量:

1|-1
2|-2
3|-3
4|-4
5|-5
6|-6

11|-11
12|-12
13|-13
14|-14
15|-15
16|-16


一个Python版本:

#!/usr/bin/env python

import os
import argparse
import sqlite3
import shutil

parser = argparse.ArgumentParser()
parser.add_argument('out')
parser.add_argument('ins', nargs='+')
args = parser.parse_args()

shutil.copyfile(args.ins[0], args.out)
con = sqlite3.connect(args.out)
cur = con.cursor()
tables = list(map(lambda e: e[0], cur.execute("SELECT name FROM sqlite_master WHERE type='table'")))
for db2 in args.ins[1:]:
    cur.execute(f"attach '{db2}' as 'db2'")
    for table in tables:
        cur.execute(f"insert into {table} select * from db2.{table}")
    con.commit()
    cur.execute("detach database db2")

也跳过自动递增列的Bash帮助器

如果表中有重叠的自动递增PK列,sqlite-merge-dbs脚本可能会意外地爆炸。
下面的helper通过查看pragma table_info helper来决定列是否自动递增,并从合并中跳过这些列,让它们在最后重新自动递增。
sqlite-merge-dbs-id

#!/usr/bin/env bash
set -eu
outdb="$1"
shift
indb0="$1"
shift
cp "$indb0" "$outdb"
for table in $(sqlite3 "$outdb" "SELECT name FROM sqlite_master WHERE type='table'"); do
  echo "table: $table"
  cols="$(sqlite3 "$outdb" "pragma table_info($table)" | awk -F\| '{ if ( $3!="INTEGER" || $6=="0" ) { print $2 } else { print "NULL" } }' | paste -sd , -)"
  echo $cols
  for db in "$@"; do
    echo "db: $db"
    sqlite3 "$outdb" "attach '$db' as 'db2'" "insert into \"$table\" select $cols from \"db2\".\"$table\""
  done
done


试验项目:

rm -f in0.sqlite in1.sqlite in2.sqlite

sqlite3 in0.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in1.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in2.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in0.sqlite 'insert into t values (NULL, 1, -1), (NULL, 2, -2)'
sqlite3 in1.sqlite 'insert into t values (NULL, 3, -3), (NULL, 4, -4)'
sqlite3 in2.sqlite 'insert into t values (NULL, 5, -5), (NULL, 6, -6)'

sqlite3 in0.sqlite 'create table s(id integer primary key, k integer, l integer)'
sqlite3 in1.sqlite 'create table s(id integer primary key, k integer, l integer)'
sqlite3 in2.sqlite 'create table s(id integer primary key, k integer, l integer)'
sqlite3 in0.sqlite 'insert into s values (NULL, 11, -11), (NULL, 12, -12)'
sqlite3 in1.sqlite 'insert into s values (NULL, 13, -13), (NULL, 14, -14)'
sqlite3 in2.sqlite 'insert into s values (NULL, 15, -15), (NULL, 16, -16)'

./sqlite-merge-dbs-id out.sqlite in0.sqlite in1.sqlite in2.sqlite

sqlite3 out.sqlite 'select * from t'
echo
sqlite3 out.sqlite 'select * from s'


输出量:

1|1|-1
2|2|-2
3|3|-3
4|4|-4
5|5|-5
6|6|-6

1|11|-11
2|12|-12
3|13|-13
4|14|-14
5|15|-15
6|16|-16


pragma table_info的文档位于:https://www.sqlite.org/pragma.html#pragma_table_info
结果集中的列包括:“name”(其名称);“type”(数据类型,如果给定,则为"“);“notnull”(列是否可以为NULL);“dflt_value”(列的默认值);和“pk”(不属于主键的列为零,或者主键中列的从1开始的索引)。
在这个例子中,我们可以观察到:

sqlite3 in0.sqlite 'pragma table_info(t)'


其输出:

0|id|INTEGER|0||1
1|i|INTEGER|0||0
2|j|INTEGER|0||0


在脚本中,我们只是跳过了任何既是INTEGER又是pk的东西。
一个Python版本:

#!/usr/bin/env python

import os
import argparse
import sqlite3
import shutil

parser = argparse.ArgumentParser()
parser.add_argument('out')
parser.add_argument('ins', nargs='+')
args = parser.parse_args()

shutil.copyfile(args.ins[0], args.out)
con = sqlite3.connect(args.out)
cur = con.cursor()
tables = list(map(lambda e: e[0], cur.execute("SELECT name FROM sqlite_master WHERE type='table'")))
table_to_pk_col = {}
table_to_insert = {}
table_to_cols = {}
for table in tables:
    cols = cur.execute(f'pragma table_info({table})').fetchall()
    table_to_cols[table] = cols
    for row in cols:
        col_name = row[1]
        type_ = row[2]
        pk = row[5]
        if type_ == 'INTEGER' and pk != 0:
            if table in table_to_pk_col:
                del table_to_pk_col[table]
            else:
                table_to_pk_col[table] = col_name
table_to_insert = { table: ','.join(list(map(
    lambda c: 'NULL' if c[1] == table_to_pk_col.get(table, None) else c[1], table_to_cols[table]
))) for table in tables }
for db2 in args.ins[1:]:
    cur.execute(f"attach '{db2}' as 'db2'")
    cur.execute(f"begin")
    for table in tables:
        cur.execute(f"insert into {table} select {table_to_insert[table]} from db2.{table}")
    con.commit()
    cur.execute("detach database db2")

保留外键自动增加pk

好了,这就是BOSS级别!Bash变得太麻烦了,所以我拿出了一些Python:
sqlite-merge-dbs-id-ref.py

#!/usr/bin/env python

import os
import argparse
import sqlite3
import shutil

parser = argparse.ArgumentParser()
parser.add_argument('out')
parser.add_argument('ins', nargs='+')
args = parser.parse_args()

shutil.copyfile(args.ins[0], args.out)
con = sqlite3.connect(args.out)
cur = con.cursor()
tables = list(map(lambda e: e[0], cur.execute("SELECT name FROM sqlite_master WHERE type='table'")))
table_to_pk_col = {}
table_to_insert = {}
table_to_cols = {}
table_to_pk_count = {}
table_to_col_to_foreign = {}
for table in tables:
    col_to_foreign = {}
    table_to_col_to_foreign[table] = col_to_foreign
    cols = cur.execute(f'pragma foreign_key_list({table})').fetchall()
    for col in cols:
        col_name = col[3]
        target_table = col[2]
        col_to_foreign[col_name] = target_table
for table in tables:
    cols = cur.execute(f'pragma table_info({table})').fetchall()
    table_to_cols[table] = cols
    for row in cols:
        col_name = row[1]
        type_ = row[2]
        pk = row[5]
        if type_ == 'INTEGER' and pk != 0:
            if table in table_to_pk_col:
                del table_to_pk_col[table]
            else:
                table_to_pk_col[table] = col_name
    if table in table_to_pk_col:
        table_to_pk_count[table] = cur.execute(f'select max({table_to_pk_col[table]}) from {table}').fetchone()[0]
    else:
        table_to_pk_count[table] = cur.execute(f'select count(*) from {table}').fetchone()[0]
def inc_str(table, col):
    if table in table_to_col_to_foreign:
        col_to_foreign = table_to_col_to_foreign[table]
        if col in col_to_foreign:
            return f'+{table_to_pk_count[col_to_foreign[col]]}'
    return ''
for db2 in args.ins[1:]:
    cur.execute(f"attach '{db2}' as 'db2'")
    table_to_pk_count_inc = {}
    for table in tables:
        table_to_insert = {
            table: ','.join(list(map(
                lambda c: 'NULL' if c[1] == table_to_pk_col.get(table, None) else \
                    c[1] + inc_str(table, c[1]),
                table_to_cols[table]
            ))) for table in tables
        }
        cur.execute(f"insert into {table} select {table_to_insert[table]} from db2.{table}")
        table_to_pk_count_inc[table] = cur.rowcount
    for table in tables:
        table_to_pk_count[table] += table_to_pk_count_inc[table]
    con.commit()
    cur.execute("detach database db2")


该脚本假设任何INTEGER PRIMARY KEY也是表中唯一的pk都自动递增。
这是我们需要通过的测试,其中表ts由表ref链接起来,表ref的外键指向这两个表:

rm -f in0.sqlite in1.sqlite in2.sqlite

sqlite3 in0.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in1.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in2.sqlite 'create table t(id integer primary key, i integer, j integer)'
sqlite3 in0.sqlite 'insert into t values (1, 1, -1), (2, 2, -2), (3, 0, 0)'
sqlite3 in1.sqlite 'insert into t values (1, 3, -3), (2, 4, -4), (3, 0, 0)'
sqlite3 in2.sqlite 'insert into t values (1, 5, -5), (2, 6, -6), (3, 0, 0)'

sqlite3 in0.sqlite 'create table s(id integer primary key, i integer, j integer)'
sqlite3 in1.sqlite 'create table s(id integer primary key, i integer, j integer)'
sqlite3 in2.sqlite 'create table s(id integer primary key, i integer, j integer)'
sqlite3 in0.sqlite 'insert into s values (1, 1, -1), (2, 2, -2)'
sqlite3 in1.sqlite 'insert into s values (1, 3, -3), (2, 4, -4)'
sqlite3 in2.sqlite 'insert into s values (1, 5, -5), (2, 6, -6)'

for i in 0 1 2; do
  sqlite3 "in$i.sqlite" <<EOF
create table ref(
  k integer,
  l integer,
  primary key(k, l),
  foreign key(k) references t(id),
  foreign key(l) references s(id)
)
EOF
done
sqlite3 in0.sqlite 'insert into ref values (1, 2)'
sqlite3 in1.sqlite 'insert into ref values (1, 2)'
sqlite3 in2.sqlite 'insert into ref values (1, 2)'

./sqlite-merge-dbs-id-ref.py out.sqlite in0.sqlite in1.sqlite in2.sqlite

echo t
sqlite3 out.sqlite 'select * from t'
echo s
sqlite3 out.sqlite 'select * from s'
echo ref
sqlite3 out.sqlite 'select * from ref'


输出量:

t
1|1|-1
2|2|-2
3|0|0
4|3|-3
5|4|-4
6|0|0
7|5|-5
8|6|-6
9|0|0
+ echo

s
1|1|-1
2|2|-2
3|3|-3
4|4|-4
5|5|-5
6|6|-6
+ echo

ref
1|2
4|4
7|6


所以现在我们看到表ts的PK和以前一样递增,但是对于ref,发生了更复杂的事情:脚本相应地用PK递增来递增外键,所以我们正确地维护了关系。
例如:

4|4

from ref链接4|3|-3 from t4|4|-4 from s。这些值在合并之前已经链接起来了:

sqlite3 in1.sqlite 'insert into ref values (1, 2)'

它将价值观从以下方面联系起来:

sqlite3 in1.sqlite 'insert into t values (1, 3, -3), (2, 4, -4), (3, 0, 0)'
sqlite3 in1.sqlite 'insert into s values (1, 3, -3), (2, 4, -4)'

但是在连接表上,ID现在是4和4而不是1和2,我们已经正确地解释了这一点。
该脚本依赖于forign_key_list杂注来列出外键https://www.sqlite.org/pragma.html#pragma_foreign_key_list E>g.:

sqlite3 in0.sqlite 'pragma foreign_key_list(ref)'

给出:

0|0|s|l|id|NO ACTION|NO ACTION|NONE
1|0|t|k|id|NO ACTION|NO ACTION|NONE

作为参考,table_info给出:

sqlite3 in0.sqlite 'pragma table_info(s)'

提供:

0|k|INTEGER|0||1
1|l|INTEGER|0||2

在Ubuntu 23.04,sqlite 3.40.1上测试。

mf98qq94

mf98qq947#

如果您已经到达这个提要的底部,但还没有找到您的解决方案,这里也有一种方法可以合并2个或更多sqlite数据库的表。
首先尝试下载并安装DB browser for sqlite database。然后尝试在两个窗口中打开您的数据库,并尝试通过简单地将表从一个窗口拖放到另一个窗口来合并它们。但问题是,您只能拖放一个窗口。因此,它不是一个真正的解决方案,但它可以用来保存一些时间,从进一步的搜索,如果您的数据库是小了

lkaoscv7

lkaoscv78#

无意冒犯,就像一个开发人员对另一个开发人员一样,我担心你的想法似乎非常低效。在我看来,你应该在同一个数据库文件中存储多个表,而不是统一SQLite数据库。
但是如果我弄错了,我想你可以附加数据库,然后使用视图来简化查询,或者在内存中创建一个表,然后复制所有数据(但这会导致性能更差,特别是如果你有大型数据库)。

相关问题