使用Python脚本将树状数据集插入MySQL

edqdpe6u  于 12个月前  发布在  Mysql
关注(0)|答案(1)|浏览(100)

你好,我是MySQL的新手。我正面临着一些问题,插入一个大的“树”数据到我的数据库。

**DATA:**我有一个树状的数据,包含各种主题、子主题、子子主题等等。我把它写进了一个yml文件。下面是yml文件的一小部分。

mathematics:
  - algebra:
    - linear_and_multilinear_algebra:
      - vector_spaces
      - matrix_operations
    - group_theory
  - analysis:
    - real_analysis
    - complex_analysis
    - functional_analysis:
      - integral_equations
      - differential_equation:
        - ode
        - pde
      - operator_theory
      - operator_algebra:
        - c*-algebras:
          - c*-algebras_and_their_representations
          - operator_spaces_associated_with_c*-algebras
          - kk-theory_and_k-homology_of_c*-algebras
          - non-selfadjoint_operator_algebras
        - banach_algebras
        - von_neumann_algebras:
          - non-commutative_geometry
          - factors
          - tomita_takesaki_theory

        - operator_spaces
        - non-commutative_algebras
natural_sciences:
  - biology:
    - botany
    - zoology
    - microbiology:
      - virology
      - bacteriology
  - chemistry:
    - organic_chemistry
    - inorganic_chemistry
    - physical_chemistry
  - physics:
    - classical_mechanics
    - quantum_mechanics:
      - quantum_entanglement
      - quantum_field_theory
technology_and_engineering:
  - computer_science:
    - algorithms
    - data_structures
    - artificial_intelligence:
      - machine_learning

**MySQL数据库:**为了存储这些数据,我在MySQL中创建了以下表

CREATE DATABASE IF NOT EXISTS knowledge;
USE knowledge;

-- Table to store topics
CREATE TABLE IF NOT EXISTS topics (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(255) NOT NULL
);

-- Closure table
CREATE TABLE IF NOT EXISTS closure (
    ancestor INT NOT NULL,
    descendant INT NOT NULL,
    length INT NOT NULL,
    PRIMARY KEY (ancestor, descendant),
    FOREIGN KEY (ancestor) REFERENCES topics(id),
    FOREIGN KEY (descendant) REFERENCES topics(id)
);

**问题。**我想写一个python脚本,它可以读取yml文件并填充两个表。

我写了下面的脚本,但它给出了错误。

验证码

# Load the YAML data from your file
with open(cat_config_file, 'r') as yaml_file:
    data = yaml.load(yaml_file, Loader=yaml.FullLoader)

# Establish a connection to the MySQL database
db_connection = mysql.connector.connect(**db_config)

# Create a cursor object to interact with the database
cursor = db_connection.cursor()

# Recursive function to insert topics and build the closure table
def insert_topic(topic, parent_id=None, level=0):
    # Insert the topic into the topics table
    cursor.execute("INSERT INTO topics (name) VALUES (%s)", (topic,))
    topic_id = cursor.lastrowid

    # Insert the closure record for the current topic and its parent with level
    if parent_id is not None:
        cursor.execute("INSERT INTO closure (ancestor, descendant, length) VALUES (%s, %s, %s)", (parent_id, topic_id, level))

    # Recursively insert subtopics
    if isinstance(data[topic], list):
        for subtopic in data[topic]:
            insert_topic(subtopic, topic_id, level + 1)
            
# Iterate through the top-level topics and insert them
for top_level_topic in data.keys():
    insert_topic(top_level_topic)

# Commit changes and close the cursor and connection
db_connection.commit()
cursor.close()
db_connection.close()

print("Data inserted into the MySQL database, with hierarchy levels in the 'closure' table.")

错误

Traceback (most recent call last):
  File "book_category.py", line 54, in <module>
    insert_topic(top_level_topic)
  File "book_category.py", line 50, in insert_topic
    insert_topic(subtopic, topic_id, level + 1)
  File "book_category.py", line 40, in insert_topic
    cursor.execute("INSERT INTO topics (name) VALUES (%s)", (topic,))
  File "/home/indrajit/Documents/hello_world/mysql/env/lib/python3.8/site-packages/mysql/connector/cursor_cext.py", line 317, in execute
    prepared = self._cnx.prepare_for_mysql(params)
  File "/home/indrajit/Documents/hello_world/mysql/env/lib/python3.8/site-packages/mysql/connector/connection_cext.py", line 802, in prepare_for_mysql
    result = self._cmysql.convert_to_mysql(*params)
_mysql_connector.MySQLInterfaceError: Python type dict cannot be converted

请帮我修改一下代码。谢谢.

**PS.**我想补充的是,我将查询数据库以获取特定节点的所有父节点。例如,如果我输入pde,那么它应该返回mathematics > analysis > functional_analysis > differential_equation > pde

nzk0hqpo

nzk0hqpo1#

该错误与闭包表甚至SQL无关。这是一个纯粹的Python问题,关于如何遍历YAML结构。
开始插入层次结构:

for top_level_topic in data.keys():
    insert_topic(top_level_topic)

data.keys()返回一个字符串列表,这些字符串是层次结构顶部的键。字符串是标量,它们可以用作SQL语句的参数。
但是,当您尝试执行递归步骤以插入层次结构的下一级时,您传递的是刚刚插入的主题所引用的Python dict。

for subtopic in data[topic]:
            insert_topic(subtopic, topic_id, level + 1)

你没有使用subtopic.keys(),因此subtopic是一个dict,而不是一个字符串。
此外,递归函数将始终引用层次结构顶部的data[topic]。即使你传递了子主题键,它们也会试图在层次结构的顶部找到那个键。
我会设计这个递归函数来接受字典,并检查字典本身的键。
然后,您将开始第一步:

insert_topic(data)

我将让您决定如何设计该函数中的代码来遍历YAML结构,因为它既有dict又有tuple。YAML比简单的树要复杂一些。

相关问题