Create separate tables based on values in one column of a base table in SQL Server

jucafojl  于 2023-11-16  发布在  SQL Server
关注(0)|答案(2)|浏览(176)

My XML was:

  1. set @xml=N'
  2. <root>
  3. <attribute col1="attr1" col2="varchar(200)" col3="A"/>
  4. <attribute col1="attr2" col2="varchar(200)" col3="A"/>
  5. <attribute col1="attr3" col2="varchar(200)" col3="B"/>
  6. <attribute col1="attr4" col2="varchar(200)" col3="C"/>
  7. </root>'

I converted the XML into Base table in SQL Server:

  1. COL1 COL2 COL3
  2. -----------------------------
  3. attr1 varchar(200) A
  4. attr2 varchar(200) A
  5. attr3 varchar(200) B
  6. attr4 varchar(200) C

I want to create tables like

  1. create table A
  2. (
  3. attr1 varchar(200),
  4. attr2 varchar(200)
  5. )

and similarly for B and C in COL3

l2osamch

l2osamch1#

You can insert values into a new table based on a select query.

  1. CREATE TABLE A
  2. AS (SELECT COL1, COL2
  3. FROM old_table
  4. WHERE COL3 = "A");

But you will have to do this for every distinct value of COL3 .

rkttyhzu

rkttyhzu2#

you can use python to do it,

  1. import pandas as pd
  2. import xml.etree.ElementTree as ET
  3. import json
  4. # Function to remove XML namespaces from the XML content
  5. def remove_xml_namespaces(xml_content):
  6. return ''.join(['<{}>'.format(tag.split('}')[1]) if '}' in tag else tag for tag in xml_content.split('<')])
  7. # Read the mapping file
  8. mapping_df = pd.read_csv('mapping.csv')
  9. # Read the JSON data from a file
  10. with open('input2.json', 'r') as json_file:
  11. json_data = json.load(json_file)
  12. # Initialize an empty list to store the data
  13. data = []
  14. for item in json_data:
  15. # Parse the XML string from the JSON data
  16. xml_string = item['xmlcol']
  17. # Remove XML namespaces
  18. xml_string = remove_xml_namespaces(xml_string)
  19. root = ET.fromstring(xml_string)
  20. # Iterate through root elements in the XML
  21. for element in root:
  22. # Initialize a dictionary to store the data for this XML
  23. xml_data = {}
  24. # Add "_id" value to the dictionary
  25. xml_data["_id"] = item["_id"]["$oid"]
  26. # Iterate through the mapping DataFrame
  27. for index, row in mapping_df.iterrows():
  28. xml_path = row['xml_path']
  29. column_name = row['column_name']
  30. # Find the element in the current root using the path
  31. elements = element.findall(xml_path)
  32. # Extract the values and append to the dictionary
  33. values = [element.text if element is not None else '' for element in elements]
  34. xml_data[column_name] = values
  35. data.append(xml_data)
  36. # Create a DataFrame from the extracted data
  37. result_df = pd.DataFrame(data)
  38. # Iterate through columns containing arrays and apply explode
  39. for column_name in result_df.columns:
  40. if isinstance(result_df[column_name].iloc[0], list):
  41. result_df = result_df.apply(lambda x: pd.Series(x[column_name]), axis=1)
  42. result_df.rename(columns=lambda col: column_name + str(col), inplace=True)
  43. # Reset the DataFrame index
  44. result_df.reset_index(drop=True, inplace=True)
  45. # Replace empty strings with None
  46. result_df = result_df.applymap(lambda x: None if x == '' else x)
  47. # Show the resulting DataFrame
  48. print(result_df)

mapping contains the map of which column you want to extract

展开查看全部

相关问题