SQL Server SQL Database normalization and data insertion

qnakjoqk  于 2023-11-16  发布在  其他
关注(0)|答案(3)|浏览(85)

№1 source table: sales (from unstructured file .csv)
| id | auto | gasoline_consumption | country |
| ------------ | ------------ | ------------ | ------------ |
| 1 | Lada Vesta, grey | 7.3 | Russia |
| 2 | BMW F80, red | 8.3 | Germany |
| 3 | Hyundai Elantra, red | 5 | South Korea |
| ... | ... | ... | ... |

  1. CREATE TABLE [sales](
  2. [id] [nvarchar](50) NOT NULL,
  3. [auto] [nvarchar](50) NOT NULL,
  4. [gasoline_consumption] [nvarchar](50) NULL,
  5. [country] [nvarchar] (50)
  6. ) ON [PRIMARY];
  7. INSERT INTO sales values(1,'Lada Vesta, grey', 7.3, 'Russia');
  8. INSERT INTO sales values(2,'BMW F80, red', 8.3, 'Germany');
  9. INSERT INTO sales values(3,'Hyundai Elantra, red', 5, 'South Korea');

№2 source table: Brand

idname
1Lada
2Hyundai
3BMW
......
  1. CREATE TABLE Brand (
  2. id INT PRIMARY KEY IDENTITY(1,1) NOT NULL,
  3. name VARCHAR(50),
  4. );
  5. INSERT INTO Brand values(1, 'Lada');
  6. INSERT INTO Brand values(2,'Hyundai');
  7. INSERT INTO Brand values(3,'BMW');

№3 source table: Color

idname
1grey
2red
......
  1. CREATE TABLE Color (
  2. id INT PRIMARY KEY IDENTITY (1,1) NOT NULL,
  3. name VARCHAR(15)
  4. )
  5. INSERT INTO Color values(1, 'grey');
  6. INSERT INTO Color values(2,'red');

№4 source table: Country

idname
1Russia
2Germany
3South Korea
......
  1. CREATE TABLE Country (
  2. id INT PRIMARY KEY IDENTITY (1,1) NOT NULL,
  3. name VARCHAR(50)
  4. )
  5. INSERT INTO Country values(1, 'Russia');
  6. INSERT INTO Country values(2,'Germany');
  7. INSERT INTO Country values(2,'South Korea');

№5 Example of the final table: Cars

idid_brandmodel_namegasoline_consumptionid_colorid_country
13F808,322
21Vesta7,311
32Elantra523
..................
  1. CREATE TABLE Cars (
  2. id INT PRIMARY KEY IDENTITY(1,1) NOT NULL,
  3. id_brand INT NOT NULL,
  4. model_name varchar (30),
  5. gasoline_consumption DECIMAL(3,1),
  6. id_color INT NOT NULL,
  7. FOREIGN KEY (id_brand) REFERENCES Brand (id),
  8. FOREIGN KEY (id_color) REFERENCES Color (id),
  9. FOREIGN KEY (id_country) REFERENCES Country (id)
  10. )

I was able to transfer colors, car brand and country from the original table to other tables. But now I don't understand how to enter data into the "Cars" table if need to take into account the id relationship between the car brand, country, color and model name.

Note: These data are given as an example, in fact there are about 1000 of them there

30byixjq

30byixjq1#

You can join back to your reference tables to get back the id, e.g.

  1. INSERT dbo.Cars (id_brand, model_name, gasoline_consumption, id_color, id_country)
  2. SELECT id_brand = b.id,
  3. model_name = SUBSTRING(s.auto, CHARINDEX(' ', s.auto), CHARINDEX(', ', s.auto) -CHARINDEX(' ', s.auto)),
  4. gasoline_consumption = TRY_CONVERT(DECIMAL(3, 1), s.gasoline_consumption),
  5. id_color = cr.id,
  6. id_country = c.id
  7. FROM dbo.Sales AS s
  8. LEFT JOIN dbo.Country AS c
  9. ON c.name = s.country
  10. LEFT JOIN dbo.Brand AS b
  11. ON b.name = LEFT(s.auto,CHARINDEX(' ', s.auto))
  12. LEFT JOIN dbo.Color AS cr
  13. ON cr.name = TRIM(RIGHT(s.auto, CHARINDEX(' ', REVERSE(s.auto))))

N.B. The above is inaccurate and won't work many values in `auto. If you have a colour such as "Sky Blue" it will fail, if you have a brand like "Land Rover", it will fail. The parsing logic shown is simple and for demo purposes, this should be replaced with your actual parsing logic.

You should probably encapsulate your parsing logic though to avoid repeating yourself and reduce code maintenance, something like:

  1. CREATE VIEW dbo.Sales_Parsed
  2. AS
  3. SELECT model_name = SUBSTRING(s.auto, CHARINDEX(' ', s.auto), CHARINDEX(', ', s.auto) -CHARINDEX(' ', s.auto)),
  4. gasoline_consumption = TRY_CONVERT(DECIMAL(3, 1), s.gasoline_consumption),
  5. color = TRIM(RIGHT(s.auto, CHARINDEX(' ', REVERSE(s.auto)))),
  6. brand = LEFT(s.auto,CHARINDEX(' ', s.auto))
  7. FROM dbo.Sales AS s;

You can then reference this view when populating your reference tables, e.g.

  1. INSERT dbo.Country (name)
  2. SELECT DISTINCT sp.Country
  3. FROM dbo.Sales_Parsed AS sp
  4. EXCEPT
  5. SELECT c.name
  6. FROM dbo.Country AS c;
  7. INSERT dbo.Color (name)
  8. SELECT DISTINCT sp.color
  9. FROM dbo.Sales_Parsed AS sp
  10. EXCEPT
  11. SELECT c.name
  12. FROM dbo.color AS c;
  13. INSERT dbo.Brand (name)
  14. SELECT DISTINCT sp.brand
  15. FROM dbo.Sales_Parsed AS sp
  16. EXCEPT
  17. SELECT c.name
  18. FROM dbo.Brand AS c;

And also where you populate your cars table:

  1. INSERT dbo.Cars (id_brand, model_name, gasoline_consumption, id_color, id_country)
  2. SELECT id_brand = b.id,
  3. s.model_name,
  4. s.gasoline_consumption,
  5. id_color = cr.id,
  6. id_country = c.id
  7. FROM dbo.Sales_Parsed AS s
  8. INNER JOIN dbo.Country AS c
  9. ON c.name = s.country
  10. INNER JOIN dbo.Brand AS b
  11. ON b.name = s.brand
  12. INNER JOIN dbo.Color AS cr
  13. ON cr.name = s.color;

Example on db<>fiddle

展开查看全部
w8rqjzmb

w8rqjzmb2#

You can do it in two different ways:

1. way (manual):

  1. INSERT INTO
  2. Cars (id_brand, model_name, gasoline_consumption, id_color, id_country)
  3. VALUES
  4. (3, 'F80', 8.3, 2, 2),
  5. (1, 'Vesta', 7.3, 1, 1),
  6. (2, 'Elantra', 5, 2, 3)

2. way (with SELECT): If you run a script or anything else you can use TSQL (if you are using SQL Server) to itterate over all single lines from your import table. Then you can extract your values which you inserted in all the other tables programatically from the columns and then you can use a SELECT to get the correct Id like in this very simple example:

  1. INSERT INTO
  2. Cars (id_brand, model_name, gasoline_consumption, id_color, id_country)
  3. VALUES
  4. (SELECT
  5. id
  6. FROM
  7. Brand
  8. WHERE
  9. name = 'BMW',
  10. 'F80',
  11. 8.3,
  12. SELECT
  13. id
  14. FROM
  15. Color
  16. WHERE
  17. name = 'red',
  18. SELECT
  19. id
  20. FROM
  21. Country
  22. WHERE name = 'Germany'),
展开查看全部
ct2axkht

ct2axkht3#

If you are refering to what to enter for the foreign keys following very basic logic, then you need to know what the ID of the relevant value for the relevant table is.

So if you wanted to insert 'BMW' as car type, you would need to find the ID of that, which in this case would be 3.

The ID/Identity column will be auto populated when creating an entry (unless you have INSERT_IDENTITY set to ON.

When inserting values, if you use a column list to explicitly state what you are inserting values, it can help to clear up exactly what you are adding. You can insert entries in the following way b

  1. -- Explicit
  2. INSERT INTO Cars (
  3. id_brand,
  4. model_name,
  5. gasoline_consumption,
  6. id_color,
  7. id_country
  8. )
  9. VALUES (3, 'F80', 8.3, 2, 2)
  10. -- Implicit
  11. INSERT INTO Cars VALUES (3, 'F80', 8.3, 2, 2)
展开查看全部

相关问题