SQL Server How to create a key when producing a dimension and fact table from single table when the prod can't be unique identified?

6gpjuf90  于 6个月前  发布在  其他
关注(0)|答案(2)|浏览(61)

I have a wide table called Sales.

Each row represents a sale. It has got various info like sales, customer, product.

I'm building a star schema from this using SQL to create product dimension using:

Select distinct prodnumber,prodname,size,class
From sales

Building fact table as follows:

Select salesamt,salesdate,prodnumber,prodname,size,class
From sales

The reason for having prod fields in fact table is that- A unique product unfortunately isn't identified by productnumber, rather by prodnumber,prodname,size,class.

Hence the key is a combination.

So in Power BI when joining the prod dimension and fact table in the final report I need to join on 4 columns. Is there an alternative way to do this? Like somehow generating a unique key to make the join simpler?

kqlmhetl

kqlmhetl1#

This answer is done in Power BI PowerQuery.
It is doing the almost exactly the same thing as the other SQL answer.

For your first query, let's call it Dim Product , Go to Add Column in the ribbon, and select Index Column > From 1 . Then in the formula bar, rename "Index" to "ID" (or similar).

For your second query, let's call it Fact Sale , in the Home tab in the ribbon, select Merge Queries . In the pop-up:

  • Select 'Dim Product` in the drop-down in the middle.
  • Under Fact Sale at the top click on prodnumber column, then while holding down Ctrl or Shift , select prodname , size , and class . You should see 1, 2, 3, 4 appear in their headers.
  • Do the same for Dim Product headers, ensuring the same order as above.

It should look like:

Finally:

  • Select columns prodnumber , prodname , size , class - and remove those columns.
  • Click on the double arrows in the Dim Product header, and only select ID , and uncheck Use original column name as prefix .
  • Rename the new ID column to Product ID if you like.

And that's it.

1dkrff03

1dkrff032#

CREATE TABLE dbo.PRODUCT_DIM
(
PRODUCT_PK INT IDENTITY(1,1) PRIMARY KEY NOT NULL,
PRODNUMBER INT NOT NULL,
PRODNAME NVARCHAR(MAX) NOT NULL,
SIZE NVARCHAR(MAX) NOT NULL,
CLASS NVARCHAR(MAX) NOT NULL
)

INSERT INTO DBO.PRODUCT_DIM
(
PRODNUMBER,
PRODNAME,
SIZE,
CLASS
)
SELECT DISTINCT
PRODNUMBER,
PRODNAME,
SIZE,
CLASS
FROM SALES

Then you will need to include the PK (in an FK form) in your fact table You can find what this is by doing the select as follows

CREATE THE FACT TABLE FIRST WITH THE COLUMNS LISTED APART FROM THE PRODNUMBER, PRODNAME, CLASS AND SIZE COLUMNS BUT WITH A PRODUCT_FK INT NOT NULL

INSERT INTO SALES_FACT
(
SALESAMT,
SALESDATE
PRODUCT_FK
)

Select S.salesamt,S.salesdate, P.PRODUCT_PK
From sales S
INNER JOIN PRODUCT_DIM P
ON 
P.PRODNUMBER = S.PRODNUMBER
P.PRODNAME = S.PRODNAME
P.SIZE = S.SIZE
P.CLASS = S.CLASS

Finally you can then join a single table using the product_pk and product_fk in your dim and fact tables respectively (if in power bi create the relationship on this)

相关问题