I am trying to calculate something in Power BI or SQL, as this is conserning many many millions of rows, performance leads me to doing this in SQL, But i've been trying to get this work for quite a while already!
I am trying to make a calculate column that shows the current state of an account
| AccountId | PartnerId | Transactions | FirstTransaction | row_num | Industry | Private/Public | Result |
| ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ | ------------ |
| 6906 | 19 | 557 | 01-Jan-22 | 1 | Financial services | Private | Private Only |
| 6906 | 20 | 557 | 02-Jan-22 | 2 | Financial services | Private | Private Only |
| 6906 | 32 | 30 | 05-Jan-22 | 3 | Government | Public | Public + Private |
| 6906 | 738 | 2 | 12-Jan-22 | 4 | Government | Public | Public + Private |
| 6906 | 82 | 5 | 13-Jan-22 | 5 | | Private | Public + Private |
| 6906 | 74 | 1 | 11-Jun-22 | 6 | Financial services | Private | Public + Private |
| 6906 | 56 | 1 | 27-Jun-22 | 7 | Insurance | Private | Public + Private |
The result should check if an account has Public or Private transactions, in order of occurrence. I added a rank, partitioned by accountid and ordered by date ( = row_num) But next I would want a column that can crosscheck all the values within a partition.
In this example above, This account has the label "Private Only" on 01-01 as it only has done transactions with a private partner. the second line also says Private only, as it still has only done transactions with a private partner. But as of the third line it changes to Public + Private as it also used public transactions.
Anyone an idea how I can calcualte that Result column?
Edit: My Original Query:
SELECT ua.[AccountId],
ua.[PartnerId],
i.industry,
i.[Public/Private],
COUNT(*) AS Transactions,
MIN(ua.[FilepathDate]) AS FirstTransaction,
ROW_NUMBER() OVER (PARTITION BY accountid
ORDER BY MIN(ua.[FilepathDate] ))
Partner_row_num
FROM [dbo].[incremental_UserActions] ua
INNER JOIN tmp_Industry i ON ua.partnerid = i.partnerid
WHERE YEAR(ua.[FilepathDate]) = 2022
AND ua.accountid = 6906
GROUP BY ua.[AccountId],
ua.[PartnerId],
i.industry,
i.[Public/Private]
Edit 2 : the eventual output of the above looks like this :
| AccountId | FirstTransaction | MonthAgg | Result |
| --------- | ---------------- | ---------- | ----------------- |
| 7015 | 05/01/2022 | 01/01/2022 | FAS Only |
| 7015 | 06/01/2022 | 01/01/2022 | FAS + Other Public|
| 7015 | 19/01/2022 | 01/01/2022 | Public + Private |
| 7015 | 19/01/2022 | 01/01/2022 | Public + Private |
| 7015 | 24/01/2022 | 01/01/2022 | Public + Private |
| 7015 | 08/04/2022 | 01/04/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/06/2022 | Public + Private |
| 7799 | 04/01/2022 | 01/01/2022 | Private Only |
| 7799 | 10/01/2022 | 01/01/2022 | Public + Private |
| 7799 | 08/03/2022 | 01/03/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/04/2022 | Public + Private |
| 7799 | 17/12/2022 | 01/12/2022 | Public + Private |
The final result should have consecutive months per accountid. so 1 line per account per month
| AccountId | FirstTransaction | MonthAgg | Result |
| --------- | ---------------- | ---------- | ---------------- |
| 7015 | 24/01/2022 | 01/01/2022 | Public + Private |
| 7015 | 24/01/2022 | 01/02/2022 | Public + Private |
| 7015 | 24/01/2022 | 01/03/2022 | Public + Private |
| 7015 | 08/04/2022 | 01/04/2022 | Public + Private |
| 7015 | 08/04/2022 | 01/05/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/06/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/07/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/08/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/09/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/10/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/11/2022 | Public + Private |
| 7015 | 08/06/2022 | 01/12/2022 | Public + Private |
| 7799 | 10/01/2022 | 01/01/2022 | Public + Private |
| 7799 | 10/01/2022 | 01/02/2022 | Public + Private |
| 7799 | 08/03/2022 | 01/03/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/04/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/05/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/06/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/07/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/08/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/09/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/10/2022 | Public + Private |
| 7799 | 08/04/2022 | 01/11/2022 | Public + Private |
| 7799 | 17/12/2022 | 01/12/2022 | Public + Private |
3条答案
按热度按时间jv4diomz1#
You can use window functions. If there are only two possible values in the
[Public/Private]
column, then, starting from your existing query:The idea is to compare the MIN and MAX values; if they are the same, then it means there is only one distinct value in the window - else there are two different values, so we just use the mixed tag.
bjg7j2ky2#
Query:
Output:
fiddle
5m1hhzi43#
I have created a months table and inserted 12 months in it. Then I have joined it with previous result. No recursion is needed. Please check.
OUtput: