分组聚合的sql计数

8cdiaqws  于 2021-07-15  发布在  ClickHouse
关注(0)|答案(2)|浏览(372)

我正在尝试编写一个sql查询,它允许我用历史数据创建一个折线图。我试图弄清楚随着时间的推移,有多少用户在使用我的应用程序的每一个版本(每天)。我的y轴将是所有应用程序的使用率百分比(满分100),x轴是一天,每个构建都是一条不同的线。在任何时候,所有行的总和都应等于100%。
由于此查询应按版本/内部版本进行分组,因此除了日期之外,我还试图找出如何在查询中获取任何给定日期的总用户百分比。到目前为止,我能够得到以下查询:

SELECT DISTINCT 
    sub.Version, 
    sub.Build,     
    sub.app_id, 
    sub.Users, 
    sub.`day`,
    (
        SELECT COUNT(DISTINCT user_id)
        FROM snowplow_enricher_good seg
    ) AS Total,
    (sub.Users/Total) * 100 AS Percent
FROM 
(
    SELECT
        visitParamExtractString(seg.contexts, 'version') AS Version,
        visitParamExtractString(seg.contexts, 'build') AS Build,
        seg.app_id,
        seg.`day`,
        CONCAT(
            Version, 
            ' (', 
            Build, 
            ')'
        ) AS AppBuildVersion,
        COUNT(DISTINCT seg.user_id) AS Users
    FROM snowplow_enricher_good seg
    GROUP BY Version, Build, app_id, `day`
    ORDER BY Users DESC
) AS sub
WHERE sub.app_id = 'APPID';

请注意,当前显示的百分比是所有天数的百分比,而不是一天。我试着创造一个 WHERE 我的习惯条款 FROM 声明,但失败了。
提前感谢:)

gab6jxml

gab6jxml1#

组数组

SELECT
    totalCnt,
    totalSum,
    ga.1 AS tag,
    ga.2 AS value,
    (value / totalSum) * 100 AS percent
FROM
(
    SELECT
        count() AS totalCnt,
        sum(value) AS totalSum,
        groupArray((tag, value)) AS ga
    FROM
    (
        SELECT
            tag,
            value
        FROM
        (
            SELECT
                [1, 2, 3, 4, 5] AS tag,
                [10, 100, 50, 100, 40] AS value
        )
        ARRAY JOIN
            tag,
            value
    )
)
ARRAY JOIN ga

┌─totalCnt─┬─totalSum─┬─tag─┬─value─┬────────────percent─┐
│        5 │      300 │   1 │    10 │ 3.3333333333333335 │
│        5 │      300 │   2 │   100 │  33.33333333333333 │
│        5 │      300 │   3 │    50 │ 16.666666666666664 │
│        5 │      300 │   4 │   100 │  33.33333333333333 │
│        5 │      300 │   5 │    40 │ 13.333333333333334 │
└──────────┴──────────┴─────┴───────┴────────────────────┘
jm81lzqq

jm81lzqq2#

通过一系列的连接和子查询就可以解决这个问题:

SELECT 
    day, 
    app_id, 
    version, 
    version_count, 
    app_count, 
    (version_count / app_count) * 100 AS percent
FROM (
    SELECT 
        day, 
        app_id, 
        visitParamExtractString(contexts, 'version') AS version, 
        count(DISTINCT user_id) AS version_count
    FROM 
        snowplow_enricher_good
    where 
        day >= subtractDays(today(), 30)
    GROUP BY 
        day, 
        app_id, 
        version
) 
INNER JOIN (
    SELECT 
        day, 
        app_id, 
        count(DISTINCT user_id) AS app_count
    FROM 
        snowplow_enricher_good
    WHERE 
        day >= subtractDays(today(), 30)
    GROUP BY 
        day, 
        app_id
)
USING 
    day, 
    app_id
WHERE
    app_id = 'APPID'
ORDER BY 
    day DESC, 
    app_id, 
    version;

相关问题