mariadb 获取每列的最常用值

uurity8g  于 2022-11-08  发布在  其他
关注(0)|答案(2)|浏览(206)

我尝试创建一个SQL查询,检索表中每一行的total_cost。除此之外,我还需要收集columnAcolumnB的最主要的值,以及它们各自的值。
例如,具有以下表格内容:
| 成本|A列|列B|目标物|
| - -|- -|- -|- -|
| 二百五十个|富|条形图|XYZ坐标系|
| 二百个|富|条形图|XYZ坐标系|
| 一百五十个|条形图|条形图|ABC公司|
| 二百五十个|富|条形图|ABC公司|
结果需要是:
| 总成本|列A_主导|列B_主导|A列_值|列B_值|
| - -|- -|- -|- -|- -|
| 八百五十个|富|条形图|二百五十个|四百人|
现在我可以计算总成本了--这不是问题。我还可以使用this answer得到columnA的最优值。但是在这之后,我不知道如何 * 也 * 得到columnB的优值 * 和 * 这些值。
这是我目前的SQL:

SELECT 
    SUM(`cost`) AS `total_cost`,
    COUNT(`columnA`) AS `columnA_dominant` 
FROM `table`
GROUP BY `columnA_dominant`
ORDER BY `columnA_dominant` DESC
WHERE `target` = "ABC"

**UPDATE:**感谢 @Barmar 提出使用子查询的想法,我成功地获得了columnAcolumnB的主导值:

SELECT 
    -- Retrieve total cost.
    SUM(`cost`) AS `total_cost`,
    -- Get dominant values.
    (
        SELECT `columnA`
        FROM `table`
        GROUP BY `columnA`
        ORDER BY COUNT(*) DESC
        LIMIT 1
    ) AS `columnA_dominant`,
    (
        SELECT `columnB`
        FROM `table`
        GROUP BY `columnB`
        ORDER BY COUNT(*) DESC
        LIMIT 1
    ) AS `columnB_dominant`
FROM `table`
WHERE `target` = "XYZ"

然而,我仍然有问题弄清楚如何计算各自的价值。

eh57zj3b

eh57zj3b1#

您可能已经接近了,如果我们想获得百分比值,我们可以尝试在子查询中添加COUNT(*),以获得columnAcolumnB的最大计数,然后除以count的总数

SELECT 
    SUM(cost),
    (
        SELECT tt.columnA
        FROM T tt
        GROUP BY tt.columnA
        ORDER BY COUNT(*) DESC
        LIMIT 1
    )  AS columnA_dominant,
    (
        SELECT tt.columnB
        FROM T tt
        GROUP BY tt.columnB
        ORDER BY COUNT(*) DESC
        LIMIT 1
    )  AS columnB_dominant,
    (
        SELECT COUNT(*)
        FROM T tt
        GROUP BY tt.columnA
        ORDER BY COUNT(*) DESC
        LIMIT 1
    ) / COUNT(*) AS columnA_percentage,
    (
        SELECT COUNT(*)
        FROM T tt
        GROUP BY tt.columnB
        ORDER BY COUNT(*) DESC
        LIMIT 1
    ) / COUNT(*) AS columnB_percentage
FROM T t1

如果您的MySQL版本支持窗口函数,则还有另一种方法可以使reduce表扫描获得比相关子查询更好的性能

SELECT SUM(cost) OVER(),
       FIRST_VALUE(columnA) OVER (ORDER BY counter1 DESC) columnA_dominant,
       FIRST_VALUE(columnB) OVER (ORDER BY counter2 DESC) columnB_dominant,
       FIRST_VALUE(counter1) OVER (ORDER BY counter1 DESC) / COUNT(*) OVER() columnA_percentage,
       FIRST_VALUE(counter2) OVER (ORDER BY counter2 DESC) / COUNT(*) OVER() columnB_percentage
FROM (
  SELECT *,
         COUNT(*) OVER (PARTITION BY columnA) counter1,
         COUNT(*) OVER (PARTITION BY columnB) counter2  
  FROM T
) t1
LIMIT 1

sqlfiddle

ggazkfy8

ggazkfy82#

请尝试以下查询

select sum(cost) as total_cost,p.columnA,q.columnB,p.columnA_percentage,q.columnB_percentage
from get_common,(
select top 1 columnA,columnA_percentage
from(
select columnA,count(columnA) as count_columnA,cast(count(columnA) as float)/(select count(columnA) from get_common) as columnA_percentage
from get_common
group by columnA)s
order by count_columnA desc
)p,
(select top 1 columnB,columnB_percentage
from (
select columnB,count(columnB) as count_columnB, cast(count(columnB) as float)/(select count(columnB) from get_common) as columnB_percentage
from get_common
group by columnB) t
order by count_columnB desc)q
group by p.columnA,q.columnB,p.columnA_percentage,q.columnB_percentage

所以如果你想得到百分比和显性值你必须自己做这样的查询

select top 1 columnA,columnA_percentage
from(
select columnA,count(columnA) as count_columnA,cast(count(columnA) as float)/(select count(columnA) from get_common) as columnA_percentage
from get_common
group by columnA)s
order by count_columnA desc

那么您可以连接sum查询以获取所需的所有值
我希望这对你有帮助

相关问题