limit mysql需要从表中返回n%的随机记录

x6yk4ghg  于 2021-06-24  发布在  Mysql
关注(0)|答案(2)|浏览(337)

完全卡在使用不同的方法,但仍然没有运气,我使用mysql和sparksql我有简单的选择

SELECT 
rdw.merchant_id, hash_id, transaction_ts 
FROM table_1 rdw
JOIN table_2 bl ON rdw.MERCHANT_ID = bl.MERCHANT_ID
WHERE brand = 'TBrand'
order by rand()
limit 36

这是完美的工作,但问题是 limit 36 我需要从所有记录中选择30%,我不能传递其中的数字,因为spark无法处理变量。我知道。试图计算%的限制,但。它现在是孤岛,有什么想法如何加载?问题是,我不知道有多少记录将返回选择,这就是为什么我不能设置限制,它应该是%的记录

l7mqbcuq

l7mqbcuq1#

如果你能处理大约30%的问题,你可以简单地做:

where brand = 'TBrand' and rand() <= 0.3

如果您希望更好地将近似值设置为30%,则可以执行以下操作:

SELECT x.*
FROM (SELECT rdw.merchant_id, hash_id, transaction_ts,
             (@rn := @rn + 1) as rn
      FROM table_1 rdw JOIN
           table_2 bl
           ON rdw.MERCHANT_ID = bl.MERCHANT_ID CROSS JOIN
           (SELECT @rn := 0) params
      WHERE brand = 'TBrand'
      ORDER BY rand()
     ) x
WHERE rn <= @rn * 0.30
yvgpqqbh

yvgpqqbh2#

SET @row_number = 0;
select merchant_id, hash_id, transaction_ts  
from (
 SELECT 
(@row_number:=@row_number + 1) as num,
rdw.merchant_id, hash_id, transaction_ts 
FROM table_1 rdw
JOIN table_2 bl ON rdw.MERCHANT_ID = bl.MERCHANT_ID
WHERE brand = 'TBrand'
order by rand()
) 
where num <= FLOOR((@row_number * 3) /10  )

相关问题