在hive中查找十大趋势推文

klh5stk1 于 2021-06-28 发布在 Hive

关注(0)|答案(1)|浏览(438)

我发现在Hive的前10个趋势推特转发计数的基础上，即推特有最高的转发计数将是第一，以此类推。。。。
这是选举表的细节

id                      bigint                  from deserializer   
created_at              string                  from deserializer   
source                  string                  from deserializer   
favorited               boolean                 from deserializer   
retweeted_status        struct<text:string,user:struct<screen_name:string,name:string>,retweet_count:int>   from deserializer   
entities                struct<urls:array<struct<expanded_url:string>>,user_mentions:array<struct<screen_name:string,name:string>>,hashtags:array<struct<text:string>>> from deserializer   
text                    string                  from deserializer   
user                    struct<screen_name:string,name:string,friends_count:int,followers_count:int,statuses_count:int,verified:boolean,utc_offset:int,time_zone:string,location:string>    from deserializer   
in_reply_to_screen_name string                  from deserializer

我的问题

select text 
from election 
where retweeted_status.retweet_count IN  
     (select  retweeted_status.retweet_count as zz 
      from election  
      order by zz desc  
      limit 10);

它给了我10次同样的微博(推送，推送，推送。推特（）
所以当我运行内部查询时，我所做的就是打破嵌套查询

select  retweeted_status.retweet_count as zz 
from election  
order by zz desc  
limit 10

它返回10个不同的值（12101209120812071206，…1201）
之后当我运行外部查询时

select text 
from election  
where retweeted_status.retweet_count 
      IN  (1210,1209,1208,1207,1206,....1201 );

结果是相同的10条推文（推文，推文，推文。推特（）
我的查询逻辑有什么问题？

sql Hive hiveql twitter mysql-logic

来源：https://stackoverflow.com/questions/38650607/finding-top-10-trending-tweets-in-hive

1条答案

按热度按时间

sz81bmfz1#

不要使用count，你应该使用id，这是因为如果你有100条tweet的计数相同，不管限制10，你将得到100条记录。

select text 
from election 
where id  IN  
     (select  id as zz 
      from election  
      order by retweeted_status.retweet_count desc  
      limit 10);

但仍然不知道为什么你会得到错误的结果。
编辑（在我的评论之后）：
如果我的意见是正确的，那么你将有相同的身份证十次。在这种情况下，换成

(select distinct id as zz 
      from election  
      order by retweeted_status.retweet_count desc  
      limit 10);

展开查看全部

赞(0）回复(0）举报 2021-06-28

我来回答

在hive中查找十大趋势推文

1条答案

相关问题

热门标签

最新问答