如何在使用三叉戟拓扑的字数统计拓扑中找到字数最大的字数?这是三叉戟字数拓扑的链接。https://github.com/nathanmarz/storm-starter/blob/master/src/jvm/storm/starter/trident/tridentwordcount.java
monwx1rj1#
tridentapi提供max&maxby操作,这些操作在trident流中一批元组的每个分区上返回最大值。因此,在计算每个单词的计数后,如下所示:
Stream wordCountsStream = topology.newStream("spout1", spout).parallelismHint(16).each(new Fields("sentence"), new Split(), new Fields("word")).groupBy(new Fields("word")).persistentAggregate(new MemoryMapState.Factory(), new Count(), new Fields("count")).parallelismHint(16).newValuesStream();
Stream wordCountsStream = topology.newStream("spout1", spout).parallelismHint(16).each(new Fields("sentence"),
new Split(), new Fields("word")).groupBy(new Fields("word")).persistentAggregate(new MemoryMapState.Factory(),
new Count(), new Fields("count")).parallelismHint(16).newValuesStream();
使用maxby获取具有最大计数的单词:
wordCountsStream.maxBy(new Fields("count"))
1条答案
按热度按时间monwx1rj1#
tridentapi提供max&maxby操作,这些操作在trident流中一批元组的每个分区上返回最大值。
因此,在计算每个单词的计数后,如下所示:
使用maxby获取具有最大计数的单词: