开源中有没有inmemmory mapreduce技术

yvgpqqbh 于 2021-06-04 发布在 Hadoop

关注(0)|答案(1)|浏览(293)

我正在努力寻找技术，使hadoop更快。有没有'内存hadoopmapreduce'技术在'开源'像gridgain？对于gridgain，我只能下载评估版。

hadoop

来源：https://stackoverflow.com/questions/20440198/is-there-any-inmemmory-mapreduce-techniques-in-opensource

1条答案

按热度按时间

vsaztqbk1#

你可能在找apache spark。

To run programs faster, Spark offers a general execution model 
that can optimize arbitrary operator graphs, and supports in-memory 
computing, which lets it query data faster than disk-based engines like Hadoop.

不过，它与代码有点不同，因为它主要是为 Scala . 所以你不写信 map 以及 reduce 函数，但声明性地构建计算块-因此 Spark 比以前灵活多了 MapReduce .
让我们看看wordcount，java版本看起来有点冗长：

JavaPairRDD<String, Integer> ones = words.map(new PairFunction<String, String, Integer>() {
      public Tuple2<String, Integer> call(String s) {
        return new Tuple2<String, Integer>(s, 1);
      }
    });

    JavaPairRDD<String, Integer> counts = ones.reduceByKey(new Function2<Integer, Integer, Integer>() {
      public Integer call(Integer i1, Integer i2) {
        return i1 + i2;
      }
    });

也许使用Java8特性会更好。
在 Scala 它更紧凑：

val file = spark.textFile("hdfs://...")
val counts = file.flatMap(line => line.split(" "))
                  .map(word => (word, 1))
                  .reduceByKey(_ + _)
counts.saveAsTextFile("hdfs://...")

赞(0）回复(0）举报 2021-06-04

我来回答

开源中有没有inmemmory mapreduce技术

1条答案

相关问题

热门标签

最新问答