为什么运行1tb teragen时没有减速机?

kq4fsx7k  于 2021-06-01  发布在  Hadoop
关注(0)|答案(2)|浏览(452)

我正在使用以下命令为hadoop运行terasort基准测试:

  1. jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar teragen -Dmapreduce.job.maps=100 1t random-data

为100个Map任务打印了以下日志:

  1. 18/03/27 13:06:03 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
  2. 18/03/27 13:06:04 INFO client.RMProxy: Connecting to ResourceManager at /127.0.0.1:8032
  3. 18/03/27 13:06:05 INFO terasort.TeraSort: Generating -727379968 using 100
  4. 18/03/27 13:06:05 INFO mapreduce.JobSubmitter: number of splits:100
  5. 18/03/27 13:06:05 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1522131782827_0001
  6. 18/03/27 13:06:06 INFO impl.YarnClientImpl: Submitted application application_1522131782827_0001
  7. 18/03/27 13:06:06 INFO mapreduce.Job: The url to track the job: http://localhost:8088/proxy/application_1522131782827_0001/
  8. 18/03/27 13:06:06 INFO mapreduce.Job: Running job: job_1522131782827_0001
  9. 18/03/27 13:06:16 INFO mapreduce.Job: Job job_1522131782827_0001 running in uber mode : false
  10. 18/03/27 13:06:16 INFO mapreduce.Job: map 0% reduce 0%
  11. 18/03/27 13:06:29 INFO mapreduce.Job: map 2% reduce 0%
  12. 18/03/27 13:06:31 INFO mapreduce.Job: map 3% reduce 0%
  13. 18/03/27 13:06:32 INFO mapreduce.Job: map 5% reduce 0%
  14. ....
  15. 18/03/27 13:09:27 INFO mapreduce.Job: map 100% reduce 0%

以下是控制台上打印的最终计数器:

  1. 18/03/27 13:09:29 INFO mapreduce.Job: Counters: 30
  2. File System Counters
  3. FILE: Number of bytes read=0
  4. FILE: Number of bytes written=10660990
  5. FILE: Number of read operations=0
  6. FILE: Number of large read operations=0
  7. FILE: Number of write operations=0
  8. HDFS: Number of bytes read=8594
  9. HDFS: Number of bytes written=0
  10. HDFS: Number of read operations=400
  11. HDFS: Number of large read operations=0
  12. HDFS: Number of write operations=200
  13. Job Counters
  14. Launched map tasks=100
  15. Other local map tasks=100
  16. Total time spent by all maps in occupied slots (ms)=983560
  17. Total time spent by all reduces in occupied slots (ms)=0
  18. Total time spent by all map tasks (ms)=983560
  19. Total vcore-milliseconds taken by all map tasks=983560
  20. Total megabyte-milliseconds taken by all map tasks=1007165440
  21. Map-Reduce Framework
  22. Map input records=0
  23. Map output records=0
  24. Input split bytes=8594
  25. Spilled Records=0
  26. Failed Shuffles=0
  27. Merged Map outputs=0
  28. GC time elapsed (ms)=9746
  29. CPU time spent (ms)=0
  30. Physical memory (bytes) snapshot=0
  31. Virtual memory (bytes) snapshot=0
  32. Total committed heap usage (bytes)=11220811776
  33. File Input Format Counters
  34. Bytes Read=0
  35. File Output Format Counters
  36. Bytes Written=0

以下是作业调度的输出:

请说明为什么没有reduce任务?

kcrjzv8t

kcrjzv8t1#

执行teragen时不运行reduce任务。以下是文档:
teragen将运行map任务来生成数据,而不会运行任何reduce任务。Map任务的默认数目由“mapreduce.job.maps=2”参数定义。这里的唯一目的是生成1tb的随机数据,格式为“10字节密钥| 2字节中断| 32字节acsii/hex | 4字节中断| 48字节填充| 4字节中断| \r\n”。

aurhwmvo

aurhwmvo2#

你的跑步命令说你在跑步 teragen 而不是 terasort . teragen 只需生成数据,然后就可以用于 terasort 所以不需要减速器。
terasort 在刚刚生成的数据上运行:

  1. hadoop jar /Users/karan.verma/Documents/backups/h/hadoop-2.6.4/share/hadoop/mapreduce/hadoop-mapreduce-examples-*.jar terasort random-data terasort-output

你应该看到异径管。

相关问题