我刚刚开始学习hadoop。在一本书里有一个我不完全理解的例子。
例子
Consider processing 200 GB of data with 50 nodes, in which each node processes 4 GB
of data located on a local disk. Each node takes 80 seconds to read the data (at the
rate of 50 MB per second). No matter how fast we compute, we cannot finish in under 80
seconds. Assume that the result of the process is a total dataset of size 200 MB, and
each node generates 4 MB of this result. which is transferred over a 1 Gbps (1 MB per
packet) network to a single node for display. It will take about 3 milliseconds (each
1 MB requires 250 microseconds to transfer over the network, and the network latency
per packet is assumed to be 500 microseconds (based on the previously referenced talk
by Dr. Jeff Dean) to transfer the data to the destination node. Ignoring computational
costs, the total processing time cannot be under 40.003 seconds.
在上面的示例中,我无法计算出传输时间3毫秒和总处理时间40.003秒的结果是如何导出的。
暂无答案!
目前还没有任何答案,快来回答吧!