我是hadoop的新手,在下面的场景中需要帮助。假设有5个乔布斯先生
P1(MR1+ MR2 ) -->// MR1 and MR2 run sequentiallyP2(MR3+MR4)// // MR3 and MR4 run sequentiallyand P1 and P2 run parallel way .Output of P1 and P2 will be joined By MR5.
P1(MR1+ MR2 ) -->// MR1 and MR2 run sequentially
P2(MR3+MR4)// // MR3 and MR4 run sequentially
and P1 and P2 run parallel way .Output of P1 and P2 will be joined By MR5.
如何在hadoop中定义这种复杂的工作流?
dsf9zpds1#
如果您有更复杂的需求,您应该看看oozie:oozie对于简单的需求,可以在作业api中使用依赖关系管理。
1条答案
按热度按时间dsf9zpds1#
如果您有更复杂的需求,您应该看看oozie:oozie
对于简单的需求,可以在作业api中使用依赖关系管理。