首页 > 技术文章 > MepReduce-开启大数据计算之门

beichenroot 2019-04-27 13:22 原文

Hadoop MapReduce(1TB)MapReduce(MR)JobTrackerMR

1. JobTracker 

2. JobTracker MR JobTracker fail MR 4000

3. TaskTrackerMRCPU TaskTracker OOM

4. TaskTracker Map task slotReduce task slot, Map taskReduce task

5. bug 

6. MR

 

    

     MR Hadoop0.23.0Hadoop MRMRYARN 

 

    JobTracker&ResourceManager ApplicationMaster 

ResourceManagerJobTracker(Scheduler)(Applications ManagerASM)

ApplicationMasterTaskTrackerApplicationMasterApplicationMaster

 

NodeManagerYARN 使 (CPU&&&)

 

   YARNMR

 

    

    MRMapReduceCombinerMR(key, value)(*Writable)WorldCountMR

 

    MapMap<k2, v2>

    ReduceMapv3<k4, v4>中<k3, v3><k2, v2>

 

     HadoopMapReduceIO(IO) (Serialization)MRIntWritableTextWritable口,WritableMRkeyvalue。

    MRIOIOMRShuffle 

* MapReduce

* 

* IO

    MRShuffle()partition() sort()merge()

1. MapReduceMapReduce

    MRPartitionerkeyvalueReduceReducekey hashReduceReduceReduce

2. ReduceMapper(100M)(80M)Spill()线key(A-Za-z)WirtableComparable

 3.  MapReduceMRCombiner(Reducer)Combinerkey

    Combiner使WorldCountMapReduceCombiner使Combiner   

    MapShuffleReduceShuffle

 

1. ReduceMapMap

2. MergeCombinerMerge()MapMapReduce

    ReduceMapJVMheap sizeShuffleReducerShuffle使 

3. MergeReducerShuffleReducerHDFS

    MRShuffleMRMapperReducer

推荐阅读