hadoop - What is the complete list of streaming command line options possible for Hadoop YARN version?
问题描述
I was browsing through the Hadoop website and found the following link for hadoop streaming.
https://hadoop.apache.org/docs/current1/streaming.html
But, I am more interested in Hadoop YARN (MRv2) - Streaming command line options.
If someone has the exhaustive list, can you please post it here?
If it is not found, can somebody please tell me if any of the command line options in the following command are illegal.
yarn jar /opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-streaming.jar \
-D mapred.jab.name="Streaming wordCount Rating" \
-D mapreduce.job.output.key.comparator.class=org.apache.hadoop.mapreduce.lib.partition.KeyFieldBasedComparator \
-D map.output.key.field.separator=\t \
-D mapreduce.partition.keycomparator.options=-k2,2nr \
-D mapreduce.job.reduces=${NUM_REDUCERS} \
-files mapper2.py,reducer2.py \
-mapper "python mapper2.py" \
-reducer "python reducer2.py" \
-input ${OUT_DIR} \
-output ${OUT_DIR_2} > /dev/null
解决方案
If you want to see all the Hadoop streaming command line options refer to StreamJob.java - setupOptions():
allOptions = new Options().
addOption(input).
addOption(output).
addOption(mapper).
addOption(combiner).
addOption(reducer).
addOption(file).
addOption(dfs).
addOption(additionalconfspec).
addOption(inputformat).
addOption(outputformat).
addOption(partitioner).
addOption(numReduceTasks).
addOption(inputreader).
addOption(mapDebug).
addOption(reduceDebug).
addOption(jobconf).
addOption(cmdenv).
addOption(cacheFile).
addOption(cacheArchive).
addOption(io).
addOption(background).
addOption(verbose).
addOption(info).
addOption(debug).
addOption(help).
addOption(lazyOutput);
The options related to MapReduce are general options for all MapReduce applications and to see if they are valid look at the mapred-default.xml configuration variables. FYI: this refers to Hadoop 2.8.0 so you might need to find the appropriate XML for your version of Hadoop.
推荐阅读
- eclipse - Eclipse 未打开,NoSuchFieldError: useDarkestDarkColors
- javascript - 表单字段数据的实时计算未插入数据库
- google-apps-script - Google Apps 脚本 - 在 Google 幻灯片表格中插入图像
- facebook - Facebook 用户 ID 问题
- templates - 我的 pug 模板文件中的访问窗口(全局变量)
- ios - 如何打印 SFSafariViewController 完整网页?
- r - 向与 ggmap 一起使用的数据添加一列
- oracle - 使用 sum 和 nvl 函数更新语句很慢
- java - Websphere 集群环境中的 JMS1.x 订阅者/客户端选择相同的 TOPIC
- git - git是否可以显示带有重音符号的文件名?