首页 > 解决方案 > 文件:file:/C:/Python/HDFS/program1/mapper.py 不可读

问题描述

执行启动 Hadoop 作业时出现以下错误

文件:file:/C:/Python/HDFS/program1/mapper.py 不可读。

使用以下命令

hadoop jar C:\hadoop\share\hadoop\tools\lib\hadoop-streaming-3.2.0.jar -file C:/Python/HDFS/program1/mapper.py -file C:/Python/HDFS/program1/reducer.py -mapper "python mapper.py" -reducer "python reducer.py" -input /sample/input_word.txt -output /sample/owc1.txt

这是我的 mapper.py 文件内容

import sys

for line in sys.stdin:
    line = line.strip()
    words = line.split()
    for word in words:
        print ("%s\t%s" % (word, 1))

和reducer.py内容如下

import sys
import collections

counter = collections.Counter()

for line in sys.stdin:
    word, count = line.strip().split("\t", 1)

    counter[word] += int(count)

for x in counter.most_common(9999):
    print(x[0],"\t",x[1])

我正在使用以下环境:

Java 版本:java 版本“1.8.0_291” Hadoop 版本:Hadoop 3.2.0 和 Windows 10

标签: pythonhadoophdfs

解决方案


推荐阅读