python - Where are application errors logs?
问题描述
In anticipation of having to debug our Python code by looking for the the error messages in the log files, I have created a Hadoop Streaming job that throws an exception but I can't locate the error message (or the stack trace).
Similar questions hadoop streaming: where are application logs? and hadoop streaming: how to see application logs? use Python's logging
module which is not desirable here because Python already logs the error so we shouldn't have to.
Here is the mapper code; we use Hadoop's built-in reducer aggregate
.
#!/usr/bin/python
import sys, re
import random
def main(argv):
line = sys.stdin.readline()
pattern = re.compile("[a-zA-Z][a-zA-Z0-9]*")
try:
while line:
for word in pattern.findall(line):
print "LongValueSum:" + word.lower() + "\t" + "1"
x = 1 / random.randint(0,99)
line = sys.stdin.readline()
except "end of file":
return None
if __name__ == "__main__":
main(sys.argv)
The x = 1 / random.randint(0,99)
line is supposed to create a ZeroDivisionError
and indeed the job fails but grepping the log files doesn't show the error. Is there a special flag we need to be setting someplace?
We have gone through the Google Dataproc documentation as well as the Hadoop Streaming documentation.
解决方案
当您运行 Cloud Dataproc 作业时,作业驱动程序输出会流式传输到 GCP Console,显示在命令终端窗口中(对于从命令行提交的作业),并存储在 Cloud Storage 中,请参阅访问作业驱动程序输出。您还可以在 StackDriver 中找到带有 name 的日志dataproc.job.driver
。
您还可以在创建集群时启用 YARN 容器日志并在 StackDriver 中查看它们,请参阅说明。
除此之外,yarn-userlogs
在 StackDriver 中也可能有用。
推荐阅读
- c# - HelloSign API - 如何在没有内置模板的情况下创建动态文档
- android - SQLDelight 关系
- mysql - 您将如何通过测试将数据库 A 移动到数据库 B?
- linux - Docker 中的 Asp.net core5.0 API。如何访问 AWS 参数存储?
- cakephp - Cakephp 如何映射和合并两个对象?
- c - C regcomp 不编译模式
- reactjs - 如何使用 useStyles 在 @material-ui 中编写 !important
- python - 为什么我的 if 语句不起作用(Python)
- swift - 首次初始化时为空媒体库
- json - 如何在 ReactJS 中美化解析的 JSON?