首页 > 解决方案 > 运行配置单元查询时失败 - 映射运算符初始化失败 - OOM

问题描述

我正在执行通过加入 3 个表创建的配置单元查询。但是我遇到了错误,即请让我知道这个错误是什么以及如何解决这个错误。

org.apache.hadoop.hive.ql.exec.tez.TezTask. Vertex failed, vertexName=Map 1, vertexId=vertex_1535088835132_0435_1_02, diagnostics=[Task failed, taskId=task_1535088835132_0435_1_02_000028, diagnostics=[TaskAttempt 0 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
    ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
    ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
    ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.io.Text.setCapacity(Text.java:268)
    at org.apache.hadoop.io.Text.set(Text.java:224)
    at org.apache.hadoop.io.Text.set(Text.java:214)
    at org.apache.hadoop.io.Text.<init>(Text.java:93)
    at org.apache.hadoop.hive.serde2.objectinspector.primitive.WritableStringObjectInspector.copyObject(WritableStringObjectInspector.java:36)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:311)
    at org.apache.hadoop.hive.serde2.objectinspector.ObjectInspectorUtils.copyToStandardObject(ObjectInspectorUtils.java:346)
    at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.read(MapJoinKeyObject.java:112)
    at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKey.read(MapJoinKey.java:68)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:127)
    at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
    ... 4 more
], TaskAttempt 1 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
    ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
    ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
    ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
    at java.util.Arrays.hashCode(Arrays.java:4146)
    at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
    at java.util.HashMap.hash(HashMap.java:339)
    at java.util.HashMap.put(HashMap.java:612)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
    at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
    ... 4 more
], TaskAttempt 2 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
    ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
    ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
    ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.math.BigDecimal.valueOf(BigDecimal.java:1202)
    at java.math.BigDecimal.createAndStripZerosToMatchScale(BigDecimal.java:4401)
    at java.math.BigDecimal.stripTrailingZeros(BigDecimal.java:2598)
    at org.apache.hadoop.hive.common.type.HiveDecimal.trim(HiveDecimal.java:238)
    at org.apache.hadoop.hive.common.type.HiveDecimal.normalize(HiveDecimal.java:252)
    at org.apache.hadoop.hive.common.type.HiveDecimal.create(HiveDecimal.java:71)
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
    at java.util.Arrays.hashCode(Arrays.java:4146)
    at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
    at java.util.HashMap.hash(HashMap.java:339)
    at java.util.HashMap.put(HashMap.java:612)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
    at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
    ... 4 more
], TaskAttempt 3 failed, info=[Error: Failure while running task:java.lang.RuntimeException: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:173)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.run(TezProcessor.java:139)
    at org.apache.tez.runtime.LogicalIOProcessorRuntimeTask.run(LogicalIOProcessorRuntimeTask.java:347)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:194)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable$1.run(TezTaskRunner.java:185)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:422)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1869)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:185)
    at org.apache.tez.runtime.task.TezTaskRunner$TaskRunnerCallable.callInternal(TezTaskRunner.java:181)
    at org.apache.tez.common.CallableWithNdc.call(CallableWithNdc.java:36)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.RuntimeException: Map operator initialization failed
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:262)
    at org.apache.hadoop.hive.ql.exec.tez.TezProcessor.initializeAndRunProcessor(TezProcessor.java:149)
    ... 14 more
Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:389)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:379)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:482)
    at org.apache.hadoop.hive.ql.exec.Operator.initializeChildren(Operator.java:439)
    at org.apache.hadoop.hive.ql.exec.Operator.initialize(Operator.java:376)
    at org.apache.hadoop.hive.ql.exec.tez.MapRecordProcessor.init(MapRecordProcessor.java:247)
    ... 15 more
Caused by: java.util.concurrent.ExecutionException: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at java.util.concurrent.FutureTask.report(FutureTask.java:122)
    at java.util.concurrent.FutureTask.get(FutureTask.java:192)
    at org.apache.hadoop.hive.ql.exec.Operator.completeInitialization(Operator.java:387)
    ... 20 more
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
    at org.apache.hadoop.hive.common.type.HiveDecimal.create(HiveDecimal.java:71)
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.getHiveDecimal(HiveDecimalWritable.java:95)
    at org.apache.hadoop.hive.serde2.io.HiveDecimalWritable.hashCode(HiveDecimalWritable.java:167)
    at java.util.Arrays.hashCode(Arrays.java:4146)
    at org.apache.hadoop.hive.ql.exec.persistence.MapJoinKeyObject.hashCode(MapJoinKeyObject.java:83)
    at java.util.HashMap.hash(HashMap.java:339)
    at java.util.HashMap.put(HashMap.java:612)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.put(HashMapWrapper.java:107)
    at org.apache.hadoop.hive.ql.exec.persistence.HashMapWrapper.putRow(HashMapWrapper.java:131)
    at org.apache.hadoop.hive.ql.exec.tez.HashTableLoader.load(HashTableLoader.java:211)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator.loadHashTable(MapJoinOperator.java:310)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:179)
    at org.apache.hadoop.hive.ql.exec.MapJoinOperator$1.call(MapJoinOperator.java:175)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache.retrieve(ObjectCache.java:75)
    at org.apache.hadoop.hive.ql.exec.tez.ObjectCache$1.call(ObjectCache.java:92)
    ... 4 more
]], Vertex did not succeed due to OWN_TASK_FAILURE, failedTasks:1 killedTasks:59, Vertex vertex_1535088835132_0435_1_02 [Map 1] killed/failed due to:OWN_TASK_FAILURE]Vertex killed, vertexName=Reducer 2, vertexId=vertex_1535088835132_0435_1_03, diagnostics=[Vertex received Kill while in RUNNING state., Vertex did not succeed due to OTHER_VERTEX_FAILURE, failedTasks:0 killedTasks:51, Vertex vertex_1535088835132_0435_1_03 [Reducer 2] killed/failed due to:OTHER_VERTEX_FAILURE]DAG did not succeed due to VERTEX_FAILURE. failedVertices:1 killedVertices:1
          (less...)

标签: hiveout-of-memoryapache-tez

解决方案


根据提供的日志,您正在 Tez 上运行 map join,但加载 hashMap 失败。例外是java.lang.OutOfMemoryError: GC overhead limit exceeded

尝试增加映射器容器大小和 Java 堆大小。检查您当前的配置并相应增加。这只是示例:

set hive.tez.container.size=9216;
set hive.tez.java.opts=-Xmx6144m;

可以在此处找到包含有关这些配置参数说明的良好调整指南:Demystify Apache Tez Memory Tuning - Step by Step
如果没有任何帮助,可能是表太大而无法放入内存,请考虑减小 的值hive.auto.convert.join.noconditionaltask.size以强制查询使用为这个大尺寸的表随机加入。


推荐阅读