首页 > 解决方案 > Oozie Job Invocation 需要时间来获得更大的容量

问题描述

我是 Oozie 的新手

我们有一个工作流程,我们在其中调用预处理 Python,然后是 Spark,然后是后处理 Python 作业

如果我们为单个实体调用工作流,它会立即得到处理

但是随着我们不断增加实体数量,每个 Jobs 的调用开始花费大量时间。作业的处理速度仍然很快,但作业的调用需要时间。

以下是我们保留的 oozie 配置:

<property>

<name>oozie.service.CallableQueueService.queue.size</name>

<value>10000</value>

<description>Max callable queue size</description>

</property>


-<property>

<name>oozie.service.SchedulerService.threads</name>

<value>100</value>

<description>The number of threads to be used by the SchedulerService to run deamon tasks.If maxed out, scheduled daemon tasks will be queued up and delayed until threads become available. </description>

</property>


-<property>

<name>oozie.service.CallableQueueService.threads</name>

<value>600</value>

<description>Number of threads used for executing callables</description>

</property>


<property>

<name>oozie.service.CallableQueueService.callable.concurrency</name>

<value>200</value>

<description>Maximum concurrency for a given callable type.Each command is a callable type (submit, start, run, signal, job, jobs, suspend,resume, etc).Each action type is a callable type (Map-Reduce, Pig, SSH, FS, sub-workflow, etc).All commands that use action executors (action-start, action-end, action-kill and action-check) usethe action type as the callable type. </description>

</property>


<property>

<name>oozie.service.coord.normal.default.timeout </name>

<value>120</value>

<description>Default timeout for a coordinator action input check (in minutes) for normal job. -1 means infinite timeout</description>

</property>


-<property>

<name>oozie.action.launcher.mapreduce.job.ubertask.enable</name>

<value>true</value>

</property>


-<property>

<name>oozie.action.shell.launcher.mapreduce.job.ubertask.enable</name>

<value>true</value>

</property>

我们尝试更改各种值,但没有看到显着的改进。有建议请支持

标签: oozieoozie-workflow

解决方案


推荐阅读