首页 > 解决方案 > 通过根据每个任务的失败率概率对任务进行优先级排序来量化节省的总时间

问题描述

我正在尝试解决一个问题,我试图根据每个任务的失败率对作业中的任务进行优先级排序。例如:

Task   p(Failure)    TimeTaken(sec)
 A       0.7            10 
 B       0.1            15
 C       0.5             3
 D       0.3             5 

这是一系列任务,即使一个任务失败,整个作业也会失败。所以我想优先处理我的任务以节省最大的时间。为此,我尝试按失败概率的顺序运行任务。所以我目前执行任务的顺序是A,C,D and then B。我觉得我的方法的问题是我没有考虑时间因素。有没有更好的方法根据所考虑的时间来确定我的任务的优先级?

标签: optimizationstatisticsprobabilityschedulingjob-scheduling

解决方案


嗯,你的直觉是正确的。您首先想做的事情要么是具有真正高失败概率的事情(这样您就不会浪费大量时间并在以后失败)以及持续时间很短的事情,因此如果您确实失败了,您还没有将时间浪费在一项长期任务上。

这是一个查看所有可能序列的蛮力解决方案。它可以缩放。可能有更优雅的解决方案,甚至可能还有一些不会很快想到的数学模型。

无论如何。假设:所有失败都是相互独立的,失败在任务结束时发生(或被识别),并且失败发生。我们从概率论中知道,如果任务是独立的,那么 P{成功} 不依赖于任务的顺序,所以所有的顺序总体上也具有相同的失败可能性。只是丢失的时间根据顺序以及在顺序中发生的时间而有所不同。

下面的代码计算 x 的期望值,假设发生故障,其中 x 是浪费的时间,它只是可能的中间序列时间和发生概率的和积。

# scheduling with failures

from itertools import permutations as perm
from math import prod
from tabulate import tabulate

tasks = {   'A': (0.7, 10),   # (P{fail}, time)
            'B': (0.1, 15),
            'C': (0.5, 3),
            'D': (0.3, 5)}

# let's start with finding P{success}
p_s = prod( (1-tasks[k][0]) for k in tasks)
success_time = sum(tasks[k][1] for k in tasks)
print(f'the probability the sequence completes successfully is: {p_s:0.3f}')
print(f'when successful, the time is: {success_time}')

min_E_x = success_time  # upper bound on min_E_x:  The minimum expected value of x
best = None
sequences, vals = [], []
for seq in perm(tasks,len(tasks)) :  # all permutations
    E_x = 0 # the expected value of x for this sequence, where x = time wasted
    for i in range(len(seq)):
        p = tasks[seq[i]][0]                                        # p{fail on last}
        earlier = prod((1-tasks[seq[j]][0]) for j in range(i))      # p{all earlier events pass}
        if earlier:
            p *= earlier
        # get elapsed time for this sequence
        time = sum(tasks[seq[j]][1] for j in range(i+1))
        # normalize the probability (we know a failure has occurred)
        p = p/(1-p_s)
        E_x += p*time  # E[x] = sum of all p*x
        # print(seq[0:i+1], time, p, E_x)
    sequences.append(seq)
    vals.append(E_x)

    if E_x < min_E_x:
        best = seq
        min_E_x = E_x

print(f'\nThe best selection with minimal wasted time given a failure is:  {best} with E[wasted time]: {min_E_x:0.3f}\n')

print(tabulate(zip(sequences, vals), floatfmt=".2f", headers=['Sequence', 'E[x]|failure']))

产量

the probability the sequence completes successfully is: 0.095
when successful, the time is: 33

The best selection with minimal wasted time given a failure is:  ('C', 'A', 'D', 'B') with E[wasted time]: 7.959

Sequence                E[x]|failure
--------------------  --------------
('A', 'B', 'C', 'D')           14.21
('A', 'B', 'D', 'C')           14.69
('A', 'C', 'B', 'D')           11.82
('A', 'C', 'D', 'B')           11.16
('A', 'D', 'B', 'C')           13.36
('A', 'D', 'C', 'B')           11.69
('B', 'A', 'C', 'D')           24.70
('B', 'A', 'D', 'C')           25.18
('B', 'C', 'A', 'D')           21.82
('B', 'C', 'D', 'A')           22.07
('B', 'D', 'A', 'C')           25.67
('B', 'D', 'C', 'A')           23.66
('C', 'A', 'B', 'D')            8.62
('C', 'A', 'D', 'B')            7.96
('C', 'B', 'A', 'D')           13.87
('C', 'B', 'D', 'A')           14.12
('C', 'D', 'A', 'B')            8.23
('C', 'D', 'B', 'A')           11.91
('D', 'A', 'B', 'C')           13.91
('D', 'A', 'C', 'B')           12.24
('D', 'B', 'A', 'C')           21.26
('D', 'B', 'C', 'A')           19.24
('D', 'C', 'A', 'B')           10.00
('D', 'C', 'B', 'A')           13.67
[Finished in 0.1s]

推荐阅读