首页 > 解决方案 > 如何按说话者姓名汇总时间戳列表 v.2

问题描述

我正在做一个项目,我已经从列表中提取数据,现在有 3 个列表:
列表 1 - 演讲者姓名列表

['<M1>', '<M1>', '<M1>', '<M1>', '<M1>', '<M2>', '<M2>', '<M2>', '<M1>', '<M1>', '<M2>', '<M1>', '<M2>', '<M2>', '<M2>', '<M2>', '<M2>']

列表 2 - 通话时间戳开始的列表

['[00:00:00.000]', '[00:00:08.010]', '[00:00:16.890]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:21.120]', '[00:01:46.130]', '[00:01:47.180]', '[00:01:49.390]', '[00:01:50.670]', '[00:02:02.320]', '[00:02:16.010]', '[00:02:21.110]', '[00:02:27.610]']

列表 3 - 通话时间戳结束的列表

['[00:00:08.010]', '[00:00:16.290]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:20.250]', '[00:01:33.850]', '[00:01:47.150]', '[00:01:49.370]', '[00:01:50.140]', '[00:02:01.350]', '[00:02:16.010]', '[00:02:20.150]', '[00:02:27.610]', '[00:02:39.040]'] 

我需要做的是每当一个发言者多次讲话时(例如列表的前 5 个元素),我需要将第一个结束段 [00:00:08.010] 更改为 [00:00:48.100] 并摆脱之间的所有条目(将只有一个发言者的 5 个条目变为 1 个条目)并对列表中的所有发言者再次执行此操作。如果说话者只说了一次,那么它需要保持不变。

我还有一条我需要遵循的规则,我不知道如何实施,如果有人直接说了 5 次(如开头的示例),并且在第 4 句和第 n.5 句之间有超过 0.5 秒的间隔结果应该是两个输入,例如

['<M1>', '<M1>']

第一个 0.00- 0.39 和第二个 0.39 - 0.48 (如果有这个 0.5 秒的空间)

有人可以帮助我并找到一种在 python 中执行此操作的方法吗?谢谢 !

这是我到目前为止写的:

newSpeakerOrder = []
newSpeakerBegin = []
newSpeakerEnd   = []

currentspeaker = None
for i in range(len(speakers)):
    if currentspeaker != speakers[i]:
        if currentspeaker != None:
            newSpeakerEnd.append(end[i - 1])
        newSpeakerOrder.append(speakers[i])
        newSpeakerBegin.append(start[i])
        currentspeaker = speakers[i]
newSpeakerEnd.append(end[-1])

但它不看时间,并且在时间段之间存在 0.5 秒差异的情况下,将说话者的第一次和他最后一次说话而不分开它们

<M1> [00:00:00.000]     [00:00:39.980]
<M1> [00:00:40.600]     [00:00:48.100]

代替

<M1> [00:00:00.000]    [00:00:48.100]

标签: pythonarrayspython-3.xalgorithmdatetime

解决方案


处理segment中0.5秒的gap,需要将开始时间和结束时间转换成datetime对象,并将当前值start与之前的end值进行比较,当speaker发生变化或者gap大于0.5秒时开始新的segment :

from datetime import datetime
import itertools

speakers = ['<M1>', '<M1>', '<M1>', '<M1>', '<M1>', '<M2>', '<M2>', '<M2>', '<M1>', '<M1>', '<M2>', '<M1>', '<M2>', '<M2>', '<M2>', '<M2>', '<M2>']
start = ['[00:00:00.000]', '[00:00:08.010]', '[00:00:16.890]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:21.120]', '[00:01:46.130]', '[00:01:47.180]', '[00:01:49.390]', '[00:01:50.670]', '[00:02:02.320]', '[00:02:16.010]', '[00:02:21.110]', '[00:02:27.610]']
end = ['[00:00:08.010]', '[00:00:16.290]', '[00:00:26.210]', '[00:00:39.980]', '[00:00:48.100]', '[00:00:56.770]', '[00:01:08.010]', '[00:01:20.250]', '[00:01:33.850]', '[00:01:47.150]', '[00:01:49.370]', '[00:01:50.140]', '[00:02:01.350]', '[00:02:16.010]', '[00:02:20.150]', '[00:02:27.610]', '[00:02:39.040]'] 

# convert times to datetime objects
startt = [datetime.strptime(t, '[%H:%M:%S.%f]') for t in start]
endt = [datetime.strptime(t, '[%H:%M:%S.%f]') for t in end]

newSpeakerOrder = []
newSpeakerBegin = []
newSpeakerEnd   = []

currentspeaker = None
currentend = datetime.fromtimestamp(0)
for i in range(len(speakers)):
    if currentspeaker != speakers[i] or (startt[i] - currentend).total_seconds() >= 0.5: 
        if currentspeaker != None and currentend != None:
            newSpeakerEnd.append(end[i - 1])
        newSpeakerOrder.append(speakers[i])
        newSpeakerBegin.append(start[i])
    currentspeaker = speakers[i]
    currentend = endt[i]
newSpeakerEnd.append(end[-1])

print([(s, b, e) for s, b, e in zip(newSpeakerOrder, newSpeakerBegin, newSpeakerEnd)])

输出:

[
 ('<M1>', '[00:00:00.000]', '[00:00:16.290]'),
 ('<M1>', '[00:00:16.890]', '[00:00:48.100]'),
 ('<M2>', '[00:00:48.100]', '[00:01:20.250]'),
 ('<M1>', '[00:01:21.120]', '[00:01:33.850]'),
 ('<M1>', '[00:01:46.130]', '[00:01:47.150]'),
 ('<M2>', '[00:01:47.180]', '[00:01:49.370]'),
 ('<M1>', '[00:01:49.390]', '[00:01:50.140]'),
 ('<M2>', '[00:01:50.670]', '[00:02:01.350]'),
 ('<M2>', '[00:02:02.320]', '[00:02:20.150]'),
 ('<M2>', '[00:02:21.110]', '[00:02:39.040]')
]

推荐阅读