python - 如何使用 Python 匹配音频剪辑中的音频剪辑?
问题描述
我正在尝试使用 Librosa 在较大的 mp3 音频剪辑中播放简短的 mp3 叮当声。但是,我很难让它工作,我不知道下一步该去哪里。这是我到目前为止基于此 StackOverflow 答案的代码,尽管我愿意通过另一种方法或库来检测叮当声的位置。
# Load the audio as a waveform
# Store the sampling rate
JingleWave, JingleSR = librosa.load(short.mp3)
EpisodeWave, EpisodeSR = librosa.load(long.mp3)
# Power spectrograms of file
# I notice through debugging that the length of these arrays are the same
# despite them being very different file lengths
JingleSpectogram = np.abs(librosa.stft(JingleWave))
EpisodeSpectogram = np.abs(librosa.stft(EpisodeWave))
# Define binary structure for the footprint
# This is the part that is most likely to be faulty, as I most did it because
# maximum filter requires a footprint
structure = generate_binary_structure(2,1)
# Find local peaks to create constellation maps (2D images only containing peaks)
JingleCM = maximum_filter(JingleSpectogram, footprint=structure)
EpisodeCM = maximum_filter(EpisodeSpectogram, footprint=structure)
# Get time frames of the constellation maps
JingleLength = JingleCM.shape[0]
EpisodeLength = EpisodeCM.shape[0]
# Keep track of what segments match the most
scores = []
# Compare audio to find matching audio
for offset in range(EpisodeLength-JingleLength):
EpisodeExcerpt = EpisodeCM[offset:offset+JingleLength]
score = np.sum(np.multiple(EpisodeExcerpt,JingleCM))
scores[offset] = score
# Find when the highest score happens
highestScore = -1
for num in range(len(scores)):
if highestScore < num:
highestScore = num
# Convert score into the position of where the jingle starts
print(scores.index(highestScore))
print(highestScore)
我只是编程的初学者,所以非常感谢任何帮助。
解决方案
推荐阅读
- r - NeweyWest - 长度为零的参数
- ruby-on-rails - 由于 Turbolinks on rails,页面自动刷新
- python - Django 无法通过 mod_wsgi 错误部署 [AWS]
- mysql - 使用数据库实现的锁定机制
- ios - 自 iOS 11.3 以来 CLVisit 的行为有所不同
- java - 如何在 Apache Beam 项目中直接使用 google-cloud-storage
- reactjs - Konvajs - Layer.toImage() 返回 null 而不是图像数据
- c# - 保持原始列表不变
- python - 如何在退出时执行函数(尤其是退出代码 -1 - 通过“X”按钮关闭程序时)
- php - Telegram REST API,在消息文本中发送换行符?