python - 从 python_speech_features 使用 mfcc 并出现内存错误
问题描述
我正在使用来自 python_speech_features 的 mfcc 并尝试从 (5-120) 秒范围内的波形文件中提取特征。对于像 (10,20) 秒这样的持续时间较短的文件,我可以提取特征,但对于较大的文件,它会显示此错误:
---------------------------------------------------------------------------
MemoryError Traceback (most recent call last)
<ipython-input-6-ea3546938d03> in <module>
14 print("\n\tFeatures\n")
15 data, sampling_rate = librosa.load(sample_data[i])
---> 16 mfcc_features = mfcc(data,sampling_rate,winlen=30,nfft=66150)
17 print(pd.DataFrame(mfcc_features))
18 print("========================================\n")
~/anaconda3/lib/python3.8/site-packages/python_speech_features/base.py in mfcc(signal, samplerate, winlen, winstep, numcep, nfilt, nfft, lowfreq, highfreq, preemph, ceplifter, appendEnergy, winfunc)
26 :returns: A numpy array of size (NUMFRAMES by numcep) containing features. Each row holds 1 feature vector.
27 """
---> 28 feat,energy = fbank(signal,samplerate,winlen,winstep,nfilt,nfft,lowfreq,highfreq,preemph,winfunc)
29 feat = numpy.log(feat)
30 feat = dct(feat, type=2, axis=1, norm='ortho')[:,:numcep]
~/anaconda3/lib/python3.8/site-packages/python_speech_features/base.py in fbank(signal, samplerate, winlen, winstep, nfilt, nfft, lowfreq, highfreq, preemph, winfunc)
53 highfreq= highfreq or samplerate/2
54 signal = sigproc.preemphasis(signal,preemph)
---> 55 frames = sigproc.framesig(signal, winlen*samplerate, winstep*samplerate, winfunc)
56 pspec = sigproc.powspec(frames,nfft)
57 energy = numpy.sum(pspec,1) # this stores the total energy in each frame
~/anaconda3/lib/python3.8/site-packages/python_speech_features/sigproc.py in framesig(sig, frame_len, frame_step, winfunc)
33 padsignal = numpy.concatenate((sig,zeros))
34
---> 35 indices = numpy.tile(numpy.arange(0,frame_len),(numframes,1)) + numpy.tile(numpy.arange(0,numframes*frame_step,frame_step),(frame_len,1)).T
36 indices = numpy.array(indices,dtype=numpy.int32)
37 frames = padsignal[indices]
<__array_function__ internals> in tile(*args, **kwargs)
~/anaconda3/lib/python3.8/site-packages/numpy/lib/shape_base.py in tile(A, reps)
1256 for dim_in, nrep in zip(c.shape, tup):
1257 if nrep != 1:
-> 1258 c = c.reshape(-1, n).repeat(nrep, 0)
1259 n //= dim_in
1260 return c.reshape(shape_out)
MemoryError: Unable to allocate 12.8 GiB for an array with shape (2591, 661500) and data type int64
这是代码,我在 Jupyter 笔记本上运行它。我在具有 8Gb RAM 的笔记本电脑、具有 32GB RAM 的 PC 和具有近 12Gb RAM 的 Google Collab 计算引擎上进行了尝试,但错误仍然存在。
print("\nSample Data:")
print("============\n")
path = ('speech-sample-data')
sample_data = [os.path.join(dp, f) for dp, dn, filenames in os.walk(path) for f in filenames if os.path.splitext(f)[1] == '.wav']
for i in range(5):
print("Speech: ")
ipd.display(ipd.Audio(sample_data[i]))
print("Type: \n\tNormal\n")
print("\n\tFeatures\n")
data, sampling_rate = librosa.load(sample_data[i])
mfcc_features = mfcc(data,sampling_rate,winlen=30,nfft=66150)
print(pd.DataFrame(mfcc_features))
print("========================================\n")
print("Speech: ")
ipd.display(ipd.Audio(sample_data[i+5]))
print("Type: \n\tToxic\n")
print("\n\tFeatures\n")
data, sampling_rate = librosa.load(sample_data[i+5])
mfcc_features = mfcc(data,sampling_rate,winlen=30,nfft=66150)
print(pd.DataFrame(mfcc_features))
print("========================================\n")
解决方案
推荐阅读
- amazon-web-services - 如何连接 CircleCI 和 AWS?
- android - ViewPager 在使用前激活它的片段
- c# - 当会话状态为真时,Ajax 自动完成不工作
- ios - 如何在 Swift 中维护 Dictionary 中数据的顺序?
- javascript - reactJS windows.FileReader readAsArrayBuffer方法错误
- snowflake-cloud-data-platform - 雪花中的以下日期转换查询不起作用
- pandas - get_as_dataframe 将 N/A 转换为 NaN 想要阻止它
- python-3.x - 我收到 EOF 错误,无法解决
- erpnext - 在 Mac 中设置 ERP-next
- android - 如果满足特定条件,如何停止 gradle 同步