Energy and RMSE¶
The energy ([Wikipedia](https://en.wikipedia.org/wiki/Energy_(signal_processing%29); FMP, p. 66) of a signal corresponds to the total magntiude of the signal. For audio signals, that roughly corresponds to how loud the signal is. The energy in a signal is defined as
The root-mean-square energy (RMSE) in a signal is defined as
Let's load a signal:
In [3]:
x, sr = librosa.load('audio/simple_loop.wav')
In [4]:
sr
Out[4]:
22050
In [5]:
x.shape
Out[5]:
(49613,)
In [6]:
librosa.get_duration(x, sr)
Out[6]:
2.2500226757369615
Listen to the signal:
Out[7]:
Your browser does not support the audio element.
Plot the signal:
Out[8]:
<matplotlib.collections.PolyCollection at 0x10cd21cc0>
Compute the short-time energy using a list comprehension:
In [9]:
hop_length = 256
frame_length = 512
In [10]:
energy = numpy.array([
sum(abs(x[i:i+frame_length]**2))
for i in range(0, len(x), hop_length)
])
In [11]:
energy.shape
Out[11]:
(194,)
Compute the RMSE using librosa.feature.rmse:
In [12]:
rmse = librosa.feature.rmse(x, frame_length=frame_length, hop_length=hop_length, center=True)
In [13]:
rmse.shape
Out[13]:
(1, 194)
In [14]:
rmse = rmse[0]
Plot both the energy and RMSE along with the waveform:
In [15]:
frames = range(len(energy))
t = librosa.frames_to_time(frames, sr=sr, hop_length=hop_length)
Out[16]:
<matplotlib.legend.Legend at 0x10cd54cc0>
Questions¶
Write a function, strip, that removes leading silence from a signal. Make sure it works for a variety of signals recorded in different environments and with different signal-to-noise ratios (SNR).
In [17]:
def strip(x, frame_length, hop_length):
# Compute RMSE.
rmse = librosa.feature.rmse(x, frame_length=frame_length, hop_length=hop_length, center=True)
# Identify the first frame index where RMSE exceeds a threshold.
thresh = 0.01
frame_index = 0
while rmse[0][frame_index] < thresh:
frame_index += 1
# Convert units of frames to samples.
start_sample_index = librosa.frames_to_samples(frame_index, hop_length=hop_length)
# Return the trimmed signal.
return x[start_sample_index:]
Let's see if it works.
In [18]:
y = strip(x, frame_length, hop_length)
Out[19]:
Your browser does not support the audio element.
Out[20]:
<matplotlib.collections.PolyCollection at 0x10ce20128>
It worked!