Librosa Constant Q Transform (CQT) contains defects at the beginning and ending of the spectrogram

By : user3099746
Date : January 11 2021, 03:34 PM
I wish this help you I think you might want to try out pad_mode which is supported in cqt. If you checkout the np.pad documentation, you can see available options (or see the end of this post). With the wrap option, you get a result like this, though I suspect the phase is a mess, so you should make sure this meets your needs. If you are always generating your own signal, you could trying using the instead of one of the available options.
code :
import numpy as np
import matplotlib.pyplot as plt
from librosa import cqt

s = np.linspace(0,1,44100)
x = np.sin(2*np.pi*1000*s)

cq_lib = cqt(x,sr=44100, fmin=fmin, n_bins=40, pad_mode='wrap')

plt.imshow(abs(cq_lib),aspect='auto', origin='lower')
plt.xlabel('Time Steps')
plt.ylabel('Freq bins')

Using Librosa to plot a mel-spectrogram

By : z6gcnm
Date : March 29 2020, 07:55 AM
To fix this issue oh, Your question is mainly about how to save it as jpg? If you just want to display pictures,You just need to add a line of code: plt.show()
if you want save a jpg, no axis, no white edge:
code :
import os
import matplotlib
matplotlib.use('Agg') # No pictures displayed 
import pylab
import librosa
import librosa.display
import numpy as np

sig, fs = librosa.load('path_to_my_wav_file')   
# make pictures name 
save_path = 'test.jpg'

pylab.axis('off') # no axis
pylab.axes([0., 0., 1., 1.], frameon=False, xticks=[], yticks=[]) # Remove the white edge
S = librosa.feature.melspectrogram(y=sig, sr=fs)
librosa.display.specshow(librosa.power_to_db(S, ref=np.max))
pylab.savefig(save_path, bbox_inches=None, pad_inches=0)
Why spectrogram from librosa library have twice the time duration of the actual audio track?

By : user1468855
Date : March 29 2020, 07:55 AM
wish help you to fix your issue You need to pass the sampling rate to librosa.display.specshow (sr=self.SamplingFrequency). If not it defaults to 20050 and if self.SamplingFrequency is a different value, it will display the wrong length.
Difference between output of python librosa.core.stft() and matlab spectrogram(x)

By : Stephen Ward
Date : March 29 2020, 07:55 AM
will be helpful for those in need After a long time, and an unsatisfied bounty, I have found the answer myself.
The MATLAB function spectrogram() outputs a vector of times which corresponds to the middle of each window while omitting the last window. For example, a 10 samples length signal with a 3 sample window and 1 sample overlap, will result in the following 4 windows:
convert spectrogram to audio using librosa functions

By : user3197429
Date : March 29 2020, 07:55 AM
around this issue Your code works for me without error. I recommend reinstalling the latest version of librosa using a clean miniconda environment:
code :
conda install -c conda-forge librosa
How can I save a Librosa spectrogram plot as a specific sized image?

By : Mike Erick
Date : January 02 2021, 06:48 AM
I wish this help you Plots are for humans to look at, and contains things like axis markers, labels etc that are not useful for machine learning. To feed a model with an 'image' of the spectrogram, one should output only the data. This data be stored in any format, but if you want to use a standard image format then should use PNG. Lossy compression such as JPEG introduces compression artifacts.
Here follows working example code to save spectrogram. Note that to get a fixed size image output, the code extracts a fixed-length window of the audio signal. Dividing an audio stream into such fixed-length analysis windows is standard practice.
code :
import librosa
import numpy
import skimage

def scale_minmax(X, min=0.0, max=1.0):
    X_std = (X - X.min()) / (X.max() - X.min())
    X_scaled = X_std * (max - min) + min
    return X_scaled

def spectrogram_image(y, sr, out, hop_length, n_mels):
    # use log-melspectrogram
    mels = librosa.feature.melspectrogram(y=y, sr=sr, n_mels=n_mels,
                                            n_fft=hop_length*2, hop_length=hop_length)
    mels = numpy.log(mels + 1e-9) # add small number to avoid log(0)

    # min-max scale to fit inside 8-bit range
    img = scale_minmax(mels, 0, 255).astype(numpy.uint8)
    img = numpy.flip(img, axis=0) # put low frequencies at the bottom in image
    img = 255-img # invert. make black==more energy

    # save as PNG
    skimage.io.imsave(out, img)

if __name__ == '__main__':
    # settings
    hop_length = 512 # number of samples per time-step in spectrogram
    n_mels = 128 # number of bins in spectrogram. Height of image
    time_steps = 384 # number of time-steps. Width of image

    # load audio. Using example from librosa
    path = librosa.util.example_audio_file()
    y, sr = librosa.load(path, offset=1.0, duration=10.0, sr=22050)
    out = 'out.png'

    # extract a fixed length window
    start_sample = 0 # starting at beginning
    length_samples = time_steps*hop_length
    window = y[start_sample:start_sample+length_samples]

    # convert to PNG
    spectrogram_image(window, sr=sr, out=out, hop_length=hop_length, n_mels=n_mels)
    print('wrote file', out)
