Time-Frequency Transforms
TODO: turn this into something more self-contained (like defining a super class for all the possible time-freq transforms)
TFTransform is the Time-Frequency Transform base class. All the TF representations sub-classing it should implement the following methods:
Computes the transform on the provided data. The sub-classes should re-implement this method, and store the result in the attribute TFTransform.transfo.
Computes the inverse transform from the stored transform in TFTransform.transfo
TFTransform.transfo receives the transform when computeTransform is called.
A convenience dictionary, with abbreviated names for the transforms.
Object that implements the computation of Short-Term Fourier Transforms (STFT) and its inverse.
Inputs:
Parameters: |
|
---|
Sequentially compute Fourier transfo, filter and overlap-add
INPUTS
- W
- M x F x N (or M x F) filter for the data, which should be single channel
- data
- T (number of samples, number of channels)
...
Sequentially compute Fourier transfo, filter and overlap-add
W is the M x M x F x N filter for the data, which should be T x M data T x M (number of samples, number of channels)
data = istft(X,window=sinebell(2048),hopsize=1024.0,nfft=2048.0,fs=44100)
Computes an inverse of the short time Fourier transform (STFT), here, the overlap-add procedure is implemented.
Computes the short time Fourier transform (STFT) of data.
Constant-Q transform after the work by C. Scholkhuber and A. Klapuri 2010 [SK2010]
Adaptation of the Constant Q transform as presented in
[SK2010] | (1, 2) Schoerkhuber, C. and Klapuri, A., “Constant-Q transform toolbox for music processing,” submitted to the 7th Sound and Music Computing Conference, Barcelona, Spain. |
Comments beginning with ‘%’ and ‘%%’ are retained from the original Matlab code.
Python/Numpy/Scipy by Jean-Louis Durrieu, EPFL, 2012 - 2013
The CQT Kernel contains everything that can be precomputed for Constant-Q transforms. This relies on [SK2010], and therefore computes a Kernel for a single octave. It is then efficiently used to compute the decomposition on the different octaves by downsampling the signal.
Parameters:
- fmax
- The maximum desired central frequency
- bins
- The number of bins per octave
- fs
- Sampling rate of the audio files
- q
- parameter that controls the quality
- atomHopFactor
- hopsize rate (0.25 is a hopsize of 25% the size of the windows) between successive analysis windows
- thresh
- threshold value for sparsifying the kernel (Note: in this implementation, we do not use the sparsity, more efficiency could be achieved by considering it)
- winFunc (python function that outputs an array)
- the analysis window function
- perfRast
- whether computing rasterized version or not (if so, the decompositions at all scales will have the same number of frames, otherwise, each lower analysis octave will have half as many frames as the direct upper analysis octave.)
Attributes:
sparKernel weight atomHOP FFTLen fftOLP fftHOP bins winNr Nk_max Q fmin fmax frequencies perfRast first_center fs winFunc thresh q
Constant Q Transform
frequency stamps for spCQT
Assuming we have self.spCQT, and not self.cellCQT, we recompute self.cellCQT from self.spCQT, and then invert as usual.
NB: here, self.cellCQT is written over, if it existed.
this inverts the transform, if perfRast, then this means we can invert each hop of the different octaves.
Invert the desired transform, here invert CQT from the cell CQT: like the original from [Schorkhuber2010]
$Q$ values, approximated
spCQT: the constant Q transform, in a readable format.
generates self.cellCQT from self.spCQT
NB: after transformation of spCQT (by filtering, for instance), this method only keeps downsampled versions of each CQT representation for each octave. More elaborated computations may be necessary to take into account more precise time variations at low frequency octaves.
time stamps for spCQT
returns the computed transform
Hybrid CQT/Linear kernel
Compute the missing (high) frequency components, and make a similar Kernel for them.
We can use this for the first octave (the highest frequency octave) to extend the high frequencies. Actually, this can be used to compute a hybrid CQT transform on the low frequencies, while keeping linear freqs in the high spectrum, and still benefiting from the invertibility of the CQT transform by Schoerkhuber and Klapuri
Hybrid Constant Q Transform
Same as computeCQT, except it uses the linear frequency components in cqtkernel.linearSparKernel
NB: since this should be equivalent to computing an FFT after windowing each frame, there may be a faster way of implementing this function. For now, keeping the same rules as the original CQT implementation, for consistency and also for avoiding problems with window synchrony
Invert the hybrid transform.
Linearity allows to perform the cqt inverse first, and add the inverse of the linear freqs part thereafter (or the other way around).
Min Q Transform Kernel
Compute the missing (high) frequency components, and make a similar Kernel for them.
We can use this for the first octave (the highest frequency octave) to extend the high frequencies. Actually, this can be used to compute a hybrid CQT transform on the low frequencies, while keeping linear freqs in the high spectrum, and still benefiting from the invertibility of the CQT transform by Schoerkhuber and Klapuri
Minimum Q Transform
Compute the linear frequency part with an STFT, and taking only the desired frequencies.
This inverts the linear part of the hybrid transform
NB: as for the computation of this part in transform, a windowed version of a plain FFT should do the same job, and faster.