DEMIX Python/NumPy implementation
DEMIX is an algorithm that counts the number of sources, based on their spatial cues, and returns the estimated parameters, which are related to the relative amplitudes between the channels, as well as the relative time shifts. The full description is given in [Arberet2010]:
Arberet, S.; Gribonval, R. & Bimbot, F. A Robust Method to Count and Locate Audio Sources in a Multichannel Underdetermined Mixture IEEE Transactions on Signal Processing, 2010, 58, 121 - 133
This implementation is based on the MATLAB Toolbox provided by the authors of the above article.
Additionally, this implementation further allows time-frequency representations other than the short-term Fourier transform (STFT).
DEMIX algorithm, for 2 channels.
compute for each cluster in self.clusters a threshold depending on the other clusters, in order to keep only those points in cluster that are close to the actual centroid, but not close to centroids of other clusters.
The returned clusters are the original clusters thresholded.
Computes the time-frequency clusters, along with their centroids, which contain the parameters of the mixing process - namely theta, which parameterizes the relative amplitude, and delta, which is homogeneous to a delay in samples between the two channels.
This computes the inverse Fourier transform of the estimated Steering Vectors, weighed by their inverse variance
The result is a detection function that provides peaks at the most likely delta - the delay in samples.
reconfigures the cluster indices in self.clusters such that all the Time-Freq points that appear in more than one cluster are dismissed from all computations
computes the max distance between centroid and points
returns a TF mask which is True if their corresponding value of delta is close enough to the delta from the centroid.
returns the TF points whose theta is close to that of the centroid, among the points considered in index_pts_to_classify
TODO: make the function for different scales, as in matlab toolbox
returns the delay maxDelta in samples that corresponds to the largest peak of the cluster defined by the provided cluster index
reestimate cluster centroids
considering all the cluster masks, reestimate the centroids, discarding the clusters for which there was no well-defined delta.
Refining the clusters in order to verify that they are possible. Additionally, if self.nsources is defined, this method only keeps the required number. Otherwise, it is decided by choosing the most likely centroids.
DJL: this did never happen in DEMIX Matlab version, have to contact authors for explanations...
using optimal spatial filters to obtain separated signals
this is a beamformer implementation. MVDR or assuming the sources are normal, independent and with same variance (not sure whether this does not mean that we can’t separate them...)
From:
Maazaoui, M.; Grenier, Y. & Abed-Meraim, K.
``Blind Source Separation for Robot Audition using
Fixed Beamforming with HRTFs'',
in proc. of INTERSPEECH, 2011.
per channel, the filter steering vector, source p: