Jean-Louis Durrieu, PhD Candidate at the ENST - Blind Source Separation Results

Jean-Louis DURRIEU
PhD Candidate at the Ecole Nationale Supérieure des Télécommunications (ENST or TELECOM ParisTech)

Below are some resulting sounds from the source separation algorithm we have designed, based on a previous melody detection (in Durrieu et al., SINGER MELODY EXTRACTION IN POLYPHONIC SIGNALS USING SOURCE SEPARATION METHODS, see the publications page), followed by a spectral Wiener filtering of the original signal. The songs we used are from the ISMIR 2004 Audio Melody Extraction Contest database (see the dedicated ismir 2004 website for more details and to download the test set and the reference files). We suggest you to use headphones in order to be able to hear slight differences and artifacts in the resulting sounds. If you experience any problems with the flash player, you can also check this alternate page with direct links to the files.

The parameters for the analysis of MIREX 2004 database songs are:

sampling rate: 44100Hz
length of analysis windows: 46.44 ms (2048 samples)
hopsize: 5.8 ms (256 samples)
frequency range for melody detection: Fmin = 100Hz, Fmax = 800Hz (except when precised)
frequency quantization for the melody: discretized to every 8th tone (that is to say there are 48 frequencies per octave)
length of the songs: between 14s and 25s

The columns in the following table are:

Title: title name of the song in original test set
Original: the original song
Sep. Singer: estimated/separated singer signal
Sep. Music: estimated/separated music signal
Remix: left channel = estimated singer, right channel = estimated music
PitchMatch: percentage of "correctness" in pitch estimation in the song's pitched frames (see the ismir 2004 website for a description)
TotalMatch: percentage of "correctness" in pitch estimation in the whole song (also taking into account the silences in the singer reference track)

Here is a table showing the results obtained by the system described in our ICASSP'08 article, Singer melody extraction in polyphonic signals using source separation methods, which A. Ehmann from the University of Illinois, at Urbana Champaign, kindly accepted to run on the database that were used for mirex 2006.

	Vx Recall	Vx False Alm	Vx d'	Raw pitch	Raw Chroma	Overall Acc
dressler	89.9%	23.5%	2.00	80.0%	82.9%	77.3%
ryynanen	80.4%	15.4%	1.88	75.5%	78.2%	72.1%
poliner	90.2%	35.7%	1.66	72.6%	75.7%	69.3%
sutton	67.6%	17.0%	1.41	59.1%	62.5%	55.7%
brossier	99.5%	97.1%	0.66	48.3%	61.7%	39.8%
durrieu	99.99%	99.81%	0.72	76.13%	80.02%	61.92%

Here is the opendoc spreadsheet with all the results. The results might differ from the ones reported below on this page, because our system is not deterministic and relies on a random initialization that may change the results. However, they do not seem to be significantly different at each run.

Title	Original	Sep. Singer	Sep. Music	Remix	Pitch Match (%)	Total Match (%)
opera_fem2					70.2	63.7
opera_fem4					78.8	80.2
opera_male3					65.5	66.8
opera_male5					82.8	77.7
daisy1					85.5	72.1
daisy2					85.3	74.6
daisy3					84.7	84.7
daisy4					90.5	90.5
pop1					74.2	62.7
pop2					79.8	63.8
pop3					76.8	62.1
pop4					79.4	64.8
jazz1					73.4	71.1
jazz2					71.2	67.3
jazz3					78.8	52.3
jazz4					71.4	57.0
midi1					74.6	70.7
midi2					81.9	81.9
midi3					60.0	60.5
midi4 (Fmax=1200)					84.8	73.5

Some results on a database I got from http://www.ee.columbia.edu/~graham/mirex_melody/, which seems to be the files "competitors" for the MIREX 2005 Melody Extraction Task could use to tune their algorithms.
The parameters are almost the same as before, except that the hopsize for the analysis windows is equal to 10ms (441 samples) to fit the given groundtruth.

Title	Original	Sep. Singer	Sep. Music	Remix	Pitch Match (%)	Total Match (%)
train01					80.9	55.1
train02					56.9	37.6
train03					77.7	48.3
train04					71.2	61.5
train05					76.5	60.0
train06					58.7	29.3
train07					72.2	56.4
train08					78.7	58.9
train09					85.1	70.4

... and below another example showing the possible use of our model for other instruments, in excerpts of take five (being a saxophonist, this example was compulsory for me!):

Title	Original	Sep. Saxophone	Sep. Background Music	Remix
takefive01
takefive02