AN ITERATIVE APPROACH TO MONAURAL MUSICAL MIXTURE DE-SOLOING
DURRIEU J.-L., RICHARD Gaël and DAVID Bertrand
(accepted for ICASSP'09)




Article:
Jean-Louis DURRIEU, Gaël RICHARD and Bertrand DAVID, "An Iterative Approach to Monaural Musical Mixture De-soloing", ICASSP 2009, Taipei, Taiwan. [pdf][poster][audio examples][bibtex][copyright]
(for ICASSP attendees: search paper #3601 at http://icassp09.org/Papers/Search.asp)
List of songs used:
  • SiSEC professionally produced material dataset (subset A):
    • bearlin-roads_85-99_with_effects (14", piano, bass, drums, male singer)
    • tamy-que_pena_tanto_faz_6-19 (13", female singer + guitar)
  • From A. Ozerov and M. Lagrange database (subset B):
    • Joyce (37'', synthetic accompaniment, male singer voice)
    • Katzen_jammer__Clipsinvegas (4 excerpts of 1' each, rock, strong effects on the male singer voice)
    • Katzen_jammer__Darkeyed (3 ex., 2 x 1' + 18", female singer)
    • Sting__Every_Breath_You_Take (4 ex. :  3 x 1' + 17", karaoke with male singer)
    • bentOutOfShape (3 ex. : 2 x 1' + 40'', rock, male singer)
    • chevalierBran (4 ex. : 4 x 1', Celtic rock, male singer + strong presence of violin and "biniou")
    • intoTheUnknown (3 ex. : 2 x 1' + 39", rock, male singer)
    • lePub (4 ex. : 4 x 1', Celtic rock, same as chevalierBran
    • schizosonic (3 ex., rock, male singer)
  • Shannon Hurley's songs (creative common licence) (subset C) (with melody-annotated set C1: annotation files):
    • (C1) Silence (4 excerpts : 4 x 1')
    • (C1) Sunrise (4 ex. : 3 x 1' + 16")
    • (C1) We Are in Love (4 ex. : 3 x 1' + 42")
    • Matter of Time (5 ex. : 4 x 1' + 37")
    • Shame (5 ex. : 4 x 1' + 40")
Separation examples:

The zip archives contain all the "wav" files if the flash plugin does not work. Click on the name of the songs to obtain them. These archives contain:
    • %song_mix.wav : original mixture
    • %song_voc.wav : original solo voice track
    • %song_voc_est_1.wav : estimated solo, directly after melody extraction step
    • %song_voc_est_2.wav : estimated solo, after parameter re-estimation
    • %song_mus.wav : original accompaniment (or background music) track
    • %song_mus_est_1.wav : estimated accompaniment, directly after melody extraction step
    • %song_mus_est_2.wav : estimated accompaniment, after parameter re-estimation
where %song is the name of the song. There are also some software on the software page, in python, that can ease the listening of these example.
  • Dataset A:
song original SDR 1-step estimation SDR 2-step estimation SDR
Bearlin solo voice
-5.4 estimated solo (1)
3.7 est. solo (2)
6.2
accompaniment
5.4 estimated acc. (1)
9.1 est. acc. (2)
11.6
mixture
Tamy solo
0.5 est. solo (1)
10.8 est. solo (2)
11.5
acc.
-0.5 est. acc. (1)
10.6 est. acc. (2)
9.2
mix.

Remarks: results very satisfying. In the estimated accompaniment, usual residual of the singer = unvoiced parts + reverberation left-overs. These effects are not explicitly estimated  within our framework.
  • Dataset B:
song original SDR 1-step estimation SDR 2-step estimation SDR
bentOutOfShape
(excerpt 2)
solo
0.0 est. solo (1)
5.8 est. solo (2)
5.5
acc.
0.0 est. acc. (1)
5.8 est. acc. (2)
5.6
mix.
chevalierBran
(excerpt 3)
solo
-6.8 est. solo (1)
1.5 est. solo (2)
1.5
acc.
6.8 est. acc. (1)
8.3 est. acc. (2)
8.3
mix.

Remarks: almost no improvement from 1st estimation to 2nd estimation. Many octave errors in pitch estimation, leading to badly re-estimated parameters. The resulting parameters after 1st estimation are more flexible: when the pitch is not the right fundamental frequency, the parameters estimated there compensate the error. The re-estimation may lead to better estimate parameters fitting a bad melody line, hence the importance of a good pitch estimator.
  • Dataset C:
song original SDR 1-step estimation SDR 2-step estimation SDR
Silence
(excerpt 2, knowing the melody)
solo
0.2 (NA: step 2 directly
applied when
melody given)
est. solo (2)
10.6
acc.
-0.2 est. acc. (2)
10.4
mix.
Silence
(excerpt 2)
solo
0.2 est. solo (1)
7.4 est. solo (2)
8.6
acc.
-0.2 est. acc. (1)

7.1 est. acc. (2)
8.4
mix.
Matter of Time
(excerpt 3)
solo
-4.7 est. solo (1)
6.4 est. solo (2)
8.0
acc.
4.7 est. acc. (1)
11.3 est. acc. (2)
12.7
mix.


A Detailed example: We Are in Love, Shannon Hurley, excerpt 3.

In order to give a deeper insight of our algorithm, we analyze it on the 3rd excerpt of ``We Are In Love'' (S. Hurley), for which we have the 8 separated tracks of each of the instruments of the song. The figure below shows the evolution of the SIR gain of the estimated over the mixture for 4 cases, depending on the instrument we consider as the ``main instrument'', i.e. setting to either of the following tracks: the guitar, the piano, the flugelhorn and the singer.


(Click on the picture to get an .eps version of this picture)

In this excerpt, the singer finishes her phrase at t=3s and sings again at t=38s. The flugelhorn plays from t=5s to t=38s. The piano and the guitar also have solo notes at t=20s and t=29s. As shown on the figure, the SIR gains are maximum for these instruments at the times where they are soloing: our system successfully separates the predominant instrument.

song SDR 1-step estimation SDR 2-step estimation SDR
We Are In Love
excerpt 3
singer

-10.8 est. solo (1)

-1.1 est. solo (2)

0.2
flugelhorn

-13.3 est. acc. (1)

9.9 est. acc. (2)

11.0
piano

-23.3
electric guitar

-19.6
mix.


Copyright 2009 IEEE. Published in the IEEE 2009 International Conference on Acoustics, Speech, and Signal Processing (ICASSP 2009), scheduled for April 19 - 24, 2009 in Taipei, Taiwan Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works, must be obtained from the IEEE. Contact: Manager, Copyrights and Permissions / IEEE Service Center / 445 Hoes Lane / P.O. Box 1331 / Piscataway, NJ 08855-1331, USA. Telephone: + Intl. 908-562-3966.
back to top
Document made with KompoZer