Please use this identifier to cite or link to this item:
For citation please use:
Main Title: Final Research Report on Auto-Tagging of Music
Author(s): Schwarz, Diemo
Peeters, Geoffroy
Cohen-Hadria, Alice
Fourer, Dominique
Marchand, Ugo
Mignot, Rémi
Cornu, Frédéric
Laffitte, Pierre
Schindler, Daniel
Hofmann, Robin
Spadaveccia, Rino
Type: Report
Abstract: The deliverable D4.7 concerns the work achieved by IRCAM until M36 for the “auto-tagging of music”. The deliverable is a research report. The software libraries resulting from the research have been integrated into Fincons/HearDis! Music Library Manager or are used by TU Berlin. The final software libraries are described in D4.5. The research work on auto-tagging has concentrated on four aspects: 1) Further improving IRCAM’s machine-learning system ircamclass. This has been done by developing the new MASSS audio features, including audio augmentation and audio segmentation into ircamclass. The system has then been applied to train HearDis! “soft” features (Vocals-1, Vocals-2, Pop-Appeal, Intensity, Instrumentation, Timbre, Genre, Style). This is described in Part 3. 2) Developing two sets of “hard” features (i.e. related to musical or musicological concepts) as specified by HearDis! (for integration into Fincons/HearDis! Music Library Manager) and TU Berlin (as input for the prediction model of the GMBI attributes). Such features are either derived from previously estimated higher-level concepts (such as structure, key or succession of chords) or by developing new signal processing algorithm (such as HPSS) or main melody estimation. This is described in Part 4. 3) Developing audio features to characterize the audio quality of a music track. The goal is to describe the quality of the audio independently of its apparent encoding. This is then used to estimate audio degradation or music decade. This is to be used to ensure that playlists contain tracks with similar audio quality. This is described in Part 5. 4) Developing innovative algorithms to extract specific audio features to improve music mixes. So far, innovative techniques (based on various Blind Audio Source Separation algorithms and Convolutional Neural Network) have been developed for singing voice separation, singing voice segmentation, music structure boundaries estimation, and DJ cue-region estimation. This is described in Part 6.
Subject(s): audio branding
music branding
machine learning
maschinelles Lernen
Issue Date: 12-Dec-2018
Date Available: 18-Feb-2019
Language Code: en
DDC Class: 780 Musik
Sponsor/Funder: EC/H2020/688122/EU/Artist-to-Business-to-Business-to-Consumer Audio Branding System/ABC DJ
TU Affiliation(s): Fak. 1 Geistes- und Bildungswissenschaften » Inst. Sprache und Kommunikation » FG Audiokommunikation
Appears in Collections:Technische Universität Berlin » Publications

Item Export Bar

This item is licensed under a Creative Commons License Creative Commons