FG Audiokommunikation

190 Items

Recent Submissions
Prediction of Dialogue Success with Spectral and Rhythm Acoustic Features Using DNNS and SVMS

Lykartsis, Athanasios ; Kotti, Margarita ; Papangelis, Alexandros ; Stylianou, Yannis (2019-02-14)

In this paper we investigate the novel use of exclusively audio to predict whether a spoken dialogue will be successful or not, both in a subjective and in an objective manner. To achieve that, multiple spectral and rhythmic features are inputted to support vector machines and deep neural networks. We report results on data from 3267 spoken dialogues, using both the full user response as well a...

Speaker Identification for Swiss German with Spectral and Rhythm Features

Lykartsis, Athanasios ; Weinzierl, Stefan ; Dellwo, Volker (2017-06-13)

We present results of speech rhythm analysis for automatic speaker identification. We expand previous experiments using similar methods for language identification. Features describing the rhythmic properties of salient changes in signal components are extracted and used in an speaker identification task to determine to which extent they are descriptive of speaker variability. We also test the ...

Using the beat histogram for speech rhythm description and language identification

Lykartsis, Athanasios ; Weinzierl, Weinzierl (2015)

In this paper we present a novel approach for the description of speech rhythm and the extraction of rhythm-related features for automatic language identification (LID). Previous methods have extracted speech rhythm through the calculation of features based on salient elements of speech such as consonants, vowels and syllables. We present how an automatic rhythm extraction method borrowed from ...

Speech and music discrimination: Human detection of differences between music and speech based on rhythm

Stanev, Madeleine ; Redlich, Johannes ; Knörzer, Christian ; Rosenfeld, Ninett ; Lykartsis, Athanasios (2016)

Rhythm in speech and singing forms one of its basic acoustic components. Therefore, it is interesting to investigate the capability of subjects to distinguish between speech and singing when only the rhythm remains as an acoustic cue. For this study we developed a method to eliminate all linguistic components but rhythm from the speech and singing signals. The study was conducted online and par...

Beat histogram features for rhythm-based musical genre classification using multiple novelty functions

Lykartsis, Athanasios ; Lerch, Alexander (2015)

In this paper we present beat histogram features for multiple level rhythm description and evaluate them in a musical genre classification task. Audio features pertaining to various musical content categories and their related novelty functions are extracted as a basis for the creation of beat histograms. The proposed features capture not only amplitude, but also tonal and general spectral chan...

Beat histogram features from NMF-based novelty functions for music classification

Lykartsis, Athanasios ; Wu, Chih-Wei ; Lerch, Alexander (2015)

In this paper we present novel rhythm features derived from drum tracks extracted from polyphonic music and evaluate them in a genre classification task. Musical excerpts are analyzed using an optimized, partially fixed Non-Negative Matrix Factorization (NMF) method and beat histogram features are calculated on basis of the resulting activation functions for each one out of three drum tracks ex...

On the analysis of speech rhythm for language and speaker identification

Lykartsis, Athanasios (2020)

In the context of this dissertation, novel methods for rhythm description and extraction originating from the area of Music Information Retrieval (MIR) were adapted and applied to represent speech rhythm and its properties. These methods were then used to extract rhythmic information to be used in two specific classification scenarios relevant to speech technology: language identification (LID)...

The FABIAN head-related transfer function data base

Brinkmann, Fabian ; Lindau, Alexander ; Weinzierl, Stefan ; Geissler, Gunnar ; van de Par, Steven ; Müller-Trapet, Markus ; Opdam, Rob ; Vorländer, Michael (2017-02-09)

This data base includes head-related transfer functions (HRTFs), headphone transfer functions (HpTFs), and 3D-meshes of the FABIAN head and torso simulator. More detailed information is provided in the documentation within the data base.

The Sound of Success: Investigating Cognitive and Behavioral Effects of Motivational Music in Sports

Elvers, Paul ; Steffens, Jochen (2017-11-21)

Listening to music before, during, or after sports is a common phenomenon, yet its functions and effects on performance, cognition, and behavior remain to be investigated. In this study we present a novel approach to the role of music in sports and exercise that focuses on the notion of musical self-enhancement (Elvers, 2016). We derived the following hypotheses from this framework: listening t...

The busking experiment: A field study measuring behavioral responses to street music performances

Anglada-Tort, Manuel ; Thueringer, Heather ; Omigie, Diana (2019-03)

A field experiment was conducted with a professional busker in the London Underground over the course of 24 days. Its aim was to investigate the extent to which performative aspects influence behavioral responses to music street performances. Two aspects of the performance were manipulated: familiarity of the music (familiar vs. unfamiliar) and body movements (expressive vs. restricted). The am...

Popular music lyrics and musicians’ gender over time: A computational approach

Anglada-Tort, Manuel ; Krause, Amanda E. ; North, Adrian C. (2019-10-23)

The present study investigated how the gender distribution of the United Kingdom’s most popular artists has changed over time and the extent to which these changes might relate to popular music lyrics. Using data mining and machine learning techniques, we analyzed all songs that reached the UK weekly top 5 sales charts from 1960 to 2015 (4,222 songs). DICTION software facilitated a computerized...

Music induces universal emotion-related psychophysiological responses: comparing Canadian listeners to Congolese Pygmies

Egermann, Hauke ; Fernando, Nathalie ; Chuen, Lorraine ; McAdams, Stephen (2015-01-07)

Subjective and psychophysiological emotional responses to music from two different cultures were compared within these two cultures. Two identical experiments were conducted: the first in the Congolese rainforest with an isolated population of Mebenzélé Pygmies without any exposure to Western music and culture, the second with a group of Western music listeners, with no experience with Congoles...

Processing of emotional words in bilinguals: Testing the effects of word concreteness, task type and language status

Ferré, Pilar ; Anglada-Tort, Manuel ; Guasch, Marc (2017-12-19)

The present study investigates whether the emotional content of words has the same effect in the different languages of bilinguals by testing the effects of word concreteness, the type of task used, and language status. Highly proficient bilinguals of Catalan and Spanish who learned Catalan and Spanish in early childhood in a bilingual immersion context, and who still live in such a context, pe...

Room Acoustical Parameters as Predictors of Room Acoustical Impression: What Do We Know and What Would We Like to Know?

Weinzierl, Stefan ; Vorländer, Michael (2015-02-26)

Room acoustical parameters are audio features, usually extracted from monaural or binaural measurements of room acoustical environments, and used to predict different aspects of the ‘room acoustical impression’. The paper takes a closer look at the nature of this perceptional construct and at different approaches to develop a psychological measuring instrument for the multidimensional perceptio...

Generation and analysis of an acoustic radiation pattern database for forty-one musical instruments

Shabtai, Noam R. ; Behler, Gottfried ; Vorländer, Michael ; Weinzierl, Stefan (2017-02-28)

A database of acoustic radiation patterns was recorded, modeled, and analyzed for 41 modern or authentic orchestral musical instruments. The generation of this database included recordings of each instrument over the entire chromatic tone range in an anechoic chamber using a surrounding spherical microphone array. Acoustic source centering was applied in order to align the acoustic center of th...

Instruments for Spatial Sound Control in Real Time Music Performances. A Review

Pysiewicz, Andreas ; Weinzierl, Stefan (2016-12-10)

The systematic arrangement of sound in space is widely considered as one important compositional design category of Western art music and acoustic media art in the 20th century. A lot of attention has been paid to the artistic concepts of sound in space and its reproduction through loudspeaker systems. Much less attention has been attracted by live-interactive practices and tools for spatialisa...

Mixed Analytical-Numerical Filter Design for Optimized Electronic Control of Line Source Arrays

Straube, Florian ; Schultz, Frank ; Makarski, Michael ; Weinzierl, Stefan (2018-09-16)

Line source arrays (LSAs) are used for large-scale sound reinforcement that synthesizes homogeneous sound fields over the full audio bandwidth. The deployed loudspeaker cabinets are rigged with different tilt angles and are electronically controlled to provide the intended coverage of the audience zones and to avoid radiation toward the ceiling, reflective walls, or residential areas. In this a...

Prediction of speech intelligibility using pseudo-binaural room impulse responses

Kokabi, Omid ; Brinkmann, Fabian ; Weinzierl, Stefan (2019-04-29)

Head orientation (HO) affects better-ear-listening and spatial-release-from-masking, which are two key aspects in binaural speech intelligibility. To incorporate HO in speech intelligibility prediction, binaural room impulse responses (BRIRs) for every HO of interest could be used. Due to the limited spectral bandwidth of speech, however, approximate representations might be sufficient, which c...

Segmentation of binaural room impulse responses for speech intelligibility prediction

Kokabi, Omid ; Brinkmann, Fabian ; Weinzierl, Stefan (2018-11-16)

The two most important aspects in binaural speech perception—better-ear-listening and spatial-release-from-masking—can be predicted well with current binaural modeling frameworks operating on head-related impulse responses, i.e., anechoic binaural signals. To incorporate effects of reverberation, a model extension was proposed, splitting binaural room impulse responses into an early, useful, an...

Audibility and Interpolation of Head-Above-Torso Orientation in Binaural Technology

Brinkmann, Fabian ; Roden, Reinhild ; Lindau, Alexander ; Weinzierl, Stefan (2015-03-20)

Head-related transfer functions (HRTFs) incorporate fundamental cues required for human spatial hearing and are often applied to auralize results obtained from room acoustic simulations. HRTFs are typically available for various directions of sound incidence and a fixed head-above-torso orientation (HATO). If-in interactive auralizations-HRTFs are exchanged according to the head rotations of a ...