Detecting violent content in Hollywood movies by mid-level audio representations
dc.contributor.author | Acar, Esra | |
dc.contributor.author | Hopfgartner, Frank | |
dc.contributor.author | Alyabrak, Sahin | |
dc.date.accessioned | 2018-04-17T09:26:00Z | |
dc.date.available | 2018-04-17T09:26:00Z | |
dc.date.issued | 2013 | |
dc.description.abstract | Movie violent content detection e.g., for providing automated youth protection services is a valuable video content analysis functionality. Choosing discriminative features for the representation of video segments is a key issue in designing violence detection algorithms. In this paper, we employ mid-level audio features which are based on a Bag-of-Audio Words (BoAW) method using Mel-Frequency Cepstral Coefficients (MFCC). BoAW representations are constructed with two different meth- ods, namely the vector quantization-based (VQ-based) method and the sparse coding-based (SC-based) method. We choose two- class support vector machines (SVMs) for classifying video shots as (non-)violent. Our experimental results on detecting violent video shots in Hollywood movies show that the mid-level audio features provide promising results. Additionally, we establish that the SC-based method outperforms the VQ-based one. More importantly, the SC-based method outperforms the unimodal submissions in the MediaEval Violent Scenes Detection (VSD) task except one visual-based method in terms of average precision. | en |
dc.identifier.isbn | 978-1-4799-0956-8 | |
dc.identifier.issn | 1949-3991 | |
dc.identifier.uri | https://depositonce.tu-berlin.de/handle/11303/7609 | |
dc.identifier.uri | http://dx.doi.org/10.14279/depositonce-6799 | |
dc.language.iso | en | en |
dc.rights.uri | http://rightsstatements.org/vocab/InC/1.0/ | en |
dc.subject.ddc | 000 Informatik, Informationswissenschaft, allgemeine Werke | de |
dc.subject.other | video coding | en |
dc.subject.other | feature extraction | en |
dc.subject.other | mel frequency cepstral coefficient | en |
dc.subject.other | visualization | en |
dc.subject.other | support vector machines | en |
dc.subject.other | dictionaries | en |
dc.subject.other | training | en |
dc.title | Detecting violent content in Hollywood movies by mid-level audio representations | en |
dc.type | Conference Object | en |
dc.type.version | acceptedVersion | en |
dcterms.bibliographicCitation.doi | 10.1109/CBMI.2013.6576556 | en |
dcterms.bibliographicCitation.editor | Czúni, László | |
dcterms.bibliographicCitation.editor | Schöffmann, Klaus | |
dcterms.bibliographicCitation.editor | Szirányi, Tamás | |
dcterms.bibliographicCitation.originalpublishername | IEEE | en |
dcterms.bibliographicCitation.originalpublisherplace | Veszprem, Hungary | en |
dcterms.bibliographicCitation.pageend | 78 | en |
dcterms.bibliographicCitation.pagestart | 73 | en |
dcterms.bibliographicCitation.proceedingstitle | 2013 11th International Workshop on Content-Based Multimedia Indexing (CBMI) | en |
dcterms.bibliographicCitation.volume | 2013 | en |
tub.accessrights.dnb | domain | en |
tub.affiliation | Fak. 4 Elektrotechnik und Informatik::Inst. Wirtschaftsinformatik und Quantitative Methoden::FG Agententechnologien in betrieblichen Anwendungen und der Telekommunikation (AOT) | de |
tub.affiliation.faculty | Fak. 4 Elektrotechnik und Informatik | de |
tub.affiliation.group | FG Agententechnologien in betrieblichen Anwendungen und der Telekommunikation (AOT) | de |
tub.affiliation.institute | Inst. Wirtschaftsinformatik und Quantitative Methoden | de |
tub.publisher.universityorinstitution | Technische Universität Berlin | en |