Violence detection in Hollywood movies by the fusion of visual and mid-level audio cues

dc.contributor.authorAcar, Esra
dc.contributor.authorHopfgartner, Frank
dc.contributor.authorAlbayrak, Sahin
dc.date.accessioned2018-04-17T08:41:35Z
dc.date.available2018-04-17T08:41:35Z
dc.date.issued2013
dc.description.abstractDetecting violent scenes in movies is an important video content understanding functionality e.g., for providing automated youth pro- tection services. One key issue in designing algorithms for violence detection is the choice of discriminative features. In this paper, we employ mid-level audio features and compare their discriminative power against low-level audio and visual features. We fuse these mid-level audio cues with low-level visual ones at the decision level in order to further improve the performance of violence detection. We use Mel-Frequency Cepstral Coefficients (MFCC) as audio and average motion as visual features. In order to learn a violence model, we choose two-class support vector machines (SVMs). Our experimental results on detecting violent video shots in Hollywood movies show that mid-level audio features are more discriminative and provide more precise results than low-level ones. The detection performance is further enhanced by fusing the mid-level audio cues with low-level visual ones using an SVM-based decision fusion.en
dc.identifier.isbn978-1-4503-2404-5
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/7608
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-6798
dc.language.isoenen
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subject.ddc000 Informatik, Informationswissenschaft, allgemeine Werkede
dc.subject.otheralgorithmsen
dc.subject.otherperformanceen
dc.subject.otherexperimentationen
dc.subject.otherbag-of-audio-wordsen
dc.subject.othermel-frequency cepstral coefficients,en
dc.subject.othermotionen
dc.subject.otherdecision fusionen
dc.subject.othersupport vector machineen
dc.titleViolence detection in Hollywood movies by the fusion of visual and mid-level audio cuesen
dc.typeConference Objecten
dc.type.versionacceptedVersionen
dcterms.bibliographicCitation.doi10.1145/2502081.2502187en
dcterms.bibliographicCitation.originalpublishernameACMen
dcterms.bibliographicCitation.originalpublisherplaceNew York, NY, USAen
dcterms.bibliographicCitation.pageend720en
dcterms.bibliographicCitation.pagestart717en
dcterms.bibliographicCitation.proceedingstitleProceedings of the 21st ACM international conference on Multimedia - MM ’13en
dcterms.bibliographicCitation.volume2013en
tub.accessrights.dnbdomainen
tub.affiliationFak. 4 Elektrotechnik und Informatik::Inst. Wirtschaftsinformatik und Quantitative Methoden::FG Agententechnologien in betrieblichen Anwendungen und der Telekommunikation (AOT)de
tub.affiliation.facultyFak. 4 Elektrotechnik und Informatikde
tub.affiliation.groupFG Agententechnologien in betrieblichen Anwendungen und der Telekommunikation (AOT)de
tub.affiliation.instituteInst. Wirtschaftsinformatik und Quantitative Methodende
tub.publisher.universityorinstitutionTechnische Universität Berlinen

Files

Original bundle
Now showing 1 - 1 of 1
Loading…
Thumbnail Image
Name:
2013_acar_etal.pdf
Size:
1010.28 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.9 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections