Evaluation of parallel H.264 decoding strategies for the Cell Broadband Engine

dc.contributor.authorChi, Chi Ching
dc.contributor.authorJuurlink, Ben
dc.contributor.authorMeenderinck, Cor
dc.date.accessioned2017-10-24T10:05:20Z
dc.date.available2017-10-24T10:05:20Z
dc.date.issued2010
dc.description.abstractHow to develop efficient and scalable parallel applications is the key challenge for emerging many-core architectures. We investigate this question by implementing and comparing two parallel H.264 decoders on the Cell architecture. It is expected that future many-cores will use a Cell-like local store memory hierarchy, rather than a non-scalable shared memory. The two implemented parallel algorithms, the Task Pool (TP) and the novel Ring-Line (RL) approach, both exploit macroblock-level parallelism. The TP implementation follows the master-slave paradigm and is very dynamic so that in theory perfect load balancing can be achieved. The RL approach is distributed and more predictable in the sense that the mapping of macroblocks to processing elements is fixed. This allows to better exploit data locality, to overlap communication with computation, and to reduce communication and synchronization overhead. While TP is more scalable in theory, the actual scalability favors RL. Using 16 SPEs, RL obtains a scalability of 12x, while TP achieves only 10.3x. More importantly, the absolute performance of RL is much higher. Using 16 SPEs, RL achieves a throughput of 139.6 frames per second (fps) while TP achieves only 76.6 fps. A large part of the additional performance advantage is due to hiding the memory latency. From the results we conclude that in order to fully leverage the performance of future many-cores, a centralized master should be avoided and the mapping of tasks to cores should be predictable in order to be able to hide the memory latency.en
dc.identifier.isbn978-1-4503-0018-6
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/6924
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-6263
dc.language.isoen
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/
dc.subject.ddc004 Datenverarbeitung; Informatik
dc.subject.otherH.264en
dc.subject.othercellen
dc.subject.otherdecodingen
dc.subject.otherparallelen
dc.subject.otherprogrammingen
dc.subject.othervideoen
dc.titleEvaluation of parallel H.264 decoding strategies for the Cell Broadband Engineen
dc.typeConference Objecten
dc.type.versionacceptedVersionen
dcterms.bibliographicCitation.doi10.1145/1810085.1810102
dcterms.bibliographicCitation.originalpublishernameAssociation for Computing Machinery (ACM)en
dcterms.bibliographicCitation.originalpublisherplaceNew York, NYen
dcterms.bibliographicCitation.pageend114
dcterms.bibliographicCitation.pagestart105
dcterms.bibliographicCitation.proceedingstitleProceedings of the 24th ACM International Conference on Supercomputingen
tub.accessrights.dnbdomain
tub.affiliationFak. 4 Elektrotechnik und Informatik::Inst. Technische Informatik und Mikroelektronik::FG Architektur eingebetteter Systemede
tub.affiliation.facultyFak. 4 Elektrotechnik und Informatikde
tub.affiliation.groupFG Architektur eingebetteter Systemede
tub.affiliation.instituteInst. Technische Informatik und Mikroelektronikde
tub.publisher.universityorinstitutionTechnische Universität Berlinen

Files

Original bundle
Now showing 1 - 1 of 1
Loading…
Thumbnail Image
Name:
Evaluation_of_Parallel.pdf
Size:
556.04 KB
Format:
Adobe Portable Document Format

Collections