Local memory-aware kernel perforation

dc.contributor.authorMaier, Daniel
dc.contributor.authorCosenza, Biagio
dc.contributor.authorJuurlink, Ben
dc.date.accessioned2018-06-04T15:23:14Z
dc.date.available2018-06-04T15:23:14Z
dc.date.issued2018
dc.description.abstractMany applications provide inherent resilience to some amount of error and can potentially trade accuracy for performance by using approximate computing. Applications running on GPUs often use local memory to minimize the number of global memory accesses and to speed up execution. Local memory can also be very useful to improve the way approximate computation is performed, e.g., by improving the quality of approximation with data reconstruction techniques. This paper introduces local memory-aware perforation techniques specifically designed for the acceleration and approximation of GPU kernels. We propose a local memory-aware kernel perforation technique that first skips the loading of parts of the input data from global memory, and later uses reconstruction techniques on local memory to reach higher accuracy while having performance similar to state-of-the-art techniques. Experiments show that our approach is able to accelerate the execution of a variety of applications from 1.6× to 3× while introducing an average error of 6%, which is much smaller than that of other approaches. Results further show how much the error depends on the input data and application scenario, the impact of local memory tuning and different parameter configurations.en
dc.identifier.isbn978-1-4503-5617-6
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/7911
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-7072
dc.language.isoenen
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subject.ddc004 Datenverarbeitung; Informatikde
dc.subject.otherapproximate computingen
dc.subject.otherGUen
dc.subject.otherkernel perforationen
dc.titleLocal memory-aware kernel perforationen
dc.typeConference Objecten
dc.type.versionacceptedVersionen
dcterms.bibliographicCitation.doi10.1145/3168814en
dcterms.bibliographicCitation.originalpublishernameAssociation for Computing Machinery (ACM)en
dcterms.bibliographicCitation.originalpublisherplaceNew York, NY, USAen
dcterms.bibliographicCitation.pageend287en
dcterms.bibliographicCitation.pagestart278en
dcterms.bibliographicCitation.proceedingstitleProceedings of 2018 IEEE/ACM International Symposium on Code Generation and Optimization (CGO’18)en
tub.accessrights.dnbfreeen
tub.affiliationFak. 4 Elektrotechnik und Informatik::Inst. Technische Informatik und Mikroelektronik::FG Architektur eingebetteter Systemede
tub.affiliation.facultyFak. 4 Elektrotechnik und Informatikde
tub.affiliation.groupFG Architektur eingebetteter Systemede
tub.affiliation.instituteInst. Technische Informatik und Mikroelektronikde
tub.publisher.universityorinstitutionTechnische Universität Berlinen

Files

Original bundle
Now showing 1 - 1 of 1
Loading…
Thumbnail Image
Name:
MaierCGO18.pdf
Size:
688.83 KB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.9 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections