Towards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodology

dc.contributor.authorStuder, Stefan
dc.contributor.authorBui, Thanh Binh
dc.contributor.authorDrescher, Christian
dc.contributor.authorHanuschkin, Alexander
dc.contributor.authorWinkler, Ludwig
dc.contributor.authorPeters, Steven
dc.contributor.authorMüller, Klaus-Robert
dc.date.accessioned2021-05-19T07:58:39Z
dc.date.available2021-05-19T07:58:39Z
dc.date.issued2021-04-22
dc.date.updated2021-05-03T18:56:51Z
dc.description.abstractMachine learning is an established and frequently used technique in industry and academia, but a standard process model to improve success and efficiency of machine learning applications is still missing. Project organizations and machine learning practitioners face manifold challenges and risks when developing machine learning applications and have a need for guidance to meet business expectations. This paper therefore proposes a process model for the development of machine learning applications, covering six phases from defining the scope to maintaining the deployed machine learning application. Business and data understanding are executed simultaneously in the first phase, as both have considerable impact on the feasibility of the project. The next phases are comprised of data preparation, modeling, evaluation, and deployment. Special focus is applied to the last phase, as a model running in changing real-time environments requires close monitoring and maintenance to reduce the risk of performance degradation over time. With each task of the process, this work proposes quality assurance methodology that is suitable to address challenges in machine learning development that are identified in the form of risks. The methodology is drawn from practical experience and scientific literature, and has proven to be general and stable. The process model expands on CRISP-DM, a data mining process model that enjoys strong industry support, but fails to address machine learning specific tasks. The presented work proposes an industry- and application-neutral process model tailored for machine learning applications with a focus on technical tasks for quality assurance.en
dc.description.sponsorshipBMBF, 01IS14013A, Verbundprojekt: BBDC - Berliner Kompetenzzentrum für Big Dataen
dc.description.sponsorshipBMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and Dataen
dc.description.sponsorshipBMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and Dataen
dc.description.sponsorshipBMBF, 01GQ1115, D-JPN Verbund: Adaptive Gehirn-Computer-Schnittstellen (BCI) in nichtstationären Umgebungenen
dc.description.sponsorshipBMBF, 01GQ0850, Verbundprojekt: Bernstein Fokus Neurotechnologie - Nichtinvasive Neurotechnologie für Mensch-Maschine Interaktion - Teilprojekte A1, A3, A4, B4, W3, Zentrumen
dc.description.sponsorshipDFG, 390685689, EXC 2046: MATH+: Berlin Mathematics Research Centeren
dc.description.sponsorshipDFG, 414044773, Open Access Publizieren 2021 - 2022 / Technische Universität Berlinde
dc.identifier.eissn2504-4990
dc.identifier.urihttps://depositonce.tu-berlin.de/handle/11303/13132
dc.identifier.urihttp://dx.doi.org/10.14279/depositonce-11926
dc.language.isoenen
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/en
dc.subject.ddc004 Datenverarbeitung; Informatikde
dc.subject.othermachine learning applicationsen
dc.subject.otherquality assurance methodologyen
dc.subject.otherprocess modelen
dc.subject.otherautomotive industry and academiaen
dc.subject.otherbest practicesen
dc.subject.otherguidelinesen
dc.titleTowards CRISP-ML(Q): A Machine Learning Process Model with Quality Assurance Methodologyen
dc.typeArticleen
dc.type.versionpublishedVersionen
dcterms.bibliographicCitation.doi10.3390/make3020020en
dcterms.bibliographicCitation.issue2en
dcterms.bibliographicCitation.journaltitleMachine Learning and Knowledge Extractionen
dcterms.bibliographicCitation.originalpublishernameMDPIen
dcterms.bibliographicCitation.originalpublisherplaceBaselen
dcterms.bibliographicCitation.pageend413en
dcterms.bibliographicCitation.pagestart392en
dcterms.bibliographicCitation.volume3en
tub.accessrights.dnbfreeen
tub.affiliationFak. 4 Elektrotechnik und Informatik::Inst. Softwaretechnik und Theoretische Informatik::FG Maschinelles Lernende
tub.affiliation.facultyFak. 4 Elektrotechnik und Informatikde
tub.affiliation.groupFG Maschinelles Lernende
tub.affiliation.instituteInst. Softwaretechnik und Theoretische Informatikde
tub.publisher.universityorinstitutionTechnische Universität Berlinen

Files

Original bundle
Now showing 1 - 1 of 1
Loading…
Thumbnail Image
Name:
make-03-00020-v3.pdf
Size:
1.2 MB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.9 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections