Stochastic Control for Bayesian Neural Network Training
dc.contributor.author | Winkler, Ludwig | |
dc.contributor.author | Ojeda, César | |
dc.contributor.author | Opper, Manfred | |
dc.date.accessioned | 2022-09-19T11:07:02Z | |
dc.date.available | 2022-09-19T11:07:02Z | |
dc.date.issued | 2022-08-09 | |
dc.date.updated | 2022-09-07T13:01:41Z | |
dc.description.abstract | In this paper, we propose to leverage the Bayesian uncertainty information encoded in parameter distributions to inform the learning procedure for Bayesian models. We derive a first principle stochastic differential equation for the training dynamics of the mean and uncertainty parameter in the variational distributions. On the basis of the derived Bayesian stochastic differential equation, we apply the methodology of stochastic optimal control on the variational parameters to obtain individually controlled learning rates. We show that the resulting optimizer, StochControlSGD, is significantly more robust to large learning rates and can adaptively and individually control the learning rates of the variational parameters. The evolution of the control suggests separate and distinct dynamical behaviours in the training regimes for the mean and uncertainty parameters in Bayesian neural networks. | en |
dc.description.sponsorship | DFG, 318763901, SFB 1294: Datenassimilation – Die nahtlose Verschmelzung von Daten und Modellen | en |
dc.description.sponsorship | BMBF, 01IS18025A, Verbundprojekt BIFOLD-BBDC: Berlin Institute for the Foundations of Learning and Data | en |
dc.description.sponsorship | BMBF, 01IS18037A, Verbundprojekt BIFOLD-BZML: Berlin Institute for the Foundations of Learning and Data | en |
dc.description.sponsorship | TU Berlin, Open-Access-Mittel – 2022 | |
dc.identifier.eissn | 1099-4300 | |
dc.identifier.uri | https://depositonce.tu-berlin.de/handle/11303/17492 | |
dc.identifier.uri | http://dx.doi.org/10.14279/depositonce-16273 | |
dc.language.iso | en | en |
dc.rights.uri | https://creativecommons.org/licenses/by/4.0/ | en |
dc.subject.ddc | 004 Datenverarbeitung; Informatik | de |
dc.subject.other | Bayesian inference | en |
dc.subject.other | Bayesian neural networks | en |
dc.subject.other | learning | en |
dc.title | Stochastic Control for Bayesian Neural Network Training | en |
dc.type | Article | en |
dc.type.version | publishedVersion | en |
dcterms.bibliographicCitation.articlenumber | 1097 | en |
dcterms.bibliographicCitation.doi | 10.3390/e24081097 | en |
dcterms.bibliographicCitation.issue | 8 | en |
dcterms.bibliographicCitation.journaltitle | Entropy | en |
dcterms.bibliographicCitation.originalpublishername | MDPI | en |
dcterms.bibliographicCitation.originalpublisherplace | Basel | en |
dcterms.bibliographicCitation.volume | 24 | en |
tub.accessrights.dnb | free | en |
tub.affiliation | Fak. 4 Elektrotechnik und Informatik::Inst. Softwaretechnik und Theoretische Informatik::FG Maschinelles Lernen | de |
tub.affiliation.faculty | Fak. 4 Elektrotechnik und Informatik | de |
tub.affiliation.group | FG Maschinelles Lernen | de |
tub.affiliation.institute | Inst. Softwaretechnik und Theoretische Informatik | de |
tub.publisher.universityorinstitution | Technische Universität Berlin | en |