Towards an Optimal Control Perspective of ResNet Training

dc.contributor.authorPüttschneider, Jens
dc.contributor.authorHeilig, Simon
dc.contributor.authorFischer, Asja
dc.contributor.authorFaulwasser, Timm
dc.date.accessioned2025-07-08T12:32:27Z
dc.date.available2025-07-08T12:32:27Z
dc.date.issued2025
dc.description.abstractWe propose a training formulation for ResNets reflecting an optimal control problem that is applicable for standard architectures and general loss functions. We suggest bridging both worlds via penalizing intermediate outputs of hidden states corresponding to stage cost terms in optimal control. For standard ResNets, we obtain intermediate outputs by propagating the state through the subsequent skip connections and the output layer. We demonstrate that our training dynamic biases the weights of the unnecessary deeper residual layers to vanish. This indicates the potential for a theory-grounded layer pruning strategy.en
dc.identifier.urihttp://hdl.handle.net/2003/43792
dc.identifier.urihttp://dx.doi.org/10.17877/DE290R-25566
dc.language.isoen
dc.relation.ispartofseriesTRR 391 Working Paper; 6
dc.rights.urihttps://creativecommons.org/licenses/by/4.0/
dc.subjectResNetsen
dc.subjectoptimal controlen
dc.subjectregularizationen
dc.subjectnetwork depthen
dc.subject.ddc310
dc.titleTowards an Optimal Control Perspective of ResNet Trainingen
dc.typeText
dc.type.publicationtypeWorkingPaper
dcterms.accessRightsopen access
eldorado.secondarypublicationfalse

Files

License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
4.82 KB
Format:
Item-specific license agreed upon to submission
Description: