$$\def\mymacro{{\mathbf{\alpha,\beta,\gamma}}}$$
$$\def\va{{\mathbf{a}}}$$
$$\def\vb{{\mathbf{b}}}$$
$$\def\vc{{\mathbf{c}}}$$
$$\def\vd{{\mathbf{d}}}$$
$$\def\ve{{\mathbf{e}}}$$
$$\def\vf{{\mathbf{f}}}$$
$$\def\vg{{\mathbf{g}}}$$
$$\def\vh{{\mathbf{h}}}$$
$$\def\vi{{\mathbf{i}}}$$
$$\def\vj{{\mathbf{j}}}$$
$$\def\vk{{\mathbf{k}}}$$
$$\def\vl{{\mathbf{l}}}$$
$$\def\vm{{\mathbf{m}}}$$
$$\def\vn{{\mathbf{n}}}$$
$$\def\vo{{\mathbf{o}}}$$
$$\def\vp{{\mathbf{p}}}$$
$$\def\vq{{\mathbf{q}}}$$
$$\def\vr{{\mathbf{r}}}$$
$$\def\vs{{\mathbf{s}}}$$
$$\def\vt{{\mathbf{t}}}$$
$$\def\vu{{\mathbf{u}}}$$
$$\def\vv{{\mathbf{v}}}$$
$$\def\vw{{\mathbf{w}}}$$
$$\def\vx{{\mathbf{x}}}$$
$$\def\vy{{\mathbf{y}}}$$
$$\def\vz{{\mathbf{z}}}$$
$$\def\vmu{{\mathbf{\mu}}}$$
$$\def\vsigma{{\mathbf{\sigma}}}$$
$$\def\vtheta{{\mathbf{\theta}}}$$
$$\def\vzero{{\mathbf{0}}}$$
$$\def\vone{{\mathbf{1}}}$$
$$\def\vell{{\mathbf{\ell}}}$$
$$\def\mA{{\mathbf{A}}}$$
$$\def\mB{{\mathbf{B}}}$$
$$\def\mC{{\mathbf{C}}}$$
$$\def\mD{{\mathbf{D}}}$$
$$\def\mE{{\mathbf{E}}}$$
$$\def\mF{{\mathbf{F}}}$$
$$\def\mG{{\mathbf{G}}}$$
$$\def\mH{{\mathbf{H}}}$$
$$\def\mI{{\mathbf{I}}}$$
$$\def\mJ{{\mathbf{J}}}$$
$$\def\mK{{\mathbf{K}}}$$
$$\def\mL{{\mathbf{L}}}$$
$$\def\mM{{\mathbf{M}}}$$
$$\def\mN{{\mathbf{N}}}$$
$$\def\mO{{\mathbf{O}}}$$
$$\def\mP{{\mathbf{P}}}$$
$$\def\mQ{{\mathbf{Q}}}$$
$$\def\mR{{\mathbf{R}}}$$
$$\def\mS{{\mathbf{S}}}$$
$$\def\mT{{\mathbf{T}}}$$
$$\def\mU{{\mathbf{U}}}$$
$$\def\mV{{\mathbf{V}}}$$
$$\def\mW{{\mathbf{W}}}$$
$$\def\mX{{\mathbf{X}}}$$
$$\def\mY{{\mathbf{Y}}}$$
$$\def\mZ{{\mathbf{Z}}}$$
$$\def\mStilde{\mathbf{\tilde{\mS}}}$$
$$\def\mGtilde{\mathbf{\tilde{\mG}}}$$
$$\def\mGoverline{{\mathbf{\overline{G}}}}$$
$$\def\mBeta{{\mathbf{\beta}}}$$
$$\def\mPhi{{\mathbf{\Phi}}}$$
$$\def\mLambda{{\mathbf{\Lambda}}}$$
$$\def\mSigma{{\mathbf{\Sigma}}}$$
$$\def\tA{{\mathbf{\mathsf{A}}}}$$
$$\def\tB{{\mathbf{\mathsf{B}}}}$$
$$\def\tC{{\mathbf{\mathsf{C}}}}$$
$$\def\tD{{\mathbf{\mathsf{D}}}}$$
$$\def\tE{{\mathbf{\mathsf{E}}}}$$
$$\def\tF{{\mathbf{\mathsf{F}}}}$$
$$\def\tG{{\mathbf{\mathsf{G}}}}$$
$$\def\tH{{\mathbf{\mathsf{H}}}}$$
$$\def\tI{{\mathbf{\mathsf{I}}}}$$
$$\def\tJ{{\mathbf{\mathsf{J}}}}$$
$$\def\tK{{\mathbf{\mathsf{K}}}}$$
$$\def\tL{{\mathbf{\mathsf{L}}}}$$
$$\def\tM{{\mathbf{\mathsf{M}}}}$$
$$\def\tN{{\mathbf{\mathsf{N}}}}$$
$$\def\tO{{\mathbf{\mathsf{O}}}}$$
$$\def\tP{{\mathbf{\mathsf{P}}}}$$
$$\def\tQ{{\mathbf{\mathsf{Q}}}}$$
$$\def\tR{{\mathbf{\mathsf{R}}}}$$
$$\def\tS{{\mathbf{\mathsf{S}}}}$$
$$\def\tT{{\mathbf{\mathsf{T}}}}$$
$$\def\tU{{\mathbf{\mathsf{U}}}}$$
$$\def\tV{{\mathbf{\mathsf{V}}}}$$
$$\def\tW{{\mathbf{\mathsf{W}}}}$$
$$\def\tX{{\mathbf{\mathsf{X}}}}$$
$$\def\tY{{\mathbf{\mathsf{Y}}}}$$
$$\def\tZ{{\mathbf{\mathsf{Z}}}}$$
$$\def\gA{{\mathcal{A}}}$$
$$\def\gB{{\mathcal{B}}}$$
$$\def\gC{{\mathcal{C}}}$$
$$\def\gD{{\mathcal{D}}}$$
$$\def\gE{{\mathcal{E}}}$$
$$\def\gF{{\mathcal{F}}}$$
$$\def\gG{{\mathcal{G}}}$$
$$\def\gH{{\mathcal{H}}}$$
$$\def\gI{{\mathcal{I}}}$$
$$\def\gJ{{\mathcal{J}}}$$
$$\def\gK{{\mathcal{K}}}$$
$$\def\gL{{\mathcal{L}}}$$
$$\def\gM{{\mathcal{M}}}$$
$$\def\gN{{\mathcal{N}}}$$
$$\def\gO{{\mathcal{O}}}$$
$$\def\gP{{\mathcal{P}}}$$
$$\def\gQ{{\mathcal{Q}}}$$
$$\def\gR{{\mathcal{R}}}$$
$$\def\gS{{\mathcal{S}}}$$
$$\def\gT{{\mathcal{T}}}$$
$$\def\gU{{\mathcal{U}}}$$
$$\def\gV{{\mathcal{V}}}$$
$$\def\gW{{\mathcal{W}}}$$
$$\def\gX{{\mathcal{X}}}$$
$$\def\gY{{\mathcal{Y}}}$$
$$\def\gZ{{\mathcal{Z}}}$$
$$\def\sA{{\mathbb{A}}}$$
$$\def\sB{{\mathbb{B}}}$$
$$\def\sC{{\mathbb{C}}}$$
$$\def\sD{{\mathbb{D}}}$$
$$\def\sF{{\mathbb{F}}}$$
$$\def\sG{{\mathbb{G}}}$$
$$\def\sH{{\mathbb{H}}}$$
$$\def\sI{{\mathbb{I}}}$$
$$\def\sJ{{\mathbb{J}}}$$
$$\def\sK{{\mathbb{K}}}$$
$$\def\sL{{\mathbb{L}}}$$
$$\def\sM{{\mathbb{M}}}$$
$$\def\sN{{\mathbb{N}}}$$
$$\def\sO{{\mathbb{O}}}$$
$$\def\sP{{\mathbb{P}}}$$
$$\def\sQ{{\mathbb{Q}}}$$
$$\def\sR{{\mathbb{R}}}$$
$$\def\sS{{\mathbb{S}}}$$
$$\def\sT{{\mathbb{T}}}$$
$$\def\sU{{\mathbb{U}}}$$
$$\def\sV{{\mathbb{V}}}$$
$$\def\sW{{\mathbb{W}}}$$
$$\def\sX{{\mathbb{X}}}$$
$$\def\sY{{\mathbb{Y}}}$$
$$\def\sZ{{\mathbb{Z}}}$$
$$\def\E{{\mathbb{E}}}$$
$$\def\jac{{\mathbf{\mathrm{J}}}}$$
$$\def\argmax{{\mathop{\mathrm{arg}\,\mathrm{max}}}}$$
$$\def\argmin{{\mathop{\mathrm{arg}\,\mathrm{min}}}}$$
$$\def\Tr{{\mathop{\mathrm{Tr}}}}$$
$$\def\diag{{\mathop{\mathrm{diag}}}}$$
$$\def\vec{{\mathop{\mathrm{vec}}}}$$
$$\def\Kern{{\mathop{\mathrm{Kern}}}}$$
$$\def\llbracket{⟦}$$
$$\def\rrbracket{⟧}$$

# Curriculum vitæ: Felix Dangel

Felix Dangel is a Postdoctoral researcher at the Vector Institute in Toronto. He finished his PhD at Philipp Hennig's lab at the University of Tübingen and the Max Planck Institute for Intelligent Systems. His PhD focused on extending gradient backpropagation to efficiently extract higher-order geometrical and statistical information about the loss landscape of neural networks to improve their training and inspire novel algorithmic research. Before, he studied Physics at the University of Stuttgart with a focus on the simulation of quantum many-body systems with tensor networks. He is passionate about

• developing automatic differentiation tricks to tackle efficient extraction of richer deep learning quantities, like second-order and per-sample information, and integrating that functionality into machine learning libraries,
• using these quantities to build better algorithms or gain insights into deep learning phenomena, and
• releasing code that empowers the community (see for example cockpit, backpack, and vivit).

## Education

 now Postdoctoral researcher, Vector Institute, Toronto -- 2023 With: Prof.Dr.Yaoliang Yu 2023 PhD in Computer Science, Max Planck Institute for Intelligent Systems & University of Tübingen -- Thesis: Backpropagation beyond the Gradient 2018 Advisor: Prof.Dr.Philipp Hennig 2018 Researcher, University of Stuttgart -- Paper: Topological invariants in dissipative extensions of the Su-Schrieffer-Heeger model 2017 Host: Institute for Theoretical Physics 1 2017 MSc in Physics, University of Stuttgart -- Thesis: Bosonic many-body systems with topologically nontrivial phases subject to gain and loss 2015 Advisor: P.D.Holger Cartarius 2015 BSc in Physics, University of Stutgart -- Thesis: Microscopic description of a coupling process for $$\mathcal{PT}\!\!\!$$ -symmetric Bose-Einstein condensates 2012 Advisor: Prof.Dr.Günter Wunner

## Publications

• The Geometry of Neural Nets' Parameter Spaces Under Reparametrization, pre-print 2023
Agustinus Kristiadi, Felix Dangel, Philipp Hennig (pdf | arXiv)
• ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure, TMLR 2022
Felix Dangel, Lukas Tatzel, Philipp Hennig (pdf | journal | arXiv | code | www)

We present novel ways to compute with curvature that allow quantifying noise in both gradient and curvature.

• Cockpit: A Practical Debugging Tool for Training Deep Neural Networks, NeurIPS 2021 (poster)
Frank Schneider, Felix Dangel, Philipp Hennig (pdf | conference | arXiv | code | www | video)

Just like classic debuggers help figuring out bugs in code, our neural debugger assists deep learning engineers in troubleshooting training. It is used to debug non-standard deep learning tasks that do not work out of the box in contrast to standard problems, and may soon be introduced as a tool to ML students in the classroom.

• BackPACK: Packing more into backprop, ICLR 2020 (spotlight)
Felix Dangel, Frederik Kunstner, Philipp Hennig (pdf | conference | arXiv | code | www | video)

The paper received perfect scores before the rebuttal. BackPACK has raised awareness in the community on issues with autodiff in ML frameworks. Since its introduction, major engines like TensorFlow (Google) and PyTorch (Meta) have provided some of the functionalities as first-party solutions (TF, PT) with our approach.

• Modular Block-diagonal Curvature Approximations for Feedforward Architectures, AISTATS 2020 (poster)
Felix Dangel, Stefan Harmeling, Philipp Hennig (pdf | conference | arXiv | code | video)

A take on the Hessian chain rule for neural networks: Layer-wise Hessians can be computed with a procedure similar to gradient backpropagation, where the propagated quantities are the Hessians.

• Topological invariants in dissipative extensions of the Su-Schrieffer-Heeger model, Phys. Rev. A 2018
Felix Dangel, Marcel Wagner, Holger Cartarius, Jörg Main, Günter Wunner (pdf | journal | arXiv)
• Numerical calculation of the complex Berry phase in non-Hermitian systems, Acta Polytechnica 2018
Marcel Wagner, Felix Dangel, Holger Cartarius, Jörg Main, Günter Wunner (pdf | journal | arXiv)

## Teaching & Reviewing

Created: 2023-05-23 Tue 18:00

Validate