\(\def\mymacro{{\mathbf{\alpha,\beta,\gamma}}}\)
\(\def\va{{\mathbf{a}}}\)
\(\def\vb{{\mathbf{b}}}\)
\(\def\vc{{\mathbf{c}}}\)
\(\def\vd{{\mathbf{d}}}\)
\(\def\ve{{\mathbf{e}}}\)
\(\def\vf{{\mathbf{f}}}\)
\(\def\vg{{\mathbf{g}}}\)
\(\def\vh{{\mathbf{h}}}\)
\(\def\vi{{\mathbf{i}}}\)
\(\def\vj{{\mathbf{j}}}\)
\(\def\vk{{\mathbf{k}}}\)
\(\def\vl{{\mathbf{l}}}\)
\(\def\vm{{\mathbf{m}}}\)
\(\def\vn{{\mathbf{n}}}\)
\(\def\vo{{\mathbf{o}}}\)
\(\def\vp{{\mathbf{p}}}\)
\(\def\vq{{\mathbf{q}}}\)
\(\def\vr{{\mathbf{r}}}\)
\(\def\vs{{\mathbf{s}}}\)
\(\def\vt{{\mathbf{t}}}\)
\(\def\vu{{\mathbf{u}}}\)
\(\def\vv{{\mathbf{v}}}\)
\(\def\vw{{\mathbf{w}}}\)
\(\def\vx{{\mathbf{x}}}\)
\(\def\vy{{\mathbf{y}}}\)
\(\def\vz{{\mathbf{z}}}\)
\(\def\vmu{{\mathbf{\mu}}}\)
\(\def\vsigma{{\mathbf{\sigma}}}\)
\(\def\vtheta{{\mathbf{\theta}}}\)
\(\def\vzero{{\mathbf{0}}}\)
\(\def\vone{{\mathbf{1}}}\)
\(\def\vell{{\mathbf{\ell}}}\)
\(\def\mA{{\mathbf{A}}}\)
\(\def\mB{{\mathbf{B}}}\)
\(\def\mC{{\mathbf{C}}}\)
\(\def\mD{{\mathbf{D}}}\)
\(\def\mE{{\mathbf{E}}}\)
\(\def\mF{{\mathbf{F}}}\)
\(\def\mG{{\mathbf{G}}}\)
\(\def\mH{{\mathbf{H}}}\)
\(\def\mI{{\mathbf{I}}}\)
\(\def\mJ{{\mathbf{J}}}\)
\(\def\mK{{\mathbf{K}}}\)
\(\def\mL{{\mathbf{L}}}\)
\(\def\mM{{\mathbf{M}}}\)
\(\def\mN{{\mathbf{N}}}\)
\(\def\mO{{\mathbf{O}}}\)
\(\def\mP{{\mathbf{P}}}\)
\(\def\mQ{{\mathbf{Q}}}\)
\(\def\mR{{\mathbf{R}}}\)
\(\def\mS{{\mathbf{S}}}\)
\(\def\mT{{\mathbf{T}}}\)
\(\def\mU{{\mathbf{U}}}\)
\(\def\mV{{\mathbf{V}}}\)
\(\def\mW{{\mathbf{W}}}\)
\(\def\mX{{\mathbf{X}}}\)
\(\def\mY{{\mathbf{Y}}}\)
\(\def\mZ{{\mathbf{Z}}}\)
\(\def\mStilde{\mathbf{\tilde{\mS}}}\)
\(\def\mGtilde{\mathbf{\tilde{\mG}}}\)
\(\def\mGoverline{{\mathbf{\overline{G}}}}\)
\(\def\mBeta{{\mathbf{\beta}}}\)
\(\def\mPhi{{\mathbf{\Phi}}}\)
\(\def\mLambda{{\mathbf{\Lambda}}}\)
\(\def\mSigma{{\mathbf{\Sigma}}}\)
\(\def\tA{{\mathbf{\mathsf{A}}}}\)
\(\def\tB{{\mathbf{\mathsf{B}}}}\)
\(\def\tC{{\mathbf{\mathsf{C}}}}\)
\(\def\tD{{\mathbf{\mathsf{D}}}}\)
\(\def\tE{{\mathbf{\mathsf{E}}}}\)
\(\def\tF{{\mathbf{\mathsf{F}}}}\)
\(\def\tG{{\mathbf{\mathsf{G}}}}\)
\(\def\tH{{\mathbf{\mathsf{H}}}}\)
\(\def\tI{{\mathbf{\mathsf{I}}}}\)
\(\def\tJ{{\mathbf{\mathsf{J}}}}\)
\(\def\tK{{\mathbf{\mathsf{K}}}}\)
\(\def\tL{{\mathbf{\mathsf{L}}}}\)
\(\def\tM{{\mathbf{\mathsf{M}}}}\)
\(\def\tN{{\mathbf{\mathsf{N}}}}\)
\(\def\tO{{\mathbf{\mathsf{O}}}}\)
\(\def\tP{{\mathbf{\mathsf{P}}}}\)
\(\def\tQ{{\mathbf{\mathsf{Q}}}}\)
\(\def\tR{{\mathbf{\mathsf{R}}}}\)
\(\def\tS{{\mathbf{\mathsf{S}}}}\)
\(\def\tT{{\mathbf{\mathsf{T}}}}\)
\(\def\tU{{\mathbf{\mathsf{U}}}}\)
\(\def\tV{{\mathbf{\mathsf{V}}}}\)
\(\def\tW{{\mathbf{\mathsf{W}}}}\)
\(\def\tX{{\mathbf{\mathsf{X}}}}\)
\(\def\tY{{\mathbf{\mathsf{Y}}}}\)
\(\def\tZ{{\mathbf{\mathsf{Z}}}}\)
\(\def\gA{{\mathcal{A}}}\)
\(\def\gB{{\mathcal{B}}}\)
\(\def\gC{{\mathcal{C}}}\)
\(\def\gD{{\mathcal{D}}}\)
\(\def\gE{{\mathcal{E}}}\)
\(\def\gF{{\mathcal{F}}}\)
\(\def\gG{{\mathcal{G}}}\)
\(\def\gH{{\mathcal{H}}}\)
\(\def\gI{{\mathcal{I}}}\)
\(\def\gJ{{\mathcal{J}}}\)
\(\def\gK{{\mathcal{K}}}\)
\(\def\gL{{\mathcal{L}}}\)
\(\def\gM{{\mathcal{M}}}\)
\(\def\gN{{\mathcal{N}}}\)
\(\def\gO{{\mathcal{O}}}\)
\(\def\gP{{\mathcal{P}}}\)
\(\def\gQ{{\mathcal{Q}}}\)
\(\def\gR{{\mathcal{R}}}\)
\(\def\gS{{\mathcal{S}}}\)
\(\def\gT{{\mathcal{T}}}\)
\(\def\gU{{\mathcal{U}}}\)
\(\def\gV{{\mathcal{V}}}\)
\(\def\gW{{\mathcal{W}}}\)
\(\def\gX{{\mathcal{X}}}\)
\(\def\gY{{\mathcal{Y}}}\)
\(\def\gZ{{\mathcal{Z}}}\)
\(\def\sA{{\mathbb{A}}}\)
\(\def\sB{{\mathbb{B}}}\)
\(\def\sC{{\mathbb{C}}}\)
\(\def\sD{{\mathbb{D}}}\)
\(\def\sF{{\mathbb{F}}}\)
\(\def\sG{{\mathbb{G}}}\)
\(\def\sH{{\mathbb{H}}}\)
\(\def\sI{{\mathbb{I}}}\)
\(\def\sJ{{\mathbb{J}}}\)
\(\def\sK{{\mathbb{K}}}\)
\(\def\sL{{\mathbb{L}}}\)
\(\def\sM{{\mathbb{M}}}\)
\(\def\sN{{\mathbb{N}}}\)
\(\def\sO{{\mathbb{O}}}\)
\(\def\sP{{\mathbb{P}}}\)
\(\def\sQ{{\mathbb{Q}}}\)
\(\def\sR{{\mathbb{R}}}\)
\(\def\sS{{\mathbb{S}}}\)
\(\def\sT{{\mathbb{T}}}\)
\(\def\sU{{\mathbb{U}}}\)
\(\def\sV{{\mathbb{V}}}\)
\(\def\sW{{\mathbb{W}}}\)
\(\def\sX{{\mathbb{X}}}\)
\(\def\sY{{\mathbb{Y}}}\)
\(\def\sZ{{\mathbb{Z}}}\)
\(\def\E{{\mathbb{E}}}\)
\(\def\jac{{\mathbf{\mathrm{J}}}}\)
\(\def\argmax{{\mathop{\mathrm{arg}\,\mathrm{max}}}}\)
\(\def\argmin{{\mathop{\mathrm{arg}\,\mathrm{min}}}}\)
\(\def\Tr{{\mathop{\mathrm{Tr}}}}\)
\(\def\diag{{\mathop{\mathrm{diag}}}}\)
\(\def\vec{{\mathop{\mathrm{vec}}}}\)
\(\def\Kern{{\mathop{\mathrm{Kern}}}}\)
\(\def\llbracket{⟦}\)
\(\def\rrbracket{⟧}\)

Curriculum vitæ: Felix Dangel

Table of Contents

(Download as pdf)

Felix Dangel is a Postdoctoral researcher at the Vector Institute in Toronto. He finished his PhD at Philipp Hennig's lab at the University of Tübingen and the Max Planck Institute for Intelligent Systems. His PhD focused on extending gradient backpropagation to efficiently extract higher-order geometrical and statistical information about the loss landscape of neural networks to improve their training and inspire novel algorithmic research. Before, he studied Physics at the University of Stuttgart with a focus on the simulation of quantum many-body systems with tensor networks. He is passionate about

Education

now Postdoctoral researcher, Vector Institute, Toronto
--  
2023 With: Prof.Dr.Yaoliang Yu
   
2023 PhD in Computer Science, Max Planck Institute for Intelligent Systems & University of Tübingen
-- Thesis: Backpropagation beyond the Gradient
2018 Advisor: Prof.Dr.Philipp Hennig
   
2018 Researcher, University of Stuttgart
-- Paper: Topological invariants in dissipative extensions of the Su-Schrieffer-Heeger model
2017 Host: Institute for Theoretical Physics 1
   
2017 MSc in Physics, University of Stuttgart
-- Thesis: Bosonic many-body systems with topologically nontrivial phases subject to gain and loss
2015 Advisor: P.D.Holger Cartarius
   
2015 BSc in Physics, University of Stutgart
-- Thesis: Microscopic description of a coupling process for \(\mathcal{PT}\!\!\!\) -symmetric Bose-Einstein condensates
2012 Advisor: Prof.Dr.Günter Wunner

Publications

  • Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets, pre-print 2023
    W. Lin, F. Dangel, R. Eschenhagen, K. Neklyudov, A. Kristiadi, R. Turner, A. Makhzani (code)
  • On the Disconnect Between Theory and Practice of Overparametrized Neural Networks, pre-print 2023
    J. Wenger, F. Dangel, A. Kristiadi (pdf | arXiv)
  • Convolutions Through the Lens of Tensor Networks, pre-print 2023
    F. Dangel (pdf | arXiv | code)
  • The Geometry of Neural Nets' Parameter Spaces Under Reparametrization, NeurIPS 2023
    A. Kristiadi, F. Dangel, P. Hennig (pdf | arXiv)
  • ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure, TMLR 2022
    F. Dangel, L. Tatzel, P. Hennig (pdf | journal | arXiv | code | www)

    We present novel ways to compute with curvature that allow quantifying noise in both gradient and curvature.

  • Cockpit: A Practical Debugging Tool for Training Deep Neural Networks, NeurIPS 2021 (poster)
    F. Schneider, F. Dangel, P. Hennig (pdf | conference | arXiv | code | www | video)

    Just like classic debuggers help figuring out bugs in code, our neural debugger assists deep learning engineers in troubleshooting training. It is used to debug non-standard deep learning tasks that do not work out of the box in contrast to standard problems, and may soon be introduced as a tool to ML students in the classroom.

  • BackPACK: Packing more into backprop, ICLR 2020 (spotlight)
    F. Dangel, F. Kunstner, P. Hennig (pdf | conference | arXiv | code | www | video)

    The paper received perfect scores before the rebuttal. BackPACK has raised awareness in the community on issues with autodiff in ML frameworks. Since its introduction, major engines like TensorFlow (Google) and PyTorch (Meta) have provided some of the functionalities as first-party solutions (TF, PT) with our approach.

  • Modular Block-diagonal Curvature Approximations for Feedforward Architectures, AISTATS 2020 (poster)
    F. Dangel, S. Harmeling, P. Hennig (pdf | conference | arXiv | code | video)

    A take on the Hessian chain rule for neural networks: Layer-wise Hessians can be computed with a procedure similar to gradient backpropagation, where the propagated quantities are the Hessians.

  • Topological invariants in dissipative extensions of the Su-Schrieffer-Heeger model, Phys. Rev. A 2018
    F. Dangel, M. Wagner, H. Cartarius, J. Main, G. Wunner (pdf | journal | arXiv)
  • Numerical calculation of the complex Berry phase in non-Hermitian systems, Acta Polytechnica 2018
    M. Wagner, F. Dangel, H. Cartarius, J. Main, G. Wunner (pdf | journal | arXiv)

Talks & Workshops

Teaching & Reviewing

Created: 2023-12-13 Wed 10:01

Validate