\(\def\mymacro{{\mathbf{\alpha,\beta,\gamma}}}\)
\(\def\va{{\mathbf{a}}}\)
\(\def\vb{{\mathbf{b}}}\)
\(\def\vc{{\mathbf{c}}}\)
\(\def\vd{{\mathbf{d}}}\)
\(\def\ve{{\mathbf{e}}}\)
\(\def\vf{{\mathbf{f}}}\)
\(\def\vg{{\mathbf{g}}}\)
\(\def\vh{{\mathbf{h}}}\)
\(\def\vi{{\mathbf{i}}}\)
\(\def\vj{{\mathbf{j}}}\)
\(\def\vk{{\mathbf{k}}}\)
\(\def\vl{{\mathbf{l}}}\)
\(\def\vm{{\mathbf{m}}}\)
\(\def\vn{{\mathbf{n}}}\)
\(\def\vo{{\mathbf{o}}}\)
\(\def\vp{{\mathbf{p}}}\)
\(\def\vq{{\mathbf{q}}}\)
\(\def\vr{{\mathbf{r}}}\)
\(\def\vs{{\mathbf{s}}}\)
\(\def\vt{{\mathbf{t}}}\)
\(\def\vu{{\mathbf{u}}}\)
\(\def\vv{{\mathbf{v}}}\)
\(\def\vw{{\mathbf{w}}}\)
\(\def\vx{{\mathbf{x}}}\)
\(\def\vy{{\mathbf{y}}}\)
\(\def\vz{{\mathbf{z}}}\)
\(\def\vmu{{\mathbf{\mu}}}\)
\(\def\vsigma{{\mathbf{\sigma}}}\)
\(\def\vtheta{{\mathbf{\theta}}}\)
\(\def\vzero{{\mathbf{0}}}\)
\(\def\vone{{\mathbf{1}}}\)
\(\def\vell{{\mathbf{\ell}}}\)
\(\def\mA{{\mathbf{A}}}\)
\(\def\mB{{\mathbf{B}}}\)
\(\def\mC{{\mathbf{C}}}\)
\(\def\mD{{\mathbf{D}}}\)
\(\def\mE{{\mathbf{E}}}\)
\(\def\mF{{\mathbf{F}}}\)
\(\def\mG{{\mathbf{G}}}\)
\(\def\mH{{\mathbf{H}}}\)
\(\def\mI{{\mathbf{I}}}\)
\(\def\mJ{{\mathbf{J}}}\)
\(\def\mK{{\mathbf{K}}}\)
\(\def\mL{{\mathbf{L}}}\)
\(\def\mM{{\mathbf{M}}}\)
\(\def\mN{{\mathbf{N}}}\)
\(\def\mO{{\mathbf{O}}}\)
\(\def\mP{{\mathbf{P}}}\)
\(\def\mQ{{\mathbf{Q}}}\)
\(\def\mR{{\mathbf{R}}}\)
\(\def\mS{{\mathbf{S}}}\)
\(\def\mT{{\mathbf{T}}}\)
\(\def\mU{{\mathbf{U}}}\)
\(\def\mV{{\mathbf{V}}}\)
\(\def\mW{{\mathbf{W}}}\)
\(\def\mX{{\mathbf{X}}}\)
\(\def\mY{{\mathbf{Y}}}\)
\(\def\mZ{{\mathbf{Z}}}\)
\(\def\mStilde{\mathbf{\tilde{\mS}}}\)
\(\def\mGtilde{\mathbf{\tilde{\mG}}}\)
\(\def\mGoverline{{\mathbf{\overline{G}}}}\)
\(\def\mBeta{{\mathbf{\beta}}}\)
\(\def\mPhi{{\mathbf{\Phi}}}\)
\(\def\mLambda{{\mathbf{\Lambda}}}\)
\(\def\mSigma{{\mathbf{\Sigma}}}\)
\(\def\tA{{\mathbf{\mathsf{A}}}}\)
\(\def\tB{{\mathbf{\mathsf{B}}}}\)
\(\def\tC{{\mathbf{\mathsf{C}}}}\)
\(\def\tD{{\mathbf{\mathsf{D}}}}\)
\(\def\tE{{\mathbf{\mathsf{E}}}}\)
\(\def\tF{{\mathbf{\mathsf{F}}}}\)
\(\def\tG{{\mathbf{\mathsf{G}}}}\)
\(\def\tH{{\mathbf{\mathsf{H}}}}\)
\(\def\tI{{\mathbf{\mathsf{I}}}}\)
\(\def\tJ{{\mathbf{\mathsf{J}}}}\)
\(\def\tK{{\mathbf{\mathsf{K}}}}\)
\(\def\tL{{\mathbf{\mathsf{L}}}}\)
\(\def\tM{{\mathbf{\mathsf{M}}}}\)
\(\def\tN{{\mathbf{\mathsf{N}}}}\)
\(\def\tO{{\mathbf{\mathsf{O}}}}\)
\(\def\tP{{\mathbf{\mathsf{P}}}}\)
\(\def\tQ{{\mathbf{\mathsf{Q}}}}\)
\(\def\tR{{\mathbf{\mathsf{R}}}}\)
\(\def\tS{{\mathbf{\mathsf{S}}}}\)
\(\def\tT{{\mathbf{\mathsf{T}}}}\)
\(\def\tU{{\mathbf{\mathsf{U}}}}\)
\(\def\tV{{\mathbf{\mathsf{V}}}}\)
\(\def\tW{{\mathbf{\mathsf{W}}}}\)
\(\def\tX{{\mathbf{\mathsf{X}}}}\)
\(\def\tY{{\mathbf{\mathsf{Y}}}}\)
\(\def\tZ{{\mathbf{\mathsf{Z}}}}\)
\(\def\gA{{\mathcal{A}}}\)
\(\def\gB{{\mathcal{B}}}\)
\(\def\gC{{\mathcal{C}}}\)
\(\def\gD{{\mathcal{D}}}\)
\(\def\gE{{\mathcal{E}}}\)
\(\def\gF{{\mathcal{F}}}\)
\(\def\gG{{\mathcal{G}}}\)
\(\def\gH{{\mathcal{H}}}\)
\(\def\gI{{\mathcal{I}}}\)
\(\def\gJ{{\mathcal{J}}}\)
\(\def\gK{{\mathcal{K}}}\)
\(\def\gL{{\mathcal{L}}}\)
\(\def\gM{{\mathcal{M}}}\)
\(\def\gN{{\mathcal{N}}}\)
\(\def\gO{{\mathcal{O}}}\)
\(\def\gP{{\mathcal{P}}}\)
\(\def\gQ{{\mathcal{Q}}}\)
\(\def\gR{{\mathcal{R}}}\)
\(\def\gS{{\mathcal{S}}}\)
\(\def\gT{{\mathcal{T}}}\)
\(\def\gU{{\mathcal{U}}}\)
\(\def\gV{{\mathcal{V}}}\)
\(\def\gW{{\mathcal{W}}}\)
\(\def\gX{{\mathcal{X}}}\)
\(\def\gY{{\mathcal{Y}}}\)
\(\def\gZ{{\mathcal{Z}}}\)
\(\def\sA{{\mathbb{A}}}\)
\(\def\sB{{\mathbb{B}}}\)
\(\def\sC{{\mathbb{C}}}\)
\(\def\sD{{\mathbb{D}}}\)
\(\def\sF{{\mathbb{F}}}\)
\(\def\sG{{\mathbb{G}}}\)
\(\def\sH{{\mathbb{H}}}\)
\(\def\sI{{\mathbb{I}}}\)
\(\def\sJ{{\mathbb{J}}}\)
\(\def\sK{{\mathbb{K}}}\)
\(\def\sL{{\mathbb{L}}}\)
\(\def\sM{{\mathbb{M}}}\)
\(\def\sN{{\mathbb{N}}}\)
\(\def\sO{{\mathbb{O}}}\)
\(\def\sP{{\mathbb{P}}}\)
\(\def\sQ{{\mathbb{Q}}}\)
\(\def\sR{{\mathbb{R}}}\)
\(\def\sS{{\mathbb{S}}}\)
\(\def\sT{{\mathbb{T}}}\)
\(\def\sU{{\mathbb{U}}}\)
\(\def\sV{{\mathbb{V}}}\)
\(\def\sW{{\mathbb{W}}}\)
\(\def\sX{{\mathbb{X}}}\)
\(\def\sY{{\mathbb{Y}}}\)
\(\def\sZ{{\mathbb{Z}}}\)
\(\def\E{{\mathbb{E}}}\)
\(\def\jac{{\mathbf{\mathrm{J}}}}\)
\(\def\argmax{{\mathop{\mathrm{arg}\,\mathrm{max}}}}\)
\(\def\argmin{{\mathop{\mathrm{arg}\,\mathrm{min}}}}\)
\(\def\Tr{{\mathop{\mathrm{Tr}}}}\)
\(\def\diag{{\mathop{\mathrm{diag}}}}\)
\(\def\vec{{\mathop{\mathrm{vec}}}}\)
\(\def\Kern{{\mathop{\mathrm{Kern}}}}\)
\(\def\llbracket{⟦}\)
\(\def\rrbracket{⟧}\)

Table of Contents

Hi, I'm Felix!

I am a Postdoc at the Vector Institute in Toronto.

I did a PhD at Philipp Hennig's lab (and the IMPRS-IS) in Tübingen. Before, I did my BSc and MSc in Physics at the University of Stuttgart.

You can contact me via GitHub, twitter, or email.


Papers

Check out my Google Scholar profile for an always up-to-date publication record.

  • Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets, pre-print 2023
    W. Lin, F. Dangel, R. Eschenhagen, K. Neklyudov, A. Kristiadi, R. Turner, A. Makhzani (code)
  • On the Disconnect Between Theory and Practice of Overparametrized Neural Networks, pre-print 2023
    J. Wenger, F. Dangel, A. Kristiadi (pdf | arXiv)
  • Convolutions Through the Lens of Tensor Networks, pre-print 2023
    F. Dangel (pdf | arXiv | code)
  • The Geometry of Neural Nets' Parameter Spaces Under Reparametrization, NeurIPS 2023
    A. Kristiadi, F. Dangel, P. Hennig (pdf | arXiv)
  • ViViT: Curvature access through the generalized Gauss-Newton's low-rank structure, TMLR 2022
    F. Dangel, L. Tatzel, P. Hennig (pdf | journal | arXiv | code | www)
  • Cockpit: A Practical Debugging Tool for Training Deep Neural Networks, NeurIPS 2021
    F. Schneider, F. Dangel, P. Hennig (pdf | conference | arXiv | code | www | video)
  • Modular Block-diagonal Curvature Approximations for Feedforward Architectures, AISTATS 2020
    F. Dangel, S. Harmeling, P. Hennig (pdf | conference | arXiv | code | video)
  • BackPACK: Packing more into backprop, ICLR 2020
    F. Dangel, F. Kunstner, P. Hennig (pdf | conference | arXiv | code | www | video)
  • Topological invariants in dissipative extensions of the Su-Schrieffer-Heeger model, Phys. Rev. A 2018
    F. Dangel, M. Wagner, H. Cartarius, J. Main, G. Wunner (pdf | journal | arXiv)
  • Numerical calculation of the complex berry phase in non-Hermitian systems, Acta Polytechnica 2018
    M. Wagner, F. Dangel, H. Cartarius, J. Main, G. Wunner (pdf | journal | arXiv)

Theses:

  • Backpropagation Beyond the Gradient
    PhD thesis 2023 (pdf | source | template)
  • Bosonic many-body systems with topologically nontrivial phases subject to gain and loss
    Master thesis 2017 (pdf)
  • Mikroskopische Beschreibung eines Einkoppelprozesses für PT-symmetrische Bose-Einstein-Kondensate
    Bachelor thesis 2015 (pdf, German only)

Code

Check out my Github profile for an always up-to-date list. Some highlights:

Cockpit (co-maintainer)
A practical debugging tool for training deep neural networks in PyTorch.
BackPACK (maintainer)
A backpropagation package on top of PyTorch that efficiently computes more than the gradient.
unfoldNd (maintainer)
N=1,2,3-dimensional unfold (im2col) and fold (col2im) in PyTorch.
ViViT (maintainer)
Curvature access (eigenvalues, eigenvectors, directional derivatives & Newton steps) through the generalized Gauss-Newton's low-rank structure.
curvlinops (maintainer)
SciPy linear operators for the Hessian, Fisher/GGN, and more in PyTorch.
singd (maintainer)
KFAC-like Structured Inverse-Free Natural Gradient Descent
einconv (maintainer)
Convolutions and more as einsum for PyTorch

Notes

An ongoing note and code snippet collection. To navigate to a post, click on its title.


KFAC explained

How to arrive at the Kronecker-factorized Hessian approximations, how to generalize them to transpose convolutions, and how to link them to other approximations.


Printing a poster towel

How I printed my poster towel for ELLIS Doctoral Symposium 2022 in Alicante 🏖.


Expanding einsum expressions

A utility function to combine nested einsum expressions.


Structural implications of batch normalization

BN spoils the concept of per-sample quantities (like individual gradients). Which structure remains?


Hessian row sum in PyTorch

Example use case for Hessian-vector products in PyTorch (using a utility function in BackPACK).


My template for new posts

My website is an .org file exported to HTML with ReadTheOrg. This snippet is for new posts.

Org mode has been a great and free tool throughout, and after, my PhD (task and time management, notes, website, …). You can support its maintainers!

Author: Felix Dangel

Created: 2023-12-06 Wed 20:30

Validate