sigmoid.social is one of the many independent Mastodon servers you can use to participate in the fediverse.
A social space for people researching, working with, or just interested in AI!

Server stats:

586
active users

On Thursday I'll be at presenting a paper on our new system for of implicit functions. A 🧵on the paper (arxiv.org/abs/2105.15183)

The motivating problem is that of computing the Jacobian of an implicit function, that is, a function defined as the solution of an optimization problem or as the solution of a non-linear equation.

This problem arises in many machine learning problems, such as bi-level optimization (hyperparameter optimization, meta-learning, dataset distillation, etc.) or optimization as a layer. Sometimes this Jacobian is interesting in its own right, for example to perform a sensitivity analysis with respect to hyperparameters.

One of the main approaches to compute the Jacobian of an implicit function is through the implicit function theorem. This approach first casts the definition of the implicit function as the solution of a (potentially non-linear) equation, for example using the KKT conditions. Then it differentiates both sides of this equation, which gives in turn an implicit formula for the desired Jacobian. This can be recovered then by solving a linear system of equations.

One drawback of this approach is that it's cumbersome to implement. Every problem is defined by an implicit equation, and so the system of equations that define the Jacobian need to be defined for every problem. Furthermore, nonsmooth problems (such as the Lasso) require a different treatment due to the non-differentiability of its implicit formulation.

Our main contribution is an implementation that abstracts the problem-specific aspects of this derivation.

In this framework, the user defines a mapping function F capturing the optimality conditions of the problem solved by the algorithm. This function is then plugged into a Python decorator @custom_root, which is appended on top of the solver declaration we wish to differentiate. And that's it!

Calling jax.jacobian will then return the Jacobian of the implicit function.

Under the hood, the system will generate the matrices appearing in the implicit function theorem and solve the system. But all this is transparent to the user, who only needs to provide the optimality conditions for the implicit function.

The result is an exceptionally effective and easy to use system. Furthermore, the software package also contains helper functions to compute the optimality function for common solvers such as projected gradient descent, mirror descent, QPs, etc.

Fabian Pedregosa

This is joint work with @mblondel, Quentin Berthet, Mathieu Blondel, Quentin Berthet, Marco Cuturi, Roy Frostig, Stephan Hoyer, Felipe Llinares-López, and Jean-Philippe Vert. Since the software was first published, we've also received many open source contributions, to which we're immensely grateful!