-
Notifications
You must be signed in to change notification settings - Fork 17
Home
- Move to live doc
- Inverting through reductions (sum)
- Scattering, general multi-stage updates
- Non-invertible Func dependencies (should become scattering update Funcs in the general case)
There is a lot of power that can come from supporting the full language so we can stay closed under the gradient transform, to allow higher-order derivatives.
"Forward-over-reverse" is common for getting second-order Hessian-vector products, so we may want forward mode for that. It's also useful for Gauss-Newton on image unknowns (Opt). Related discussion by one of the autograd authors here: https://j-towns.github.io/2017/06/12/A-new-trick.html. (The autograd guys are among the best and clearest thinkers about AD in the ML community, from what I've seen.)
auto d = propagate_adjoints(myPipeline)
auto someGrad = d(myPipeline)/d(param)
One thing I realize writing this down: we need to be sure we can effectively output many differently sized/shaped grad terms from a single run of the pipeline.
Are we losing anything by not having the ability to apply gradients to update some weights in the middle of the backprop pipeline, rather than having to generate all the gradients before applying and throwing away any of them?
- Python bindings
compile_to_tensorflow
- General optimization frameworks we can fit into?