-
Notifications
You must be signed in to change notification settings - Fork 528
add stochastic methods #52
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Hello @kilianFatras and thank you for the job. We needed the stochastic implementation and you provided a good implementation there. But there are still a few things that need to be done before merging
Note that all the tests are performed again after you do a commit push. Again thank you for the job, those are only remarks aiming at having a better integration of your algorithm into POT. Rémi |
ef14ddb
to
e2d06ef
Compare
4008808
to
3617b42
Compare
14e4aaa
to
055417e
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @kilianFatras,
the code looks great, I left a few comments
ot/stochastic.py
Outdated
return u | ||
|
||
|
||
def transportation_matrix_entropic(a, b, M, reg, method, numItermax=10000, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename the function to solve_entropic
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or solve_semi_dial_entropic
ot/stochastic.py
Outdated
for i in range(n_source): | ||
r = M[i, :] - v | ||
exp_v = np.exp(-r / reg) * b | ||
u[i] = - reg * np.log(np.sum(exp_v)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we use a stabilized version of logsumexp?
https://en.wikipedia.org/wiki/LogSumExp
so that the algorithm do not explode
np.testing.assert_allclose( | ||
zero, (G_asgd - G_sinkhorn).sum(1), atol=1e-03) # cf convergence asgd | ||
np.testing.assert_allclose( | ||
zero, (G_asgd - G_sinkhorn).sum(0), atol=1e-03) # cf convergence asgd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also test the whole matrix and not onmly the marginals with np.testing.assert_allclose(G_asgd nG_sinkhorn, atol=1e-03)
ot/stochastic.py
Outdated
opt_u = c_transform_entropic(b, M, reg, opt_v) | ||
pi = (np.exp((opt_u[:, None] + opt_v[None, :] - M[:, :]) / reg) * | ||
a[:, None] * b[None, :]) | ||
return pi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a log parameter to the function that return a dictionary with u and v (probably better name alpha and beta see sinkhorn/emd wher emd return also the dual)
dd3398b
to
74cfe5a
Compare
e77c487
to
e068b58
Compare
f1fa080
to
52134e9
Compare
49a50b6
to
7073e41
Compare
6837b9c
to
6777ffd
Compare
4aa054f
to
af5f726
Compare
03d7e25
to
e8cf3cc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello @kilianFatras,
The code is far better now and the function names are much clearer.
The example file do not run due to a small bug discussed in the comments. Also I have seen some problems with documentation compilation due to small typos and missing lines.
The dual SGD is a bit slow but i guess it is difficult to compete with SAG and ASGD.
Again thank you for your work, we will merge shortly now.
examples/plot_stochastic.py
Outdated
|
||
method = "ASGD" | ||
asgd_pi, log = ot.stochastic.solve_semi_dual_entropic(a, b, M, reg, method, | ||
numItermax, log) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bug because log is the wrong positional argument. Replace log
by log_asgd
is the left hand side and log
by log=log
in the right hand side.
|
||
import matplotlib.pylab as pl | ||
import numpy as np | ||
import ot |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also import ot.plot
because it is not imported by default
Compute the coordinate gradient update for regularized discrete | ||
distributions for (i, :) | ||
|
||
The function computes the gradient of the semi dual problem: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add line before math operator for proper math generation in documentation
ot/stochastic.py
Outdated
optimal transport max problem | ||
|
||
The function solves the following optimization problem: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
ot/stochastic.py
Outdated
optimal transport max problem | ||
|
||
The function solves the following optimization problem: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
ot/stochastic.py
Outdated
measures optimal transport max problem | ||
|
||
The function solves the following optimization problem: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
ot/stochastic.py
Outdated
Computes the partial gradient of F_\W_varepsilon | ||
|
||
Compute the partial gradient of the dual problem: | ||
..Math: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, use math
instead of Math
ot/stochastic.py
Outdated
Computes the partial gradient of F_\W_varepsilon | ||
|
||
Compute the partial gradient of the dual problem: | ||
..Math: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, use math instead of Math
ot/stochastic.py
Outdated
optimal transport dual problem | ||
|
||
The function solves the following optimization problem: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
ot/stochastic.py
Outdated
optimal transport dual problem | ||
|
||
The function solves the following optimization problem: | ||
.. math:: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here, use math instead of Math
29f025a
to
9fecd51
Compare
I think that SGD's performance also depends on the hyperparameters. It is really hard to find the good batch size or the learning rate. I tried many but as it is the dual problem, it is not as easy as for the semi dual problem. In [Genevay & al.], they pointed out that for the semi dual problem, you can have an upper bound of the Lipschitz constant which is not the case for the dual problem. I will keep working on this as it can increase the convergence speed. |
011f5e1
to
968ad58
Compare
64d504b
to
208ff46
Compare
SAG for discrete measures and ASGD for semi continuous measure.