Skip to content

add stochastic methods #52

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 16 commits into from
Jun 27, 2018
Merged

Conversation

kilianFatras
Copy link
Contributor

SAG for discrete measures and ASGD for semi continuous measure.

@rflamary
Copy link
Collaborator

Hello @kilianFatras and thank you for the job. We needed the stochastic implementation and you provided a good implementation there.

But there are still a few things that need to be done before merging

  • The PR do not build because of pep8 problems and failing tests.
  • The submodule stochastic is not imported in module ot which is why it does not work in the test. You need to add from . import stochastic to the ot/__init__.py file in order to import it while doing import ot.
  • The order of the parameters in all your function should follow closely the order in other functions such as ot.emd and ot.sinkhorn so that you can switch from one to the other easily (at least a,b M , and reg (which you call epsilon and should be renamed also) followed by optimization parameters).
  • n_source and n_target are not useful since those dimensions can be inferred from a,b or M. You also should provide default values for n_iter (renamed to numItermax to fit ot.emd and ot.sinkhorn functions) and lr.
  • In the test i would like you to also test that your solver return the same solution as the ot.sinkkorn function. This is an important test since both solver address the same optimization problem with different algorithms.
  • please use POT functions for providing uniform distributions and computing distance matrices M (in the doc, the test and the examples). uniform distributions of dimensionality n are obtained easily by ot.unif, and M can be computed very easily with ot.dist function.
  • We try to provide in the documentation of the function the optimization problem that is solved (see ot.emd or ot.sinkhorn). So plesa express the dual problem solved by the saga and asgd function. The c transform should also appear in the documentation since it can be used in numerous occasions. Also try to provide the euaton or algorithm number of the reference paper use in the documentation of all the functions. It helps the user/reader to better follow what is done.

Note that all the tests are performed again after you do a commit push.

Again thank you for the job, those are only remarks aiming at having a better integration of your algorithm into POT.

Rémi

Kilian Fatras added 2 commits June 15, 2018 19:15
Copy link
Collaborator

@rflamary rflamary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @kilianFatras,

the code looks great, I left a few comments

ot/stochastic.py Outdated
return u


def transportation_matrix_entropic(a, b, M, reg, method, numItermax=10000,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rename the function to solve_entropic

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

or solve_semi_dial_entropic

ot/stochastic.py Outdated
for i in range(n_source):
r = M[i, :] - v
exp_v = np.exp(-r / reg) * b
u[i] = - reg * np.log(np.sum(exp_v))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we use a stabilized version of logsumexp?
https://en.wikipedia.org/wiki/LogSumExp

so that the algorithm do not explode

np.testing.assert_allclose(
zero, (G_asgd - G_sinkhorn).sum(1), atol=1e-03) # cf convergence asgd
np.testing.assert_allclose(
zero, (G_asgd - G_sinkhorn).sum(0), atol=1e-03) # cf convergence asgd
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also test the whole matrix and not onmly the marginals with np.testing.assert_allclose(G_asgd nG_sinkhorn, atol=1e-03)

ot/stochastic.py Outdated
opt_u = c_transform_entropic(b, M, reg, opt_v)
pi = (np.exp((opt_u[:, None] + opt_v[None, :] - M[:, :]) / reg) *
a[:, None] * b[None, :])
return pi
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a log parameter to the function that return a dictionary with u and v (probably better name alpha and beta see sinkhorn/emd wher emd return also the dual)

Copy link
Collaborator

@rflamary rflamary left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @kilianFatras,

The code is far better now and the function names are much clearer.

The example file do not run due to a small bug discussed in the comments. Also I have seen some problems with documentation compilation due to small typos and missing lines.

The dual SGD is a bit slow but i guess it is difficult to compete with SAG and ASGD.

Again thank you for your work, we will merge shortly now.


method = "ASGD"
asgd_pi, log = ot.stochastic.solve_semi_dual_entropic(a, b, M, reg, method,
numItermax, log)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bug because log is the wrong positional argument. Replace log by log_asgd is the left hand side and log by log=log in the right hand side.


import matplotlib.pylab as pl
import numpy as np
import ot
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also import ot.plot because it is not imported by default

Compute the coordinate gradient update for regularized discrete
distributions for (i, :)

The function computes the gradient of the semi dual problem:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add line before math operator for proper math generation in documentation

ot/stochastic.py Outdated
optimal transport max problem

The function solves the following optimization problem:
.. math::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

ot/stochastic.py Outdated
optimal transport max problem

The function solves the following optimization problem:
.. math::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

ot/stochastic.py Outdated
measures optimal transport max problem

The function solves the following optimization problem:
.. math::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

ot/stochastic.py Outdated
Computes the partial gradient of F_\W_varepsilon

Compute the partial gradient of the dual problem:
..Math:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, use math instead of Math

ot/stochastic.py Outdated
Computes the partial gradient of F_\W_varepsilon

Compute the partial gradient of the dual problem:
..Math:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, use math instead of Math

ot/stochastic.py Outdated
optimal transport dual problem

The function solves the following optimization problem:
.. math::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here

ot/stochastic.py Outdated
optimal transport dual problem

The function solves the following optimization problem:
.. math::
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same here, use math instead of Math

@kilianFatras
Copy link
Contributor Author

kilianFatras commented Jun 25, 2018

I think that SGD's performance also depends on the hyperparameters. It is really hard to find the good batch size or the learning rate. I tried many but as it is the dual problem, it is not as easy as for the semi dual problem. In [Genevay & al.], they pointed out that for the semi dual problem, you can have an upper bound of the Lipschitz constant which is not the case for the dual problem. I will keep working on this as it can increase the convergence speed.

@rflamary rflamary merged commit 39cbcd3 into PythonOT:master Jun 27, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants