A large scale non-linear optimization library

amai · on Oct 20, 2022

At Google, Ceres is used to:

- Estimate the pose of Street View cars, aircrafts, and satellites.

- Build 3D models for PhotoTours.

- Estimate satellite image sensor characteristics.

- Stitch panoramas on Android and iOS.

- Apply Lens Blur on Android.

- Solve bundle adjustment and SLAM problems in Project Tango.

Microsoft Research uses Ceres for nonlinear optimization of objectives involving subdivision surfaces under skinned control meshes.

http://ceres-solver.org/users.html

JacobiX · on Oct 20, 2022

We used it in a project that performed document recognition and analysis. It was useful for aligning the fields to be extracted from the source image as we formulated it as an optimization problem. It was straightforward to use and it contains some neat loss functions that reduce the effect of outliers (e.g. CauchyLoss).

stevesimmons · on Oct 20, 2022

That sounds really interesting. Do you have any more details?

If you prefer to reply privately, my contact details are in my HN profile.

JacobiX · on Oct 20, 2022

Thanks, I've replied privately

margalabargala · on Oct 20, 2022

How does this compare to nlopt?

https://nlopt.readthedocs.io/en/latest/

dimatura · on Oct 20, 2022

I believe Ceres was originally developed for nonlinear least squares optimization, particularly in the context of structure from motion (as bundle adjustment) and related problems in computer vision and robotics (esp. SLAM). As such it has some features that are useful in that context, such as automatic differentiation and covariance estimation. I see that it also has more general nonlinear optimization feature, but even in that feature it seems like it assumes there will be at least first order gradients available (and again, for this the auto differentiation is handy).

On the other hand it seems like NLOPT is oriented towards implementing various more general "black box" optimization methods, only some of which need/support gradient information, and there's no auto differentiation.

So if I was working on some kind of SFM/SLAM problem, I'd probably use Ceres, but if I had a less structured optimization problem - and especially if I didn't have gradients - I'd try NLOPT.

1980phipsi · on Oct 20, 2022

Looks like ceres doesn't support inequality constraints (directly at least), while NLOPT has some algorithms that do.

angrycontrarian · on Oct 20, 2022

For all of the success applying GPUs to optimization problems in ML, why don't any of the common optimization packages seem to support GPU acceleration?

zetazzed · on Oct 20, 2022

For combinatorial optimization, there is now NVIDIA cuOpt, which is ridiculously fast: https://developer.nvidia.com/cuopt-logistics-optimization

Ceres can use GPUs for some solver bits too, I believe.

cerved · on Oct 20, 2022

I didn't know that, thank you. Is there any information about the internals of this? Like the kind of solver etc?

whatever1 · on Oct 20, 2022

Sorry but this is not optimization, this is a bunch of heuristics ran on a gpu.

sampo · on Oct 20, 2022

That's how you solve combinatorial optimization problems. With a bunch on heuristics.

cerved · on Oct 20, 2022

To be pedantic, it's one way to solve combinatorial optimization problems :)

cerved · on Oct 20, 2022

What's your definition of optimization?

whatever1 · on Oct 20, 2022

Sparse matrices is the answer. GPUs do not offer significant speed up in these cases.

Google for linear programming gpu acceleration to see for yourself

cerved · on Oct 20, 2022

I found this https://support.gurobi.com/hc/en-us/articles/360012237852-Do... which I guess is what you refer to?

mgaunard · on Oct 20, 2022

Optimization is an iterative path-dependent problem, it's not particularly GPU-friendly.

angrycontrarian · on Oct 20, 2022

What about interior point methods?

mgraczyk · on Oct 20, 2022

They are iterative and path dependent, no?

angrycontrarian · on Oct 20, 2022

Having implemented them myself on an ad hoc basis in pytorch, I don't see how they're much different than training a deep learning model.

mgaunard · on Oct 24, 2022

There are two "directions" along which you can parallelize: - explore different parts of the parameter hyperspace in parallel. - for a given parametrization, split the model and/or objective function so that its parts can be computed in parallel.

The second approach is model-specific, and gives you nice speedups (make your model N times faster, and you will converge N times faster), but is often not particularly well-suited for accelerators (which includes GPUs) due to the latency of moving data back and forth, but again with model-specific tuning you can maybe make it work. Most of the time SIMD on CPU is best here for most traditional problems.

The first approach, which already pretty much requires the second so that the model and the optimization can run on the same computing unit, well it isn't particularly great since you're doing computations that are suboptimal and/or redundant to begin with. Any speedup isn't obvious, depends on the optimization algorithm and the convergence characteristics of your problem. Also as you follow along some paths in parallel, you'll eventually need to sync up, and since they have divergent control flow, this means you're not able to make the most of the computing resources which will be stalling quite often. Often, with enough tuning for your particular problem and method, you can make it work.

So why don't generic libraries do it on GPU? Because unless you tune everything for your particular problem, it's just not going to perform as well as on CPU.

kxyvr · on Oct 20, 2022

Many already do, but not necessarily explicitly. If an optimizer accepts user defined functions for its evaluation and derivatives, these computations can be done using a GPU even though the optimizer itself knows nothing about the GPU. For example, GPU are extensively used in parameter estimation problems associated with PDE contrained optimization. Essentially, the PDE solves use GPUs to solve the differential equation and then then results are fed back into the optimizer. Many of these packages use common open source optimizers.

More generally, there's a question of where the algorithms themselves benefit from GPUs or parallelism in general. For large scale nonlinear, continuous optimization problems using second-order, Newton like methods, the big costs are in the function evaluations, their derivatives, and the linear system preconditioners/solves. Generally speaking, how the function evaluations and derivatives are computed are on the user. For the preconditioning/linear system solves, there's value in parallelism. However, here, the GPUs have traditionally lagged. Basically, we need a factorization, be it sparse or dense, and it's only been recently where good library support has been extended for GPUs. For the longest time, the entire matrix factorization needed to fit onto the GPU and many of these matrices were large. That said, for optimizers that accept a user-defined preconditioner, the use of GPUs is already possible.

dimatura · on Oct 20, 2022

I believe that when the number of parameters isn't in the millions, as in deep networks, there's less of an advantage to using GPUs. But there is definitely research on using GPUs to solve nonlinear least squares optimization problems. Here's one I saw a few years ago: https://dl.acm.org/doi/10.1145/3132188

WithinReason · on Oct 20, 2022

I've come to believe is that the answer to these kinds of questions is "because GPU optimisation is difficult".

mrfox321 · on Oct 20, 2022

Less users. Smaller matrices.

angrycontrarian · on Oct 20, 2022

There are a lot of large-scale optimization problems in industry that are still compute bound. Currently available solvers are either single threaded on the CPU, or offer "parallelism" by running copies of the same problem on multiple threads, but with different initial conditions in hopes that one happens to converge faster.

hackerlight · on Oct 20, 2022

Most large-scale optimization outside of neural nets is bottlenecked by function evaluation

kxyvr · on Oct 20, 2022

I'll contend that this depends. For example, in mixed-integer linear programs, the function evaluations are trivial, but the combinatorial search is expensive. Alternatively, for nonlinear, continuous optimization problems with constraints, the primary cost is often the preconditioning or factorization of the systems associated either with an augmented or KKT systems. That said, for unconstrained or bound constrained, nonlinear, continuous optimization problems, I largely agree. And, even with general constraints, it's sometimes the case it's the function evaluations. Mostly, I wanted to contend that the linear system solves can often be the limiting factor and better large scale factorization codes are needed.

adammarples · on Oct 20, 2022

https://github.com/cvxgrp/cvxpylayers ?

cerved · on Oct 20, 2022

this one does http://ceres-solver.org/features.html but maybe that's not what you were referring to?

zcw100 · on Oct 20, 2022

I'd be interested to see how this compares to Googles also recently released Vizier https://oss-vizier.readthedocs.io/en/latest/ other than one is black box and the other isn't.

Karliss · on Oct 20, 2022

Can anyone comment what this project considers "large optimization problems". More than 2 variables, few hundred variables or many thousands of variables?