I recently read about the simple but brilliantly clever improvement to RANSAC, called PROSAC. Maybe it’s not really all that brilliant, because as far as I can tell, it is not picked up and built upon in the authors’ subsequent RANSAC-related papers. But I think it’s cool, and easy enough that I want to play around with adding it to the OpenCV baseline, maybe even eventually get it solid enough to request it to be added officially.

I don’t want to try to explain or reiterate the entire paper, the general idea is, if your prospective matches are usefully ordered (say, small descriptor distances up front, or larger 1st/2nd-best-dist-ratio up front), then it makes no sense for RANSAC to give every element the same chance to be in a sample that defines a model. Why not give priority to the (probably) better elements at the front of the list? So authors Chum&Matas derive a meaningful schedule for drawing samples with max element m, then m+1, then m+2, etc. So instead of simply RANSAC=RANdom SAmple Consensus, we have PROgressive SAmple Consensus. Chum&Matas also provide more useful stopping criteria than RANSAC alone, which I haven’t gotten into yet.

So without further ado, here’s my first stab at code for refactoring a RANSAC implementation over to PROSAC (no thanks to wordpress for not maintaining indentation of my paste!):

#include <stdlib.h> // rand()
#include <math.h> // ceil()

#ifdef DEBUG
#include <iostream>
using std::cout;
using std::endl;

typedef unsigned int uint;

#ifdef DEBUG
#define DPRINT(x) cout << "\t" << x << endl;
#define DPRINT(x)

class ProsacHelper {
 ProsacHelper(uint m_, uint N_, uint TN_=200000) 
 : m(m_) // sample size
 , N(N_) // full size
 , TN(TN_) // max iterations
 , poolmax(m-1) // initial poolmax is m-1
 , poolit(0) // initial iteration is 0
 , si(0) // be ready to serve first element of first sample
   // if (m<2 || m>100) throw...
   DPRINT("m = " << m);
   DPRINT("N = " << N);
   DPRINT("TN = " << TN);
   DPRINT("poolmax = " << poolmax);

   // initialize current T(n=m) by direct computation
   T = TN;
   for (uint i=0; i<=m-1; ++i) {
     T *= m-i;
     T /= N-i;
   DPRINT("T = " << T);

   // spend only 1 it in first pool (sample=0...m-1)
   for (uint i=0; i<m; ++i) sample[i] = i;
   DPRINT("3:0 0 1 2 3");

 uint randomSample() {
   uint ret = sample[si]; // this is what will be returned
   si++; // advance within current sample

   if (si==m) { 
     // we have exhausted current sample of m values for iteration. Time
     // to advance to the next iteration, and perhaps the next poolmax, and
     // draw the sample for the next iteration

     if (poolit >= poolits && // time to expand the poolmax
         poolmax < N) { // ...unless we can't
       DPRINT("poolmax = " << poolmax);

       // eq (3) n=poolmax, n+1=poolmax+1
       DPRINT("curT = " << T);
       double nxtT = T * (poolmax+1) / (poolmax+1-m);
       DPRINT("nxtT = " << nxtT);
       double dpoolits = nxtT - T;
       poolits = ceil(dpoolits);
       DPRINT("its with max= "<<poolmax<<": "<<
       dpoolits<<" --> "<<poolits);
       T = nxtT;

     // now that we know what the poolmax is, 
     // create the next sample

     uint n_unique_elts = (poolmax >= N ? m : m-1);

     // loop to populate elements 0..m-2 from 0..n-2
     for (uint i=0; i<n_unique_elts; ++i) {
       bool unique=false;
       // draw random r until different from all previous
       do {
         sample[i]=rand()%poolmax; // 0..poolmax-1
         for (uint j=0; j<i; ++j) {
           if (sample[i] == sample[j]) {
       } while (!unique);

     if (poolmax < N) {
     // last element must be end of pool
     sample[m-1] = poolmax;
   si=0; // reset sample counter for next call
   DPRINT(poolmax<<":"<<poolit<<": " << sample[0] << " " << sample[1]
          << " " << sample[2] << " " << sample[3]);

  return ret;

 uint m; // sample size: homographies=4, fundamental matrices=7 or 8
 uint N; // full number of items
 uint poolmax; // current pool max (pool = 0..poolmax)
 uint poolit;  // iteration within current pool size
 uint TN;      // max iterations
 double T;     // T(n)=avg # regular RANSAC seeds out of TN with max<=n-1
 uint poolits; // # iterations to spend in current pool; i=0..poolits-1
 uint sample[100]; // cache holding the current sample of m indices
 uint si;          // current position within the current sample

So the whole point is that you construct a ProsacHelper with parameters relevant to your case (m=4 for fitting a homography, or 7 or 8 for fitting a fundamental matrix; and N=number of elements; you can probably leave TN default), and then call randomSample() repeatedly. It remembers its state within each sample of m (with member si), and (as long as you don’t call it to use the random indices for other purposes) it will give you m-samples according to the schedule laid out by PROSAC.

In the OpenCV calib3d module, source file modelest.cpp, method CvModelEstimator2::getSubset(), the line

idx[i] = idx_i = cvRandInt(&rng)%count;

would be replaced by

idx[i] = idx_i = prosac_helper.randomSample();

–that is, assuming a ProsacHelper object could be set up beforehand and persisted across calls to getSubset(). I hope to show how this works out in a next post.

I did a few things a little differently from the paper. Member ‘poolsize’ is n, the size of the current pool of samples (0..n-1, or more specifically, n-1 and m-1 from 0..n-2). Rather than maintain t=global iteration number and T’n=iteration to bump up n=poolsize, I maintain only T'(n) – T'(n-1) = ceil(T(n) – T(n-1)) = ‘poolits’, and ‘poolit’ is my counter from 0..poolits-1.

I did spend time in there to guarantee that each m-sample does not repeat itself. I figured that this time is probably overshadowed by the time required to make sure that random samples are not degenerate (i.e. 3/4 collinear when fitting a homography).

Test code like this:

ProsacHelper prosac(4, 100);
uint sample[4];

for (int it=0; it<100; ++it) {
   for (int mi=0; mi<4; ++mi) sample[mi] = prosac.randomSample();
   cout << it << ": " << sample[0] << " " << sample[1]
               << " " << sample[2] << " " << sample[3] << endl;

creates output like this:

0: 0 1 2 3
1: 3 2 1 4
2: 0 3 1 5
3: 0 3 1 6
4: 2 1 5 6
5: 0 3 1 7
6: 2 3 0 7
7: 7 5 6 8
8: 2 6 3 8
9: 3 7 1 8

You can see that, since T(n) increases by less than one the first few times, poolmax=3,4,5 each only get one iteration, but then as T(n) grows more quickly, poolmax=6,7 get 2 iterations each, poolmax=8 gets 3 iterations, and it grows from there.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: