Can you help me with optimizing, Fast Poisson Disk Sampling

[h]Are you up for a challenge?[/h]

I’ve created some code to perform “Fast Poisson Disk Sampling”, only I want it to be faster, I think I’ve almost reached the limits of my capabilities, can you do better? I am comparing with some computer science papers (from 2012) that claim this algorithm can perform in 12 milliseconds, I’m getting 1000x that, and of course they don’t publish their source code.

Download the code from http://www.ohanaware.com/xojo/PoissonDiskSampling.zip

Before you start:
Run the program as is; click the only button 10 times and record the averages (at the bottom of the list).
Compile the program, run it and click the only button 10 times, record the averages.

MacBook Pro (Retina, Mid 2012) 2.6 Ghz Core i7; macOS 10.11.6; Xojo 2018 release 3. debug: 61; 15,600; 2,659; 5.85 compiled: 61; 9,691; 2;077; 3.62
I didn’t record the exact metrics when I started, but the compiled version was taking 18,xxxx milliseconds.

Some rules;

  1. Don’t alter the dimensions, the minRadius (currently set to 30) or poissonDiskSampling2D.k (also set to 30).
  2. Use the default compiler (In the “Shared” settings of the project, leave the “Optimization” level at “Default”).
  3. When replacing code, please comment out original code, leaving it there.
  4. Make sure you comment your code, tag it with your name and the date. “// Sam Rowlands, Jan 17th 2018”
  5. When you share your changes, please post the before and after metrics.

Thank you for looking into this to help me, good luck and may the odds be forever in your favor :wink:

Edit: Updated point 2 to (hopefully) be clearer.

What does “use the default compiler” mean exactly?

I just had a quick look, and it appears that there is too much variance between runs (due to different random numbers) to get a meaningful conclusion about any small incremental changes. For example, in the refresh routine, I replaced the 3 lines having the floor() function and floating point divide to an integer divide which should accomplish the same thing:

'grid( floor( p.x / cellSize ), floor( p.y / cellSize ) ) = p 'replaced with: grid( ( p.x \\ cellSize ), ( p.y \\ cellSize ) ) = p ' and 'gridLoc.x = floor( cSample.x / cellSize ) 'gridLoc.y = floor( cSample.y / cellSize ) 'replaced with: gridLoc.x = ( cSample.x \\ cellSize ) gridLoc.y = ( cSample.y \\ cellSize )
but any improvement (if there was any) was masked by the large variance in run times. I’d recommend using a much larger number of runs, or else use a fixed sequence of pseudo random numbers so that they are the same in every run.

[code]MacBook Pro (Retina, 15-inch, Mid 2015) 2.2 GHz OSX 10.10.5

Original
debug: 59.6 9971.9 2635.0 3.80
compiled: 60.1 8823.3 2532.0 3.36

Mod to use integer divide
debug: 59.8 9941.9 2638.00 3.77
compiled: 60.1 7872.1 2671.00 2.95[/code]

Sorry for not being clear, in the “Shared” settings of the project, leave the “Optimization” level at “Default”.

Thanks Robert.

There is an improvement there, considering it’s a small change.

The article I was working from is https://www.cct.lsu.edu/~fharhad/ganbatte/siggraph2007/CD2/content/sketches/0250.pdf.
if I’m honest; it took me about 10 reads (while making notes) to understand what I think he means. I am not well versed in working from published works like this, but it seemed so simple… I understood the principle behind this was to be as random as possible.

That said, I am not against comparing with a much larger amount of runs. I think we should continue to keep 10 runs, but I see no problem in performing a larger amount of runs as well. What number of run do you think would give a suitable set of data?

Here it is: https://www.cct.lsu.edu/~fharhad/ganbatte/siggraph2007/CD2/content/sketches/supplemental/0250-02.pdf

I’ve made the following changes to the poissonDisk2D.refresh routine which give a slight improvement in speed, but nothing dramatic:

#pragma boundsChecking false
#pragma stackOverflowChecking false
#pragma backgroundTasks false

timeTaken = microseconds
'R. Weaver 2019-01-17 - Add new variable, the square of minRadius
dim minRadiusSQ As Double = minRadius*minRadius
dim cellSize as double = minRadius / sqrt( 2 )
dim gridSize as new xojo.core.size( width / cellSize, height / cellSize )

redim locations( -1 )
dim activeList(-1) as xojo.core.point

// --- Build the neighborhood loop
static neighborhood() as xojo.core.point = array( new xojo.core.point( -1, -1 ), _
new xojo.core.point( 0, -1 ), new xojo.core.point( 1, -1 ), new xojo.core.point( -1, 0 ), _
new xojo.core.point( 1, 0 ), new xojo.core.point( -1, 1 ), new xojo.core.point( 0, 1 ), _
new xojo.core.point( 1, 1 ) )

Dim grid( -1, -1 ) as xojo.core.point                         // --- Build the grid.
redim grid( gridSize.width, gridSize.height )                 // --- we can't use a variable to build it.

Dim p as new xojo.core.point( rnd * width, rnd * height )     // --- Step 1: Set the initial sample
'R. Weaver 2019-01-17 - replace floor and floating point divide with integer divide
'grid( floor( p.x / cellSize ), floor( p.y / cellSize ) ) = p  // --- Add to the grid, so it can be checked.
grid( ( p.x \\ cellSize ), ( p.y \\ cellSize ) ) = p  // --- Add to the grid, so it can be checked.
activeList.append p                                           // --- Add it to the list of locations to explore next.
samplesCreated = 1
Try                                                           // --- Store the location
  Dim t as pds2DLoc
  t.x = p.x
  t.y = p.y
  locations.append t
End Try

Dim tooCloseForComfort, addedToGrid as boolean
Dim gridLoc, compareLoc, cSample as pds2DLoc                  // --- Using structures instead of objects, we improved performance.
Dim rndIdx, iteration, fnsm as integer
Dim radii, angle as double
Dim listCount as integer = actiVeList.ubound

do
  rndIdx = rnd * ( listCount + 1 )                            // --- Pick a random item from the list of locations to explore
  p = activeList( rndIdx )
  addedToGrid = false
  
  For iteration = 1 to k                                      // --- Create k samples from our randomly selected existing point.
    'R. Weaver 2019-01-17 - Trigonometry error: 361 should be 360
    'angle = 0.01745329252 * ( rnd * 361 )                     // --- Random direction
    angle = 0.01745329252 * ( rnd * 360 )                     // --- Random direction
    radii = ( rnd * minRadius ) + minRadius                   // --- Random distance within minRadius & 2minRadius
    
    cSample.x = ( radii * cos( angle ) ) + p.x
    cSample.y = ( radii * sin( angle ) ) + p.y
    
    // --- Make sure the sample is within bounds.
    if cSample.x < 0 then cSample.x = 0
    if cSample.x > width then cSample.x = width
    if cSample.y < 0 then cSample.y = 0
    if cSample.y > height then cSample.y = height
    
    'R. Weaver 2019-01-17 - replace floor and floating point divide with integer divide
    'gridLoc.x = floor( cSample.x / cellSize )                // --- Obtain the location on our grid.
    'gridLoc.y = floor( cSample.y / cellSize )
    gridLoc.x = ( cSample.x \\ cellSize )                // --- Obtain the location on our grid.
    gridLoc.y = ( cSample.y \\ cellSize )
    
    if grid( gridLoc.x, gridLoc.y ) <> nil then continue     // --- if there's already one in here, we can skip this sample.
    
    // --- Now we check our neighbors to make sure that our sample is minRadius away from neighbors at home.
    tooCloseForComfort = false
    For fnsm = 0 to 7
      compareLoc.x = gridLoc.x + neighborhood( fnsm ).x
      compareloc.y = gridLoc.y + neighborhood( fnsm ).y
      
      // --- It's this location is out of bounds then skip to the next neighbor.
      if ( compareLoc.x < 0 or compareLoc.x >= gridSize.width ) or ( compareLoc.y < 0 or compareLoc.y >= gridSize.height ) then continue
      
      // --- If no neighbor is home, then skip to the next neighbor.
      if grid( compareLoc.x, compareLoc.y ) = nil then continue
      
      // --- If we're too close to the neighbor, we reject this sample and move to the next.
      'R. Weaver 2019-01-17 - remove sqrt and abs functions and replace minRadius with minRadiusSQ
      'if minRadius > sqrt( ( abs( cSample.x - grid( compareLoc.x, compareLoc.y ).x ) ^ 2 ) + ( abs( cSample.y - grid( compareLoc.x, compareLoc.y ).y ) ^ 2 ) ) then
      if minRadiusSQ > (( cSample.x - grid( compareLoc.x, compareLoc.y ).x ) ^ 2 ) + (( cSample.y - grid( compareLoc.x, compareLoc.y ).y ) ^ 2 )  then
        tooCloseForComfort = true
        exit
      end if
    Next
    
    if not tooCloseForComfort then // --- so we're cool with our neighbors.
      grid( gridLoc.x, gridLoc.y ) = new xojo.core.point( cSample.x, cSample.y ) // --- Add to the grid, so it can be checked.
      activeList.append grid( gridLoc.x, gridLoc.y )                             // --- Add it to the list of locations to explore next.
      addedToGrid = true
      
      Dim n as pds2DLoc                                                          // --- Store the location
      n.x = cSample.x
      n.y = cSample.y
      locations.append n
    end if
  Next
  
  samplesCreated = samplesCreated + k
  
  if not addedToGrid then activeList.remove rndIdx             // --- We couldn't place any more samples for this item, so we drop it.
  listCount = activeList.ubound
Loop until listcount = -1

timeTaken = microseconds - timeTaken

For testing only, I would recommend using the Random class. Add a random object rnd to the application

Property rnd as Random

and an open event to the application with this code:

rnd = new random
rnd.Seed=12345.6789

Then change all occurrences of “rnd” to “app.rnd.number” in the refresh routine.
This will ensure that every time you run the program, it will produce exactly the same sequence of random numbers, making it easier to compare changes. The random class appears to be a bit slower in generating random numbers than the old rnd function, so once you get it running, you may want to switch back to the rnd function.

Thanks for digging that out; I’ve had a quick look at it.

If the goal is to create the fastest code, why?

Because I’ve had some problems in the past with the “aggressive” option, and while I’m pretty sure it will be fine, I don’t want to rely on it for the performance improvements. Plus if it can really be improved without, should I choose to use the “Aggressive” option, it will make it even faster (in theory).

When I optimized my M_Crypto module, using Aggressive dropped the Bcrypt hash time from seconds to milliseconds. There is only so far you can go without it.

Considering that there’s always a trade off between speed and memory usage, you may be better off ditching everything that involves dynamic memory allocation, and simply dimension all of your arrays to fixed sizes, and avoid the use of dynamically allocated objects. For example, use an array of separate x and y doubles rather than creating new point objects.

There’s another thing I noticed which is probably not very significant, but I’ll point it out anyway since I tend to be picky about these kinds of things. Your method of generating random points has a couple of problems. First, the use of random angles and radii will lead to non uniform distribution of points. They will be more dense towards the centre. The following diagram shows this, with the random angle/radius method on the left and a uniform sample rejection method on the right.

It’s not as bad in your case since you’re generating points that fall outside of the centre area, and so the non-uniformity is not as bad, though it’s still there. The only reliable way to generate uniformly distributed random points over a circular area is to generate random points over a square area that bounds the circle. Then, check to see if the point falls inside the circle. If not, it’s rejected. Because the area of the bounding square is 14% bigger than the area of the circle, it means that 14% of the random points will be rejected. In the case of a circular area, because the sample rejection method avoids the calculation of sin and cos, this often offsets the extra 14% overhead. However, in your case your points are distributed over an annulus rather than a circle, and so the inner points must be rejected as well. In this case 52% of the points will be rejected. So, this may make the sample rejection method too inefficient in this application.

The second problem is where you check to see if the random point falls outside of the bounds of your graphical region. When they do, your program forces them inside. To be statistically correct, these points should be rejected.

The following code uses sample rejection for generating uniform points, and also rejects out of bound points rather than force them within limits. It’s for interest only, because I think it likely runs slower than the original. (I can’t tell with any certainty because the variance between run times seems more dependent on other background tasks.) It also includes a couple of other minor changes that are commented.

#pragma boundsChecking false
#pragma stackOverflowChecking false
#pragma backgroundTasks false

timeTaken = microseconds

dim cellSize as double = minRadius / sqrt( 2 )
dim gridSize as new xojo.core.size( width / cellSize, height / cellSize )

redim locations( -1 )
dim activeList(-1) as xojo.core.point

'R. Weaver 2019-01-17 - Add several new variables for faster distance testing and sample rejection
dim minRadiusSQ As Double = minRadius*minRadius
dim minRadius4 As Double = 4*minRadius
dim minRadius2SQ As Double = 4*minRadiusSQ


// --- Build the neighborhood loop
static neighborhood() as xojo.core.point = array( new xojo.core.point( -1, -1 ), _
new xojo.core.point( 0, -1 ), new xojo.core.point( 1, -1 ), new xojo.core.point( -1, 0 ), _
new xojo.core.point( 1, 0 ), new xojo.core.point( -1, 1 ), new xojo.core.point( 0, 1 ), _
new xojo.core.point( 1, 1 ) )

Dim grid( -1, -1 ) as xojo.core.point                         // --- Build the grid.
redim grid( gridSize.width, gridSize.height )                 // --- we can't use a variable to build it.

Dim p as new xojo.core.point( app.rand.number * width, app.rand.number * height )     // --- Step 1: Set the initial sample
'R. Weaver 2019-01-17 - replace floor and floating point divide with integer divide
'grid( floor( p.x / cellSize ), floor( p.y / cellSize ) ) = p  // --- Add to the grid, so it can be checked.
grid( ( p.x \\ cellSize ), ( p.y \\ cellSize ) ) = p  // --- Add to the grid, so it can be checked.
activeList.append p                                           // --- Add it to the list of locations to explore next.
samplesCreated = 1
Try                                                           // --- Store the location
  Dim t as pds2DLoc
  t.x = p.x
  t.y = p.y
  locations.append t
End Try

Dim tooCloseForComfort, addedToGrid as boolean
Dim gridLoc, compareLoc, cSample as pds2DLoc                  // --- Using structures instead of objects, we improved performance.
Dim rndIdx, iteration, fnsm as integer
Dim radii, angle as double
Dim listCount as integer = actiVeList.ubound

do
  rndIdx = app.rand.number * ( listCount + 1 )                            // --- Pick a random item from the list of locations to explore
  p = activeList( rndIdx )
  addedToGrid = false
  
  For iteration = 1 to k                                      // --- Create k samples from our randomly selected existing point.
    
    'R. Weaver 2019-01-18 - Replace angle/radius random point generation
    'with uniform distribution sample rejection method
    
    'Original code:
    'angle = 0.01745329252 * ( rnd * 361 )                     // --- Random direction
    'radii = ( rnd * minRadius ) + minRadius                   // --- Random distance within minRadius & 2minRadius
    
    'cSample.x = ( radii * cos( angle ) ) + p.x
    'cSample.y = ( radii * sin( angle ) ) + p.y
    '
    '// --- Make sure the sample is within bounds.
    '
    'if cSample.x < 0 then cSample.x = 0
    'if cSample.x > width then cSample.x = width
    'if cSample.y < 0 then cSample.y = 0
    'if cSample.y > height then cSample.y = height
    
    'Sample rejection code
    dim x,y,xySQ As Double
    do
      x=rnd-0.5
      y=rnd-0.5
      xySQ=x*x+y*y
    loop until xySQ<0.25 and xySQ>0.0625
    cSample.x=x*minRadius4 + p.x
    cSample.y=y*minRadius4 + p.y
    
    'R. Weaver 2019-01-17 - replace floor and floating point divide with integer divide
    'gridLoc.x = floor( cSample.x / cellSize )                // --- Obtain the location on our grid.
    'gridLoc.y = floor( cSample.y / cellSize )
    gridLoc.x = ( cSample.x \\ cellSize )                // --- Obtain the location on our grid.
    gridLoc.y = ( cSample.y \\ cellSize )
    
    'R. Weaver 2019-01-18 - Replaces limit conditioning (above) with array index check
    'and reject the point rather than force it within bounds
    if gridLoc.x<0 or gridLoc.x>gridSize.Width or gridLoc.y<0 or gridLoc.y>gridSize.height then Continue
    
    if grid( gridLoc.x, gridLoc.y ) <> nil then continue     // --- if there's already one in here, we can skip this sample.
    
    // --- Now we check our neighbors to make sure that our sample is minRadius away from neighbors at home.
    tooCloseForComfort = false
    For fnsm = 0 to 7
      compareLoc.x = gridLoc.x + neighborhood( fnsm ).x
      compareloc.y = gridLoc.y + neighborhood( fnsm ).y
      
      // --- It's this location is out of bounds then skip to the next neighbor.
      if ( compareLoc.x < 0 or compareLoc.x >= gridSize.width ) or ( compareLoc.y < 0 or compareLoc.y >= gridSize.height ) then continue
      
      // --- If no neighbor is home, then skip to the next neighbor.
      if grid( compareLoc.x, compareLoc.y ) = nil then continue
      
      // --- If we're too close to the neighbor, we reject this sample and move to the next.
      'R. Weaver 2019-01-17 - remove sqrt and abs functions and replace minRadius with minRadiusSQ
      'if minRadius > sqrt( ( abs( cSample.x - grid( compareLoc.x, compareLoc.y ).x ) ^ 2 ) + ( abs( cSample.y - grid( compareLoc.x, compareLoc.y ).y ) ^ 2 ) ) then
      if minRadiusSQ > (( cSample.x - grid( compareLoc.x, compareLoc.y ).x ) ^ 2 ) + (( cSample.y - grid( compareLoc.x, compareLoc.y ).y ) ^ 2 )  then
        tooCloseForComfort = true
        exit
      end if
    Next
    
    if not tooCloseForComfort then // --- so we're cool with our neighbors.
      grid( gridLoc.x, gridLoc.y ) = new xojo.core.point( cSample.x, cSample.y ) // --- Add to the grid, so it can be checked.
      activeList.append grid( gridLoc.x, gridLoc.y )                             // --- Add it to the list of locations to explore next.
      addedToGrid = true
      
      Dim n as pds2DLoc                                                          // --- Store the location
      n.x = cSample.x
      n.y = cSample.y
      locations.append n
    end if
  Next
  
  samplesCreated = samplesCreated + k
  
  if not addedToGrid then activeList.remove rndIdx             // --- We couldn't place any more samples for this item, so we drop it.
  listCount = activeList.ubound
Loop until listcount = -1

timeTaken = microseconds - timeTaken

Actually, your displayed time is microseconds, not milliseconds. So, your code is just as fast.

Wow Robert; you really understand this a lot better than I do! There is a lot to digest in what you said, it may take me some time to go through your points.

Thank you so much for looking at my code and spending so much time on it, I hope that you enjoyed this process.

lol… I had assumed (obviously incorrectly) that as Microseconds are 1/1000000 that they were comparable to milliseconds.

I did some work a few years ago on Monte Carlo simulations, where it’s critical that the random numbers be uniformly distributed.

I should have added some better comments to the sample rejection method of generating points.

    do
      x = rnd - 0.5
      y = rnd - 0.5
      xySQ = x * x + y * y
    loop until xySQ < 0.25 and xySQ > 0.0625
    cSample.x = x * minRadius4 + p.x
    cSample.y = y * minRadius4 + p.y

It’s based on the circle formula:
x^2 + y^2 = Ro^2
where Ro is the radius.
So, if we generate two random points x and y, each of which is in the range -1 to +1, then they will fall somewhere inside a 2x2 square which has its lower left corner at -1,-1 and its upper right corner at +1,+1. To see if the point then falls inside a circle of radius =1, we just test to see if:
Ro^2 > (x^2 + y^2)
In the case of an annulus, we also need to test if the point is outside the inner circle. If the inner circle has a radius of Ri then the complete test of the point is:
Ro^2 > (x^2 + y^2) and Ri^2 < (x^2 + y^2)
In my code, in order to make the loop as tight as possible, the points are generated over a range of -0.5 to +0.5, for an outer radius of 0.5 and an inner radius of 0.25. The radius test values are then 0.5^2 = 0.25 and 0.25^2 = 0.0625. After finding a valid point, it is then scaled up by the minRadius factor.

Have you considered using Halton points instead of Poisson Disk points? They don’t look quite the same but they are similar, and Halton points are faster to generate. I’ve got some code for Halton points somewhere.

use log(a)-log(b) instead of a/b

[quote]In the case of an annulus, we also need to test if the point is outside the inner circle. If the inner circle has a radius of Ri then the complete test of the point is:
Ro^2 > (x^2 + y^2) and Ri^2 < (x^2 + y^2)[/quote]

I wonder if a brute force method may be faster than the elegant math?

If you want to see if a point is in a circle which is a certain size,
Create a picture, make it white
Draw a black filled circle on it of the right size

Then:

[code]//outside of loop
dim mypicture = new picture (sizex,sizey)
mypicture.graphics.forecolor = &cffffff
mypicture.graphics.fillrect 0,0,mypicture.width,mypicture.height
mypicture.graphics.forecolor = &c000000
mypicture.graphics.filloval 0,0,mypicture.width,mypicture.height

dim rgbs as rgbsurface = mypicture.rgbsurface

repeat
//get co-ords
until g.pixel(x,y).red < 10
//now x and y are definitely in our zone of interest
[/code]