marching squares

a short post about the marching squares algorithm, a technique mostly used to vectorize the outline of a binary image.

it’s a very well known & very well spread technique, I happened to have some spare time & the urge to do something graphic.
for a start let’s look at how things work.

the idea is to draw a shape through the edges of cells which vertices are “weighted” ( each point of the cell has a value ). in our case, we use an image to represent the cells’ grid ( an image is a grid of pixels ) & if we use a binary image (black & white, no greyscale), the weights are the colors of the pixels ( 0 or 1 ) . then we need a structuring element or kernel to scan our grid & find the outline.

the algorithm goes as follow:

  1. apply a threshold to the image to obtain a binary image
  2. find the first non transparent pixel
  3. position the kernel at this location
  4. while the kernel has not reached the starting location
    • move the kernel around the shape
    • store the new location
    • if the number of iteration is too high, break
    • else continue

the first step is fairly easy ; when you access the imageData of a given canvas, you can scan all the pixels’ components ( r,g,b or a ) & match them against a given value.

a good idea though is to reduce the image’s colors before performing the marching square process ; this can be done quite simply either beforehand by using a restricted palette or on the fly by applying a simple reduction algorithm like:

which will give you something like

segmentation

the idea is simple, for each component of the color, first normalize it ( data[ X ] / 0xFF ) then interpolate between 0 & the desired amount of steps ( * steps ) & Math.ceil it ( parseInt( X + .5 )  ) then multiply this by the value of a color step ( 0xFF / steps ).

the lower the number of steps, the easier it should be to vectorize, adding a blur step before reduces the noise & gives more usable regions.

to understand what comes next, you may need to understand the notion of kernel but there’s this example by Sakri that shows exactly what the last step of the algorithm is about:  http://codepen.io/sakri/pen/aIirl

as we’re using a simple 2 * 2 kernel, there are very few possibilities ( only 16 actually ) to move the kernel when looking for the direction. here’s a table with the integers 0-15 in binary form, the 4 rectangles below the figure represent the bytes ( white=0, black = 1 ) & the squares are the kernel itself ; what we’ll use to actually determine the direction to move to (represented by the arrow above the box).

binary_table

by sampling the right, bottom & bottom right pixels, this table allows us to quickly determine where we should go next but there are 2 flaws.

a minor flaw is that it contains both a 2 * 2 void square ( 0000 ) and a 2 * 2 full square ( 1111 ). this is not that bad ; as we select the first non-transparent pixel, it is usually located on the top left area of the picture so 0000 will head towards the bottom right & 1111 will head towards the top left.

the second flaw is more embarrassing though, it rises about the values 6 ( 0110 ) & 9 ( 1001 ) and form a diagonal pattern. in these cases, there is no way to tell where to send the kernel next, ending up in infinite loops. there are ways to deal with this problem but that’s beyond the scope of this article.

to prevent the infinite loops, I tried simply not to use those patterns.

which failed of course.

but it gave me an idea ; why not alter the binary shape so that this pattern doesn’t occur?

enter the Morphological operators.
the morphological operators work on binary images & perform very simple tests between a pixel & its neighborhood to determine whether or not to toggle a pixel.

this simple operators fill the smaller gaps that cause most of the infinite loops and applying this during the collection phase ( step 1 of the algorithm ) allows for not having to bother with ambiguous case anymore.

let’s take this burning head I did so it has a fair amount of ambiguous cases (click to view full size) especially in the bottom right corner.

burning_head

trying a brute force marching square will fail, but applying the following kernel:

where sum is the sum of the neighboring pixels’ values, allows to preserve only the outline of the shape and gives the red outline (click to view full size)

burning_head_border

if you look closely, you can follow the whole border without finding any ambiguous case + smaller elements (namely single pixels) are being discarded. now instead of 16 possibilities, we end up managing only 12 the Lookup table now looks like this:

binary_path

 

the burning head vectorized output looks like this

burning_head_vector

the blue area is the vectorized outline and the orange line & dots is the simplified version of the path. nice isn’t it?

here’s the code of the marching squares method:

it’s intentionally unoptimized for legibility but one would probably merge & inline the morphological operator, unfold & reorganize the way getDirection sets the x/y directions, reuse the canvas & the buffer instead of recreating it a.s.o.

and a standard call would go like :

for instance calling this on a unicorn picture would give this kind of output

unicorn_vector

one of the greatest benefits of vectorization being to make images scale-free, my favourite results come from smaller images ; here are 2 32 * 32 pictures vectorized & simplified:

gunmonster

gun_monster

that’s far from bulletproof and I wouldn’t recommend using it in production but that’s a fun little algorithm to play with. it allows on the fly vectorization of free shapes to use in physics games for instance.

possible enhancements include, multiple shapes vectorization & triangulation, using better interpolation techniques than binary, using integral images (?!) also, I’d recommend reading this lovely article: Metaballs and marching squares by Jamie Wong.

that was it!

enjoy :)

One Comment

  1. forouzan

    Hello.

    I want to use Marching square algorithm to detect the outline of a bitmap image in unity. Could you help me what should I do? Could I translate this code to c# for that?

    Thanks

Leave a Reply to forouzan Cancel reply

Your email address will not be published. Required fields are marked *