Square Root Part 2 (or 1.414, If You Prefer)

Last time I talked about an efficient (and old) technique for generating square root estimates. As you might expect, there are many different ways to do this calculation, some of which are more amenable to computers than others. I was browsing http://is.gd/UZgRhd to see if there were any other methods that made sense on restricted CPUs.

One algorithm that caught my eye was the “digit-by-digit” binary algorithm. If you aren’t old enough to have done square roots by hand (or, like me, you just forgot), you might start by reading the base 10 version first – if you understand that one, the base 2 version is the same idea but applied to base 10.

This is one of those algorithms that is easier to code than it is to explain. So with that in mind, here’s the code slightly adapted from the code in the Wikipedia article:

/* Another square root -- Williams http://www.ddj.com/embedded */
#include <stdio.h>

#ifdef TEST
#include <stdlib.h> /* random numbers */

/* Work out the square root bit by bit */
#define BITS 32   // set for the number of bits in an int

int bit_isqrt(int num, float *v) 
     int num0=num;
     int result=0; 
     // The second-to-top bit is set: 1L<<14 or 1L<<30 or whatever for your sizeof(int)
     int tbit=1<<(BITS-2); 
     // tbit starts at the highest power of four <= the argument.
     while (tbit>num0)
     while (tbit!=0) 
	 if (num0>=result+tbit) 
     if (v) *v=(float)result;
     return result;

#ifdef TEST

/* Simple routine to drive the calculation and print stats */
void debugsqrt(int x)
  float v;
  int iv=bit_isqrt(x,&v);
  printf("Square root of %d = %d  %f",x,iv,v);
#ifdef TEST
  printf(" Check=%f error=%f (%f%%)\n",v,v-x,((v-x)/x)*100);

/* Simple main */

// generate random numbers from 0 to MAXTESTINT-1
#define MAXTESTINT (10000)
// Random number seed change for different sequence of randoms */
#define SEED 1

void main(int argc, char *argv[])
  int i;
  debugsqrt(9);  /* known test case ! */
  debugsqrt(512);  /* known test case ! */
  for (i=0;i<100;i++) 


If you compare the test harness output with last time’s program, you’ll see that this algorithm is a little less accurate because it rounds down. So the square root of 512 shows up as 22 instead of 23 (it’s really about 22.6, so 23 is closer). You could fix that by always calculating the error of the answer and the answer plus one. But that’s two extra multiplies. Or you can do fixed-point math to get some data to the left of the decimal point.

You can extend the algorithm to do fixed point math pretty easily. Or if you don’t mind a little floating point, you can first multiply the input by a perfect square scaling factor. Then the answer can be divided ( ***a***that’s the floating-point part) by the square root of the scaling factor.

For example, if you wanted the square root of 2 with a few digits of precision, yo ***a***u could multiply by 256 (shift left 8 times) and actually find the square root of 512. Then divide the answer by 16. That gives you 1.375, which is not too far off from 1.414 (about 3%).
I think I’ll stick with the old EINAC algorithm for now. But if I had to do a fixed-point square root, this algorithm would definitely be one to try.