is an estimate
for
, then the average of
and
/
is an even better estimate for
.
First consider

And now compare distances

What this means,
in terms of our algorithm, is that a new estimate
and its companion
estimate
/
are only half as far apart as
and its companion
estimate
/
.
In other words, using induction, we would have

Since the right-hand side of this inequality tends to zero,
we find that
and
/
converge to the same limit, say
.
This common limit is necessarily
because it satisfies

Finally, now
that we know the limit, we can speak of the rate of convergence. For this
we (essentially) repeat our first calculation:

where
is a constant.
Now in our case, if we set
=
, then we have
(at least for
).
Hence,

This is what is meant by "quadratic convergence". It's a good thing
to know: A quadratically convergent sequence zooms toward its limit!
For example, if
< 1, then we would have

Thus, if
(or any other
)
is less than 1 unit away from
,
then
tends to
faster than
tends to 0,
where
< 1.
(In other words, pretty darn fast!)
Et voila!
We now know that the Babylonian method for computing
converges, and
converges rapidly.
You
can call it quits,
of course, but you might also be interested in
a related algorithm.