In our last article we looked at how the dimension of data space impacts Machine Learning algorithms. This is often referred to as the curse of dimensionality.
At the heart of the article we discussed the fact that an hypersphere hyper-volume trends to zero as dimension increases.
Here I want to demonstrate how to find the hyper-volume of an hypersphere of dimension N.
The Math Reference Project gives a short & sweet demonstration. I personally found it hard to follow. Foundations of Data Science by John Hopcroft & Ravindran Kannan (chapter 2) starts a demonstration but does cut short.
I wanted to contribute a complete demonstration because I just love that type of mathematical problem. It’s one of my many flaws.
We’ll use the Cartesian coordinates and the fact that the volume of an hypersphere of dimension N can be found by integrating the volume of an hypersphere of dimension N-1 with an infinitesimal thickness:
Well, : it’s a line.
We already know the result should be , but let’s demonstrate it.
We know the result should be , but again, let’s demonstrate it.
Let’s find the hyper-volume of an hypersphere of dimension 4.
Now we have quite some practice. Let’s try to generalize the hypersphere volume formula.
First let’s assume the volume formula has the following form:
Where is a constant (independent of R). We’ll see that we only need to assume that form for the volumes of N-1 and less. Since we already know it to be true for N <= 4, it isn’t a strong assumption.
With that, let’s proceed:
We’re dealing with a recursion here, so let’s rewrite this equation in terms of two sets of constants:
Let’s work on the set of constants C. We know the first couple of values:
We can also obtain a recursive expression.
If we observes that
Then we can write
Therefore we found that
Which gives us an explicit formula for the volume of an hypersphere in N dimensions.
Given the formula for , it is easy to it is a product of smaller and smaller terms.
As soon as N becomes bigger than (i.e. at N=6), the terms becomes smaller than 1 and therefore the products start to shrink.
This is why the hyper volume vanishes as N grows towards infinity.
We can then compute values (for R=1):
which corresponds to what we gave in our last article.
We demonstrated how to find the hyper volume of an hyper sphere of dimension N and could rigorously find that the hyper volume vanishes as the dimension grows.
That result is counterintuitive and this is why we thought a mathematical proof was warranted.