This is a fun problem. Not just because it's entertaining to look at it geometrically, but also because it exercises a lot of fundamental linear algebra muscle, which makes it an ideal review of concepts from time to time.
Let's say you have a point that is floating above some plane and you want to project that point onto the plane. The question of projection is really a question of finding the quickest route (in lamens terms) from point to the plane, which we will identify as . Another way to look at it would be to imagine dropping point straight onto the surface of the plane (imagining gravity to pull the point straight to the plane). From this analogy, the projection is then the spot (or point ) on the plane at which landed when it fell from its original position. Graphically it would look something like this:
But let's be a bit more precise when we talk about plane (or the surface atop which the point falls).
Two vectors are linearly independent if neither can be described in terms of the other (such as one being the multiple of the other). Another way of saying this is if two vectors are linearly independent, then neither is a linear combination of the other. However - and this is important - all linear combinations of two linearly independent vectors span (or fully make up) a plane. From this definition one can imagine a plane being a web-work of an infinite number of crisscrossing vectors, where each vector is some simple combination of two base vectors. So in linear algebra when we talk about a plane, we talk about the space spanned by the basis formed from two linearly independent vectors.
Given all of this so far we can begin to see what's really happening when we project a point onto a plane: the projection is the point on the plane - defined by a linear combination of the plane's basis vectors - that is closest to the point suspended off the plane (closest in terms of Euclidean distance). So what we're looking for is a linear combination of the plane's basis vectors that creates a point on the plane as close as possible to the point not in the plane.
What's interesting about all of this is there's no need to limit ourselves to a point and plane; we can just as easily talk about projecting a point onto a line, or a 3-dimensional subspace onto a 4-dimensional one, and so on. To see what I mean, let's look at a simple example of projecting a point onto a line.
Here we see point described by a vector . Our goal is to project onto the line spanned by the unit vector (see how even when we're not talking about a plane, we're still talking about a space being spanned by a linear combination of one or more vectors; in this case the line we're projecting onto is a space spanned by linear combinations of a single unit vector that lies along the line). If we continue our analogy of dropping the point (or vector ) onto the subspace, then can be seen as the path taken by as it falls onto to point .
Let's pause for a moment and think about what's happening here. If we were to "drop" the point straight onto , the path taken () would be exactly perpendicular or orthogonal to (and therefore each and every linear combination of our unit vector , including itself). This exact same situation happens if we were projecting a point onto a plane, but this time the vector which follows the projection is a linear combination of each and every vector which spans the plane instead. And again, this concept can be extended to higher level dimensions.
Continuing on, vector subtraction shows can also be defined as
And since we also know that is a linear combination of some basis for the line (defined by some scalar applied to ), we can then define as
Combining (1) and (2) gives us
Going back to our earlier observation regarding orthogonality, taking the dot product of any vector within the line to the vector should result in zero, then. Therefore
And finally, after some algebraic manipulation, we see that
So the projection of onto is simply scaled by (multiplied by) (which is the dot product of and divided by the dot product of with itself).
Can we extend this to projecting a point onto a plane? Of course. The only difference is that instead of projecting a point onto a space spanned by a vector, we are projecting a point onto a space spanned by two vectors; something which can easily be encoded in a matrix. So we simply make a matrix, say . Then we update our equation like this:
Therefore mutiplying by the basis of our plane (which is ) will identify the point on the plane which is the projection of .