Steve says: When I made this problem I
wondered how people might approach its solution. In my notes,
I recorded that "This quadratic does the trick $y= 2(x-1)(x+1) -1$"
and was delighted to see Herschel's and Ben's wonderful analyses by
the least squares method improving upon this. I include both of
these in full.
Both Ben, from Kenny, and Herschel, from
the European School of Varese, realised that minimising the squared
distance from the points reduced to an (involved!) exercise in
calculus (Ben provides the details in his write-up). Both of the
solvers discovered the 'ideal' solution
$$
f(x)= \frac{100}{49}x^2-\frac{20}{7}
$$ The least squares analysis needed to find
this points towards the sort of thing that goes on at university,
so very well done for thinking about this!
Herschel gave a lovely account of the
thought process:
At first, when considering this problem, the only simple approach
seems to be trial and error - experimenting with different
values for $a$, $b$ and $c$ in $y=ax^2+bx+c$.
Since the points are symmetric about the vertical $y$-axis, we can
reduce this to $y=ax^2+c$, so that our parabola is also symmetrical
about the $y$-axis. One could try various combinations of $a$ and
$c$ and search for a good combination, but there is no
guarantee of a simple solution. Also, we need a way to measure the
distance from a parabola to a point, which isn't easy if you don't
know what the parabola is in the first place!
Mathematicians trying to fit a parabola to a set of data often use
the "Least squares" method. Instead of trying to find the
perpendicular (and usually diagonal) distance between the point and
the curve, we will measure only the vertical distance. The distance
between a point and the parabola could be positive or negative
depending on whether the point is above or below the parabola. We
want to find the sum of all these distances (errors) so we first
make them all positive by squaring them (hence the name "Least
squares" - we try to minimise the sum of the square of the
distances). We can actually form a set of equations that use the
Least Squares principle. Using some differentiation and calculus
(which I won't delve into now), they can be reduced to a set of
linear equations:
The third of these equations tells us straight away that $b=0$,
although we already knew that! Solving the other two equations give
us: $a=\frac{100}{49}, c=-\frac{20}{7}$ Thus, our "ideal" quadratic
according to this method would be
$$
y = \frac{100}{49}x^2 -\frac{20}{7}
$$
Of course, we could round this to $a=2$ and
$c=-\left(2+\frac{2}{3}\right)$, or even further to $c=-3$, so that
we get a nice and simple final result of $y=2x^2-3$. Some might say
that we could've just found this by trial and error, and indeed,
they would be right - we would have skipped all the calculus and
algebra and got to the answer much sooner! But where's the fun in
that, eh? :)
For the second part of the question
Ben looked at finding the smallest distances between the points and
curve, and realised that the perpendicular distances are what are
needed.
The smallest distance will lie along the normal of the line at a
point where the normal also crosses the point given. In other
words, a point on the line where the normal to that part of the
line crosses the given point is the closest part of the line. So
that means that first we must work out the derivative of $f(x)$; We
can then include this in the formula for a straight line to work
out the normal at each point and denote this as $g(x)$:
$$
g(x) = \frac{100}{49}x^2_0 -\frac{49}{200 x_0}x
-\frac{3657}{1400}
$$ Ben then found the normals which pass
through the points and solved numerically to show
that:
... it may not be close for each predicted y-coordinate, but
it does get very close for the iterated points and never exceeds
being $0.2$ units away from any of the points given.
Ben rather cunningly also found another
interesting solution by considering functions which pass exactly
through the points and noting that splicing together two quadratics
yields a solution.
As well as noticing that $f(x) = f(-x)$ I noticed another
interesting property of this function (assuming the function goes
through all the points). Imagine that it was a standard $x^2$
graph when $x> 0$, then the x values needed to pass through
these points would be $0, 2, 4$. If you match these up with the
ones we are given and call the standard ones $u$ and the given ones
$x$ then you'll notice a basic pattern emerge if we
map $x\rightarrow u$: $1\rightarrow 0, 2\rightarrow 2,
3\rightarrow 4, \dots, x\rightarrow 2x-2$. Therefore
$$
u=2(x-1) \mbox{ if } x> 0
$$
If you do this with $x< 0$ then the pattern becomes
$$
u=2(x+1) \mbox{ if } x< 0
$$
This means that u can be written as a function of $x$ (ie
$u=g(x)$):
$$
g(x) = 2(x-1) \quad(\mbox{ if } x> 0)\quad;\quad 2(x+1)
\quad(\mbox{ if } x< 0)
$$
This function returns the value of u, which we can then square to
get the actual value of the function. Or mathematically
$$
f(x)=g(x)^2
$$
We can also simplify the function $g$ using the sgn function which
returns the sign of the variable. In other words $sgn(+5)=+1$,
$sgn(-123)=-1$, $sgn(0)=0$. Also note that $sgn(x)^2 = 1$, unless
$x=0$. This simplifies $g(x)$ to
$$
f(x)=2(x-sgn(x))
$$
This can then be substituted into f(x) and simplified to give
$$
f(x)=4(x-sgn(x))^2 = 4x^2-8|x|+4
$$
This can be expressed as
$$
f(x)=4x^2-8\sqrt{x^2}+4
$$
This formula works, but breaks down when asked what $f(0)$ is, and
its not really a proper quadratic, but other than that it can get
all the right results and is 0 units away from every point.