## Fitting Polynomials Quickly

Given two points, it is easy to find the equation of a line between them. But how difficult is it to fit a parabola to three arbitrary points, or a cubic to four, etc.?

First I’ll consider a cold-blooded way to do this, then a recursive method.

Suppose you want to fit an $n$-degree polynomial perfectly through $n+1$ points,

$(x_1,y_1)$

$(x_2,y_2)$

${} \qquad \vdots$

$(x_{n+1},y_{n+1})$

You know the polynomial will have the form

$y = a_0 + a_1x + a_2x^2 + \cdots + a_nx^n$.

Plugging in the $n+1$ values of the $x$ and $y$ coordinates, we get $n+1$ linear equations in the variables $a_0 \ldots a_n$.

$a_0 + a_1x_1 + \cdots + a_nx_1^n = y_1$

$a_0 + a_1x_2 + \cdots + a_nx_2^n = y_2$

$\vdots$

$a_0 + a_1x_{n+1} + \cdots + a_nx_{n+1}^{n} = y_{n+1}$

which we can write in matrix form as

$\left[ \begin{array}{cccc} 1 & x_1 & \ldots & x_1^n \\ 1 & x_2 & \ldots & x_2^n \\ \vdots & \vdots \\ 1 & x_{n+1} & \ldots & x_{n+1}^n \end{array} \right] \left[ \begin{array}{c} a_0 \\ a_1 \\ \vdots \\ a_{n+1} \end{array} \right] = \left[ \begin{array}{c} y_1 \\ y_2 \\ \vdots \\ y_{n+1} \end{array} \right]$

We can solve the matrix equation using Gaussian elimination. This takes $O(n^3)$ arithmetic operations1.

For the next method, begin by considering three points to be fit by a quadratic. This would be quite simple if you knew the zeroes of the quadratic. Any quadratic with zeroes at $a$ and $b$ is of the form

$A*(x-a)(x-b)$.

The freedom in $A$ lets us fit an arbitrary third point $(c,y_c)$ by setting

$y_c = A(c-a)(c-b)$

$A = \frac{y_c}{(c-a)(c-b)}$.

So the full quadratic is

$y = \frac{y_c}{(c-a)(c-b)} (x-a)(x-b)$.

But this only works for special sets of points in which two $y$ coordinates are zero. However, we can use it as a starting point to fit a curve through any three points. The idea is to find the line that goes through the first two points. Subtract that from all three coordinates, and now you’ve reduced it too the previously-solved problem, with two points zero.

Specifically, if

$y = a_2x^2 + a_1x + a_0$

goes through

$(x_1,y_1), (x_2,y_2), (x_3,y_3)$,

then

$y - \left(y_1 + \frac{y_2-y_1}{x_2-x_1}x\right)$

goes through

$(x_1,0), (x_2,0), \left(x_3, y_3 - y_1 - \frac{y_2-y_1}{x_2-x_1}x_3\right)$

Letting

$a = x_1$

$b = x_2$

$c = x_3$

$y_c = y_3 - y_1 - \frac{y_2-y_1}{x_2-x_1}x_3$

and plugging into the solution to the quadratic with zeroes gives

$y - \left(y_1 + \frac{y_2-y_1}{x_2-x_1}x\right) = \frac{y_3 - y_1 - \frac{y_2-y_1}{x_2-x_1}x_3}{(x_3-x_1)(x_3-x_2)} (x-x_1)(x-x_2)$.

We can use the result to solve an arbitrary cubic in the same way. First, find the solution when three of four points are zeroes. Then go back to the four arbitrary points and fit a quadratic through the first three. Subtract that quadratic from all four points. You’ve got three zeroes and a fourth arbitrary point. Fit that with the solution from the first step, and add the quadratic back in to complete the solution.

Similarly, we could extend the method to fit $n+1$ points with an $n^{th}$ degree polynomial.

Let’s count the number of operations to implement this method. The first step, in which we solve the specialized problem with a bunch of zeroes, involves multiplying $n$ numbers and doing one division, because

$A = \frac{y_{n+1}}{\left( (x_{n+1} - x_1)*(x_{n+1} - x_2)*\ldots*(x_{n+1} - x_n)\right)}$.

The second step, fitting the first $n$ points to a degree $n-1$ polynomial, takes an unknown number of operations, since we aren’t done our count yet.

The third step, subtracting the degree $n-1$ polynomial from each of the $n$ points, takes $n$ operations.

The fourth step, substituting the modified points into the formula from step 1, takes $n$ operations, too.

That’s it – then we’re done. So overall, fitting with a degree $n$ polynomial takes $O(n)$ more steps than fitting with a degree $n-1$ polynomial. That implies that the overall procedure, including recursion, is $O(n^2)$, an improvement over Gaussian elimination.

reference
1) Strang, Gilbert Linear Algebra and Its Applications, 2nd ed.

pp. 4-5 explains the $n^3$ count for Gaussian elimination.