Linear Regression

Linear Regression #

Infographic #

The linear regression infographic is shown below.

alt text

Simple Linear Regression #

Ordinary Least Squares (OLS) can be used to fit a linear line to noisy data.

If \(y\) is a vector of measured data, \(\beta_{0}\) and \(\beta_{1}\) are the actual linear model parameters (intercept and gradient) and \(\epsilon\) represents the error vector, the model can be expressed as:

\[y_{1} = \beta_{0} + \beta_{1}x_{1} + \epsilon_{1} \\y_{2} = \beta_{0} + \beta_{1}x_{2} + \epsilon_{2} \\\vdots \\y_{n} = \beta_{0} + \beta_{1}x_{n} + \epsilon_{n}\]

Which can be expressed as:

\[\begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix} = \begin{bmatrix} 1 & x_{1} \\ 1 & x_{2} \\ \vdots & \vdots \\ 1 & x_{n} \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \end{bmatrix} + \begin{bmatrix} \epsilon_{1} \\ \epsilon_{2} \\ \vdots \\ \epsilon_{n} \end{bmatrix}\]

Or in matrix form:

\[y=X \beta + \epsilon\]

For some estimate \(\hat{\beta}\) of the model parameters, the error and error squared are defined by:

\[\epsilon = y - X \hat \beta \\\epsilon^T\epsilon = (y - X \hat \beta)^T(y - X \hat \beta)\]

Expanding the above equation gives:

\[\\\epsilon^T\epsilon = y^Ty - y^T (X \hat \beta) - (X \hat \beta)^Ty + (X \hat \beta)^T(X \hat \beta) \\\epsilon^T\epsilon = y^Ty - (X \hat \beta)^T y - (X \hat \beta)^Ty + (X \hat \beta)^T(X \hat \beta) \\\epsilon^T\epsilon = y^Ty - 2(X \hat \beta)^T y + (X \hat \beta)^T(X \hat \beta) \\\epsilon^T\epsilon = y^Ty - 2\hat \beta ^T X^T y + \hat \beta^T X^T X \hat \beta\]

To find \(\hat \beta\) which minimises the square of the errors, the above equation can be differentiated and set equal to 0:

\[\frac{\partial [\epsilon^T\epsilon]}{\partial \beta} = - 2 X^T y + 2 X^T X \hat \beta = 0\]

Therefore,

\[X^T X \hat \beta = 2 X^T\]

Finally this can be rearranged to give the familiar OLS equation below which is the coefficient vector of model parameters:

\[\\\hat \beta = (X^T X)^{-1} X^T y\]

Multiple Linear Regression #

The example above demonstrates fitting a linear model with intercept and gradient parameters to noisy data. We can also use the OLS equation to fit a higher order polynomial equation to noisy data where in this case the independent varaible is squared. Although this somewhat goes against intuition a linear relationship still remains between the independent variables ( \(x^0\) , \(x^1\) and \(x^2\) ) and the dependent variable \(y\) , so we can still use the OLS equation. We are using linear regression to fit a quadratic model, this is termed Miltiple Linear Regression since the number of independent variables is now > 1.

Along the same lines as the previous post, let’s consider a 2nd order polynomial model as follows, again \(y\) is a vector of measured data, however this time \(\beta_{0}\) , \(\beta_{1}\) and \(\beta_{2}\) are the quadratic model parameters, \(x\) represents the independent variable and \(\epsilon\) represents the error vector, this model can therefore be expressed as:

\[y_{1} = \beta_{0} + \beta_{1}x_{1} + \beta_{2}x_{1}^2 + \epsilon_{1} \\y_{2} = \beta_{0} + \beta_{1}x_{2} + \beta_{2}x_{2}^2 + \epsilon_{2} \\\vdots \\y_{n} = \beta_{0} + \beta_{1}x_{n} + \beta_{2}x_{n}^2 + \epsilon_{n}\]

As mentioned previously this non-linear model can effectively be considered a linear model if, instead of one independent variable \(x\) , we consider the model to have 3 independent variables \(x^0\) , \(x^2\) and \(x^3\) . Then we can write the model in the required linear form:

\[\begin{bmatrix} y_{1} \\ y_{2} \\ \vdots \\ y_{n} \end{bmatrix} = \begin{bmatrix} 1 & x_{1} & x_{1}^2 \\ 1 & x_{2} & x_{2}^2 \\ \vdots & \vdots \\ 1 & x_{n} & x_{n}^2 \end{bmatrix} \begin{bmatrix} \beta_{0} \\ \beta_{1} \\ \beta_{2} \\ \end{bmatrix} + \begin{bmatrix} \epsilon_{1} \\ \epsilon_{2} \\ \vdots \\ \epsilon_{n} \end{bmatrix}\]

Or in matrix form:

\[y=X \beta + \epsilon\]

And as per the simple linear regression example, we can calculate \(\hat \beta\) from the OLS equation:

\[\\\hat \beta = (X^T X)^{-1} X^T y\]

Python Implementation #

Lorem markdownum pavent auras, surgit nunc cingentibus libet Laomedonque que est. Pastor An arbor filia foedat, ne fugit aliter, per. Helicona illas et callida neptem est Oresitrophos caput, dentibus est venit. Tenet reddite famuli praesentem fortibus, quaeque vis foret si frondes gelidos gravidae circumtulit inpulit armenta nativum.

  1. Te at cruciabere vides rubentis manebo
  2. Maturuit in praetemptat ruborem ignara postquam habitasse
  3. Subitarum supplevit quoque fontesque venabula spretis modo
  4. Montis tot est mali quasque gravis
  5. Quinquennem domus arsit ipse
  6. Pellem turis pugnabant locavit

Do this with twinkly.io #

We are currently working on adding this capability in twinkly.io. Please check back at a future date.

comments powered by Disqus