Force piecewise linear fit through data
April 15, 2018
I’ve received a few request for pwlf to perform fits through a particular data point, or set of data points. For instance, you may know that your model must go through the origin (0, 0). We can use Lagrange multipliers to solve a constrained least squares problem to find the best piecewise linear fit while forcing the fit through a set of data points.
Constrained least squares fit
So from my previous post, we have defined a piecewise linear regression problem as
where \(\mathbf{A} \) is our regression matrix, \(\mathbf{\beta} \) is our unknown set of parameters, and \(\mathbf{y} \) is our vector of \(y \) values. We have a set of data points at \((\mathbf{w},\mathbf{z}) \) locations where \( w\) is the \(x \) locations, and \( z\) is the \(y \) locations. We we want to force our fit to go through \((\mathbf{w},\mathbf{z}) \).
I searched for a good explanation of constrained least squares problems, and I think the upcoming book by Boyd and Vandenberghe [1] is great. Essentially we need to construct a constraint matrix \(\mathbf{C} \), such that
is strictly enforced. The constraint matrix \(\mathbf{C} \), is of the exact same form as \(\mathbf{A} \), but evaluated at the \(w \) values as oppose to the \(x \) data points. This means \(\mathbf{C} \) will be a matrix with the number of constraints as rows, and one plus the number of line segments as columns. Using a Lagrangian formulation defined in [1], we can set up the constrained least squares problem as
where \(\mathbf{\zeta} \) is some set of Lagrangian multipliers which will be solved along with \(\mathbf{\beta} \). This is the KKT equations for a constrained least squares problem. We are left with a square matrix, which can be used to solve for our unknown set of \(\mathbf{\beta} \) parameters.
Python implementation in pwlf
This has now been implemented into pwlf with the latest version. You can now fit piecewise linear function for a particular number of line segments, and force the fit to go through a particular set of points.
This example shows how you can force a fit through the origin given a set of x y data.
import numpy as np
import pwlf
# Arbitrary sin wave
x = np.linspace(0.0, 1.0, num=100)
y = np.sin(6.0*x)
# initialize pwlf
myPWLF = pwlf.PiecewiseLinFit(x, y, disp_res=True)
x_c = [0.0]
y_c = [0.0]
# fit three line segments, and force fit through (x_c,y_c)
res = myPWLF.fit(3, x_c, y_c)
This will give you a similar fit to the following, which has been forced to go through the origin.
References
[1] Boyd, Stephen and Vandenberghe, Lieven. Introduction to Applied Linear Algebra - Vectors, Matrices, and Least Squares. Chapter 16 - Constrained least squares. Cambridge University Press. 2018. link