This article is focused on showing how a point on the loss surface is equivalent to a line between X and Y. The example that I have taken here is of a simple linear regression model between 2 variables sunshine (in hours) and attendance (in thousands).
# Import libraries
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
# Importing the dataset
dataset = pd.read_csv('sunshine.csv')
# Check the data
dataset.head()
# Check correlation between dependent and independent variables
dataset.corr()
# Assign columns to X and y
X = dataset.iloc[:, [0]].values
y = dataset.iloc[:, 1].values
print(X.shape)
print(y.shape)
# Check the scatter plot
plt.scatter(X, y)
plt.xlabel("Sunshine in hrs")
plt.ylabel("Attendance in '000s")
plt.title("Sunshine vs Attendance")
plt.show()
# Create LinearRegression model
from sklearn.linear_model import LinearRegression
# Create linear regression object
model = LinearRegression()
model.fit(X, y)
print(model.coef_)
print(model.intercept_)
# Draw the predicted line
plt.scatter(X, y)
plt.plot(X, model.predict(X))
plt.xlabel("Sunshine in hrs")
plt.ylabel("Attendance in '000s")
plt.title("Sunshine vs Attendance")
plt.show()
Now the best-fit line has a loss which is defined as the Least Sum of Squared Errors i.e. L2 loss which has the formula.
Min of Σ(Actual y – Predicted y)2
So for coefficient 5.45 the loss is
Let’s plot this loss against the coefficient and our regression line side-by-side
loss = sum((y - ypred)**2)
plt.scatter(model.coef_, loss)
plt.xlabel('w')
plt.ylabel('loss')
plt.show()
Now, let's change the coefficient range from 2.5 to 9 and plot the different lines that we get.
So for each coefficient, you get a line and a corresponding loss. So each loss point on the LHS figure is actually a regression line on the RHS figure. We have ignored the bias/intercept so far in this visualization.
Plotting L2 loss
Suppose we plot for the bias, we will follow the curve. The L2 loss function is quadratic in nature hence we get a shaped curve.
slope = np.arange(2.5, 7.5, 0.5)
bias = np.arange(13.2, 18, 0.5)
w0, w1 = np.meshgrid(slope, bias)
ypred = w0 * X + w1
loss = np.power((y - ypred), 2)
fig = plt.figure()
ax = fig.gca(projection='3d')
surf = ax.plot_surface(w0,
w1,
loss,
label="Loss surface",
cmap='viridis',
edgecolor='none')
surf._facecolors2d = surf._facecolors3d
surf._edgecolors2d = surf._edgecolors3d
ax.set_xlabel('Slope')
ax.set_ylabel('Bias')
ax.legend()
Geometrically loss function is a convex function as shown above.
Plotting L1 Loss
Similarly, you can plot the L1 loss which is abs(y-ypred). Here there is no quadratic term. So how does the geometry of this loss function look? It looks V-shaped.
You can visualize the other loss functions in the same way.
I have made a video on this topic and uploaded it here.
The code is also uploaded.