Introduction
Regression is one of the most important tools in machine learning. It’s used to predict numbers, like estimating house prices, forecasting stock trends, or analyzing customer behavior. Simply put, regression helps us understand the relationship between different variables and make accurate predictions. In this blog post, we’ll explore the top 10 regression techniques you need to know in 2025. These techniques are powerful tools for solving a range of problems, from simple predictions to more complex scenarios involving large datasets.
Each method has unique strengths, and choosing the right one can greatly improve the accuracy of your results. Don’t worry if you’re new to regression—this guide will explain everything.
By the end, you’ll have a solid understanding of these techniques, how they work, and where to apply them in real-world projects. Let’s get started!

1. Linear Regression: One of the Simple and Powerful Tool of regression techniques
What is Linear Regression?
Linear regression is one of the simplest and most commonly used techniques in machine learning. It helps us understand the relationship between two variables by fitting a straight line through the data points. This line is used to predict the value of one variable based on the value of the other. In other words, linear regression answers the question: “How does one thing change when another thing changes?”
The formula for linear regression is expressed as:
Y = mX + b
Where:
- Y is the dependent variable (the value you want to predict).
- X is the independent variable (the variable you use to predict Y).
- m is the slope of the line (indicating how much Y changes when X changes).
- b is the y-intercept (where the line crosses the Y-axis when X = 0).
How Does Linear Regression Work?
Let’s walk through an example to make it clearer:
Imagine that you’re a real estate agent, and you want to predict the price of a house based on its size (in square feet). Here is the collected data of several houses, noting their sizes and prices:
House Size (X) | Price (Y) |
---|---|
1,000 sq ft | $150,000 |
1,500 sq ft | $200,000 |
2,000 sq ft | $250,000 |
2,500 sq ft | $300,000 |
Now, you have to use linear regression to predict the price of a house based on its size. The technique will find the best line that fits this data, so you can use the size (X) to predict the price (Y).
In this case, the relationship between the size of the house and its price appears to be linear: as the size increases, so does the price. Linear regression will calculate the slope (m) and intercept (b) for the best-fit line that represents this relationship.
Example: Applying Linear Regression Techniques in Python
Here’s a Python implementation of linear regression using the scikit-learn library, which is a popular machine learning library. We’ll use the same house pricing example to illustrate how linear regression works in practice.
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
# Sample data: House size in square feet and their corresponding prices
house_size = np.array([1000, 1500, 2000, 2500]).reshape(-1, 1) # Independent variable (X)
house_price = np.array([150000, 200000, 250000, 300000]) # Dependent variable (Y)
# Splitting the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(house_size, house_price, test_size=0.2, random_state=42)
# Initializing the Linear Regression model
model = LinearRegression()
# Training the model
model.fit(X_train, y_train)
# Making predictions on the test data
y_pred = model.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
# Displaying the results
print(f"Mean Squared Error: {mse}")
print(f"R-squared: {r2}")
# Plotting the data and the regression line
plt.scatter(house_size, house_price, color='blue', label='Data Points')
plt.plot(house_size, model.predict(house_size), color='red', label='Regression Line')
plt.title('House Price Prediction')
plt.xlabel('Size of House (sq ft)')
plt.ylabel('Price ($)')
plt.legend()
plt.show()

Explanation of the Code:
- Importing Libraries:
We use libraries likenumpy
for handling numerical data,matplotlib
for plotting the graph, andsklearn
for building and evaluating the model. - Data Preparation:
house_size
is the independent variable (X), which contains the sizes of the houses.house_price
is the dependent variable (Y), which contains the corresponding house prices.
- Train-Test Split:
We split the data into training and testing sets usingtrain_test_split
. 80% of the data is used to train the model, and the remaining 20% is used to test the model. - Model Creation:
We initialize theLinearRegression()
model and train it using the training data (X_train
,y_train
). - Prediction:
After training the model, we use it to predict the house prices for the test data (X_test
). - Model Evaluation:
We evaluate the model by calculating:- Mean Squared Error (MSE): A measure of how close the predicted values are to the actual values. Lower MSE indicates better performance.
- R-squared (R²): This tells us how well the regression line fits the data. An R² value closer to 1 means the model explains most of the variance in the data.
- Plotting the Results:
The code also plots the original data points and the regression line, helping visualize how well the model fits the data.
When to Use Linear Regression
Linear regression is ideal when:
- There is a clear, linear relationship between the variables.
- The data is not too complex or noisy.
- You are working with numerical data.
Limitations of Linear Regression
While linear regression is a great starting point, it does have some limitations:
- Assumes linearity: Linear regression assumes that the relationship between the variables is linear. If the relationship is more complex (e.g., exponential), linear regression might not work well.
- Sensitive to outliers: Outliers (extreme data points) can significantly affect the slope and intercept, leading to inaccurate predictions.
- Multicollinearity: If multiple independent variables are highly correlated with each other, it can create problems in the model’s performance.
Must Read
- How to Write a Python Program to Find LCM
- Complete Guide to Find GCD in Python: 7 Easy Methods for Beginners
- How to Set Up CI/CD for Your Python Projects Using Jenkins
- The Ultimate Guide to the range() Function in Python
- How to compute factorial in Python
2. Regression Techniques of Logistic Regression: Predicting Probabilities and Classifications
What is Logistic Regression?
While linear regression helps predict continuous values (like house prices), logistic regression is used when we need to predict categorical outcomes. For example, you might want to predict whether an email is spam or not, or if a patient has a disease or doesn’t.
Unlike linear regression, which predicts continuous values, logistic regression is designed to predict the probability of a binary outcome (yes/no, 0/1, true/false). The result of logistic regression is a probability that is then mapped to one of the two categories.
How Does Logistic Regression Work?
The formula for logistic regression is based on the logistic function, also known as the sigmoid function. It transforms the output of a linear equation into a value between 0 and 1, which is perfect for binary classification.
The equation for logistic regression is:

𝑃(𝑌=1∣𝑋) P(Y=1∣X), based on input features.
Where:
- P(Y=1 | X) is the probability of the positive class (e.g., “spam” or “disease”).
- X is the independent variable(s) (input features).
- b_0 and b_1 are the model parameters (intercept and slope).
- e is Euler’s number (a mathematical constant).
The output is always a probability between 0 and 1. If this probability is greater than 0.5, we classify it as 1 (positive class), and if it’s less than 0.5, we classify it as 0 (negative class).
Example: Predicting Email Spam or Not (Binary Classification)
Let’s consider a simple example where you want to predict whether an email is spam or not based on the number of links in the email.
Here’s a dataset for training:
Number of Links (X) | Spam (Y) |
---|---|
2 | 0 |
5 | 1 |
7 | 1 |
3 | 0 |
6 | 1 |
In this case:
- X (Number of Links) is the feature.
- Y (Spam or Not) is the target variable (0 for not spam, 1 for spam).
Python Code for Logistic Regression Techniques
Let’s see how to apply logistic regression in Python using the scikit-learn library to predict whether an email is spam or not based on the number of links in the email.
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix
# Sample data: Number of links in email and whether the email is spam (1) or not (0)
X = np.array([2, 5, 7, 3, 6]).reshape(-1, 1) # Feature: Number of Links
Y = np.array([0, 1, 1, 0, 1]) # Target: Spam (1) or Not Spam (0)
# Splitting the data into training and testing sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2, random_state=42)
# Initializing the Logistic Regression model
model = LogisticRegression()
# Training the model
model.fit(X_train, y_train)
# Making predictions on the test data
y_pred = model.predict(X_test)
# Evaluating the model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
# Displaying the results
print(f"Accuracy: {accuracy}")
print(f"Confusion Matrix:\n{conf_matrix}")
# Plotting the decision boundary
plt.scatter(X, Y, color='blue', label='Data Points')
plt.plot(X, model.predict_proba(X)[:, 1], color='red', label='Logistic Regression Curve')
plt.title('Spam Prediction')
plt.xlabel('Number of Links')
plt.ylabel('Spam (1) or Not Spam (0)')
plt.legend()
plt.show()

Explanation of the Code:
- Importing Libraries:
numpy
for handling numerical data.matplotlib
for plotting the graph.sklearn.linear_model
for the logistic regression model.sklearn.model_selection
for splitting the data into training and test sets.sklearn.metrics
for evaluating the model.
- Data Preparation:
- We use
X
to represent the number of links in the email andY
to represent whether the email is spam or not (1 for spam, 0 for not spam).
- We use
- Train-Test Split:
- We split the data into training and test sets, using 80% of the data for training and 20% for testing.
- Model Training:
- The
LogisticRegression
model is created and trained on the training data (X_train
,y_train
).
- The
- Prediction:
- The model is then used to predict whether the test emails are spam or not.
- Evaluation:
- We calculate accuracy, which tells us the percentage of correct predictions.
- We also display the confusion matrix, which shows how many predictions were correctly classified as spam or not spam.
- Plotting:
- The plot displays the data points and the logistic regression curve that separates the spam and non-spam emails.
When to Use Logistic Regression
Logistic regression is best suited for problems where:
- The target variable is binary (two possible outcomes, such as yes/no, true/false).
- You want to predict probabilities of an event occurring (such as the probability of an email being spam).
- The relationship between the dependent and independent variables is approximately linear.
Advantages and Limitations of Logistic Regression Techniques
Advantages:
- Simple and easy to implement: Logistic regression is easy to understand and implement, especially for binary classification problems.
- Interpretable results: The coefficients provide insights into the impact of the independent variables on the probability of the outcome.
- Works well for linearly separable data: If the data can be separated by a straight line (or a hyperplane in higher dimensions), logistic regression performs well.
Limitations:
- Assumes linearity: Logistic regression assumes that the log-odds of the dependent variable is a linear combination of the independent variables, which may not always be the case.
- Sensitive to outliers: Outliers in the data can distort the predictions and performance of the model.
- Binary outcomes only: Logistic regression is primarily used for binary classification. For multiclass problems, you would need to use multinomial logistic regression.
3. Ridge Regression: Tackling Overfitting with Regularization – regression techniques
What is Ridge Regression?
Ridge regression is a variation of linear regression that aims to address the problem of overfitting. Overfitting happens when the model becomes too complex and fits the noise in the data rather than the actual underlying pattern. In simple terms, it happens when a model is too sensitive to small fluctuations in the training data, which makes it perform poorly on unseen (test) data.
How Does Ridge Regression Techniques Work?
Ridge regression adds an additional term to the regular least squares method used in linear regression. This term is the L2 penalty, which is the sum of the squared values of the model’s coefficients (weights). The formula for ridge regression looks like this:

Where:
- Sum of Squared Errors: This is the regular error term used in linear regression.
- λ (lambda): This is a regularization parameter. It controls how much penalty we apply to the size of the coefficients. A larger value of λ will lead to smaller coefficients, while a smaller value will allow the model to fit the data more closely.
- β_i: These are the coefficients of the model.
- n: The number of features.
The goal of ridge regression is to find the set of coefficients that minimize both the error and the penalty term. The larger the λ, the stronger the penalty, which means the coefficients will shrink, and the model becomes simpler.
Example: Predicting House Prices with Ridge Regression techniques
Let’s consider a dataset where we are trying to predict the price of a house based on the number of rooms and square footage. Suppose we have some data:
Square Footage | Rooms | House Price (Target) |
---|---|---|
1000 | 3 | 300,000 |
1500 | 4 | 400,000 |
2000 | 4 | 450,000 |
2500 | 5 | 500,000 |
3000 | 5 | 550,000 |
Without regularization, linear regression might give large weights to certain features, especially if the data is noisy. Ridge regression helps to reduce these large weights and makes the model more reliable.
Ridge Regression Techniques in Action
To understand how ridge regression works in practice, let’s take a closer look at how it would apply to this dataset. The general steps are:
- Fit the model using linear regression.
- Add the regularization term to the error function to penalize large weights.
- Solve the cost function to find the optimal weights that balance fitting the data and keeping the weights small.
Python Code for Ridge Regression Techniques
Let’s implement ridge regression techniques in Python using scikit-learn. We’ll use the house price example with square footage and rooms as input features
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Ridge
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Sample data: Square footage, Rooms, and House Price
X = np.array([[1000, 3], [1500, 4], [2000, 4], [2500, 5], [3000, 5]]) # Features: Square footage, Rooms
y = np.array([300000, 400000, 450000, 500000, 550000]) # Target: House Price
# Splitting data into training and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initializing the Ridge Regression model with lambda (alpha) = 1.0
ridge_reg = Ridge(alpha=1.0)
# Training the model
ridge_reg.fit(X_train, y_train)
# Making predictions on the test data
y_pred = ridge_reg.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred) # Mean Squared Error
print(f"Mean Squared Error: {mse}")
# Plotting the predictions vs actual values
plt.scatter(y_test, y_pred)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Ridge Regression: Predicted vs Actual House Prices')
plt.show()

Explanation of the Code:
- Importing Libraries:
numpy
for data manipulation.matplotlib
for visualizing the results.Ridge
from sklearn for ridge regression.train_test_split
for splitting the dataset into training and test sets.mean_squared_error
to evaluate the model’s performance.
- Data Preparation:
- We create a simple dataset with square footage and rooms as features and house price as the target variable.
- Train-Test Split:
- We split the data into training and test sets (80% for training, 20% for testing).
- Ridge Regression Model:
- We initialize the Ridge regression model with an alpha (regularization strength) of 1.0.
- The higher the alpha, the stronger the penalty, which will shrink the coefficients more.
- Model Training:
- The model is trained on the training data using the
fit()
method.
- The model is trained on the training data using the
- Prediction and Evaluation:
- We make predictions on the test data using the
predict()
method. - The Mean Squared Error (MSE) is calculated to assess how well the model is performing.
- We make predictions on the test data using the
- Plotting:
- We plot the true house prices vs. the predicted prices to visualize the performance of the model. The red dashed line represents perfect predictions, and the points indicate how close the predictions are.
When to Use Ridge Regression Techniques
Ridge regression is most beneficial when:
- You have a large number of features (especially when some of them are highly correlated).
- The model is overfitting and you’re looking for a way to shrink the coefficients without completely removing any feature.
- You want to improve model generalization and avoid overfitting, especially with complex or noisy data.
Advantages and Limitations of Ridge Regression Techniques
Advantages:
- Reduces overfitting: By shrinking the coefficients, ridge regression can prevent overfitting and help the model generalize better to unseen data.
- Works well with correlated features: It can handle situations where the features are highly correlated, unlike regular linear regression, which may give unstable estimates when features are correlated.
- Computationally efficient: Ridge regression is computationally efficient, even for large datasets.
Limitations:
- Does not perform feature selection: Unlike Lasso regression, which can set some coefficients to zero (effectively removing features), ridge regression shrinks all coefficients but does not eliminate any.
- Sensitive to the choice of λ (alpha): The value of the regularization parameter λ must be carefully chosen. If it’s too large, it can underfit the model, and if it’s too small, the model might overfit.
4. Lasso Regression Techniques: Feature Selection with Regularization
What is Lasso Regression Techniques?
Lasso Regression, which stands for Least Absolute Shrinkage and Selection Operator, is another variation of linear regression that adds a regularization term. Like Ridge Regression, Lasso also addresses overfitting by penalizing large coefficients. However, Lasso has a unique feature: it can set some coefficients to zero, effectively removing those features from the model.
In simpler terms, Lasso is like a feature selector. It helps not only to shrink coefficients but also to automatically perform feature selection by removing unnecessary features.
How Does Lasso Regression Techniques Work?
The main difference between Ridge and Lasso lies in the penalty term. While Ridge uses an L2 penalty (the sum of squared coefficients), Lasso uses an L1 penalty (the sum of the absolute values of the coefficients). This difference in penalties gives Lasso the ability to set some coefficients exactly to zero, which leads to simpler models with fewer features.
The formula for the cost function in Lasso regression looks like this:

Where:
- Sum of Squared Errors: This is the usual error term in linear regression.
- λ (lambda): The regularization parameter, which controls how strongly the penalty is applied. A larger λ will shrink the coefficients more, and possibly remove some of them entirely (set them to zero).
- β_i: These are the model’s coefficients.
- n: The number of features.
The goal of Lasso regression is to find the coefficients that minimize both the error term and the penalty, with the added benefit of eliminating unimportant features.
Example: Predicting House Prices with Lasso Regression Techniques
Imagine we have a dataset with features like square footage, number of rooms, and age of the house. We want to predict the house price. Some of these features may be irrelevant or redundant. Lasso regression helps us select only the most important features and discard the less relevant ones.
Here’s an example dataset:
Square Footage | Rooms | House Age | House Price (Target) |
---|---|---|---|
1000 | 3 | 10 | 300,000 |
1500 | 4 | 8 | 400,000 |
2000 | 4 | 5 | 450,000 |
2500 | 5 | 3 | 500,000 |
3000 | 5 | 1 | 550,000 |
Lasso regression will help us identify the most important features (like square footage or rooms) and eliminate less relevant ones (like house age if it turns out not to affect the price significantly).
Why Use Lasso Regression Techniques?
Lasso regression is particularly useful when:
- Have a large number of features, and you suspect some of them are irrelevant or redundant.
- You want to automatically select features in your model, so you don’t have to manually decide which ones to include.
- You are dealing with high-dimensional datasets where the number of features exceeds the number of observations.
Python Code for Lasso Regression Techniques
Let’s walk through an implementation of Lasso regression in Python. We will use the house price example again with square footage, rooms, and house age as features.
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import Lasso
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Sample data: Square footage, Rooms, House Age, and House Price
X = np.array([[1000, 3, 10], [1500, 4, 8], [2000, 4, 5], [2500, 5, 3], [3000, 5, 1]]) # Features: Square footage, Rooms, Age
y = np.array([300000, 400000, 450000, 500000, 550000]) # Target: House Price
# Splitting data into training and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initializing the Lasso Regression model with lambda (alpha) = 1.0
lasso_reg = Lasso(alpha=1.0)
# Training the model
lasso_reg.fit(X_train, y_train)
# Making predictions on the test data
y_pred = lasso_reg.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred) # Mean Squared Error
print(f"Mean Squared Error: {mse}")
# Plotting the predictions vs actual values
plt.scatter(y_test, y_pred)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Lasso Regression: Predicted vs Actual House Prices')
plt.show()
Explanation of the Code:
- Importing Libraries:
numpy
for data manipulation.matplotlib
for plotting results.Lasso
from sklearn for performing Lasso regression.train_test_split
for dividing the data into training and test sets.mean_squared_error
for evaluating model performance.
- Data Preparation:
- We create a simple dataset with features like square footage, rooms, and house age, and the target variable is the house price.
- Train-Test Split:
- The dataset is split into training and testing sets.
- Lasso Regression Model:
- The Lasso model is initialized with a regularization parameter (alpha) of 1.0.
- Alpha controls the strength of the penalty term. A larger alpha leads to more feature elimination.
- Model Training:
- The model is trained on the training data using the
fit()
method.
- The model is trained on the training data using the
- Prediction and Evaluation:
- The model makes predictions on the test set using the
predict()
method. - Mean Squared Error (MSE) is used to evaluate the performance of the model.
- The model makes predictions on the test set using the
- Plotting:
- The predicted house prices are plotted against the true house prices to see how well the model performs. The red dashed line indicates perfect predictions.
Advantages and Limitations of Lasso Regression Techniques
Advantages:
- Feature selection: Lasso can automatically remove irrelevant features by setting their coefficients to zero, leading to simpler models.
- Prevents overfitting: By adding the L1 penalty, Lasso helps reduce overfitting, especially in models with many features.
- Improves model interpretability: Since Lasso tends to select only a few features, the resulting model is easier to interpret.
Limitations:
- Can be too aggressive: If the regularization parameter λ (alpha) is too large, Lasso may eliminate useful features.
- Sensitive to the choice of α: The value of alpha needs to be carefully chosen. If it’s too small, the model may overfit, and if it’s too large, the model might underfit.
- Not ideal for highly correlated features: If two features are highly correlated, Lasso might randomly choose one and discard the other, which may not always be desirable.
When to Use Lasso Regression Techniques
Lasso regression is especially useful when:
- Have a large set of features, and you suspect some of them are irrelevant.
- Automatic way of selecting is the most important features and eliminating the unnecessary ones.
- You are working with high-dimensional datasets and need a simple model with fewer features.
Elastic Net Regression Techniques: Combining Ridge and Lasso for Optimal Performance
What is Elastic Net Regression Techniques?
Elastic Net Regression is a machine learning technique that combines the features of both Ridge Regression (L2 regularization) and Lasso Regression (L1 regularization). It is particularly useful when there are many features in the dataset, and it’s uncertain whether Lasso or Ridge would be the better choice for regularization.
Elastic Net works by adding a mix of L1 and L2 penalties to the cost function, making it a more flexible and effective tool than either Ridge or Lasso alone. This combined penalty can help when there are highly correlated features in the dataset or when the number of predictors exceeds the number of observations.
How Does Elastic Net Regression Techniques Work?
The Elastic Net cost function can be represented as:

Where:
- Sum of Squared Errors: This is the usual error term in linear regression.
- λ₁: The regularization parameter for the L1 penalty (from Lasso).
- λ₂: The regularization parameter for the L2 penalty (from Ridge).
- β_i: The model’s coefficients (parameters).
- n: The number of features.
Elastic Net combines both penalties by introducing two parameters: λ₁ for L1 (Lasso) and λ₂ for L2 (Ridge). This allows for a balance between feature selection (Lasso’s strength) and shrinkage of coefficients (Ridge’s strength).
Why Use Elastic Net?
Elastic Net is particularly beneficial in situations where:
- You have highly correlated features: Lasso may randomly pick one feature from a correlated group and discard others, while Ridge may keep all the features but shrink them too much. Elastic Net can manage correlations between features by selecting groups of correlated features and maintaining them in the model.
- You have more predictors than observations: In situations where the number of features exceeds the number of data points, both Lasso and Ridge might struggle. Elastic Net can handle this scenario better.
- You want a flexible regularization technique that adapts to the nature of your data, offering the benefits of both Lasso and Ridge.
Example: Predicting House Prices with Elastic Net Regression Techniques
Let’s use the same dataset to predict house prices, but this time we’ll apply Elastic Net Regression to see how it works.
Here’s the dataset:
Square Footage | Rooms | House Age | House Price (Target) |
---|---|---|---|
1000 | 3 | 10 | 300,000 |
1500 | 4 | 8 | 400,000 |
2000 | 4 | 5 | 450,000 |
2500 | 5 | 3 | 500,000 |
3000 | 5 | 1 | 550,000 |
Python Code for Elastic Net Regression
Let’s implement Elastic Net in Python using the scikit-learn library to predict house prices. We’ll use the same dataset and split it into training and test sets.
# Importing necessary libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import ElasticNet
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
# Sample data: Square footage, Rooms, House Age, and House Price
X = np.array([[1000, 3, 10], [1500, 4, 8], [2000, 4, 5], [2500, 5, 3], [3000, 5, 1]]) # Features: Square footage, Rooms, Age
y = np.array([300000, 400000, 450000, 500000, 550000]) # Target: House Price
# Splitting data into training and test sets (80% train, 20% test)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Initializing the Elastic Net model with λ1 = 0.5 and λ2 = 0.5
elastic_net = ElasticNet(alpha=1.0, l1_ratio=0.5)
# Training the model
elastic_net.fit(X_train, y_train)
# Making predictions on the test data
y_pred = elastic_net.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred) # Mean Squared Error
print(f"Mean Squared Error: {mse}")
# Plotting the predictions vs actual values
plt.scatter(y_test, y_pred)
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--')
plt.xlabel('True Values')
plt.ylabel('Predictions')
plt.title('Elastic Net Regression: Predicted vs Actual House Prices')
plt.show()

Explanation of the Code:
- Importing Libraries:
- We use
ElasticNet
from scikit-learn to perform Elastic Net regression. train_test_split
to split data into training and test sets.mean_squared_error
to evaluate the model performance.matplotlib
to plot the results.
- We use
- Data Preparation:
- Similar to previous examples, we create a dataset with features like square footage, rooms, and house age, and the target is the house price.
- Train-Test Split:
- The dataset is split into training and testing sets.
- Elastic Net Model:
- The model is initialized with alpha (the regularization strength) and l1_ratio (the mixing parameter between L1 and L2 penalties).
- alpha controls the overall strength of regularization.
- l1_ratio determines the mix of Lasso (L1) and Ridge (L2) penalties:
- A l1_ratio of 1.0 is equivalent to Lasso.
- A l1_ratio of 0 is equivalent to Ridge.
- Values between 0 and 1 give the balance between Lasso and Ridge.
- The model is initialized with alpha (the regularization strength) and l1_ratio (the mixing parameter between L1 and L2 penalties).
- Training and Prediction:
- The model is trained on the training data and makes predictions on the test data.
- Evaluation:
- We evaluate the model performance using Mean Squared Error (MSE).
- Plotting:
- We plot the true vs predicted house prices to visually assess the model’s accuracy.
Advantages and Limitations of Elastic Net Regression
Advantages:
- Flexibility: Elastic Net combines the benefits of both Lasso and Ridge. It can handle a variety of data scenarios by adjusting the L1 and L2 regularization parameters.
- Works well with correlated features: Elastic Net can handle highly correlated features, unlike Lasso, which might discard them entirely.
- Feature selection: Like Lasso, Elastic Net can perform feature selection by setting coefficients to zero.
Limitations:
- Tuning required: Elastic Net requires tuning two parameters: alpha and l1_ratio, which can be challenging.
- Interpretability: While it helps with feature selection, the model might still be less interpretable than simpler models like linear regression.
When to Use Elastic Net Regression
Elastic Net is most useful when:
- You have correlated features and want to avoid the problem of one feature being randomly selected over another.
- Your dataset has more features than observations, which may make Ridge and Lasso less effective.
- You need a flexible regularization model that can combine both L1 and L2 penalties to handle different types of data efficiently.
Support Vector Regression (SVR) – Regression Techniques
What is Support Vector Regression?
Support Vector Regression (SVR) is a machine learning algorithm based on Support Vector Machines (SVM), primarily designed for classification. SVR adapts the principles of SVM to solve regression problems. Instead of predicting discrete class labels, SVR predicts continuous output values by finding a hyperplane that fits the data within a specified margin of tolerance (epsilon).
SVR is highly effective for complex datasets where relationships between variables may not be linear. By using kernels, it can model non-linear relationships efficiently.
How SVR Works
- Core Idea:
- The goal of SVR is to find a hyperplane (or regression line) that predicts the target variable as accurately as possible while keeping errors within a specified margin (epsilon margin).
- Unlike traditional regression, SVR tries to minimize a loss function that ignores errors within this margin, focusing on significant deviations.
- Key Parameters:
- Epsilon (ε): Defines the margin of tolerance around the hyperplane. Predictions falling within this margin are considered acceptable errors.
- C (Regularization Parameter): Controls the trade-off between achieving a low error and maintaining a simple model. Higher values of C aim for fewer errors but risk overfitting.
- Kernel: Determines how SVR handles non-linear relationships. Common kernels include:
- Linear: For linear relationships.
- Polynomial: For polynomial relationships.
- Radial Basis Function (RBF): For complex non-linear relationships.
- Objective Function: SVR minimizes the following function:

Example: Predicting House Prices with SVR
Let’s use SVR to predict house prices based on features like square footage, the number of rooms, and house age.
Python Code for SVR
# Importing libraries
import numpy as np
from sklearn.svm import SVR
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, r2_score
import matplotlib.pyplot as plt
# Sample dataset
X = np.array([[1000, 3, 10], [1500, 4, 8], [2000, 4, 5], [2500, 5, 3], [3000, 5, 1]]) # Features: Square footage, Rooms, Age
y = np.array([300000, 400000, 450000, 500000, 550000]) # Target: House Price
# Splitting dataset into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Creating and fitting the SVR model with an RBF kernel
svr_model = SVR(kernel='rbf', C=1000, epsilon=5000)
svr_model.fit(X_train, y_train)
# Making predictions
y_pred = svr_model.predict(X_test)
# Evaluating the model
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)
print(f"Mean Squared Error: {mse}")
print(f"R-squared Score: {r2}")
# Plotting the predictions vs actual values
plt.scatter(y_test, y_pred, color='blue', label='Predictions')
plt.plot([min(y_test), max(y_test)], [min(y_test), max(y_test)], color='red', linestyle='--', label='Ideal Fit')
plt.xlabel('Actual Values')
plt.ylabel('Predicted Values')
plt.title('SVR: Predicted vs Actual House Prices')
plt.legend()
plt.show()

Code Explanation
- Data Preparation:
- The dataset includes features like square footage, number of rooms, and house age, with house price as the target variable.
- Splitting the Dataset:
- The data is divided into training and testing sets.
- SVR Model:
- An SVR model is initialized with:
- Kernel: RBF, suitable for non-linear relationships.
- C: A high value encourages fewer errors but increases the risk of overfitting.
- Epsilon: Defines the acceptable margin of error for predictions.
- An SVR model is initialized with:
- Model Evaluation:
- Mean Squared Error (MSE) quantifies prediction errors.
- R-squared Score (R²) measures the proportion of variance in the target variable explained by the model.
- Visualization:
- A scatter plot compares the actual house prices with the predicted values, along with a reference line showing the ideal fit.
Advantages of SVR
- Handles Non-linearity: SVR can model complex relationships using kernels like RBF.
- Robustness: It performs well on small or medium-sized datasets.
- Flexibility: The epsilon margin allows tolerance for small errors.
Limitations of SVR
- Computational Cost: SVR can be slow with large datasets due to its reliance on support vectors.
- Parameter Tuning: Choosing appropriate values for C, epsilon, and the kernel is critical and can be challenging.
- Scaling Issues: SVR is sensitive to the scale of input features, so feature scaling (e.g., standardization) is often required.
Conclusion
As we step into 2025, mastering the top regression techniques is crucial for tackling diverse real-world challenges. Each method we’ve explored—Linear Regression, Logistic Regression, Ridge Regression, Lasso Regression, Elastic Net Regression, and Support Vector Regression (SVR)—offers unique strengths for specific scenarios.
- Linear Regression remains the go-to for simple, interpretable relationships.
- Logistic Regression is indispensable for classification tasks with probabilistic insights.
- Ridge and Lasso Regression help manage multicollinearity and feature selection, respectively.
- Elastic Net Regression strikes a balance between Ridge and Lasso, making it flexible for complex datasets.
- Support Vector Regression shines when non-linear patterns demand sophisticated modeling.
The key to using these techniques lies in understanding your data and problem statement. By aligning the strengths of each technique with the task at hand, you can build more accurate and reliable models.
FAQs
Regression is a statistical method used to model relationships between dependent and independent variables, predicting continuous outcomes like prices, temperatures, or sales.
Linear Regression predicts continuous values, while Logistic Regression predicts probabilities for classification tasks.
Use Ridge Regression when your dataset has multicollinearity (highly correlated features) to prevent overfitting.
Lasso Regression performs feature selection by shrinking some coefficients to zero, simplifying models and improving interpretability.
Elastic Net combines Ridge’s handling of multicollinearity with Lasso’s feature selection, making it effective for complex datasets.
Support Vector Regression (SVR) is a powerful technique for predicting continuous outcomes, especially in datasets with non-linear relationships.
External Resources
Kaggle Datasets and Tutorials
Access free datasets and practical tutorials for experimenting with regression techniques.
https://www.kaggle.com/
Towards Data Science: Regression Techniques
Articles explaining various regression methods with clear examples and use cases.
https://towardsdatascience.com/
Leave a Reply