Logistic Regression

Loading

Logistic Regression in Machine Learning

1. Introduction to Logistic Regression

Logistic Regression is a Supervised Learning algorithm used for classification problems. Unlike Linear Regression, which predicts continuous values, Logistic Regression predicts categorical outcomes (such as Yes/No, 0/1, True/False).

🚀 Example Use Cases:
✔ Email Spam Detection (Spam or Not Spam)
✔ Disease Prediction (Has Disease or Not)
✔ Customer Churn Prediction (Will Leave or Stay)
✔ Credit Risk Assessment (Loan Default or Not)

📌 Despite its name, Logistic Regression is not a regression algorithm; it is primarily used for classification tasks.


2. Why Do We Need Logistic Regression?

Linear Regression predicts continuous values, but in many cases, we need a model that predicts discrete classes (e.g., “Yes” or “No”).

🔹 Example:
Imagine predicting whether a student passes or fails an exam based on study hours. Linear Regression might give values like -0.5 or 1.2, which are not valid class labels. Logistic Regression solves this problem by producing a probability between 0 and 1.


3. Mathematical Representation of Logistic Regression

Logistic Regression uses the Sigmoid Function (Logistic Function) to map any real-valued number to a probability range between 0 and 1. σ(z)=11+e−z\sigma(z) = \frac{1}{1 + e^{-z}}

Where:

  • σ(z)\sigma(z) = Sigmoid function
  • z=b0+b1X1+b2X2+…+bnXnz = b_0 + b_1X_1 + b_2X_2 + … + b_nX_n (Linear equation)
  • ee = Euler’s number (≈ 2.718)

📌 Sigmoid Function Interpretation:

  • If σ(z) > 0.5, the outcome is Class 1 (Yes, True, Positive)
  • If σ(z) < 0.5, the outcome is Class 0 (No, False, Negative)

📊 Graph of the Sigmoid Function:

  • S-Shaped curve
  • Values range between 0 and 1
  • Helps model classification problems

4. Types of Logistic Regression

Binary Logistic Regression → Two possible outcomes (e.g., Spam or Not Spam)
Multinomial Logistic Regression → More than two categories (e.g., Dog, Cat, Horse)
Ordinal Logistic Regression → Ordered categories (e.g., Low, Medium, High)


5. Decision Boundary in Logistic Regression

The decision boundary is a threshold that separates different classes.
If we set 0.5 as the threshold, the decision rule is: If σ(z)>0.5,predict Class 1\text{If } \sigma(z) > 0.5, \text{predict Class 1} If σ(z)<0.5,predict Class 0\text{If } \sigma(z) < 0.5, \text{predict Class 0}

This boundary can be linear or non-linear, depending on the dataset.


6. Cost Function in Logistic Regression

Instead of using Mean Squared Error (MSE) like in Linear Regression, Logistic Regression uses the Log Loss (Logistic Loss) function, also called the Binary Cross-Entropy Loss: J(θ)=−1m∑i=1m[yilog⁡(y^i)+(1−yi)log⁡(1−y^i)]J(\theta) = – \frac{1}{m} \sum_{i=1}^{m} \left[ y_i \log(\hat{y}_i) + (1 – y_i) \log(1 – \hat{y}_i) \right]

Where:

  • m = Number of training examples
  • yᵢ = Actual class label (0 or 1)
  • Ŷᵢ = Predicted probability

🔹 Log Loss Explanation:

  • If prediction is correct, the loss is small.
  • If prediction is wrong, the loss is large.
  • Goal: Minimize Log Loss to improve classification accuracy.

7. Optimization: Gradient Descent in Logistic Regression

To minimize the cost function, Logistic Regression uses Gradient Descent, an optimization algorithm that updates weights iteratively:

1️⃣ Compute the gradient of the cost function.
2️⃣ Update parameters θ using: θ=θ−α⋅∂J∂θ\theta = \theta – \alpha \cdot \frac{\partial J}{\partial \theta}

Where:

  • α\alpha = Learning rate
  • J(θ)J(\theta) = Cost function
  • ∂J∂θ\frac{\partial J}{\partial \theta} = Gradient

Gradient Descent helps find the best-fit parameters that minimize Log Loss.


8. Implementing Logistic Regression in Python (Sklearn)

Let’s implement Logistic Regression using Python and Scikit-Learn.

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

# Generate synthetic dataset
np.random.seed(42)
X = np.random.rand(100, 1) * 10  # Random values between 0 and 10
y = (X > 5).astype(int).flatten()  # Class 1 if X > 5, else Class 0

# Split into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train Logistic Regression model
model = LogisticRegression()
model.fit(X_train, y_train)

# Predict on test data
y_pred = model.predict(X_test)

# Evaluate model
accuracy = accuracy_score(y_test, y_pred)
conf_matrix = confusion_matrix(y_test, y_pred)
report = classification_report(y_test, y_pred)

print(f'Accuracy: {accuracy:.2f}')
print('Confusion Matrix:')
print(conf_matrix)
print('Classification Report:')
print(report)

# Visualize Decision Boundary
X_curve = np.linspace(0, 10, 100).reshape(-1, 1)
y_curve = model.predict_proba(X_curve)[:, 1]

plt.scatter(X_test, y_test, color='blue', label="Actual Data")
plt.plot(X_curve, y_curve, color='red', label="Logistic Regression Curve")
plt.axhline(0.5, color='green', linestyle='dashed', label="Decision Boundary")
plt.xlabel("Feature X")
plt.ylabel("Probability")
plt.legend()
plt.show()

9. Model Evaluation Metrics for Logistic Regression

📌 1. Accuracy Score Accuracy=Correct PredictionsTotal Predictions\text{Accuracy} = \frac{\text{Correct Predictions}}{\text{Total Predictions}}

📌 2. Precision Precision=TPTP+FP\text{Precision} = \frac{TP}{TP + FP}

📌 3. Recall (Sensitivity) Recall=TPTP+FN\text{Recall} = \frac{TP}{TP + FN}

📌 4. F1-Score (Harmonic Mean of Precision and Recall) F1=2×Precision×RecallPrecision+RecallF1 = 2 \times \frac{\text{Precision} \times \text{Recall}}{\text{Precision} + \text{Recall}}

📌 5. ROC Curve & AUC Score

  • Receiver Operating Characteristic (ROC) Curve shows the model’s ability to distinguish between classes.
  • Area Under Curve (AUC) quantifies performance (1 = perfect model, 0.5 = random guessing).

10. Advantages and Disadvantages of Logistic Regression

Advantages:

✔ Simple & easy to interpret.
✔ Works well for binary classification.
✔ Computationally efficient.

Disadvantages:

❌ Assumes a linear decision boundary.
❌ Does not perform well with complex relationships.
❌ Sensitive to outliers.


11. Summary

Logistic Regression is a classification algorithm.
✔ Uses the Sigmoid Function to output probabilities.
Cost function is Log Loss (Cross-Entropy Loss).
Gradient Descent is used for optimization.
Evaluated using Accuracy, Precision, Recall, F1-Score, and ROC-AUC.

Mastering Logistic Regression is key to understanding fundamental classification problems in Machine Learning!

Leave a Reply

Your email address will not be published. Required fields are marked *