Differentiation and Partial Derivatives – A Comprehensive Guide
Differentiation and partial derivatives are fundamental concepts in calculus that help in understanding how functions change. These concepts are widely used in mathematics, physics, engineering, economics, and machine learning.
In machine learning and deep learning, differentiation plays a crucial role in gradient descent, optimization, and backpropagation. Partial derivatives, in particular, are essential when dealing with functions of multiple variables, such as loss functions in neural networks.
1. Introduction to Differentiation
1.1 What is Differentiation?
Differentiation is the process of finding the rate of change of a function. If a function represents a curve, the derivative at a particular point gives the slope of the tangent line at that point.
Mathematically, the derivative of a function f(x)f(x) is defined as: f′(x)=limh→0f(x+h)−f(x)hf'(x) = \lim_{h \to 0} \frac{f(x+h) – f(x)}{h}
where:
- f′(x)f'(x) represents the derivative (also called the first derivative).
- hh is an infinitely small value.
- The limit ensures the slope is calculated at an infinitesimally small interval.
2. Rules of Differentiation
To differentiate functions efficiently, we use standard rules:
2.1 Power Rule
If: f(x)=xnf(x) = x^n
Then its derivative is: f′(x)=nxn−1f'(x) = n x^{n-1}
Example: f(x)=x3⇒f′(x)=3x2f(x) = x^3 \quad \Rightarrow \quad f'(x) = 3x^2
2.2 Sum and Difference Rule
For two functions f(x)f(x) and g(x)g(x): ddx[f(x)+g(x)]=f′(x)+g′(x)\frac{d}{dx} [f(x) + g(x)] = f'(x) + g'(x) ddx[f(x)−g(x)]=f′(x)−g′(x)\frac{d}{dx} [f(x) – g(x)] = f'(x) – g'(x)
Example: f(x)=x3+2×2−5xf(x) = x^3 + 2x^2 – 5x f′(x)=3×2+4x−5f'(x) = 3x^2 + 4x – 5
2.3 Product Rule
If: h(x)=f(x)g(x)h(x) = f(x) g(x)
Then: h′(x)=f′(x)g(x)+f(x)g′(x)h'(x) = f'(x) g(x) + f(x) g'(x)
Example: f(x)=x2,g(x)=exf(x) = x^2, \quad g(x) = e^x ddx(x2ex)=2xex+x2ex\frac{d}{dx} (x^2 e^x) = 2x e^x + x^2 e^x
2.4 Quotient Rule
If: h(x)=f(x)g(x)h(x) = \frac{f(x)}{g(x)}
Then: h′(x)=f′(x)g(x)−f(x)g′(x)[g(x)]2h'(x) = \frac{f'(x) g(x) – f(x) g'(x)}{[g(x)]^2}
Example: f(x)=x2,g(x)=x+1f(x) = x^2, \quad g(x) = x + 1 ddx(x2x+1)=2x(x+1)−x2(1)(x+1)2\frac{d}{dx} \left( \frac{x^2}{x+1} \right) = \frac{2x(x+1) – x^2(1)}{(x+1)^2}
2.5 Chain Rule
If: y=f(g(x))y = f(g(x))
Then: dydx=f′(g(x))⋅g′(x)\frac{dy}{dx} = f'(g(x)) \cdot g'(x)
Example: f(x)=sin(x2)f(x) = \sin(x^2)
Using the chain rule: ddxsin(x2)=cos(x2)⋅2x\frac{d}{dx} \sin(x^2) = \cos(x^2) \cdot 2x
3. Higher-Order Derivatives
The second derivative is the derivative of the derivative: f′′(x)=d2dx2f(x)f”(x) = \frac{d^2}{dx^2} f(x)
Example: f(x)=x3⇒f′(x)=3×2⇒f′′(x)=6xf(x) = x^3 \quad \Rightarrow \quad f'(x) = 3x^2 \quad \Rightarrow \quad f”(x) = 6x
Higher-order derivatives (f′′′(x),f′′′′(x)f”'(x), f””(x), etc.) are useful in physics and machine learning for analyzing curvature and acceleration.
4. Partial Derivatives
4.1 What are Partial Derivatives?
A partial derivative is the derivative of a function with multiple variables, with respect to one variable while keeping others constant.
For a function: f(x,y)=x2+3xy+y2f(x, y) = x^2 + 3xy + y^2
The partial derivative with respect to xx is: ∂f∂x=2x+3y\frac{\partial f}{\partial x} = 2x + 3y
The partial derivative with respect to yy is: ∂f∂y=3x+2y\frac{\partial f}{\partial y} = 3x + 2y
4.2 Gradient and Directional Derivatives
The gradient is a vector of all partial derivatives: ∇f(x,y)=[∂f∂x,∂f∂y]\nabla f(x, y) = \left[ \frac{\partial f}{\partial x}, \frac{\partial f}{\partial y} \right]
For f(x,y)=x2+3xy+y2f(x, y) = x^2 + 3xy + y^2: ∇f(x,y)=(2x+3y,3x+2y)\nabla f(x, y) = (2x + 3y, 3x + 2y)
5. Applications of Differentiation and Partial Derivatives
5.1 Machine Learning – Gradient Descent
The gradient is used in optimization: θ=θ−α∇f(θ)\theta = \theta – \alpha \nabla f(\theta)
where:
- θ\theta are model parameters.
- α\alpha is the learning rate.
- ∇f(θ)\nabla f(\theta) is the gradient of the loss function.
5.2 Neural Networks – Backpropagation
Backpropagation computes gradients using the chain rule to update weights.
5.3 Physics – Motion and Acceleration
- Velocity: v(t)=dxdtv(t) = \frac{dx}{dt}
- Acceleration: a(t)=dvdt=d2xdt2a(t) = \frac{dv}{dt} = \frac{d^2x}{dt^2}
6. Summary Table
Concept | Formula | Application |
---|---|---|
Derivative | f′(x)=limh→0f(x+h)−f(x)hf'(x) = \lim_{h \to 0} \frac{f(x+h) – f(x)}{h} | Rate of change |
Power Rule | f(x)=xn⇒f′(x)=nxn−1f(x) = x^n \Rightarrow f'(x) = n x^{n-1} | Simple polynomials |
Product Rule | (uv)′=u′v+uv′(uv)’ = u’v + uv’ | Multiplication of functions |
Quotient Rule | (uv)′=u′v−uv′v2\left( \frac{u}{v} \right)’ = \frac{u’v – uv’}{v^2} | Division of functions |
Chain Rule | (f(g(x)))′=f′(g(x))g′(x)(f(g(x)))’ = f'(g(x)) g'(x) | Composite functions |
Partial Derivative | ∂f∂x\frac{\partial f}{\partial x}, ∂f∂y\frac{\partial f}{\partial y} | Multivariable functions |
Gradient | ∇f(x,y)=[∂f∂x,∂f∂y]\nabla f(x, y) = [\frac{\partial f}{\partial x}, \frac{\partial f}{\partial y}] | Optimization |