LSTMs for Time Series

Loading

Long Short-Term Memory Networks (LSTMs) for Time Series Forecasting

1. Introduction to LSTMs for Time Series Forecasting

Long Short-Term Memory (LSTM) networks are a type of Recurrent Neural Network (RNN) designed to handle sequential and time series data. Unlike traditional neural networks, LSTMs can retain past information for long periods, making them ideal for forecasting problems where past values influence future trends.

Why Use LSTMs for Time Series?

Handles Long-Term Dependencies – Unlike standard RNNs, LSTMs avoid the vanishing gradient problem, allowing them to remember long-term patterns.
Works Well with Sequential Data – Useful for stock prices, weather data, sales forecasting, etc.
Captures Trends and Seasonality – Can model complex, non-linear time dependencies.
Handles Multivariate Time Series – Can process multiple input features simultaneously.

Common Applications of LSTMs in Time Series Forecasting

📌 Stock Market Predictions 📈
📌 Weather Forecasting
📌 Sales and Demand Forecasting 🛒
📌 Energy Load Forecasting 🔋
📌 Website Traffic Prediction 🌍
📌 Financial Time Series Analysis 💰


2. Understanding the LSTM Architecture

An LSTM network consists of memory cells, each containing:

  • Forget Gate: Decides what information to discard.
  • Input Gate: Determines what new information to store.
  • Cell State: Maintains long-term memory.
  • Output Gate: Determines the final output.

Mathematical Formulation of LSTMs

1️⃣ Forget Gate ft=σ(Wf⋅[ht−1,xt]+bf)f_t = \sigma(W_f \cdot [h_{t-1}, x_t] + b_f)

Decides which past information to forget.

2️⃣ Input Gate it=σ(Wi⋅[ht−1,xt]+bi)i_t = \sigma(W_i \cdot [h_{t-1}, x_t] + b_i) Ct~=tanh⁡(WC⋅[ht−1,xt]+bC)\tilde{C_t} = \tanh(W_C \cdot [h_{t-1}, x_t] + b_C)

Controls what new information to store.

3️⃣ Cell State Update Ct=ft∗Ct−1+it∗Ct~C_t = f_t * C_{t-1} + i_t * \tilde{C_t}

Stores long-term dependencies.

4️⃣ Output Gate ot=σ(Wo⋅[ht−1,xt]+bo)o_t = \sigma(W_o \cdot [h_{t-1}, x_t] + b_o) ht=ot∗tanh⁡(Ct)h_t = o_t * \tanh(C_t)

Determines the next hidden state.


3. Preparing the Data for LSTM

Step 1: Load Required Libraries

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import LSTM, Dense, Dropout
from sklearn.preprocessing import MinMaxScaler
from sklearn.metrics import mean_absolute_error, mean_squared_error

Step 2: Load and Visualize the Time Series Data

# Load dataset
df = pd.read_csv("time_series_data.csv")
df['Date'] = pd.to_datetime(df['Date'])
df.set_index('Date', inplace=True)

# Plot the time series data
plt.figure(figsize=(12, 5))
plt.plot(df, label="Time Series Data")
plt.xlabel("Time")
plt.ylabel("Value")
plt.title("Time Series Data Visualization")
plt.legend()
plt.show()

Step 3: Normalize Data Using MinMax Scaling

LSTMs work better when data is scaled between 0 and 1.

scaler = MinMaxScaler(feature_range=(0, 1))
df_scaled = scaler.fit_transform(df)

4. Creating Sequences for LSTM Training

Since LSTMs process sequential data, we must create input-output pairs from the time series.

def create_sequences(data, time_steps=50):
    X, y = [], []
    for i in range(len(data) - time_steps):
        X.append(data[i:i+time_steps])
        y.append(data[i+time_steps])
    return np.array(X), np.array(y)

# Define time step (number of past values to consider)
time_steps = 50

# Create sequences
X, y = create_sequences(df_scaled, time_steps)

# Split into training and testing sets
train_size = int(len(X) * 0.8)
X_train, y_train = X[:train_size], y[:train_size]
X_test, y_test = X[train_size:], y[train_size:]

5. Building the LSTM Model

Now, we define an LSTM-based neural network.

# Define the LSTM model
model = Sequential([
    LSTM(units=100, activation='relu', return_sequences=True, input_shape=(X_train.shape[1], 1)),
    Dropout(0.2),
    LSTM(units=50, activation='relu', return_sequences=False),
    Dropout(0.2),
    Dense(units=25),
    Dense(units=1)
])

# Compile the model
model.compile(optimizer='adam', loss='mse')

6. Training the LSTM Model

# Train the model
history = model.fit(X_train, y_train, epochs=50, batch_size=32, validation_data=(X_test, y_test), verbose=1)

Epochs: Number of times the model goes through the dataset.
Batch Size: Number of samples processed before updating model weights.


7. Evaluating Model Performance

Plot Training & Validation Loss

plt.figure(figsize=(10,5))
plt.plot(history.history['loss'], label='Train Loss')
plt.plot(history.history['val_loss'], label='Validation Loss')
plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training vs Validation Loss')
plt.legend()
plt.show()

Make Predictions

# Predict on test data
predictions = model.predict(X_test)

# Rescale predictions back to original scale
predictions = scaler.inverse_transform(predictions)
y_test_rescaled = scaler.inverse_transform(y_test.reshape(-1, 1))

Calculate Error Metrics

mae = mean_absolute_error(y_test_rescaled, predictions)
rmse = np.sqrt(mean_squared_error(y_test_rescaled, predictions))

print(f"MAE: {mae}, RMSE: {rmse}")

Lower RMSE and MAE → Better model performance!


8. Visualizing Predictions vs Actual Values

plt.figure(figsize=(12,6))
plt.plot(y_test_rescaled, label="Actual Data", color="blue")
plt.plot(predictions, label="Predicted Data", color="red", linestyle="dashed")
plt.xlabel("Time")
plt.ylabel("Value")
plt.title("LSTM Time Series Forecasting")
plt.legend()
plt.show()

The red dashed line represents the LSTM model’s predictions, while the blue line is actual data.


9. Extending LSTMs for More Accurate Forecasting

Use Bidirectional LSTMs – Helps capture both past and future dependencies.
Stack Multiple LSTM Layers – Deep LSTMs improve accuracy.
Use Attention Mechanisms – Helps focus on important time steps.
Try GRUs (Gated Recurrent Units) – Faster alternative to LSTMs.
Experiment with Hyperparameter Tuning – Adjust learning rate, batch size, etc.


Leave a Reply

Your email address will not be published. Required fields are marked *