Pandas
is a powerful Python library used for data manipulation, analysis, and financial modeling. It provides an easy-to-use structure for handling time-series data, stock market data, financial statements, and more.
Key features of Pandas for financial data:
✔ Handling large financial datasets
✔ Data cleaning and preprocessing
✔ Working with time-series data
✔ Calculating financial indicators
1. Installing Pandas and Dependencies
Ensure you have pandas
, numpy
, and matplotlib
installed. If not, install them using:
pip install pandas numpy matplotlib yfinance
2. Loading Financial Data with Pandas
Financial data can be loaded from multiple sources, including CSV files, Excel sheets, APIs, and databases.
Loading CSV Files
import pandas as pd
# Load stock data from a CSV file
df = pd.read_csv("financial_data.csv")
# Display first five rows
print(df.head())
Loading Stock Market Data from Yahoo Finance
You can fetch stock market data using yfinance
.
import yfinance as yf
# Fetch Apple (AAPL) stock data
df = yf.download("AAPL", start="2023-01-01", end="2024-01-01")
# Display first five rows
print(df.head())
✔ Automatically retrieves historical stock prices
✔ Includes Open, High, Low, Close, Volume, and Adjusted Close
3. Handling Missing Financial Data
Financial data often has missing values due to market holidays or incomplete records.
# Check for missing values
print(df.isnull().sum())
# Fill missing values with the previous value
df.fillna(method='ffill', inplace=True)
# Drop rows with missing values (if necessary)
df.dropna(inplace=True)
✔ Fills missing data using previous values
✔ Removes rows with missing data
4. Time-Series Analysis in Pandas
Financial data is mostly time-series, meaning each observation is associated with a specific date/time.
Setting Date as Index
df.index = pd.to_datetime(df.index)
print(df.index)
✔ Converts the date column into a DatetimeIndex
✔ Allows easy time-based filtering
Filtering Data by Date
# Filter stock data for June 2023
june_data = df["2023-06"]
print(june_data)
Resampling for Monthly Data
# Resample data to monthly frequency (mean closing price)
monthly_data = df['Close'].resample('M').mean()
print(monthly_data)
✔ Aggregates data at different time intervals (daily, weekly, monthly)
5. Calculating Financial Indicators
Simple Moving Average (SMA)
SMA helps in analyzing trends by smoothing price fluctuations.
df['SMA_50'] = df['Close'].rolling(window=50).mean()
df['SMA_200'] = df['Close'].rolling(window=200).mean()
Exponential Moving Average (EMA)
EMA gives more weight to recent prices.
df['EMA_50'] = df['Close'].ewm(span=50, adjust=False).mean()
Relative Strength Index (RSI)
RSI is a momentum indicator that identifies overbought or oversold conditions.
import numpy as np
delta = df['Close'].diff()
gain = (delta.where(delta > 0, 0)).rolling(window=14).mean()
loss = (-delta.where(delta < 0, 0)).rolling(window=14).mean()
rs = gain / loss
df['RSI'] = 100 - (100 / (1 + rs))
6. Visualizing Financial Data
Matplotlib
and Seaborn
can help plot stock prices and indicators.
import matplotlib.pyplot as plt
# Plot stock closing price
plt.figure(figsize=(10,5))
plt.plot(df.index, df['Close'], label='Closing Price')
plt.plot(df.index, df['SMA_50'], label='50-day SMA', linestyle='dashed')
plt.legend()
plt.title("Stock Closing Price with 50-day SMA")
plt.show()
✔ Visualizes trends in stock prices
✔ Compares different indicators
7. Analyzing Portfolio Performance
Calculating Daily Returns
df['Daily Return'] = df['Close'].pct_change()
Calculating Cumulative Returns
df['Cumulative Return'] = (1 + df['Daily Return']).cumprod()
Risk Analysis: Standard Deviation
volatility = df['Daily Return'].std()
print("Stock Volatility:", volatility)
✔ Measures risk by calculating price fluctuations
8. Exporting Processed Data
Once the analysis is complete, you may need to save it.
Saving to CSV
df.to_csv("processed_financial_data.csv")
Saving to Excel
df.to_excel("financial_analysis.xlsx")