Data Visualization with Matplotlib – A Comprehensive Guide
1. Introduction to Data Visualization & Matplotlib
Data Visualization is an essential aspect of data analysis, machine learning, and business intelligence. It helps in:
✅ Understanding Trends & Patterns
✅ Communicating Insights Effectively
✅ Identifying Anomalies & Outliers
✅ Comparing Different Data Points
📌 What is Matplotlib?
Matplotlib is a powerful Python library used for creating static, animated, and interactive visualizations. It is widely used in data science, machine learning, and exploratory data analysis (EDA).
📌 Why Use Matplotlib?
✔ Highly customizable
✔ Works well with NumPy & Pandas
✔ Supports various types of charts
✔ Used in scientific computing & ML models
2. Installing and Importing Matplotlib
A. Install Matplotlib
pip install matplotlib
B. Import Matplotlib
import matplotlib.pyplot as plt
import numpy as np
3. Basic Plot with Matplotlib
A. Create a Simple Line Plot
# Sample Data
x = np.linspace(0, 10, 100) # Generate 100 points between 0 and 10
y = np.sin(x) # Apply sine function
# Create a Plot
plt.plot(x, y, label="Sine Wave", color='blue', linestyle='--')
# Add Labels and Title
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Simple Line Plot")
# Add a Legend
plt.legend()
# Show the Plot
plt.show()
📌 Key Elements in the Plot:
✔ plt.plot()
→ Plots the data
✔ xlabel()
, ylabel()
→ Label the axes
✔ title()
→ Adds a title
✔ legend()
→ Adds a legend
✔ show()
→ Displays the plot
4. Different Types of Plots in Matplotlib
A. Line Plot
x = np.arange(1, 10)
y = x * 2 # Linear function
plt.plot(x, y, marker='o', linestyle='-', color='r', label='y = 2x')
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Line Plot Example")
plt.legend()
plt.show()
📌 Best Use Case:
✔ Showing trends over time
B. Bar Chart
categories = ["A", "B", "C", "D"]
values = [10, 20, 15, 25]
plt.bar(categories, values, color=['red', 'blue', 'green', 'purple'])
plt.xlabel("Categories")
plt.ylabel("Values")
plt.title("Bar Chart Example")
plt.show()
📌 Best Use Case:
✔ Comparing categories or groups
C. Histogram
data = np.random.randn(1000) # Generate 1000 random numbers
plt.hist(data, bins=30, color='skyblue', edgecolor='black')
plt.xlabel("Data Values")
plt.ylabel("Frequency")
plt.title("Histogram Example")
plt.show()
📌 Best Use Case:
✔ Showing data distribution & frequency
D. Scatter Plot
x = np.random.rand(50)
y = np.random.rand(50)
plt.scatter(x, y, color='red', marker='x')
plt.xlabel("X-axis")
plt.ylabel("Y-axis")
plt.title("Scatter Plot Example")
plt.show()
📌 Best Use Case:
✔ Visualizing relationships between two variables
E. Pie Chart
labels = ["Apples", "Bananas", "Cherries", "Dates"]
sizes = [30, 20, 40, 10]
colors = ["red", "yellow", "pink", "brown"]
plt.pie(sizes, labels=labels, colors=colors, autopct="%1.1f%%", startangle=140)
plt.title("Pie Chart Example")
plt.show()
📌 Best Use Case:
✔ Representing proportions & percentages
5. Customizing Matplotlib Plots
A. Adding Gridlines
plt.plot(x, y, label="y = 2x", color='g')
plt.grid(True) # Enable gridlines
plt.legend()
plt.show()
B. Changing Line Styles & Markers
plt.plot(x, y, linestyle="--", marker="o", color="blue")
plt.show()
📌 Common Line Styles & Markers:
- Linestyles:
"-"
,"--"
,"-."
,":"
- Markers:
"o"
,"s"
,"x"
,"d"
C. Setting Figure Size
plt.figure(figsize=(8, 5)) # Width = 8 inches, Height = 5 inches
plt.plot(x, y)
plt.show()
D. Adjusting Ticks & Labels
plt.xticks(rotation=45) # Rotate x-axis labels
plt.yticks(fontsize=12) # Adjust font size of y-axis labels
6. Creating Subplots
Subplots allow multiple plots in a single figure.
fig, ax = plt.subplots(2, 2, figsize=(10, 8)) # 2x2 Grid of Subplots
# First Plot
ax[0, 0].plot(x, y, color='blue')
ax[0, 0].set_title("Plot 1")
# Second Plot
ax[0, 1].bar(categories, values, color='orange')
ax[0, 1].set_title("Plot 2")
# Third Plot
ax[1, 0].hist(data, bins=20, color='green')
ax[1, 0].set_title("Plot 3")
# Fourth Plot
ax[1, 1].scatter(x, y, color='red')
ax[1, 1].set_title("Plot 4")
plt.tight_layout() # Adjust layout
plt.show()
📌 Why Use Subplots?
✔ Comparing multiple visualizations in one figure
✔ Helps in multi-variable analysis
7. Saving Figures in Matplotlib
plt.plot(x, y)
plt.savefig("plot.png", dpi=300, bbox_inches="tight") # Save as PNG
📌 Common Formats: .png
, .jpg
, .svg
, .pdf
8. Interactive Plots with plt.show(block=False)
plt.ion() # Turn on interactive mode
plt.plot(x, y)
plt.show(block=False)
📌 Best Use Case:
✔ When running continuous updates in live applications
9. Summary
✔ Matplotlib is essential for data visualization in Python
✔ Supports line, bar, histogram, scatter, and pie charts
✔ Highly customizable with labels, colors, gridlines, etc.
✔ Subplots allow multiple plots in one figure
✔ Can save figures for reports & presentations
📌 Next Steps:
✅ Learn Seaborn for Advanced Visualizations
✅ Use Matplotlib with Pandas & NumPy
✅ Try interactive dashboards with Plotly
Need project ideas? Let me know!