Data Visualization with Seaborn

Loading

Data Visualization with Seaborn – A Comprehensive Guide

1. Introduction to Data Visualization & Seaborn

A. What is Data Visualization?

Data visualization is the graphical representation of data that helps in:
✅ Identifying trends & patterns
✅ Communicating insights effectively
✅ Comparing different data points
✅ Enhancing decision-making

B. What is Seaborn?

Seaborn is a Python library for statistical data visualization built on top of Matplotlib. It provides:
High-level interface for complex visualizations
Beautiful and customizable graphs
Better aesthetics than Matplotlib
Integration with Pandas & NumPy

📌 Why Use Seaborn?
Easier to use than Matplotlib
Attractive default styles
Built-in themes for better presentation
Handles categorical data better
Provides complex visualizations in a simple way


2. Installing and Importing Seaborn

A. Install Seaborn

pip install seaborn

B. Import Seaborn and Other Required Libraries

import seaborn as sns
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

3. Basic Seaborn Plots

A. Load Sample Dataset

Seaborn provides built-in datasets for practice.

# Load Seaborn's built-in dataset
tips = sns.load_dataset("tips")
tips.head()

📌 Common Datasets in Seaborn:

  • tips → Restaurant bill tips
  • iris → Flower measurements
  • penguins → Penguin species data
  • diamonds → Diamond pricing

B. Scatter Plot (for Relationships Between Variables)

sns.scatterplot(x="total_bill", y="tip", data=tips, hue="sex", style="smoker", size="size")
plt.title("Scatter Plot of Total Bill vs Tip")
plt.show()

📌 Best Use Case:
✔ Visualizing relationships between numerical variables


C. Line Plot (for Trends Over Time)

sns.lineplot(x="day", y="total_bill", data=tips, hue="sex", marker="o")
plt.title("Line Plot of Total Bill Over Days")
plt.show()

📌 Best Use Case:
Time series analysis


D. Bar Plot (for Categorical Data Comparison)

sns.barplot(x="day", y="total_bill", data=tips, hue="sex", estimator=np.mean)
plt.title("Average Total Bill per Day")
plt.show()

📌 Best Use Case:
✔ Comparing categories based on numerical values


E. Histogram & KDE Plot (for Data Distribution)

sns.histplot(tips["total_bill"], bins=20, kde=True, color="blue")
plt.title("Histogram of Total Bill Amount")
plt.show()

📌 Best Use Case:
✔ Understanding data distribution & frequency


F. Box Plot (for Outlier Detection)

sns.boxplot(x="day", y="total_bill", data=tips, hue="sex")
plt.title("Box Plot of Total Bill by Day")
plt.show()

📌 Best Use Case:
Identifying outliers & spread of data


G. Violin Plot (for Distribution & Density)

sns.violinplot(x="day", y="total_bill", data=tips, hue="sex", split=True)
plt.title("Violin Plot of Total Bill by Day")
plt.show()

📌 Best Use Case:
✔ Combining box plot & KDE plot for better insights


H. Pair Plot (for Multi-Variable Relationships)

sns.pairplot(tips, hue="sex")
plt.show()

📌 Best Use Case:
Analyzing relationships between multiple variables


4. Advanced Seaborn Customizations

A. Setting Themes

sns.set_theme(style="darkgrid")

📌 Themes Available:

  • "darkgrid"
  • "whitegrid"
  • "dark"
  • "white"
  • "ticks"

B. Customizing Colors

sns.barplot(x="day", y="total_bill", data=tips, palette="coolwarm")

📌 Popular Color Palettes:

  • "coolwarm"
  • "Blues"
  • "Reds"
  • "magma"
  • "viridis"

C. Adding Titles & Labels

plt.title("Customized Seaborn Plot", fontsize=15, fontweight="bold")
plt.xlabel("X-axis Label", fontsize=12)
plt.ylabel("Y-axis Label", fontsize=12)

D. Using FacetGrid (Multiple Plots in One Figure)

g = sns.FacetGrid(tips, col="sex", row="smoker", margin_titles=True)
g.map_dataframe(sns.scatterplot, x="total_bill", y="tip")
plt.show()

📌 Best Use Case:
✔ Creating multiple related plots


5. Heatmaps (for Correlation & Relationships)

# Compute Correlation Matrix
corr_matrix = tips.corr()

# Create Heatmap
sns.heatmap(corr_matrix, annot=True, cmap="coolwarm", linewidths=0.5)
plt.title("Correlation Heatmap")
plt.show()

📌 Best Use Case:
Finding relationships between multiple numerical variables


6. Saving Seaborn Plots

plt.savefig("seaborn_plot.png", dpi=300, bbox_inches="tight")

7. Summary

Seaborn is a high-level Python library for data visualization
✔ Supports scatter, line, bar, histogram, box, violin, and pair plots
✔ Offers built-in themes and color palettes
✔ Allows advanced customizations with FacetGrid & Heatmaps
Best suited for statistical data analysis

📌 Next Steps:
✅ Explore Matplotlib for lower-level customization
✅ Use Seaborn with Pandas & NumPy
✅ Try interactive dashboards with Plotly

Need help with a project? Let me know!

Leave a Reply

Your email address will not be published. Required fields are marked *