📊 Data Visualization: Seaborn Cheat Sheet

date
Apr 1, 2023
slug
data-visualization-seaborn-cheat-sheet
status
Published
tags
Data Visualization Basics
summary
A Seaborn Cheat Sheet for your Data Analysis.
type
Post
Last updated
Apr 1, 2023 04:47 PM
👋 Hi guys, in this lesson I will show a wide Seaborn Cheat Sheet to you know which plots are available in this library and what situations is better to use each one. So let's go!!
 
PS.: you can get all codes shown and all datasets used here in this GitHub Repository: data-visualization-basics-posts.
 

 
First of all, let’s load an image containing the brief of everything I will cover in this lesson:
 
from IPython.display import Image
Image('./datas/seaborn-plots.png')
notion image
 

 

-) Preparing the Libraries and Datasets

 
#
# ---- Importing the modules ----
#
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

sns.set_style('whitegrid')


#
# ---- Reading the datasets ----
#

spotify_data = pd.read_csv('./datas/spotify.csv'
                           , index_col='Date'
                           , parse_dates=True)

flight_data = pd.read_csv('./datas/flight_delays.csv'
                         , index_col='Month')

insurance_data = pd.read_csv('./datas/insurance.csv')

iris_data = pd.read_csv('./datas/iris.csv'
                      , index_col='Id')
 

 

0) Trend Plots

 
Trend Plots are used to show patterns of changes, especially over the time, such as hours, days, weeks, years and so on. Styles:
 
  • Line Plots
 

0.1) Line Plots

 
Line Plots are usefull to show changes of variables over the time. Commands:
 
All Variables
sns.lineplot(data=data)
 
Just a Few Variables
sns.lineplot(data=data1, label='label1') sns.lineplot(data=data2, label='label2)
 
#
# ---- Showing a line plot with all columns ----
#
plt.figure(figsize=(10,7))
plt.title('Songs on Spotify: Number of Views per Date')
plt.xlabel('Date')
plt.ylabel('Number of Views')

sns.lineplot(data=spotify_data)

plt.show()
notion image
 
#
# ---- Line Plot with just a few columns/variables ----
#
plt.figure(figsize=(10,7))

plt.title('Spotify Songs: Number of Views per Date')
plt.xlabel('Date')
plt.ylabel('Number of Views')

sns.lineplot(data=spotify_data['Despacito'], label='Despacito')
sns.lineplot(data=spotify_data['Something Just Like This'], label='Something Just Like This')
sns.lineplot(data=spotify_data['Unforgettable'], label='Unforgettable')

plt.show()
notion image
 

 

1) Relationship Plots

 
Relationship Plots are used to show comparisons/relationships between the dataset variables. Styles:
 
  • Bar Plot
  • HeatMap Plot
  • Scatter Plot
  • Regression Plot
  • Lm Plot
  • Swarm Plot
 

1.0) Bar Plots

 
Bar Plots compare quantities corresponding from different groups. Commands:
 
Simple Bar Plot
sns.barplot(data=data, x=x_data, y=y_data)
Bar Plot With Categorial Variable
sns.barplot(data=data, x=x_data, y=y_data, hue=categorical variable)
 
#
# ---- Simple Bar Plot ----
#
plt.figure(figsize=(10,7))
plt.title("Month Delays of HA's Flights")

# positive values >> the flight was delayed
# negative values >> the flight was early
sns.barplot(data=flight_data, x=flight_data.index, y='HA')

plt.show()
notion image
 
#
# ---- Bar Plot with Categorical Variable ----
#
plt.figure(figsize=(10,7))
plt.title('Charges by BMI and Smoker or Not People')


sns.barplot(data=insurance_data[0:20]
           , x='bmi'
           , y='charges'
           , hue='smoker')

plt.show()
notion image
 

1.1) HeatMap Plots

 
HeatMaps show patterns and correlations colouring the variables, only accept numerical variables and categorical variables must be labelled. Commands:
 
Simple HeatMap
sns.heatmap(data=data, annot=True|False)
 
#
# ---- Simple HeatMap ----
#
plt.figure(figsize=(10,7))

plt.title('Flights Delays per Month')
plt.xlabel('Flights')
plt.ylabel('Months')

sns.heatmap(data=flight_data, annot=True)

plt.show()
notion image
 

1.2) Scatter Plot

 
Scatter Plots show how the data are distributed in the dataset. Commands:
 
Simple Scatter Plot
sns.scatterplot(data=data, x=x_data, y=y_data)
Scatter Plot with Categorical Variable
sns.scatterplot(data=data, x=x_data, y=y_data, hue=categorical_variable)
 
#
# ---- Simple Scatter Plot ----
#
plt.figure(figsize=(10,7))

plt.title('Charges by BMI')
plt.xlabel('BMI')
plt.ylabel('Charges')

sns.scatterplot(data=insurance_data, x='bmi', y='charges')

plt.show()
notion image
 
#
# ---- Scatter Plot with Categorical Variable ----
#
plt.figure(figsize=(10,7))

plt.title('Charges by BMI and Smoker or Not People')
plt.xlabel('BMI')
plt.ylabel('Charges')

sns.scatterplot(data=insurance_data, x='bmi', y='charges', hue='smoker')

plt.show()
notion image
 

1.3) Regression Plots

 
Regression Plots are like the scatter ones, but with a regression line displaying the relationship between the x and y variables. Commands:
 
Regression Plot
sns.regplot(data=data, x=x_data, y=y_data)
 
#
# ---- Regression Plot ----
#
plt.figure(figsize=(10,7))

plt.title('Charges by BMI')
plt.xlabel('BMI')
plt.ylabel('Charges')

# as higher the BMI is, as more the person pays for charges
sns.regplot(data=insurance_data, x='bmi', y='charges')

plt.show()
notion image
 

1.4) LM Plots

 
LM Plots are like the regression ones, but with a categorical variable and a regression line for each categorical group. Commands:
 
LM Plot
sns.lmplot(data=data, x=x_data, y=y_data, hue=categorical_variable)
 
#
# ---- LM Plot ----
#
#
# we can assume that if the person is a smoker, they will pay
# for more charges as higher the BMI is.
#
# the same is valid for non-smokers, but the charge value is way
# to slow compared to the smoker ones
sns.lmplot(data=insurance_data, x='bmi', y='charges', hue='smoker');
notion image
 

1.5) Swarm Plots

 
Swarm Plots show the relationship between a categorical variable and a numerical one. Commands:
 
Swarm Plot
sns.swarmplot(data=data, x=x_data, y=y_data)
 
#
# ---- Swarm Plot ----
#
plt.figure(figsize=(10,7))

plt.title('Charges by Smoker or Non-Smoker People')
plt.xlabel('Smoker or Not')
plt.ylabel('Charges')

sns.swarmplot(data=insurance_data, x='smoker', y='charges')

plt.show()
notion image
 

 

2) Distribution Plots

 
Distribution Plots show how the datas are distributed into the dataset and help the Data Scientist takes guesses of the possible output values for specific inputs. Styles:
 
  • Histogram Plots
  • KDE Plots
  • Joint Plots
 

2.0) Histogram Plots

 
Histogram Plots show the distribution of a single a variable with or without a categorical one. Commands:
 
Simple Hist Plot
sns.histplot(data=data, x=x_data)
Hist Plot with Categorical Variable
sns.histplot(data=data, x=x_data, hue=categorical_variable)
 
#
# ---- Simple Hist Plot ----
#
plt.figure(figsize=(10,7))

plt.title('Number of Flowers by Sepal Length (cm)')
plt.xlabel('Count')
plt.ylabel('Sepal Length (cm)')

sns.histplot(data=iris_data, x='Sepal Length (cm)')

plt.show()
notion image
 
#
# ---- Hist Plot with Categorical Variable ----
#
plt.figure(figsize=(10,7))

plt.title('Number of Flowers by Sepal Length (cm) and Species')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Count')

sns.histplot(data=iris_data, x='Sepal Length (cm)', hue='Species')

plt.show()
notion image
 
#
# ---- Simple Histogram that I liked so badly ----
#
plt.figure(figsize=(10,7))

plt.title('Number of Flowers by Sepal Length (cm)')
plt.xlabel('Count')
plt.ylabel('Sepal Length (cm)')

sns.histplot(data=iris_data)

plt.show()
notion image
 

2.1) KDE Plots

 
KDE Plots show a smoother distribution (compared to the histograms) of one variable with or without a categorical one. Commands:
 
Simple KDE Plot
sns.kdeplot(data=data, x=x_data, shade=True|False)
KDE Plot with Categorical Variable
sns.kdeplot(data=data, x=x_data, hue=categorical_variable, shade=True|False)
 
#
# ---- Simple KDE Plot ----
#
plt.figure(figsize=(10,7))

plt.title('Number of Flowers per Sepal Length (cm)')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Density')

sns.kdeplot(data=iris_data, x='Sepal Length (cm)', shade=True)

plt.show()
notion image
 
#
# ---- KDE Plot with Categorical Variable ----
#
plt.figure(figsize=(10,7))

plt.title('Number of Flowers by Sepal Length (cm) and Species')
plt.xlabel('Sepal Length (cm)')
plt.ylabel('Density')

sns.kdeplot(data=iris_data, x='Sepal Length (cm)', hue='Species', shade=True)

plt.show()
notion image
 

2.2) Joint Plot

 
Joint Plots shows a KDE Plot of two numerical variables merged. Commands:
 
Joint Plot
sns.jointplot(data=data, x=x_data, y=y_data, kind='kde)
 
#
# ---- Joint Plot ----
#
sns.jointplot(data=iris_data
             , x='Sepal Length (cm)'
             , y='Petal Length (cm)'
             , kind='kde');
notion image
 

 
Yeah, yeah, I know that this lesson has a little too much information, but you don't have to learn all of these things at once. Use this post to review every time you have to create plots and you will get used to it.
 
See ya in the next lesson!! 👋