Comical way of storytelling with xkcd Matplotlib
Making comical charts for infographics in matplotlib for Netflix Tv Show and Movies dataset.
Data visualization is a great way for telling the story as humans understand charts easy rather than tables and values. You can absorb the data in a great way by visualization by various charts but what if we can do it in a fun way!
Have you ever wondered about a comical way to represent the data? In this article, we will see a Matplotlib library for creating our fun visualizations. Since we are attracted to visualize by nature, we can add these skills to enhance the data by using charts and telling stories.
We can make a bar chart, line chart, scatter chart, and many more charts in a comical way with XKCD matplotlib. XKCD is a style that is meant to mimic the style of drawing of popular comic XKCD. XKCD is a comic series created by Randall Munroe and he is famous for its simple design and witty humor.
The data which I am going to take is taken from Kaggle of Netflix Movies and TV shows, if you want an EDA for this data do visit here.
Python interface with xkcd
Python library uses json interface to Randall’s site to retrieve the comic data. python 2 and python 3 both are supported for xkcd.
Table of Content
- Plotting chart with xkcd
- Description of Dataset
- Netflix Timeline chart
- Pie Chart
- Bar Chart
- Line Chart
- plotting charts with subplots
- Infographics with xkcd
- Where to use xkcd types of charts
Plotting chart with xkcd
To make the comical type of chart we just need to add our plotting code into the following block. Adding the code of matplotlib for a bar chart, line chart, or any other chart in the below code will make a comical chart
with plt.xkcd():
Importing the library
import pandas as pd
import matplotlib.pyplot as plt
Dataset
First, let’s see how the data looks like. Before starting the analysis part need to do data cleaning do follow the steps provided here.
df = pd.read_csv(r'D:\netflix_titles.csv')
df.head(2)
The data is of Netflix Movies and TV Shows that have various features like; type, title, director, country, cast, year, duration. In this article, we will see how to make a comical chart and compare the charts which are made in plotly.
Netflix Journey
Before diving into how to make charts, let’s look around and make a journey chart of Netflix. To make this chart we need to combine scatter and line charts. This visual shows the journey of Netflix from DVDs to Netflix and Chill!
from datetime import datetime# these go on the numbers below
tl_dates = [
"1997\nFounded",
"1999\nStart Monthly\nsubscription\nservice",
"2004\nLaunches online\n DVD rental\nservice",
"2007\nStreaming service",
"2016\nGoes Global",
"2021\nNetflix & Chill"
]
tl_x = [1, 2.2, 4, 5.3, 7.5, 8.5]# the numbers go on these
tl_sub_x = [1.5,3,5,6.5]tl_sub_times = [
"1998","2000","2006","2012"
]tl_text = [
"Netflix.com launched", "Starts\nPersonal\nRecommendations","Billionth DVD Delivery","UK Launch"]with plt.xkcd():
# Set figure & Axes
fig, ax = plt.subplots(figsize=(15, 6), constrained_layout=True)
ax.set_ylim(-2, 1.75)
ax.set_xlim(0, 10)# Timeline : line
ax.axhline(0, xmin=0.1, xmax=0.85, c='#000', zorder=1)# Timeline : Date Points
ax.scatter(tl_x, np.zeros(len(tl_x)), s=120, c='#000', zorder=2)
ax.scatter(tl_x, np.zeros(len(tl_x)), s=30, c='#000', zorder=3)
# Timeline : Time Points
ax.scatter(tl_sub_x, np.zeros(len(tl_sub_x)), s=50, c='#777',zorder=4)# Date Text
for x, date in zip(tl_x, tl_dates):
ax.text(x, -0.55, date, ha='center',
fontfamily='serif', fontweight='bold',
color='#111',fontsize=12)# Stemplot : vertical line
levels = np.zeros(len(tl_sub_x))
levels[::2] = 0.3
levels[1::2] = -0.3
markerline, stemline, baseline = ax.stem(tl_sub_x, levels, use_line_collection=True)
plt.setp(baseline, zorder=0)
plt.setp(markerline, marker=',', color='#000')
plt.setp(stemline, color='#000')# Text
for idx, x, time, txt in zip(range(1, len(tl_sub_x)+1), tl_sub_x, tl_sub_times, tl_text):
ax.text(x, 1.3*(idx%2)-0.5, time, ha='center',
fontfamily='serif', fontweight='bold',
color='#111', fontsize=11)ax.text(x, 1.3*(idx%2)-0.6, txt, va='top', ha='center',
fontfamily='serif',color='#111')# Spine
for spine in ["left", "top", "right", "bottom"]:
ax.spines[spine].set_visible(False)# Ticks
ax.set_xticks([])
ax.set_yticks([])# Title
ax.set_title("From DVD rental to Netflix & chill", fontweight="bold", fontfamily='serif', fontsize=16, color='#111')
plt.show()
Reference for the chart is taken from here
To enhance this chart you can add different colors to the timeline. Where each color will show a different part of the Netflix Journey.
Pie chart in a comical way
Let’s see the ratio of Movies and Tv shows. This chart can be present like this in a plotly.
But to spice things up and make it a little bit interesting we will create a chart like this
df_type = df['type'].value_counts().reset_index().rename(columns = {'index':'Type','type':'Count'})
with plt.xkcd():
explode = (0, 0.1)
fig1, ax1 = plt.subplots(figsize=(5, 5), dpi=100)
ax1.pie(df_type["Count"], explode=explode, labels=df_type["Type"], autopct='%1.1f%%',
shadow=True)
ax1.set_title('Most watched on Netflix')
plt.show()
As we see there is a higher number of audiences preferring Movies over TV shows on Netflix.
2. Bar chart in a comical way
Watching the distribution of Ratings on Netflix. To find out which type of content is most preferable by the audience. The below chart is made in plotly.
Let’s make the bar chart with xkcd.
import numpy as np
df_rating = pd.DataFrame(df['rating'].value_counts()).reset_index().rename(columns={'index':'rating','rating':'count'}) #.sort_values(by='count', ascending=True)
with plt.xkcd():
fig, ax = plt.subplots(figsize=(10, 4), dpi=100)
y_pos = np.arange(len(df_rating.rating))
ax.barh(df_rating.rating, df_rating['count'], align='center')
ax.set_yticks(y_pos)
ax.set_yticklabels(df_rating.rating)
ax.invert_yaxis()
ax.set_title('Distribution of Ratings')
plt.show()
Interpreting the visual
Most of the content watched is preferred by the mature audiences so we can tell most of the users Netflix has are of mature age. The highest number of shows has TV-MA and TV-14 rating tags. The least show having a rating tag is NC-17; where children under 17 can not watch these shows.
3. Line chart in a comical way
Let’s see how much TV Shows have created an impact over the years. So Netflix will know if they should produce more TV Shows or not. They can also make some strategies so the audience can watch more TV Shows.
d1 = df[df["type"] == "TV Show"]
col = "year_added"
vc1 = d1[col].value_counts().reset_index().rename(columns = {col : "count", "index" : col})
vc1['percent'] = vc1['count'].apply(lambda x : 100*x/sum(vc1['count']))
vc1 = vc1.sort_values(col)
with plt.xkcd():
fig, ax = plt.subplots(figsize=(10, 6))
ax.plot(vc1[col], vc1["count"])
plt.title('TV Shows imapct over the years')
plt.show()
Interpreting the visuale
In the year 2017 TV Shows were in demand after the year 2020. We can also check for Movies too just by selecting the Movie data and compare the shows with respect to years.
4. Working with subplots
Let’s see how we can work with subplots in xkcd in matplotlib. We will see The most popular director with the highest content from countries; India, United States, Canada, United Kingdom. For easy differentiation, I have given different colors for the countries.
from collections import Counter
from matplotlib.pyplot import figure
import math
colours = ["#4c78a8", "#e45766", "#72b7b2", "#b279a2"]
countries_list = [ "India", "United States", "Canada", "United Kingdom"]
col = "director"
with plt.xkcd():
figure(num=None, figsize=(20, 8))
x=1
for country in countries_list:
country_df = df[df["country"]==country]
categories = ", ".join(country_df['director'].fillna("")).split(", ")
counter_list = Counter(categories).most_common(5)
counter_list = [_ for _ in counter_list if _[0] != ""]
labels = [_[0] for _ in counter_list][::-1]
values = [_[1] for _ in counter_list][::-1]
if max(values)<10:
values_int = range(0, math.ceil(max(values))+1)
else:
values_int = range(0, math.ceil(max(values))+1, 2)
plt.subplot(2, 2, x)
plt.barh(labels,values, color = colours[x-1])
plt.xticks(values_int)
plt.title(country)
x+=1
plt.suptitle('Popular Directors with the Highest content')
plt.tight_layout()
plt.show()
Interpreting the visuale
As we see how to make a pie chart, a bar chart, line chart, and how to work with subplots with xkcd in matplotlib. Now we can use these charts in a fun way to represent the data.
Infographics in Matplotlib
Let’s combine all the charts which we made with xkcd. We will create small comical infographics for the Netflix dataset. I made some changes in the bar and line chart by removing axes of spines from the right, top, and bottom to make it look more presentable.
with plt.xkcd():
fig = plt.figure(figsize=(15, 8))
plt.subplots_adjust(wspace= 0.35, hspace= 0.40)
ax1 = fig.add_subplot(2,2,1)
ax1.barh(df_rating.rating, df_rating['count'])
plt.annotate('Tv-MA is highest', xy = (1400, 5), va = 'center', ha = 'center', weight='bold', fontsize = 15)
ax1.set_title("Distribution of Ratings")
ax1.axes.get_xaxis().set_visible(False)
ax1.spines['right'].set_visible(False)
ax1.spines['top'].set_visible(False)
ax1.spines['bottom'].set_visible(False)
# Create second axes, the top-left plot with orange plot
ax2 = fig.add_subplot(2,2,2)
ax2.pie(df_type["Count"], explode=explode,
labels=df_type["Type"], autopct='%1.1f%%', shadow=True)
ax2.set_title('Ration of Movies vs TV shows')
# Create third axes, a combination of third and fourth cell
ax3 = fig.add_subplot(2,2,(3,4))
ax3.plot(vc1[col], vc1["count"])
ax3.set_title('TV shows over the Years')
ax3.spines['right'].set_visible(False)
ax3.spines['top'].set_visible(False)
plt.tight_layout()
plt.show()
Interpreting the infographics
We can see that from here Movies are most preferred to watch and most of the ratings are given to TV-MA shows. There are fluctuations in the charts of TV Shows impact over the years. We can add more images or charts to describe the use case.
Where to use xkcd type of charts
We normally use the charts in our analysis or while making presentations but using comical charts can be used in a fun way of telling a story.
End Note
As we saw how to make boring charts interesting with the help of xkcd. We can add these types of charts in meetings or presentations. It’s fun to create these charts. I hope you like the journey from boring graphics charts to comical visuals.