This notebook was created by Jean de Dieu Nyandwi for the love of machine learning community. For any feedback, errors or suggestion, he can be reached on email (johnjw7084 at gmail dot com), Twitter, or LinkedIn.
Data Visualizations with Matplotlib¶
To gain more insights or understand the problem you're solving, it is very important to visualize the dataset that you are working with.
Matplotlib is powerful visualization tool in Python. It can be used to create 2D and 3D figures. Seaborn that we will see later is built on Matplotlib.
This is what you will learn:
If you would like more inspirations before continue, you can check out the Gallery page of Matplotlib - The plots there are fascinating.
import matplotlib.pyplot as plt
import numpy as np
# If using notebook, you need this to display plots
%matplotlib inline
# Or plt.show()for other editors or if you are using scripts (.py)
A Simple Starter plot¶
x_data = [1,2,3,4,5,6,7,8,9]
y_data = np.exp(x_data)
plt.plot(x_data, y_data)
plt.title('This is a title')
plt.xlabel('X_data, normally X axis Title')
plt.ylabel('Y_data, normally Y axis Title')
Text(0, 0.5, 'Y_data, normally Y axis Title')
x_data
and y_data
can be lists, NumPy arrays, or a Pandas dataframes. In the latter parts, you will work with real world datasets where you can plot a feature as a dataframe.
Below is the anatomy of the figure. It is taken from and built with matplotlib.org
.
Subplots¶
plt.subplot
will display multiple plots on same figure.
# plt.subplot(number of rows, number of columns, plot number)
# Create emply figure with multiple plots
plt.subplot(1,3,1)
plt.plot(x_data, y_data)
plt.subplot(1,3,2)
plt.plot(x_data, np.sin(y_data))
plt.subplot(1,3,3)
plt.plot(x_data, np.cos(y_data))
[<matplotlib.lines.Line2D at 0x7fc604eb3b50>]
Figure Size¶
You can use plt.figure(figsize=(width, height)
to adjust the figure size to your desied size.
plt.figure(figsize=(10,8))
plt.plot(x_data, y_data)
[<matplotlib.lines.Line2D at 0x7fc5f1c87750>]
You can also specify the dot per inch (DPI)
plt.figure(figsize=(8,5), dpi=100)
plt.plot(x_data, y_data)
[<matplotlib.lines.Line2D at 0x7fc6234d92d0>]
figsize=(width, height)
can also be used in subplots that we saw earlier.
plt.subplots(figsize=(10,5))
Saving Figure¶
We can save the Matplot figures in different formats, be it PNG, JPG, PDF, etc...
figure = plt.figure(figsize=(8,5))
figure.savefig('Savedfigure.png')
<Figure size 576x360 with 0 Axes>
Figure Title, Axis Labels and Legends¶
# A Title can be added to figure with plt.tiitle('.....')
x = np.linspace(0, 50, 100)
y1 = np.sin(x)
y2 = np.cos(x)
y3 = np.tan(x)
plt.plot(x, y1)
plt.title('This is a title')
Text(0.5, 1.0, 'This is a title')
# X and Y labels can be added to figure by plt.xlabel('...') or plt.ylabel('..')
plt.plot(x, y2)
plt.xlabel('This is X Label')
plt.ylabel('This is Y Label')
Text(0, 0.5, 'This is Y Label')
# Legend is added to figure by plt.legend()
plt.plot(x, y1, label = 'Y1')
plt.plot(x, y2, label = 'Y2')
plt.xlabel('This is X Label')
plt.ylabel('This is Y Label')
plt.legend(loc='lower left')
<matplotlib.legend.Legend at 0x7fc604fde4d0>
Line Colors and Styles¶
You can add the color to the line plots using color
parameter, color codes(rgbcmyk), hex codes, and grayscale level.
Color codes (RGBCYMK):
'b'....blue
'g'....green
'r'....red
'c'....cyan
'm'....magenta
'y'....yellow
'k'....black
Hex codes:
Example: #00FF00' is green. Google any hex code color.
plt.figure(figsize=(10,8))
plt.plot(x, y1, color='red') #red
plt.plot(x, y2, color='y') #yellow,
plt.plot(x, y2*2, color='#00FF00') # green
plt.plot(x, y2*3, color='0.8') #grayscale
[<matplotlib.lines.Line2D at 0x7fc6050254d0>]
Line styles and markers¶
Line styles
'-' ....solid line style
'--'....dashed line style
'-.'....dash-dot line style\
':'....dotted line style
'b'...# blue markers with default shape
'or'...# red circles
'-g'....# green solid line
'--'...# dashed line with default color
'^k:'...# black triangle_up markers connected by a dotted line
Line Markers
'.'...point marker
','...pixel marker
'o'...circle marker
'v'...triangle_down marker
'^'...triangle_up marker
'<'....triangle_left marker
'>'....triangle_right marker
'1'....tri_down marker
'2'....tri_up marker
'3'....tri_left marker
'4'....tri_right marker
's'....square marker
'p'....pentagon marker
'*'....star marker
'h'....hexagon1 marker
'H'....hexagon2 marker
'+'....plus marker
'x'....x marker
'D'....diamond marker
'd'....thin_diamond marker
'|'....vline marker
'_'....hline marker
## Line Styles
## I will also add line width(lw) and marker size
plt.figure(figsize=(10,8))
plt.plot(x, y1, color='red', ls='solid', marker='+', lw=3, markersize=3)
plt.plot(x, y2, color='b', ls='dotted', marker='D', lw=4, markersize=4)
plt.plot(x, y2*2, color='#00FF00', ls='dashed', marker='*', lw=5, markersize=5)
plt.plot(x, y2*3, color='0.8', ls=':', marker='s', lw=6, markersize=6)
[<matplotlib.lines.Line2D at 0x7fc6051a0290>]
Text and Annotation¶
If you want to add texts or annotation in visualization, Matplotlib can allow you to do that.
x = [1,2,3,4,5,6,7,8]
y = np.square(x)
# Plot x and y
plt.plot(x, y)
# Add the text to the plot
# plt.text(x position, y position, string, color or style)
plt.text(2,60, 'This texts is added at the plot where x=2, y=60', color='r')
Text(2, 60, 'This texts is added at the plot where x=2, y=60')
# Text Annotation
plt.plot(x, y)
plt.annotate('This is the Minimum point',
xy=(1,1),
xytext=(1.5, 30), arrowprops=dict(facecolor='red'))
plt.annotate('This is the Maximum point', # string to annotate
xy=(8,64), # the position of x and y
xytext=(1, 55), # the xy position to add the texts
bbox=dict(boxstyle='round'), # or boxstyle can be 'round'
arrowprops=dict(arrowstyle='fancy',connectionstyle='angle3, angleA=0, angleB=-90'))
plt.annotate('Middle, estimated', # string to annotate
xy=(4.5,22), # the position of x and y
xytext=(5, 12), # the xy position to add the texts
bbox=dict(boxstyle='round', fc='none'), # or boxstyle can be 'round'
arrowprops=dict(arrowstyle='wedge, tail_width=0.5', alpha=0.3))
Text(5, 12, 'Middle, estimated')
There are alot of interesting things that you can do with annotation and texts by plt.text()
and plt.annotate()
. Learn more at Matplotlib documentation.
If you have ever used Matplab
language, standing for Matrix Laboratory, you have probably saw that all plots or subplots were easy to create and that's how it is done in Matlab. It is Matlab Interface.
How do we create customized plots?
Object Oriented Method¶
If you want to create and customize figures, object oriented method is for you.
Below are examples
# Create empty figure
fig = plt.figure()
# Create axes
ax = plt.axes()
# Plot on that axes
ax.plot(x, y, 'g') # green color
# Add the title
ax.set_title('This is a title')
# Add X and Y label
ax.set_xlabel('This is X Label')
ax.set_ylabel('This is Y Label')
Text(0, 0.5, 'This is Y Label')
This is the same as for subplots
.
# Create empty figure
#fig = plt.figure()
# Create axes
fig, axes = plt.subplots(nrows=1, ncols=2)
# Plot on that axes
axes[0].plot(x, y, 'g') # green color
axes[1].plot(x, y, 'b') # blue color
# Add the title
axes[0].set_title('This is a title 1')
axes[1].set_title('This is a title 2')
Text(0.5, 1.0, 'This is a title 2')
You can also iterate through the axes of subplots. This would be very helpful if you have many rows and columns.
When using axes
and subplots, the elements of figure can overlap. You can use plt.tight_layout()
to correct that automatically.
fig, axes = plt.subplots(nrows=1, ncols=3)
for ax in axes:
ax.plot(x, y, 'r')
ax.set_xlabel('This is Xlabel')
ax.set_ylabel('This is Y Label')
ax.set_title('This is a Title')
# Show the figure
fig
# Use plt.tight_layout to adjust the positions of the axes, and labels automatically
# Comment it and see how bad the figure is
plt.tight_layout()
Many functions will be same while using Matlab Interface or Object oriented except things like defining labels, title, which we already saw above.
To conclude the object oriented interface, you could use the function ax.set()
and include all the plot things immediatey like xlim, ylim, xlabel, ylabel, title
.
# Create an axis
# End the function with ; to get rid of unused figure details
ax = plt.axes()
ax.plot(x, y, 'g')
ax.set(title='This is a Title',
xlabel='This is xlabel',
ylabel='This is ylabel',
ylim=(-10,10), # This is x limit..Set by ax.set_xlim()
xlim=(0,25)); # This is y limit..Set by ax.set_ylim()
Colors, Color Maps and Style¶
2. More Types of Plots¶
Scatter Plot¶
These types of plots are used to display the relationship between two variables using dots (by default, more styles can be used).
plt.scatter(y1, y2, marker='s')
<matplotlib.collections.PathCollection at 0x7fc6239d0a90>
plt.scatter(y, x, marker='>')
<matplotlib.collections.PathCollection at 0x7fc6052b5cd0>
Bar Plot¶
Bar plots are used to display the relationship between two variables, one being mumeric and other being categorical.
quantity = np.array([30,30,30,10])
fruits = ['Apple', 'Orange','Lemon', 'Pineapple']
plt.bar(fruits, quantity)
<BarContainer object of 4 artists>
Pie Charts¶
Pie plots are used to show the proportion of a feature elements in circular form.
plt.pie(quantity, labels=fruits);
quantity = np.array([30,30,30,10])
fruits= ['Apple', 'Orange','Lemon', 'Pineapple']
plt.pie(quantity, labels=fruits, shadow=True);
Histograms¶
Histograms are used to show the frequency distribution of the data values.
fruits= ['Apple', 'Orange','Lemon', 'Pineapple','Orange', 'Pineapple']
plt.hist(fruits);
data = np.random.randn(100)
plt.hist(data);
That's it for the basic plots in Matplotlib. To learn more about 3D Plotting and other advanced plots, check out the official documentary.
3. Matplotlib for Image Visualization¶
Using Matplotlib image functions, we can visualize images.
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
We will use a sample image from Scikit-Learn
, a Machine learning framework.
from PIL import Image
from sklearn.datasets import load_sample_image
# Loading sample image from sklearn images
image = load_sample_image('flower.jpg')
image_2 =load_sample_image('china.jpg')
# Plotting image with plt.imshow
img_plot = plt.imshow(image)
Enhancing the Image¶
The image we have is RGB(Red, Green, Blue) color channel. Let's select a single color using NumPy slicing technique.
img = image[:, :,0]
plt.imshow(img)
<matplotlib.image.AxesImage at 0x7fc60555b110>
plt.imshow(img, cmap='ocean')
<matplotlib.image.AxesImage at 0x7fc6054dbad0>
cmap
stands for color maps. There are alot of color maps to choose from.
plt.imshow(img, cmap='hot')
<matplotlib.image.AxesImage at 0x7fc6054d0610>
plt.imshow(img, cmap='Reds')
<matplotlib.image.AxesImage at 0x7fc60557fa10>
For more color maps, here is the list of them.
Image Color bar¶
img_plot = plt.imshow(img, cmap='Oranges')
plt.colorbar()
<matplotlib.colorbar.Colorbar at 0x7fc612e94710>
This is the end of the Matplotlib notebook!