Data visualization enables you to find context for your data through maps or graphs. This tutorial offers an insightful guide to interacting with graphs in Jupyter Notebook.
Prerequisites
You need to have Jupyter installed on your machine. If it’s not, you can install it by entering the following code into your command-line:
You’ll also need the pandas and matplotlib library:
After the installations are complete, start the Jupyter Notebook server. Type the command below in your terminal to do so. A Jupyter page showing files in the current directory will open in your computer’s default browser.
Note: Do not close the terminal window that you run this command in. Your server will stop should you do so.
Simple Plot
In a new Jupyter page, run this code:
The code is for a simple line plot. The first line imports the pyplot graphing library from the matplotlib API. The third and fourth lines define the x and y axes respectively.
The plot() method is called to plot the graph. The show() method is then used to display the graph.
Suppose you wish to draw a curve instead. The process is the same. Just change the values of the python list for the y-axis.
Notice something important: in both graphs, there’s no explicit scale definition. The scale is automatically calculated and applied. This is one of the many interesting features that Juypter offers which can get you focused on your work (data analysis) instead of worrying about code.
If you’re also vigilant, you may observe that the number of values for the x and y axes are the same. If any of them are less than the other, an error will be flagged when you run the code and no graph will be shown.
Types Available
Unlike the line graph and curve above, other graph visualizations (e.g a histogram, bar chart, etc.) need to be explicitly defined in order to be shown.
Bar Graph
To show a bar plot you will need to use the bar() method.
Scatter Plot
All you need to do is to use the scatter() method in the previous code.
Pie Chart
A pie plot is a bit different from the rest above. Line 4 is of particular interest, so take a look at the features there.
figsize is used to set the aspect ratio. You can set this to anything you like (e.g (9,5)), but the official Pandas docs advise that you use an aspect ratio of 1.
There are some parameters the pie chart has that are noteworthy:
labels - This can be used to give a label to each slice in the pie chart.
colors - This can be used to give predefined colors to each of the slices. You can specify colors both in text form (e.g “yellow”) or in hex form(e.g “#ebc713”).
See the example below:
There are also other plots like hist, area, and kde that you can read more about on Pandas docs.
Plot Formatting
In the plots above, there aren’t any aspects such as labels. Here’s how to do that.
To add a title, include the code below in your Jupyter Notebook:
The x and y axes can be respectively labelled as below:
Learning More
You can run the help() command in your notebook to get interactive assistance about Jupyter commands. To get more information on an a particular object, you can use help(object).
You’ll also find it a good practice to try drawing graphs using datasets from csv files. Learning how to visualize data is a powerful tool to communicate and analyze your findings, so it’s worth taking some time to build your skill.