10 b] Apply code for scatter plot on animal statistics using matplotlib.
Let’s visualize the correlation between various animals with the help of a scatter plot
- Open the Exercise3.03.ipynb Jupyter Notebook from the Chapter03 folder to implement this exercise. Navigate to the path of this file and type in the following at the command-line
terminal: jupyter-lab. - Import the necessary modules and enable plotting within the Jupyter Notebook:
# Import statements import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline
3. Use pandas to read the data located in the Datasets folder
# Load dataset
data = pd.read_csv('../../Datasets/anage_data.csv')
- The given dataset is not complete. Filter the data so that you end up with samples containing a body mass and a maximum longevity. Sort the data according to the animal class; here, the isfinite() function (to check whether the number is finite or not) checks for the finiteness of the given element:
# Preprocessing longevity = 'Maximum longevity (yrs)' mass = 'Body mass (g)' data = data[np.isfinite(data[longevity]) & np.isfinite(data[mass])] # Sort according to class amphibia = data[data['Class'] == 'Amphibia'] aves = data[data['Class'] == 'Aves'] mammalia = data[data['Class'] == 'Mammalia'] reptilia = data[data['Class'] == 'Reptilia']
- Create a scatter plot visualizing the correlation between the body mass and the maximum longevity. Use different colors to group data samples according to their class. Add a legend, labels, and a title. Use a log scale for both the x-axis and y-axis:
# Create figure
plt.figure(figsize=(10, 6), dpi=300)
# Create scatter plot
plt.scatter(amphibia[mass], amphibia[longevity], label='Amphibia')
plt.scatter(aves[mass], aves[longevity], label='Aves')
plt.scatter(mammalia[mass], mammalia[longevity], label='Mammalia')
plt.scatter(reptilia[mass], reptilia[longevity], label='Reptilia')
# Add legend
plt.legend()
# Log scale
ax = plt.gca()
ax.set_xscale('log')
ax.set_yscale('log')
# Add labels
plt.xlabel('Body mass in grams')
plt.ylabel('Maximum longevity in years')
# Show plot
plt.show()
