10 b] Apply code for scatter plot on animal statistics using matplotlib.
Let’s visualize the correlation between various animals with the help of a scatter plot
- Open the Exercise3.03.ipynb Jupyter Notebook from the Chapter03 folder to implement this exercise. Navigate to the path of this file and type in the following at the command-line
terminal: jupyter-lab. - Import the necessary modules and enable plotting within the Jupyter Notebook:
# Import statements import pandas as pd import numpy as np import matplotlib.pyplot as plt %matplotlib inline
3. Use pandas to read the data located in the Datasets folder
# Load dataset data = pd.read_csv('../../Datasets/anage_data.csv')
- The given dataset is not complete. Filter the data so that you end up with samples containing a body mass and a maximum longevity. Sort the data according to the animal class; here, the isfinite() function (to check whether the number is finite or not) checks for the finiteness of the given element:
# Preprocessing longevity = 'Maximum longevity (yrs)' mass = 'Body mass (g)' data = data[np.isfinite(data[longevity]) & np.isfinite(data[mass])] # Sort according to class amphibia = data[data['Class'] == 'Amphibia'] aves = data[data['Class'] == 'Aves'] mammalia = data[data['Class'] == 'Mammalia'] reptilia = data[data['Class'] == 'Reptilia']
- Create a scatter plot visualizing the correlation between the body mass and the maximum longevity. Use different colors to group data samples according to their class. Add a legend, labels, and a title. Use a log scale for both the x-axis and y-axis:
# Create figure plt.figure(figsize=(10, 6), dpi=300) # Create scatter plot plt.scatter(amphibia[mass], amphibia[longevity], label='Amphibia') plt.scatter(aves[mass], aves[longevity], label='Aves') plt.scatter(mammalia[mass], mammalia[longevity], label='Mammalia') plt.scatter(reptilia[mass], reptilia[longevity], label='Reptilia') # Add legend plt.legend() # Log scale ax = plt.gca() ax.set_xscale('log') ax.set_yscale('log') # Add labels plt.xlabel('Body mass in grams') plt.ylabel('Maximum longevity in years') # Show plot plt.show()