Introduction
Altair simplifies the process of turning data into beautiful, interactive charts. We'll be using a dataset of cars to demonstrate two types of visualizations: a Bar Chart and a Scatter Plot. First, we'll see how to compare the average horsepower of cars from different origins using a Bar Chart. This will give us insights into which region produces more powerful cars. Next, we'll dive into a Scatter Plot to examine how a car's weight influences its miles per gallon, offering a glimpse into the efficiency of different car models. It's built on top of Vega and Vega-Lite visualization grammars and offers a simple, intuitive, and consistent way to build a wide range of statistical visualizations.
General Steps to use Altair package in Google Colab
Step 1. Install Altair using the below command
!pip install altair_viewer
Step 2. Install the dataset
!pip install altair vega_datasets
Bar chart using Altair
import altair as alt
from vega_datasets import data
# Load sample data
source = data.cars()
# Create a bar chart
chart = alt.Chart(source).mark_bar().encode(
x='mean(Horsepower)',
y='Origin',
color='Origin'
)
# Display the chart
chart.show()
Explanation
- We import Altair and a sample dataset (
cars
) from vega_datasets
, a collection of datasets used for visualization examples.
- We create a bar chart using
alt.Chart(source)
, where source
is the DataFrame containing our data.
- We use
mark_bar()
to specify that we want a bar chart.
- In the
encode
method, we define the x-axis as the mean horsepower (mean(Horsepower)
), the y-axis as the car origin (Origin
), and color the bars by the car origin.
- Finally, we display the chart using
chart.show()
.
Scatter Plot
import altair as alt
from vega_datasets import data
# Load sample data
source = data.cars()
# Create a scatter plot
scatter_plot = alt.Chart(source).mark_circle(size=60).encode(
x='Weight_in_lbs',
y='Miles_per_Gallon',
color='Origin',
tooltip=['Name', 'Origin', 'Weight_in_lbs', 'Miles_per_Gallon']
).interactive()
# Display the chart
scatter_plot.show()
Explanation
- We import Altair and the
cars
dataset from vega_datasets
.
- We create a scatter plot using
alt.Chart(source)
. The source
is the DataFrame containing our data.
mark_circle(size=60)
specifies that we want to use circles of size 60 to represent our data points.
- In the
encode
method, we define:
x='Weight_in_lbs'
: The x-axis represents the weight of the cars.
y='Miles_per_Gallon'
: The y-axis represents the miles per gallon.
color='Origin'
: The color of the points is determined by the car's origin.
tooltip=['Name', 'Origin', 'Weight_in_lbs', 'Miles_per_Gallon']
: This adds a tooltip that displays the car's name, origin, weight, and MPG when you hover over a point.
.interactive()
makes the plot interactive, allowing you to zoom in and out and pan across the plot.
- Finally,
scatter_plot.show()
displays the chart.
Conclusion
With just a few steps, we've seen how Altair can help us easily turn car data into clear and engaging charts, making data analysis both fun and insightful.