Bar Chart and Scatter Plot with Altair

Introduction

Altair simplifies the process of turning data into beautiful, interactive charts. We'll be using a dataset of cars to demonstrate two types of visualizations: a Bar Chart and a Scatter Plot. First, we'll see how to compare the average horsepower of cars from different origins using a Bar Chart. This will give us insights into which region produces more powerful cars. Next, we'll dive into a Scatter Plot to examine how a car's weight influences its miles per gallon, offering a glimpse into the efficiency of different car models. It's built on top of Vega and Vega-Lite visualization grammars and offers a simple, intuitive, and consistent way to build a wide range of statistical visualizations.

General Steps to use Altair package in Google Colab

Step 1. Install Altair using the below command

!pip install altair_viewer

Step 2. Install the dataset

!pip install altair vega_datasets

Bar chart using Altair

import altair as alt
from vega_datasets import data

# Load sample data
source = data.cars()

# Create a bar chart
chart = alt.Chart(source).mark_bar().encode(
    x='mean(Horsepower)',
    y='Origin',
    color='Origin'
)

# Display the chart
chart.show()

Explanation

  1. We import Altair and a sample dataset (cars) from vega_datasets, a collection of datasets used for visualization examples.
  2. We create a bar chart using alt.Chart(source), where source is the DataFrame containing our data.
  3. We use mark_bar() to specify that we want a bar chart.
  4. In the encode method, we define the x-axis as the mean horsepower (mean(Horsepower)), the y-axis as the car origin (Origin), and color the bars by the car origin.
  5. Finally, we display the chart using chart.show().

Scatter Plot

import altair as alt
from vega_datasets import data

# Load sample data
source = data.cars()

# Create a scatter plot
scatter_plot = alt.Chart(source).mark_circle(size=60).encode(
    x='Weight_in_lbs',
    y='Miles_per_Gallon',
    color='Origin',
    tooltip=['Name', 'Origin', 'Weight_in_lbs', 'Miles_per_Gallon']
).interactive()

# Display the chart
scatter_plot.show()

Explanation

  1. We import Altair and the cars dataset from vega_datasets.
  2. We create a scatter plot using alt.Chart(source). The source is the DataFrame containing our data.
  3. mark_circle(size=60) specifies that we want to use circles of size 60 to represent our data points.
  4. In the encode method, we define:
    • x='Weight_in_lbs': The x-axis represents the weight of the cars.
    • y='Miles_per_Gallon': The y-axis represents the miles per gallon.
    • color='Origin': The color of the points is determined by the car's origin.
    • tooltip=['Name', 'Origin', 'Weight_in_lbs', 'Miles_per_Gallon']: This adds a tooltip that displays the car's name, origin, weight, and MPG when you hover over a point.
  5. .interactive() makes the plot interactive, allowing you to zoom in and out and pan across the plot.
  6. Finally, scatter_plot.show() displays the chart.

Conclusion

With just a few steps, we've seen how Altair can help us easily turn car data into clear and engaging charts, making data analysis both fun and insightful.


Recommended Free Ebook
Similar Articles