Python Libraries for Machine Learning: Seaborn

Introduction 

 
In the previous chapter, we studied Python MatPlotLib, its functions, and its python implementations.
 
In this chapter, we will start with the next very useful and important Python Machine Learning library "Python Seaborn".
 

What is Python Seaborn? 

 
Seaborn is a library for making statistical graphics in Python. It is built on top of matplotlib and closely integrated with pandas data structures.
 
Here is some of the functionality that seaborn offers:
  • A dataset-oriented API for examining relationships between multiple variables
  • Specialized support for using categorical variables to show observations or aggregate statistics
  • Options for visualizing univariate or bivariate distributions and for comparing them between subsets of data
  • Automatic estimation and plotting of linear regression models for different kinds of dependent variables
  • Convenient views onto the overall structure of complex datasets
  • High-level abstractions for structuring multi-plot grids that let you easily build complex visualizations
  • Concise control over matplotlib figure styling with several built-in themes
  • Tools for choosing colour palettes that faithfully reveal patterns in your data
Seaborn aims to make visualization a central part of exploring and understanding data. Its dataset-oriented plotting functions operate on dataframes and arrays containing whole datasets and internally perform the necessary semantic mapping and statistical aggregation to produce informative plots
 
The official website is seaborn.pydata.org
 

Installing Seaborn 

 
1. Ubuntu/Linux
  1. sudo apt update -y                  
  2. sudo apt upgrade -y                  
  3. sudo apt install python3-tk python3-pip -y                  
  4. sudo pip install seaborn -y    
2. Anaconda Prompt
  1. conda install -c anaconda seaborn   

Difference Between Matplotlib and Seaborn 

 
  MatPlotLib Seaborn 
Functionality  Matplotlib is mainly deployed for basic plotting. Visualization using Matplotlib generally consists of bars, pies, lines, scatter plots and so on.  Seaborn, on the other hand, provides a variety of visualization patterns. It uses fewer syntax and has easily interesting default themes. It specializes in statistics visualization and is used if one has to summarize data in visualizations and also show the distribution in the data.
Handling Multiple Figures  Matplotlib has multiple figures can be opened, but need to be closed explicitly. plt.close() only closes the current figure. plt.close(‘all’) would close em all.  Seaborn automates the creation of multiple figures. This sometimes leads to OOM (out of memory) issues. 
Visualization  Matplotlib is a graphics package for data visualization in Python. It is well integrated with NumPy and Pandas. The pyplot module mirrors the MATLAB plotting commands closely. Hence, MATLAB users can easily transit to plotting with Python. Seaborn is more integrated for working with Pandas data frames. It extends the Matplotlib library for creating beautiful graphics with Python using a more straightforward set of methods. 
Data Frames and Arrays  Matplotlib works with data frames and arrays. It has different stateful APIs for plotting. The figures and aces are represented by the object and therefore plot() like calls without parameters suffices, without having to manage parameters.  Seaborn works with the dataset as a whole and is much more intuitive than Matplotlib. For Seaborn, replot() is the entry API with ‘kind’ parameter to specify the type of plot which could be line, bar, or any of the other types. Seaborn is not stateful. Hence, plot() would require passing the object. 
Flexibility  Matplotlib is highly customizable and powerful. Seaborn avoids a ton of boilerplate by providing default themes which are commonly used. 
Use Cases 
Pandas uses Matplotlib. It is a neat wrapper around Matplotlib. 
Seaborn is for more specific use cases. Also, it is Matplotlib under the hood. It is specially meant for statistical plotting. 
 

Seaborn Functions

 

1. seaborn.relplot()

 
Syntax
 
seaborn.relplot(x=None, y=None, hue=None, size=None, style=None, data=None, row=None, col=None, col_wrap=None, row_order=None, col_order=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, markers=None, dashes=None, style_order=None, legend='brief', kind='scatter', height=5, aspect=1, facet_kws=None, **kwargs) 
 
It is a function that is a figure-level interface for drawing relational plots onto a FacetGrid. 
  1. import seaborn as sns  
  2. sns.set(style="white")  
  3.   
  4. # Load the example mpg dataset  
  5. mpg = sns.load_dataset("mpg")  
  6.   
  7. # Plot miles per gallon against horsepower with other semantics  
  8. sns.relplot(x="horsepower", y="mpg", hue="origin", size="weight",  
  9.             sizes=(40040), alpha=.5, palette="muted",  
  10.             height=6, data=mpg)  
Output 
 
relplot
 

2. seaborn.scatterplot()

 
Syntax
 
seaborn.scatterplot(x=None, y=None, hue=None, style=None, size=None, data=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, markers=True, style_order=None, x_bins=None, y_bins=None, units=None, estimator=None, ci=95, n_boot=1000, alpha='auto', x_jitter=None, y_jitter=None, legend='brief', ax=None, **kwargs)
 
Draws a scatter plot with the possibility of several semantic groupings. 
  1. import seaborn as sns  
  2. sns.set()  
  3.   
  4. # Load the example iris dataset  
  5. planets = sns.load_dataset("planets")  
  6.   
  7. cmap = sns.cubehelix_palette(rot=-.5, as_cmap=True)  
  8. ax = sns.scatterplot(x="distance", y="orbital_period",  
  9.                      hue="year", size="mass",  
  10.                      palette=cmap, sizes=(100100),  
  11.                      data=planets)  
Output
 
scatterplot

3. seaborn.lineplot() 

 
Syntax 
 
seaborn.lineplot(x=None, y=None, hue=None, size=None, style=None, data=None, palette=None, hue_order=None, hue_norm=None, sizes=None, size_order=None, size_norm=None, dashes=True, markers=None, style_order=None, units=None, estimator='mean', ci=95, n_boot=1000, sort=True, err_style='band', err_kws=None, legend='brief', ax=None, **kwargs) 
 
Draws a line plot with the possibility of several semantic groupings. 
  1. import numpy as np  
  2. import pandas as pd  
  3. import seaborn as sns  
  4. sns.set(style="whitegrid")  
  5.   
  6. rs = np.random.RandomState(365)  
  7. values = rs.randn(3654).cumsum(axis=0)  
  8. dates = pd.date_range("1 1 2016", periods=365, freq="D")  
  9. data = pd.DataFrame(values, dates, columns=["A""B""C""D"])  
  10. data = data.rolling(10).mean()  
  11.   
  12. sns.lineplot(data=data, palette="tab10", linewidth=2.5)  
Output
lineplot
 

4. seaborn.catplot()

 
Syntax 
 
seaborn.catplot(x=None, y=None, hue=None, data=None, row=None, col=None, col_wrap=None, estimator=<function mean>, ci=95, n_boot=1000, units=None, order=None, hue_order=None, row_order=None, col_order=None, kind='strip', height=5, aspect=1, orient=None, color=None, palette=None, legend=True, legend_out=True, sharex=True, sharey=True, margin_titles=False, facet_kws=None, **kwargs) 
 
Figure-level interface for drawing categorical plots onto a FacetGrid.
  1. import seaborn as sns  
  2. sns.set(style="whitegrid")  
  3.   
  4. # Load the example exercise dataset  
  5. df = sns.load_dataset("exercise")  
  6.   
  7. # Draw a pointplot to show pulse as a function of three categorical factors  
  8. g = sns.catplot(x="pulse", y="time", hue="diet", col="kind",  
  9.                 capsize=.6, palette="YlGnBu_d", height=6, aspect=.75,  
  10.                 kind="point", data=df)  
  11. g.despine(left=True)  
Output 
 
catplot
 

5. seaborn.stripplot() 

 
Syntax
 
seaborn.stripplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, jitter=True, dodge=False, orient=None, color=None, palette=None, size=5, edgecolor='gray', linewidth=0, ax=None, **kwargs) 
 
Draws a scatterplot where one variable is categorical.
  1. import pandas as pd  
  2. import seaborn as sns  
  3. import matplotlib.pyplot as plt  
  4.   
  5. sns.set(style="whitegrid")  
  6. iris = sns.load_dataset("iris")  
  7.   
  8. # "Melt" the dataset to "long-form" or "tidy" representation  
  9. iris = pd.melt(iris, "species", var_name="measurement")  
  10.   
  11. # Initialize the figure  
  12. f, ax = plt.subplots()  
  13. sns.despine(bottom=True, left=True)  
  14.   
  15. # Show each observation with a scatterplot  
  16. sns.stripplot(x="measurement", y="value", hue="species",  
  17.               data=iris, dodge=True, jitter=True,  
  18.               alpha=.25, zorder=1)  
  19.   
  20. # Show the conditional means  
  21. sns.pointplot(x="measurement", y="value", hue="species",  
  22.               data=iris, dodge=.532, join=False, palette="dark",  
  23.               markers="d", scale=.75, ci=None)  
  24.   
  25. # Improve the legend   
  26. handles, labels = ax.get_legend_handles_labels()  
  27. ax.legend(handles[3:], labels[3:], title="species",  
  28.           handletextpad=0, columnspacing=1,  
  29.           loc="lower right", ncol=3, frameon=True)  
Output
 
stripplot
  

6. seaborn.swarmplot()

 
Syntax
 
seaborn.swarmplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, dodge=False, orient=None, color=None, palette=None, size=5, edgecolor='gray', linewidth=0, ax=None, **kwargs) 
 
Draws a categorical scatterplot with non-overlapping points.
  1. import pandas as pd  
  2. import seaborn as sns  
  3. sns.set(style="whitegrid", palette="muted")  
  4.   
  5. # Load the example iris dataset  
  6. iris = sns.load_dataset("iris")  
  7.   
  8. # "Melt" the dataset to "long-form" or "tidy" representation  
  9. iris = pd.melt(iris, "species", var_name="measurement")  
  10.   
  11. # Draw a categorical scatterplot to show each observation  
  12. sns.swarmplot(x="value", y="measurement", hue="species",  
  13.               palette=["r""c""y"], data=iris)  
Output
 
swarmplot
  

7. seaborn.boxplot() 

 
Syntax
 
seaborn.boxplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, fliersize=5, linewidth=None, whis=1.5, notch=False, ax=None, **kwargs)
 
Draws a box plot to show distributions with respect to categories.
  1. import seaborn as sns  
  2. import matplotlib.pyplot as plt  
  3.   
  4. sns.set(style="ticks")  
  5.   
  6. # Initialize the figure with a logarithmic x axis  
  7. f, ax = plt.subplots(figsize=(76))  
  8. ax.set_xscale("log")  
  9.   
  10. # Load the example planets dataset  
  11. planets = sns.load_dataset("planets")  
  12.   
  13. # Plot the orbital period with horizontal boxes  
  14. sns.boxplot(x="distance", y="method", data=planets,  
  15.             whis="range", palette="vlag")  
  16.   
  17. # Add in points to show each observation  
  18. sns.swarmplot(x="distance", y="method", data=planets,  
  19.               size=2, color=".6", linewidth=0)  
  20.   
  21. # Tweak the visual presentation  
  22. ax.xaxis.grid(True)  
  23. ax.set(ylabel="")  
  24. sns.despine(trim=True, left=True)  
Output
 
boxplot
  

8. seaborn.violinplot()

 
Syntax
 
seaborn.violinplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, bw='scott', cut=2, scale='area', scale_hue=True, gridsize=100, width=0.8, inner='box', split=False, dodge=True, orient=None, linewidth=None, color=None, palette=None, saturation=0.75, ax=None, **kwargs)
 
Draws a combination of boxplot and kernel density estimate.
  1. import seaborn as sns  
  2. import matplotlib.pyplot as plt  
  3. sns.set(style="whitegrid")  
  4.   
  5. # Load the example dataset of brain network correlations  
  6. df = sns.load_dataset("brain_networks", header=[012], index_col=0)  
  7.   
  8. # Pull out a specific subset of networks  
  9. used_networks = [13456781112131617]  
  10. used_columns = (df.columns.get_level_values("network")  
  11.                           .astype(float)  
  12.                           .isin(used_networks))  
  13. df = df.loc[:, used_columns]  
  14.   
  15. # Compute the correlation matrix and average over networks  
  16. corr_df = df.corr().groupby(level="network").mean()  
  17. corr_df.index = corr_df.index.astype(int)  
  18. corr_df = corr_df.sort_index().T  
  19.   
  20. # Set up the matplotlib figure  
  21. f, ax = plt.subplots(figsize=(116))  
  22.   
  23. # Draw a violinplot with a narrower bandwidth than the default  
  24. sns.violinplot(data=corr_df, palette="Set3", bw=1, cut=.2, linewidth=1)  
  25.   
  26. # Finalize the figure  
  27. ax.set(ylim=(-.71.05))  
  28. sns.despine(left=True, bottom=True)  
Output
 
violinplot
 

9. seaborn.boxenplot() 

 
Syntax
 
seaborn.boxenplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, width=0.8, dodge=True, k_depth='proportion', linewidth=None, scale='exponential', outlier_prop=None, ax=None, **kwargs)
 
Draws an enhanced box plot for larger datasets.
  1. import seaborn as sns  
  2. sns.set(style="whitegrid")  
  3.   
  4. diamonds = sns.load_dataset("diamonds")  
  5. clarity_ranking = ["I1""SI2""SI1""VVS2""VVS1""IF" "VS2""VS1"]  
  6.   
  7. sns.boxenplot(x="clarity", y="carat",  
  8.               color="g", order=clarity_ranking,  
  9.               scale="linear", data=diamonds)  
Output
 
boxenplot
 

10. seaborn.pointplot()

 
Syntax
 
seaborn.pointplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean>, ci=95, n_boot=1000, units=None, markers='o', linestyles='-', dodge=False, join=True, scale=1, orient=None, color=None, palette=None, errwidth=None, capsize=None, ax=None, **kwargs)
 
Shows point estimates and confidence intervals using scatter plot graphs.
  1. import seaborn as sns  
  2. sns.set(style="whitegrid")  
  3.   
  4. # Load the example Titanic dataset  
  5. titanic = sns.load_dataset("titanic")  
  6.   
  7. # Set up a grid to plot survival probability against several variables  
  8. g = sns.PairGrid(titanic, y_vars="survived",  
  9.                  x_vars=["class""sex"],  
  10.                  height=5, aspect=.5)  
  11.   
  12. # Draw a seaborn pointplot onto each Axes  
  13. g.map(sns.pointplot, scale=1.3, errwidth=4, color="xkcd:plum")  
  14. g.set(ylim=(01))  
  15. sns.despine(fig=g.fig, left=True)  
Output 
 
pointplot
 

11. seaborn.barplot()

 
Syntax 
 
seaborn.barplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, estimator=<function mean>, ci=95, n_boot=1000, units=None, orient=None, color=None, palette=None, saturation=0.75, errcolor='.26', errwidth=None, capsize=None, dodge=True, ax=None, **kwargs)
 
Shows point estimates and confidence intervals as rectangular bars.
  1. import numpy as np  
  2. import seaborn as sns  
  3. import matplotlib.pyplot as plt  
  4. sns.set(style="white", context="talk")  
  5. rs = np.random.RandomState(8)  
  6.   
  7. # Set up the matplotlib figure  
  8. f, (ax1, ax2, ax3) = plt.subplots(31, figsize=(75), sharex=True)  
  9.   
  10. # Generate some sequential data  
  11. x = np.array(list("ABCDEFGHIJ"))  
  12. y1 = np.arange(111)  
  13. sns.barplot(x=x, y=y1, palette="rocket", ax=ax1)  
  14. ax1.axhline(0, color="k", clip_on=False)  
  15. ax1.set_ylabel("Sequential")  
  16.   
  17. # Center the data to make it diverging  
  18. y2 = y1 - 5.5  
  19. sns.barplot(x=x, y=y2, palette="vlag", ax=ax2)  
  20. ax2.axhline(0, color="k", clip_on=False)  
  21. ax2.set_ylabel("Diverging")  
  22.   
  23. # Randomly reorder the data to make it qualitative  
  24. y3 = rs.choice(y1, len(y1), replace=False)  
  25. sns.barplot(x=x, y=y3, palette="deep", ax=ax3)  
  26. ax3.axhline(0, color="k", clip_on=False)  
  27. ax3.set_ylabel("Qualitative")  
  28.   
  29. # Finalize the plot  
  30. sns.despine(bottom=True)  
  31. plt.setp(f.axes, yticks=[])  
  32. plt.tight_layout(h_pad=2)  
Output
 
barplot
 

12. seaborn.countplot()

 
Syntax
 
seaborn.countplot(x=None, y=None, hue=None, data=None, order=None, hue_order=None, orient=None, color=None, palette=None, saturation=0.75, dodge=True, ax=None, **kwargs)
 
Shows the counts of observations in each categorical bin using bars.
  1. import seaborn as sns  
  2. sns.set(style="darkgrid")  
  3. titanic = sns.load_dataset("titanic")  
  4. g = sns.catplot(x="class", hue="who", col="survived", data=titanic, kind="count", height=4, aspect=.7)  
Output
 
countplot
  

13. seaborn.jointplot() 

 
Syntax
 
seaborn.jointplot(x, y, data=None, kind='scatter', stat_func=None, color=None, height=6, ratio=5, space=0.2, dropna=True, xlim=None, ylim=None, joint_kws=None, marginal_kws=None, annot_kws=None, **kwargs)
 
Draw a plot of two variables with bivariate and univariate graphs. 
  1. import numpy as np  
  2. import seaborn as sns  
  3. sns.set(style="ticks")  
  4.   
  5. rs = np.random.RandomState(11)  
  6. x = rs.gamma(1, size=500)  
  7. y = -.5 * x + rs.normal(size=500)  
  8.   
  9. sns.jointplot(x, y, kind="hex", color="#4CB391")  
Output
 
jointplot
 

14. seaborn.pairplot() 

 
Syntax 
 
seaborn.pairplot(data, hue=None, hue_order=None, palette=None, vars=None, x_vars=None, y_vars=None, kind='scatter', diag_kind='auto', markers=None, height=2.5, aspect=1, dropna=True, plot_kws=None, diag_kws=None, grid_kws=None, size=None)
 
Plots pairwise relationships in a dataset.
  1. import seaborn as sns  
  2. sns.set(style="ticks")  
  3.   
  4. df = sns.load_dataset("iris")  
  5. sns.pairplot(df, hue="species")  
Output
 
pairplot
 

15. seaborn.distplot()

 
Syntax 
 
seaborn.distplot(a, bins=None, hist=True, kde=True, rug=False, fit=None, hist_kws=None, kde_kws=None, rug_kws=None, fit_kws=None, color=None, vertical=False, norm_hist=False, axlabel=None, label=None, ax=None)
 
Flexibly plots a univariate distribution of observations.
  1. import numpy as np  
  2. import seaborn as sns  
  3. import matplotlib.pyplot as plt  
  4.   
  5. sns.set(style="white", palette="muted", color_codes=True)  
  6. rs = np.random.RandomState(10)  
  7.   
  8. # Set up the matplotlib figure  
  9. f, axes = plt.subplots(22, figsize=(77), sharex=True)  
  10. sns.despine(left=True)  
  11.   
  12. # Generate a random univariate dataset  
  13. d = rs.normal(size=100)  
  14.   
  15. # Plot a simple histogram with binsize determined automatically  
  16. sns.distplot(d, kde=False, color="b", ax=axes[00])  
  17.   
  18. # Plot a kernel density estimate and rug plot  
  19. sns.distplot(d, hist=False, rug=True, color="r", ax=axes[01])  
  20.   
  21. # Plot a filled kernel density estimate  
  22. sns.distplot(d, hist=False, color="g", kde_kws={"shade"True}, ax=axes[10])  
  23.   
  24. # Plot a historgram and kernel density estimate  
  25. sns.distplot(d, color="m", ax=axes[11])  
  26.   
  27. plt.setp(axes, yticks=[])  
  28. plt.tight_layout()  
Output
 
distplot
 

16. seaborn.kdeplot() 

 
Syntax
 
seaborn.kdeplot(data, data2=None, shade=False, vertical=False, kernel='gau', bw='scott', gridsize=100, cut=3, clip=None, legend=True, cumulative=False, shade_lowest=True, cbar=False, cbar_ax=None, cbar_kws=None, ax=None, **kwargs)
 
Fits and plots a univariate or bivariate kernel density estimate.
  1. import numpy as np  
  2. import seaborn as sns  
  3. import matplotlib.pyplot as plt  
  4.   
  5. sns.set(style="dark")  
  6. rs = np.random.RandomState(500)  
  7.   
  8. # Set up the matplotlib figure  
  9. f, axes = plt.subplots(33, figsize=(99), sharex=True, sharey=True)  
  10.   
  11. # Rotate the starting point around the cubehelix hue circle  
  12. for ax, s in zip(axes.flat, np.linspace(0310)):  
  13.   
  14.     # Create a cubehelix colormap to use with kdeplot  
  15.     cmap = sns.cubehelix_palette(start=s, light=1, as_cmap=True)  
  16.   
  17.     # Generate and plot a random bivariate dataset  
  18.     x, y = rs.randn(250)  
  19.     sns.kdeplot(x, y, cmap=cmap, shade=True, cut=5, ax=ax)  
  20.     ax.set(xlim=(-33), ylim=(-33))  
  21.   
  22. f.tight_layout()  
Output
 
kdeplot
 

17. seaborn.rugplot() 

 
Syntax 
 
seaborn.rugplot(a, height=0.05, axis='x', ax=None, **kwargs)
 
Plots data points in an array as sticks on an axis.
  1. import numpy as np  
  2. import matplotlib.pyplot as plt  
  3. import seaborn as sns  
  4. sample = np.hstack((np.random.randn(300), np.random.randn(200)+5))  
  5. fig, ax = plt.subplots(figsize=(8,4))  
  6. sns.distplot(sample, rug=True, hist=False, rug_kws={"color""g"},  
  7.     kde_kws={"color""k""lw"3})  
  8. plt.show()  
Output
 
rugplot 
 

18. seaborn.lmplot() 

 
Syntax
 
seaborn.lmplot(x, y, data, hue=None, col=None, row=None, palette=None, col_wrap=None, height=5, aspect=1, markers='o', sharex=True, sharey=True, hue_order=None, col_order=None, row_order=None, legend=True, legend_out=True, x_estimator=None, x_bins=None, x_ci='ci', scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=False, x_jitter=None, y_jitter=None, scatter_kws=None, line_kws=None, size=None)
 
Plots data and regression model fits across a FacetGrid.
  1. import seaborn as sns  
  2. sns.set()  
  3.   
  4. # Load the iris dataset  
  5. iris = sns.load_dataset("iris")  
  6.   
  7. # Plot sepal with as a function of sepal_length across days  
  8. g = sns.lmplot(x="sepal_length", y="sepal_width", hue="species",  
  9.                truncate=True, height=5, data=iris)  
  10.   
  11. # Use more informative axis labels than are provided by default  
  12. g.set_axis_labels("Sepal length (mm)""Sepal width (mm)")  
Output
 
lmplot
 

19. seaborn.regplot()  

 
Syntax
 
seaborn.regplot(x, y, data=None, x_estimator=None, x_bins=None, x_ci='ci', scatter=True, fit_reg=True, ci=95, n_boot=1000, units=None, order=1, logistic=False, lowess=False, robust=False, logx=False, x_partial=None, y_partial=None, truncate=False, dropna=True, x_jitter=None, y_jitter=None, label=None, color=None, marker='o', scatter_kws=None, line_kws=None, ax=None)
 
Plots data and a linear regression model fit.
  1. import seaborn as sns; sns.set(color_codes=True)  
  2. tips = sns.load_dataset("tips")  
  3. ax = sns.regplot(x=x, y=y, marker="+")  
Output 
 
regplot  
 

20. seaborn.residplot() 

 
Syntax 
 
seaborn.residplot(x, y, data=None, lowess=False, x_partial=None, y_partial=None, order=1, robust=False, dropna=True, label=None, color=None, scatter_kws=None, line_kws=None, ax=None)
 
Plots the residuals of linear regression.
  1. import numpy as np  
  2. import seaborn as sns  
  3. sns.set(style="whitegrid")  
  4.   
  5. # Make an example dataset with y ~ x  
  6. rs = np.random.RandomState(10)  
  7. x = rs.normal(2175)  
  8. y = 2 + 1.5 * x + rs.normal(1275)  
  9.   
  10. # Plot the residuals after fitting a linear model  
  11. sns.residplot(x, y, lowess=True, color="g")  
Output
 
residplot
 

21. seaborn.heatmap() 

 
Syntax 
 
seaborn.heatmap(data, vmin=None, vmax=None, cmap=None, center=None, robust=False, annot=None, fmt='.2g', annot_kws=None, linewidths=0, linecolor='white', cbar=True, cbar_kws=None, cbar_ax=None, square=False, xticklabels='auto', yticklabels='auto', mask=None, ax=None, **kwargs)
 
Plots rectangular data as a color-encoded matrix.
  1. import matplotlib.pyplot as plt  
  2. import seaborn as sns  
  3. sns.set()  
  4.   
  5. # Load the example flights dataset and conver to long-form  
  6. flights_long = sns.load_dataset("flights")  
  7. flights = flights_long.pivot("month""year""passengers")  
  8.   
  9. # Draw a heatmap with the numeric values in each cell  
  10. f, ax = plt.subplots(figsize=(97))  
  11. sns.heatmap(flights, annot=True, fmt="d", linewidths=.5, ax=ax)  
Output 
 
heatmap
 

22. seaborn.clustermap() 

 
Syntax
 
seaborn.clustermap(data, pivot_kws=None, method='average', metric='euclidean', z_score=None, standard_scale=None, figsize=None, cbar_kws=None, row_cluster=True, col_cluster=True, row_linkage=None, col_linkage=None, row_colors=None, col_colors=None, mask=None, **kwargs)
 
Plots a matrix dataset as a hierarchically-clustered heatmap.
  1. import pandas as pd  
  2. import seaborn as sns  
  3. sns.set()  
  4.   
  5. # Load the brain networks example dataset  
  6. df = sns.load_dataset("brain_networks", header=[012], index_col=0)  
  7.   
  8. # Select a subset of the networks  
  9. used_networks = [15678121317]  
  10. used_columns = (df.columns.get_level_values("network")  
  11.                           .astype(int)  
  12.                           .isin(used_networks))  
  13. df = df.loc[:, used_columns]  
  14.   
  15. # Create a categorical palette to identify the networks  
  16. network_pal = sns.husl_palette(8, s=.45)  
  17. network_lut = dict(zip(map(str, used_networks), network_pal))  
  18.   
  19. # Convert the palette to vectors that will be drawn on the side of the matrix  
  20. networks = df.columns.get_level_values("network")  
  21. network_colors = pd.Series(networks, index=df.columns).map(network_lut)  
  22.   
  23. # Draw the full plot  
  24. sns.clustermap(df.corr(), center=0, cmap="vlag",  
  25.                row_colors=network_colors, col_colors=network_colors,  
  26.                linewidths=.75, figsize=(88))  
Output
 
clustermap
                 

Conclusion 

 
In this chapter, we studied Python Seaborn. In the next chapter, we will learn about Python Tensorflow.
 
Python Tensorflow is a very useful library used primarily for dataflow and differentiable programming across a range of tasks.
Author
Rohit Gupta
65 27.3k 3m
Next » Python Libraries for Machine Learning: TensorFlow