Plotnine: Python Port of R’s GGplot

People who love Python but rely on R’s GGplot for visualization might want to explore Plotnine. Plotnine is a Python implementation of R’s GGPlot and has exactly same API. Further, I love the fact that it works directly with Pandas DataFrame and thereby fits perfectly with the data analytics process. Below are few examples to demonstrate it’s power and basic usage.

```from sklearn import datasets
from plotnine import *
import pandas as pd

df = pd.DataFrame(
data['data'],
columns=data['feature_names']
)
df['target'] = data['target']
df[list(df.columns)[0:4] + ["target"]].sample(4)
```
CRIM ZN INDUS CHAS target
256 0.01538 90.0 3.75 0.0 44.0
23 0.98843 0.0 8.14 0.0 14.5
50 0.08873 21.0 5.64 0.0 19.7
464 7.83932 0.0 18.10 0.0 21.4

Example 1: A simple density plot for the target variable

```(
ggplot(df)
+ geom_density(aes("target"))
)
```

Example 2: Density Plot with bells and whistles

• `theme_...`: Use theme to stylize chart. There are many different themes available ranging from professional looking to informal. I particularly enjoy using xkcd
• `theme(figure_size=(...))`: In order to change the size of the figure, using figure_size attribute.
• `np.log(...)`: Notice that I am converting target variable to log scale. You can use any other expression to transform values.
```import numpy as np

(
ggplot(df, aes("np.log(target)"))
+ geom_density()
+ theme_seaborn()
+ xlab("House Price") + ylab("Density")
+ theme(figure_size=(10, 5))
)
```

Example 3: Using facet_wrap

One of the best features of GGplot is `facet_wrap`. It allows to render multiple plots associated with different variables. Luckily, facet_wrap is available as part of the plotnine library

```meltedDF = pd.melt(df, value_vars=['AGE', 'target', 'TAX'])
(
ggplot(meltedDF, aes("np.log(value)", fill="variable"))
+ geom_density(alpha=0.5)
+ theme_xkcd()
+ facet_wrap("~variable")
+ theme(figure_size=(20, 5))
+ xlab("Value (Log Scale)") + ylab("Density")
)
```

Few other tips:
1. Hiding Legend: `scale_color_discrete(guide=False)`
2. Change Legend Title: `scale_fill_discrete(name="New Title")`
3. Rotate labels: `theme(axis_text_x=element_text(rotation=90, hjust=1))`

Note:
An alternative port of GGplot is `yhat`. I tried using yhat long back and it was missing lot of critical features. I haven’t tried in recent months. Let me know if you have any recent experience with yhat.