I would like to share with you a small tool I have discovered this year which is very useful; Violin Plot!

# The Intro,

## Violin Plot

Violin plots are a nice tool that we can find in many different visualisation libraries. What do they do they show and compare distributions. It’s, in my own opinion a better way to compare distributions. Here is a quick example

As you can see here you can easily compare the value from the for categories (Thur, Fri, Sat, Sun).

In a more precise way, the Violin plot is a mix of Box Plot and Kernel density plot

### The Set list

- The thin bar represent the 95% confidence interval
- The thick bar represent the second and third Quartile
- The white dot represent the Median (The value in the middle)
- Top and Bottom point represent your min and max
- The coloured area represent the density or the distribution of the population

# The Masterpiece

I have used them for hyper-parameter tuning. When you do hyper-parameter tuning using grid search from scikit-learn you get a array of result. I you look simply at the absolute minimum, you can easily end up with an anomaly or a corner case. But looking at the tendance you can get a way better result.

- Value will tend to be better in general
- The sparse between good and bad will be smaller.

In the above picture it is clear that if you pick a value above 3, you end up with highly variable values. Even if you potentially can get a better result with one of the values > 3, you get no stability and the % chances you get something bad is big.

In this case the value 3 is very good because. First it is a better score than 1 and 2, but it’s also a lot more compact, thus you can expect a small variance in your results.

## Where to buy tickets

You can get violin plot in most of the popular visualization libraries. Here is a short list:

- Seaborn: The one I have used in this article.
- Plotly: Another very popular lib.
- Matplotib: I do not like this one, personal taste.
- GGplot2: For R lovers 😉

# The critics

In conclusion, this is a very nice tool when you need to compare distribution. I honestly think this tool is undervalued and underused.

Happy plotting,

You really make it seem so easy with your presentation but I find this matter to be actually something which I think I would never understand. It seems too complicated and very broad for me. I’m looking forward for your next post, I will try to get the hang of it!