DVC Day 27: Practical Applications

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

As we finish out the 30 days, I’ll actually be using an example of work that I did to test a hypothesis at Automattic. We currently provide live chat support to two cohorts of our customers, the folks who purchase WordPress.com Business, and our customers who have purchased any upgrade at all (mostly domains and WordPress.com Premium). There has been a longstanding assumption that our live chats with Business customers were longer in duration – they have access to Ecommerce options, as well as no-cost access to our entire library of Premium Themes.

So, I ported our live chat data out of Olark and into R, and threw together a box plot:

Screen Shot 2015-05-11 at 4.02.00 PM

– If this looks wrong somehow, that’s because it is: our box is so small as to be flattened. All we really see are the massive upward outliers.
– This clearly does not do anything to help us decide which style of chat tends to be longer in duration – our Business folks are on the left here, and our Paid customers are on the right.
– Clearly the next step is figuring out how to change this display so we can see what those boxes look like in a zoomed-in view.


> library(ggplot2)
> mydata = read.csv(“~/olark_april_2015.csv”)
> p = ggplot(mydata, aes(group_title, chat_duration)) 
> p + geom_boxplot()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.