As we finish out the 30 days, I’ll actually be using an example of work that I did to test a hypothesis at Automattic. We currently provide live chat support to two cohorts of our customers, the folks who purchase WordPress.com Business, and our customers who have purchased any upgrade at all (mostly domains and WordPress.com Premium). There has been a longstanding assumption that our live chats with Business customers were longer in duration – they have access to Ecommerce options, as well as no-cost access to our entire library of Premium Themes.
So, I ported our live chat data out of Olark and into R, and threw together a box plot:
– If this looks wrong somehow, that’s because it is: our box is so small as to be flattened. All we really see are the massive upward outliers.
– This clearly does not do anything to help us decide which style of chat tends to be longer in duration – our Business folks are on the left here, and our Paid customers are on the right.
– Clearly the next step is figuring out how to change this display so we can see what those boxes look like in a zoomed-in view.
> library(ggplot2) > mydata = read.csv(“~/olark_april_2015.csv”) > p = ggplot(mydata, aes(group_title, chat_duration)) > p + geom_boxplot()