DVC Day 29: Almost There!

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

Here we have the final graph that I presented to the rest of my colleagues in discussing the difference between our chat durations with Paid customers vs. our Business customers.

Screen Shot 2015-05-11 at 3.51.39 PM Thoughts:

– This is more effective than the box-and-whisker graph because it illustrates that while Paid and Business chats may have roughly the same median duration, the breakdown of the chat duration field is not the same – note how the Business chats bump out on the longer end. Very interesting.
– Note also that the duration piece has been changed to a log scale – this is to handle some of those huge outliers.


> library(ggplot2)
> mydata = read.csv(“~/olark_april_2015.csv”)
> q = ggplot(mydata,aes(log(chat_duration))) 
> q + geom_density(aes(fill=factor(group_title, labels=c("Business","Paid")) , alpha=1/4)) + ylab("% of Total Chats")

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.