DVC Day 2

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

Yesterday, I plotted the distribution of price among the diamonds dataset. One of the cons was that it showed the price distribution, but failed to really indicate any reasoning or correlations that might help us understand _why_ the prices were the way that they were.

To add some more depth, I’ve plotted price on the y-axis and carat on the x-axis:YVMu-ibRnV-1200x1200

If you’ve spent any time with this dataset (or R tutorials) you’ve likely seen this visualization before.

Pros:
– Gives us more context about what might be driving price
– Has some interesting vertical separations
– Looks like it may indicate a trend

Cons:
– The density of points makes it hard to tell whether a dot is one data point deep or 300 data points deep
– It bothers me that “price” is vertical
– What are those vertical separations about?

Code:

library(ggplot2)
qplot(carat, price, data=diamonds)

2 thoughts on “DVC Day 2

  1. 2 for 2! Vertical separations are interesting. Seems like they occur at regular 1/2 carrot intervals with 1/2 and 2 1/2 being a little less defined and 3 simply lined up dots. Guessing it is a psychological/marketing thing. Like a 1.899 carrot does not sound as impressive as a 2.001 carrot now does it? Probably can’t tell the difference by looking at it but would be more desirable to buy and sell because it sounds better. Just guessing.

    1. I really like your thinking, Chris! There are so many things at play with diamonds, but you’re right, I’m not even sure what a carat is but I prefer 2 to 1.98 šŸ™‚

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.