Yesterday, I plotted the distribution of price among the diamonds dataset. One of the cons was that it showed the price distribution, but failed to really indicate any reasoning or correlations that might help us understand _why_ the prices were the way that they were.
To add some more depth, I’ve plotted price on the y-axis and carat on the x-axis:
If you’ve spent any time with this dataset (or R tutorials) you’ve likely seen this visualization before.
– Gives us more context about what might be driving price
– Has some interesting vertical separations
– Looks like it may indicate a trend
– The density of points makes it hard to tell whether a dot is one data point deep or 300 data points deep
– It bothers me that “price” is vertical
– What are those vertical separations about?
library(ggplot2) qplot(carat, price, data=diamonds)