Author: Simon

The Future of the Toyota Production System

p6-toyota-a-20140408-870x608

“What Akio Toyoda feared the company lost when it was growing so fast was the time to struggle and learn,” said Liker, who met with Toyoda in November. “He felt Toyota got big-company disease and was too busy getting product out.”

via ‘Gods’ edging out robots at Toyota facility | The Japan Times.

Seems like there are always lessons to be learned from Toyota – interesting to see a perspective about running a business that looks at the downside of speed, the downside of a product focus.

DVC Day 9: Bin What?

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

Testing our earlier hypothesis that the vertical striations in our data were due to a preference for “whole” carat numbers – or at least more readable numbers, we can look at a single-variable histogram:

Screen Shot 2015-04-20 at 8.19.26 PM

Thoughts:
– One interesting thing about this chart is the importance of binwidth, which sets the resolution of the data in a histogram – for instance, here’s this same chart with a binwidth of .15 rather than .01. It loses a lot of the utility of the chart above!
– It might be interesting to display a second variable here in a way other than on the y-axis – as a color maybe.

Code:

library(ggplot2)
qplot(carat, data=diamonds, geom="histogram", binwidth=.01, xlim=c(0,3))

DVC Day 8: Messin’ with (More) Geoms

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

In playing a bit more with the qplot geoms call, I spent some time with the “jitters” geom, which has nothing to do with coffee, as it turns out. Jittering is a neat method to fight against the same sort of dot density that we saw earlier in the challenge – it creates a larger space for points to be plotted, which makes a visualization more readable. Here’s this same visualization without the jittering.

tXFnUXRYOl-3000x3000

Thoughts:
– The more I do this, the more I realize I don’t know about diamonds.
– The more I do this, the more I realize I don’t yet understand about R and visualizing data. It’s exciting!
– There’s a consistent pattern to the clarity layers that we see, repeating what looks like 3 times, yellow, green, blue, pink, and then again, and then a third time, with pink sort of stretching skyward. What’s that about?
– The “J” color continues to be interesting to me – why is it so jumbled up when the others seem to be at least somewhat orderly? It also reaffirms our previous findings, where we noticed that “J” diamonds seemed to be outliers (in a bad way) on the price vs. carat chart.

Code:


library(ggplot2)
qplot(color, price/carat, color=clarity, data=diamonds, geom="jitter")

DVC Day 7: Messin’ with Geoms

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

In noodling around with the different options of the qplot function (and there are plenty), I found myself going back and forth on the geom option – here’s one of the possible inputs, smooth, which takes us from yesterday’s graph to one of just very smooth lines, with a shading indicating the standard deviation of that particular collection of data:

 

 

nZnlIFK9Qn-3000x3000

Thoughts:
– This is a really interesting example of another case where we trade some visual precision for more visual utility – for example,that same graph using arguably more a more precise plotting of lines looks like a total, and useless, mess.
 The green line is particularly interesting, since it appears to plateau at a certain point – about the same place where it is the only remaining clarity.

Code:


library(ggplot2)
only.j <- subset(diamonds, color=="J")
j <- qplot(carat, price, data=only.j, color=clarity, geom=c("smooth"))
j

DVC Day 6

(This Post is part of my 30 day Data Visualization Challenge – you can follow along using the ‘challenge’ tag!)

After taking a look at the different colors of diamonds in a sample, I noticed that the diamonds colored “J” appeared to be unusual outliers. Following this tack, I created a subset of the larger diamond data set that contained _only_ the J colored diamonds – then, plotted that on the same carat/price graph we’ve seen, but with color now indicating the diamond’s clarity:

ziakw_LTPh-3000x3000

Thoughts:
– I also added a title, and started using variable names as I build around a data frame, which makes it much easier.
– We can see at least one of those vertical striations that we saw in the original data set.
– It looks like the outliers on the low-price-high-carat scale of the J-colored diamonds are larger but less valuable than their peers.
– This graph is a bit muddy, but we can for sure see what look like trends in clarity correlating with price as we go from orange to green to pink/purple.

Code:


library(ggplot2)
only.j <- subset(diamonds, color=="J")
j <- j <- qplot(carat, price, data=only.j, color=clarity, size=I(1.5))
j <- j + ggtitle("J-Color Diamond Clarity & Pricing")
j