The Pitch for Looker at Automattic

Automattic, and the part of Automattic where I work, WordPress.com, has a lot of data.

A really tremendous amount of information: so much information it’s passed the point of being helpful, and has started to become a hindrance.

Where do you find the information you’re looking for? How do you know which potential source of information is correct? Do you want the numbers that accounting uses? The numbers that the billing system uses? Maybe a combination of the two – but wait, shouldn’t those numbers be the same?

One of the big goals, part of the vision of my current team, Marketing Data, is to figure out how we can be better ambassadors of our information to our colleagues. How we can serve as translators, or maybe sherpas (sherpii?), on this journey we’re all on together.

The importance of this goal is kind of abstract: we hire a lot of folks who are really brilliant, who shine in their areas of expertise and are poised to create explosive value for their team, their division, and the company at large.

For many of these folks, their background isn’t particularly technical: they may be super Excel savvy (which is, I believe, equivalent to programming in many ways!), but when it comes to directly querying and manipulating raw data with SQL or other querying languages, it’s too much – and I am sympathetic to that.

The way I see it is, we’re putting two obstacles between these folks and their ability to realize their own greatness – and to help maximize the velocity of their impact, how quickly they’re able to go from curiosity to insight to revenue change.

The first obstacle is technical: it doesn’t make sense for them to learn the ins and outs of our data as well as becoming fluent in querying languages. So, they have to make requests of data professionals or software engineers to get the information in their hands that will allow them to maximize their value – and in some sense, their own personal growth.

The second obstacle is social: as a good colleague, no one wants to feel like they’re wasting the time of the folks they work with. But this is the way that we’re making our comrades feel – maybe not intentionally, but implicitly. If, as my colleague Demet often says, only one out of any ten A/B tests will produce results, and those tests have to be analyzed by hand, the nine without results will have the added result of the analysis-requester feeling a bit sheepish.

When there exists a piece of friction between curiosity and insight, when a professional has to ask a question of an analyst or engineer before they can validate their curiosity and pursue insight, we attach a tiny (sometimes not so tiny) cost onto that question. Great marketers ask tons of questions, because they know that it’s only through curiosity and exploration that they can find the kinds of insights they need to discover explosive growth.

When that friction exists because they can’t explore their curiosity directly, but only through other folks, we’re doing them a disservice, and reducing their ability to do their work.

Therefore, any good pursuit of that overall goal – being good ambassadors or translators for our data to less-technical members of our professional community – will work to break down those two obstacles.

Enter Looker.

We tried a few different Business Intelligence tools, and there were a few contenders for our attention, but I kept hearing about the potential of Looker, and especially notably, the importance of its modeling layer.

It took me a shamefully long time to understand what this thing is – I here humbly tip my hat to Matt Mazur and Stephen Levin and the rest of the good folks in the #Measure Slack channel for being so patient and generous with their explanations – and it turns out they were right, the particular features of the modeling layer really are a game changer.

Here’s why:

Imagine that you’re onboarding a new member of your data team: part of that onboarding will be cultural and social but a big part of it will be about relationships between different pieces of data – probably mostly tables but maybe also different sources. You’d say things like:

This table contains every user id, hopefully only once, along with some facts about each customer.

This one is every transaction, with a receipt id, and a total paid amount. The user id is in here too, but many times, since a single customer can have many transactions. If we want to join this table to that table, we need to remember that many-to-one type relationship, or we’ll have problems.

… and so on. In a sense you’re trying to take all of your own understanding about your data, and the way that it all clicks together to form a cohesive whole, and communicate that understanding into someone else’s brain.

In a Matrix style future, you’d be able to plug a plug directly into your new hire and sort of transmit that understanding directly into their head, so they could behave as though they had all of your hard-won knowledge – along with everyone else who made use of that data, because, why not?

Well, that’s sort of the modeling layer.

The modeling layer is a way of defining, through code, all of the relationships and nuances of your existing data buildout, and then presenting them to folks in a useful way, with those relationships in place. They can ask questions and explore the data as though they had the sum total understanding of everyone who built the modeling layer.

In some sense you already have a modeling layer – it’s just in your head, and can only be shared by explaining it to other folks. What Looker does is it gives everyone on your team – everyone in your company – the same powers as Rogue from X-Men.

The modeling layer can literally act like a superpower. Folks don’t have to understand how the data is stored, how the tables relate to one another, they don’t have to wake up in a cold sweat with the date/time doc from Apache Hive seared into their subconcious (Is month MM or mm??) – they just have to use a nice, clean GUI with solid built-in viz. And, they don’t have to ask anyone for help.

Looker gets us past our two big obstacles – once your data is modeled (which is not easy or fast but it is worth it), there is no longer a technical requirement for folks to explore the data, and there is, in the majority of cases, no social obstacle either.

Thus far it has been the right choice for us, and it’s something I look forward to working with for a long time.

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.