# Source & Medium: A Medium Sized Dilemma

Subtitle: Source, Medium, Attribution, Stale Information, and The Future of Data

Here’s our situation – we want to be able to slice reporting and dashboards by a number of dimensions, including source and medium.

MARDAT (the team I’m lucky enough to be working with) is working to make this kind of thing a simple exercise in curiosity and (dare I say) wonder. It’s really interesting to me, and has become more and more clear over the last year or so, how important enabling curiosity is. One of the great things that Google Analytics and other business intelligence tools can do is open the door to exploration and semi-indulgent curiosity fulfillment.

You can imagine, if you’re a somewhat non-technical member of a marketing or business development team, you’re really good at a lot of things. Your experience gives you a sense of intuition and interest in the information collected by and measured by your company’s tools.

If the only way you have access to that information is by placing a request, for another person to go do 30 minutes, two hours, three hours of work, that represents friction in the process, that represents some latency, and you’re going to find yourself disinclined to place that kind of request if you’re not fairly certain that there’s a win there – it pushes back on curiosity. It reduces your ability to access and leverage your expertise.

This is a bad thing!

That’s a little bit of a digression – let’s talk about Source and Medium. Source and Medium are defined pretty readily by most blogs and tools: these are buckets that we place our incoming traffic in. People who arrive at our websites, where ever they were right before they arrived at our websites, that’s Source and Medium.

We assign other things too – campaign name, keyword, all sorts of things. My dilemma here actually applies to the entire category of things we tag our customers with, but it’s quicker to just say, Source and Medium.

Broadly, Source is the origin (Google, another website, Twitter, and so forth) and Medium is the category (organic, referral, etc) – if this is all new to you I recommend taking a spin through this Quora thread for a little more context.

What I am struggling with, is this: for a site like WordPress.com, where folks may come and go many times before signing up, and they may enjoy our free product for a while before making a purchase, at what point do you say, “OK, THIS is the Source and Medium for this person!”

Put another way:  when you make a report, say, for all sales in May, and you say to the report, “Split up all sales by Source and Medium,” what do you want that split to tell you?

Here are some things it might tell you:

• The source and medium for the very first page view we can attribute back to that customer, regardless of how long ago that page view was.
• The source and medium for a view of a page we consider an entry page (landing pages, home page, etc), regardless of how long ago that page view was.
• The source and medium for the very first page view, within a certain window of time (7 days, 30 days, 1 year)
• The source and medium for the first entry page (landing page, homepage) within a certain window of time (7 days, 30 days, 1 year)
• The source and medium for the visit that resulted in a signup, rather than the first ever visit.
• The source and medium for the visit that resulted in a conversion, rather than the first ever visit.
• The source and medium for an arrival based on some other criteria (first arrival of all time OR first arrival since being idle for 30 days, something like that)

It feels like at some point Source and Medium should go bad, right? If someone came to the site seven years ago, via Friendster or Plurk or something, signed up for a free site, and then came back last week via AdWords, we wouldn’t want to assign Friendster | Referral to that sale, right?

Maybe we have to create more dynamic Source/Medium assignation: have one for “First Arrival,” one for “Signup,” one for “Purchase.” Maybe even something like Source/Medium for “Return After 60+ Days Idle”

In the long run, it feels like having a sense of what sources are driving each of those behaviors more or less effectively would be helpful, and could help build insights – but I also feel a little crazy: does no one else have this problem with Source and Medium?

# Cogitating on Return on Ad Spend – AKA ROAS

I’m still pretty new to this whole marketing thing: I’ve been a part of Automattic’s marketing efforts for just over a year, and I feel like I’m still learning: the pace of education hasn’t slowed down even a bit.

One of the things that was a real challenge for me was getting to understand the language of the work, especially given our interactions with a number of outside vendors and agencies: the number of acronyms, shorthand and unusual usage of otherwise common words is a huge part of the advertising world, and it serves many purposes.

The import of accessible language is probably something I should save for its own post: I think that, especially in highly interdependent company like Automattic, opaque language, complex jargon, and inscrutable acronyms are more of a hindrance than a help, and in fact likely do us harm, given the way that we, as humans, myself included, want to feel smart, and powerful, and it can be very attractive to nod along rather than ask hard questions.

If you’ve been following this blog for a little while, you know that measurement and the implications of measurement are things that I think about – here’s a piece about metrics generally.

My broad position on metrics is, they’re reductive, necessarily and usefully so, and need to be understood as means rather than as ends.

All that to say, we should also be careful not to treat our metrics as being perhaps more reductive than they really are, or to behave as though what we are measuring is simple, when in fact it is not simple at all.

Taking something complex and making it simple enough to be useful – that’s the essential core of all measurement. Taking something complex and acting like it is something simple is another thing entirely, and a very easy way to increase your overall Lifetime Error Rate.

This brings us to Return on Ad Spend, sometimes shortened to ROAS. Return on Ad Spend can be calculated like this:

$\dpi{150} \fn_jvn \LARGE \frac{r}{S}$

…with being revenue and being spend. Generally the output is represented either by a ratio like 3:1, where for every dollar you spend on advertising, you get three dollars worth of revenue, or with a percentage – 3:1 would be represented as 300%.

It looks pretty simple. It’s generally referred to as being very simple, or easy, that kind of thing. Which, well, it is, at least on the face of it.

(The rest of this Post is about the sometimes hidden complexities of ROAS. If you want to learn more about using the metric in a tactical way, John at Ignite Visibility has a great write up on how to calculate and break out ROAS, as well as some wrinkles about attribution, which I recommend if that’s what you’re looking for. Here’s a link)

Let’s talk about this metric: ROAS. The name holds a lot of promise, right? Return on Ad Spend: something everyone who spends money on ads wants to learn, the dream of marketers everywhere. How much are we taking back in, for the amount we are putting out?

The trick of ROAS is, we have built in a set of assumptions: specifically, that the numbers we put in represent the whole of each of those categories. The trouble here is that there are only very specific parts of the marketing spend where that is a safe assumption: low-funnel type tactics, especially for e-commerce companies shipping physical products.

In these situations, for these companies, ROAS tends to be a clean metric: you have a very clear picture of where you are spending money, and each transaction has a straightforward, static revenue.

The trick is,

For SaaS companies, ROAS can become much more complicated: imagine your company sells a single product, some type of Helpful Business Software, and it retails for \$100 / year. If you run some numbers, you find that you spend on average \$50 in ads to get a customer – this looks good, right? We can say we have 200% ROAS and call it a day.

Of course, one of the great advantages of having Data is that we are able to record it, and then see how it changes over time, and try to do the sorts of things in our business that move the needle in our desired direction.

For a SaaS company, two of the metrics that you live or die by are Customer Lifetime Value (sometimes called CLV or LTV) and the dreaded Churn Rate – astute readers will note that these two metrics are inextricably linked. Briefly: LTV is the amount of revenue that your business can expect to make from a given customer, and the dreaded Churn Rate is the expected number of customers (generally at a rate out of 100, represented as a percent, like: “Our Dreaded Churn Rate is a spooky 13%!” )

A saavy SaaS marketing analyst will use the expected lifetime value of a customer in the top of the fraction up there, to determine Return on Ad Spend – for two great reasons. FIRST, because it is more accurate: if you’re looking to determine the total return it makes more sense to use LTV than simply the ticket price. SECOND, because it will make her look better in her reporting.

Consider: for this same sale of our Helpful Business Software, our expected LTV isn’t \$100, which is the annual cost of our product, but rather, \$200. This doubles our ROAS. This is great news!

(It’s not really news at all though, right? We’re not actually improving either our ads or our product, we just used a more accurate number. Metrics are means!)

One wrinkle, though, is that now we’re not really using that equation above anymore – we’re using something more like:

$\dpi{200} \frac{LTV}{S}$

If you’ve ever spent any time trying to calculate your customers’ lifetime value, you know that this has suddenly become a much more complicated metric.

What happens once we start to bring in more complicated ingredients into our ROAS pie here, things like LTV, is that ROAS moves from being a static sort of snapshot into a metric that is much more dependent on other parts of the business to be successful.

In the above example, imagine if your company has had a disastrous year, and your Dreaded Churn Rate has skyrocketed, driving your LTV down to below \$100 (due to let’s say sweeping customer refunds and growing customer support costs) – now our ROAS is below 100%, even though literally nothing has changed on the advertising side. In this situation, ROAS becomes a larger aggregate metric, telling us something about the business at large.

This brings to mind a larger question: do we want ROAS to be a heartbeat metric, an indicator of the business overall? Or do we want it to be what it was about a thousand words ago, a simple snapshot of how our advertising efforts are going?

As we move away from direct retail e-commerce businesses into more complex companies, and up what’s called the advertising funnel, ROAS becomes additionally tricky, not because the equation itself becomes more complicated, but because we start to introduce uncertainty, and even worse than that, we introduce unequal uncertainty.

Generally, you know how much you’ve spent. This is true even for less measurable marketing efforts, things like event sponsorships, branding, and so forth. What you decide to include is a little bit of a wrinkle: do you include agency fees? Payroll?

The uncertainty comes into play in the revenue piece, and this is why ROAS as a metric starts to break down as we move up the funnel, because the lower part of your fraction, your spend, stays certain, while the upper part, the revenue, becomes increasingly uncertain, which makes the output more and more difficult to use in a reliable way.

This is a problem that crops up a lot in marketing metrics, and something I’ve been thinking on quite a lot: we often will compare or do arithmetic on numbers which have wildly different underlying levels of base uncertainty, sometimes to our detriment, maybe sometimes to our advantage.

I’ve been working with ROAS quite a lot, and trying to really get my teeth into it, and my brain around its under-the-surface complexity. For most businesses today, ROAS is useful, but it is not as simple as it looks.

This is where I ask you to add something in the comments! What metrics are stuck in your craw this week? Do you think I spend too much time trying to become certain about uncertainty? Let me know!

# Working Remotely and Tangible Craft

Before I started working at Automattic (the folks behind WordPress.com and the Jetpack WordPress platform), I had a career in what I call progressive coffee: high end coffee, well made, ethically sourced, that kind of thing.

(I’ve written a little about my journey from a hospitality job to a job with a tech company here and here, if you’re curious about that)

While I was working at Seven Stars Bakery in Providence, Rhode Island, our biggest day of the week was Saturday, especially Saturday morning. One part of my job there was designing the layout of the employee space, as well as building and improving the training programs. Spending my Saturday mornings at the busiest location on the busiest day was a great way to ensure that my work was successful, and was adding value to the company and customers when the rubber hit the road.

During one of these shifts, I’d typically clock in 14,000+ steps: it is borderline insane to think about, now, when I struggle to get in 10,000 steps in a whole day!

Preparing really excellent espresso, tasting coffee to decide what to bring on as an offering: these are inherently and importantly physical and sensorial activities: awareness of the body and what it’s telling you is part and parcel of finding success in these pursuits. You have to not just heat the milk, you have to listen, to watch it, to gauge the temperature of the pitcher against your hand. You spend a lot of time focused on and attending to your sense inputs, using your body and its inputs in increasingly focused and demanding ways.

The combination of focused precise work (which being a quality-focused barista in a busy espresso bar absolutely is) with 14,000 steps meant that the days were cognitively and physically exhausting.

Moving away from this career into what I do now, working for a software company, and more specifically working fully remotely, was a real shift. It was a real change, in some ways I expected (I didn’t have to work on Saturdays anymore!) and in some ways I did not expect.

One of the unexpected changes was that I found I was attracted to hobbies that were much more manual: gardening at first, and more recently woodworking.

Over time I’ve come to realize that the part of me that is fulfilled by building things and planting vegetables is the same part of me that was fulfilled by those busy Saturdays: there’s a value to using the body, to getting to know what your body can do and how many amazing things you can teach it to accomplish.

Part of it, too, is that manual pursuits, physical crafts, impose humility on the practitioner: you cannot Google how to cut a perfect dovetail.

Well, that’s not true – you can Google it, and get lots of results! But, you can’t Google how to actually do it, like Neo in the Matrix. Once the saw hits the lumber, the truth comes out. There is no quick route: if you want to make beautiful things out of wood, you have to spend a lot of time making pretty ugly stuff first.

You simply have to put the miles on: there aren’t any shortcuts.

(This is also true for another new pursuit of mine – Brazilian Jiu Jitsu, but you’d have to substitute “make ugly stuff” to “get choked out by strangers” – different but the same!)

Like everything, becoming someone who works with code means re-learning everything I’ve known to be true again, in a new pursuit. I realized and came to appreciate the need for exposure, for time with my saws and chisels and on the mats – and at the outset it seemed so different from writing code. Coding, for the beginner, many times does feel like there are short cuts, with Stack Exchange, with other folks’ code or libraries, etc.

It turns out that what all those folks on Stack Exchange have been saying all this time (but are mostly ignored) is true: copy-pasting a solution that works is different from understanding the problem, from getting a deep sense of the solution and how to pursue it. It’s like buying someone else’s chest of drawers rather than building it yourself. It’s different from having the answer in your bones.

The more I learn, the more I learn that everything is the same. For me, I’ve learned that remote work is an amazing way to work, and writing software and doing data work is special, and important, and so satisfying: but the brain and the body, or my brain and body at least, really need that opportunity to do work in the physical space, to hold something, to engage in craft that doesn’t happen on my laptop screen.

If you are a remote worker, and you don’t have a hobby or pursuit that takes you off of your computer and into the garage or the gym – you should give it a try!

# Your Classic New Year’s Post

2017 was a huge year for me and the Doc and the kids!

My kids turned three and one, and are starting to become little people with their own opinions and positions – they are also developing a complex relationship with one another which I had for some reason totally failed to expect until it came into sudden and sharp focus.

My wife, the Doc, has received tenure! Associate Professor no more, she has a new name badge on her office door and a continued bright professional future. I am so proud of her, even as this accomplishment is staggering for me to consider: she’s spent practically the whole of her professional life working toward this goal. When she started graduate school, ostensibly the first step toward tenure, I was still an undergraduate. This is the culmination of more work than all of the work I’ve done. She’s amazing.

I moved into this new role at WordPress.com – from leading a Happiness Team, to now this more data focused role with our (still pretty new!) Marketing team. The job title is Data Analyst but, like any job at a startup, I have plenty of hats. Adjusting my mindset from leading humans to writing SQL has been a challenge, but a great opportunity for growth as well.

2018 is shaping up to be an exciting year: it will mark my fifth anniversary at WordPress.com and Automattic, which I’ll be able to celebrate by taking a 3-month paid sabbatical in late summer / early fall.

I think, in 2018, I would like to try to spend some time leaning into being more of a light bulb: I’ve found success in focus, in great focus and goal-setting and the fairly standard practices there.

I’ve always felt a certain drive to be significant, to try to be a Big Fish in whatever capacity I can – in 2018 I think I’ll try to be OK as a Small Fish. Doing my work well, contributing thoughtfully without fanfare. Rather than shouting, focus on listening. Try to really embrace the role of the student. That sort of thing.

# Fatigue: Emotional and Intellectual

I know not everyone follows my LinkedIn profile with rapt attention. That’s OK – I don’t follow your LinkedIn profile very closely, either.

So, you might not know; I’ve moved into a different-but-not-so-different role at Automattic (the folks behind WordPress.com, Woo Commerce, Jetpack, and a heap of other great stuff)

I was previously leading a support team, and have since moved into a role that we call a Data Analyst, on the Marketing team.

If you’re familiar at all with Automattic’s naming policies, both for jobs and teams, yes, this is an outlier in the direction of the mundane in both cases. I went from being a Happiness Engineer on Team Athens (and also technically on Team Redwood) to being a Data Analyst on Customer Activation.

100% consistent with Automattic standard, though, are the many and varied hats that come with this role: at other companies my day to day work could be described as Marketing, Growth Engineering, SEO, SEM, Pay Per Click, Customer Success, Data Science, even a little bit of database architecture. It’s a lot!

(As a sidebar, I think the job duties and title change may make it seem like I’m making a career change or otherwise sort of changing direction – let me assure you, my focus continues to be on building explosive value for our customers. I’m expanding my tool set – not changing my approach.)

Today, was a cognitively demanding day. Working remotely means taking on a lot of responsibility for structure and organization of one’s work – I’m still figuring out how to do that the best way I can, in this new role. It also means being disciplined to push back distractions which are constantly at the ready in any browser window.

Spending hours looking at databases, considering queries, performance, outputs, accuracy – this is work, it’s real work, and it builds fatigue. A day of work, focused, attentive work, can certainly leave me in need of a deep breath and a long walk.

(I personally find it especially hard to think critically and well about SQL, statistics, databases, and so forth, near the end of my work day. It’s like I’m running out of gas.)

What struck me today was how different this kind of fatigue feels, especially compared to the kind of fatigue I’d feel after a tough day leading a team of Happiness Engineers. I’ve reduced them into two distinct types for the title – Emotional and Intellectual – but I’m sure there is some overlap, maybe some days more than others.

Maybe the difference is, in the lead role, the fatigue comes from trying to serve others, trying to hold them and their full personhood in your mind, whereas in this analyst role the fatigue comes from the intensely individual and personal kind of focus it takes to do it well, to take it seriously.

It does feel different to say to my wife, “I had a hard day – I couldn’t get the data types to reconcile the way I wanted,” rather than “I had a hard day – I think I really let some people down.”

Maybe they’re not different. Maybe I’m different.