Subtitle: Source, Medium, Attribution, Stale Information, and The Future of Data
Here’s our situation – we want to be able to slice reporting and dashboards by a number of dimensions, including source and medium.
MARDAT (the team I’m lucky enough to be working with) is working to make this kind of thing a simple exercise in curiosity and (dare I say) wonder. It’s really interesting to me, and has become more and more clear over the last year or so, how important enabling curiosity is. One of the great things that Google Analytics and other business intelligence tools can do is open the door to exploration and semi-indulgent curiosity fulfillment.
You can imagine, if you’re a somewhat non-technical member of a marketing or business development team, you’re really good at a lot of things. Your experience gives you a sense of intuition and interest in the information collected by and measured by your company’s tools.
If the only way you have access to that information is by placing a request, for another person to go do 30 minutes, two hours, three hours of work, that represents friction in the process, that represents some latency, and you’re going to find yourself disinclined to place that kind of request if you’re not fairly certain that there’s a win there – it pushes back on curiosity. It reduces your ability to access and leverage your expertise.
This is a bad thing!
That’s a little bit of a digression – let’s talk about Source and Medium. Source and Medium are defined pretty readily by most blogs and tools: these are buckets that we place our incoming traffic in. People who arrive at our websites, where ever they were right before they arrived at our websites, that’s Source and Medium.
We assign other things too – campaign name, keyword, all sorts of things. My dilemma here actually applies to the entire category of things we tag our customers with, but it’s quicker to just say, Source and Medium.
Broadly, Source is the origin (Google, another website, Twitter, and so forth) and Medium is the category (organic, referral, etc) – if this is all new to you I recommend taking a spin through this Quora thread for a little more context.
What I am struggling with, is this: for a site like WordPress.com, where folks may come and go many times before signing up, and they may enjoy our free product for a while before making a purchase, at what point do you say, “OK, THIS is the Source and Medium for this person!”
Put another way: when you make a report, say, for all sales in May, and you say to the report, “Split up all sales by Source and Medium,” what do you want that split to tell you?
Here are some things it might tell you:
- The source and medium for the very first page view we can attribute back to that customer, regardless of how long ago that page view was.
- The source and medium for a view of a page we consider an entry page (landing pages, home page, etc), regardless of how long ago that page view was.
- The source and medium for the very first page view, within a certain window of time (7 days, 30 days, 1 year)
- The source and medium for the first entry page (landing page, homepage) within a certain window of time (7 days, 30 days, 1 year)
- The source and medium for the visit that resulted in a signup, rather than the first ever visit.
- The source and medium for the visit that resulted in a conversion, rather than the first ever visit.
- The source and medium for an arrival based on some other criteria (first arrival of all time OR first arrival since being idle for 30 days, something like that)
It feels like at some point Source and Medium should go bad, right? If someone came to the site seven years ago, via Friendster or Plurk or something, signed up for a free site, and then came back last week via AdWords, we wouldn’t want to assign Friendster | Referral to that sale, right?
Maybe we have to create more dynamic Source/Medium assignation: have one for “First Arrival,” one for “Signup,” one for “Purchase.” Maybe even something like Source/Medium for “Return After 60+ Days Idle”
In the long run, it feels like having a sense of what sources are driving each of those behaviors more or less effectively would be helpful, and could help build insights – but I also feel a little crazy: does no one else have this problem with Source and Medium?