Conifer

Data For Data's Sake

Anecdote

Recently I moved to a new team that was looking to improve how they use data and metrics to report on their team's outcomes. The team's metrics historically were pretty light, largely focused on launches or showing up at industry events. While it was clear the team had a lot of inputs to show that, yes, in fact, they're doing their jobs, we didn't really have a good answer to "so what?" or why leaders should invest in creating these inputs.

I was asked to join a task force to help this team update their "page zero". This is a cover page used in quarterly reporting that provides leaders with a snapshot of high level performance that then is covered in greater depth with commentary on the following pages. The page zero, or P0, as it's often called, is important to get right. I've been part of numerous reviews where there is more commentary on what's included on P0 than actually deriving any insight or action from it. This is the wrong conversation to have with your leaders and stakeholders. The design of the page should make it easy to discuss the data and insights. If you're focusing on how the information is presented versus the information itself, you're having the wrong conversation at that stage.

My colleague set up a meeting with a senior person on our data team to help guide how we get this built. I was expecting a conversation about how to use data to effectively articulate the team's work, understand feasibility of specific metrics, and generally speak to how we can design a P0 that helps us tell a story about what our team is doing and why it matters.

Instead, it was a barrage of questions about data accessibility.

"Is this number available? How do I get it? Who do I talk to? Can you just run a report for me this once?"

These were the types of questions being asked. When I challenged the team member to understand why we needed this metric, it was brushed off. Knowing we could access it was more important than understanding if we needed it or if it told us anything useful. Simply putting numbers on the page was seen as the greater value exercise, to "show something", versus adding any meaningful insight to the team's performance.

While knowing what's available and accessible is good, it doesn't matter if you don't know why you need to show it at all.

Using Data Responsibly

I feel pretty strongly about how data is used at work. Putting a number to something makes it tangible: the shift from qualitative to quantitative brings an element of gravitas to what's being shared.

It sounds a lot more serious to say "we received 3104 complaints on our customer service, making up 84% of all complaints received" than saying "a lot of customers seem unhappy with our customer service team". Quantifying gives scope and helps people understand the problem more objectively. What looks like a big portion of fish and chips to me may look like a small portion to you, but we can make that judgment for ourselves if we know how many ounces of fish are served on our plate.

Why do we care so much about this? Data should help us get closer to the truth so we can have a shared foundation to work from. This helps us debate and determine a plan of action, and also see if what we do has meaningful impact on the number.

I generally break down measures like this:

Input measures

Input measures define the actions the team took. You may hear this called "activities" in the sales world. In recruiting, this may be the number of interviews scheduled each week, the amount of inmails sent on LinkedIn, or the dollars invested into job posting. It grounds the team on the initial action that was taken.

Input metrics are a bit useless on their own. These are usually raw numbers that go up or down pending what the team does with their time.

While measuring activities and actions can help you answer "Is my team producing at an acceptable rate?", it doesn't tell you much beyond that, and seldom are the thing to focus on without knowing what's happening downstream.

For example, I could tell you that I adopted the Jeff Skilling diet and ate a pair of Twinkies and had three Diet Cokes every day. The reason we know this is a bad diet is looking further downstream: did my waist size increase? Has my health absolutely cratered since adopting this diet? While intuitively we could make this assumption, it's because many of us know that eating unhealthy food will impact your health negatively. It's not the only factor, but it is an input we can control that shapes our health outcomes.

Guardrail measures

Guardrail measures highlight acceptable measure of performance. For example, going back to recruiting, maybe the input measure is the number of interviews scheduled, but the guardrail metric is the percentage of candidates that make it to the final interview stage. This could help us understand if we are interviewing the right people and finding good talent. Guardrails contextualize the input measures and ensure that items like volume don't tank quality.

NPS or other survey measures can be good guardrail measures, since they assess if the quality of the input was perceived to be satisfactory. When I was recruiting, I liked to survey my hiring managers and interviewers to gauge how I was to work with and if they felt that they were meeting good candidates. This helped me get feedback to improve and help validate if my inputs mattered (e.g. am I moving too slowly? Am I doing too much?) or if the outcomes I was driving looked good, but had negative effects (e.g. is my time to hire really good, but interviewers complain that I'm inflexible about scheduling?, etc.)

In general, you want to find the nexus point of acceptable performance in a guardrail metric and the appropriate volume of an input metric. Similarly, you want to look at guardrails in relation to output metrics to ensure that your outputs are being achieved in a high quality, sustainable fashion.

Output measures

This is the result of the actions taken. Outputs are, in my opinion, one of the toughest things to measure effectively. Unlike inputs, which are often directly attributed to the person who took an action, outputs can have more influences that make it difficult to casually measure these.

For example, it may be easy to measure that I built ten training courses last year, but I may struggle to articulate if taking training was correlated with improved sales team performance the following quarter -- there are multiple factors at play.

In recruiting, outputs are typically more straightforward (offer acceptance rate is a common one), but in other disciplines it can be more challenging. In learning and development, it can be tough to measure training impact. For performance management, it can be tough to attribute improvements in performance ratings to coaching provided to the manager or the employee. There are multiple factors at play, and sometimes you'll need to lean on simple correlations to guide you directionally, even if it isn't 100% the truth.

Attributing input measures to output measures is an interesting space, and teams can model lift in different ways using multiple factors. I think there is opportunity to do this internally for teams to help get to the root of what actions contributed most to an outcome.

For example, I moved to Seattle from New York. There were a number of factors that impacted my decision: climate, cost, family, and so forth, but the factor that played the biggest role and deserves the most credit for this was having a job offer and relocation money. Relying on qualitative factors to get to the why of any particular outcome is inherently biased, since I may also misattribute success if asked, especially pending who is asking me and how I want them to perceive me.

If I am a salesperson who just had a record-hitting quota, I may be asked what led to my improvement: I may be embarrassed to say the real reason and instead deflect or attribute success to a strong leader who had a conversation with (I gotta get that promo, after all), when that may not be the actual reason at all. Without investment in data science here, you'll often be left looking at rough correlations (e.g. three of five employees on Team A hit quota and viewed the training, while the other two didn't view the training: we can assume that training played some role in helping them hit quota, even if they also did many other things differently, too).

As you can imagine, it's easy to corrupt the use case for data and instead find numbers that don't tell you much of anything, but that "look good". It can get the team focusing on the wrong things or misattributing success.

Data should be used to tell a story of what the team did, how good it was, and the outcomes it led to. Getting crisp on this will help teams rally around the best way to spend their time and how to do their jobs effectively to inform the right metrics.

This isn't just a matter of doing the deep work to find and create the right measures, it is also a cultural one. Companies need to be honest with themselves and be okay if things don't look as rosy as they'd like. Having the space to make mistakes and discuss things that aren't working well will help improve performance and prevent bigger catastrophes downstream.

Thus, it's importance to define the metrics that are used, why they are used, how they influence each other, and what outcomes they led to. While we may not always drive revenue directly, we know there are lots of factors that lead to this in a business. Even doing some basic correlation work will help get the team to some level of truth, but I feel confident most companies aren't doing this with great regularity. If you can get your team operating at that level and using it to inform direction, you'll be in good shape.

Defining Metrics

Defining the team's metrics, or at least identifying which metrics the team helps drive, can be a lengthy exercise. It can also be challenging if the metrics don't exist or require more complex calculations, require data housed in multiple sources, or if data is unreliable.

Using the above, we can start to categorize the different metrics we care about across different functions. Below are some questions I'd ask if I was getting started with this for my team.

What do we do? How do we know it's good? Do we have a hypothesis as to why or what we influence? What else would we want to look at? Can we get the inputs to inform this measure? What would it take to measure it?

Beyond identifying measures, we also need to develop a sense of what good looks like, and get the team rallied around that. There are different ways we can express data to make its overall quality appear more obvious.

Baseball analytics are notoriously sophisticated, but the newer sabermetrics (as they're called), give deeper insight into performance than the traditional "baseball card" metrics show. For example, I couldn't tell you if a .250 batting average was good in the 2023 season. However, through a more advanced metric, like "weighted runs created plus" (wRC+) I can get a number that articulates the overall offensive value of the player relative to the rest of the league. For example, a .250 batting average may be below average in 1998 but above average in 2024. wRC+ is represented as a number, with 100 as the average. Lower number is below average, higher number is above average. This can provide a quick look at the overall offensive impact, allowing the other more traditional metrics to contextualize and color how they made their impact and where we may want to provide additional coaching or support.

At work, we also want to make sure we use multiple numbers together to paint a fuller picture of our performance so we know where we have opportunities to improve and articulate what's working well.

You should start with simple measures that are easily accessible to build this muscle. Most applicant tracking systems, learning management systems, and HRIS tool will offer some degree of basic reporting and metrics that allow you to get insight into some level of detail. As you get more sophisticated, you'll likely want to work with business intelligence engineers to create dashboards and new metrics that go deeper than what these systems each provide on their own.

For example, my applicant tracking system may be able to tell me how many people sourced from referrals joined the company each year and the percentage of hires that came from this source. What it can't tell me is if these people stayed longer than the average employee or achieved better performance ratings each year. We'd need to join these two data sets together and create a view that articulates this. We may learn that referrals are a great source for finding people who get through the interview, but that they often churn faster or have lower performance ratings. Visualizing the data this way helps stimulate conversations and give folks visibility into the big picture, which helps shape the strategy.

With any measure, you'll also need to have a gut read on what good looks like and what's achievable. You'll need to forecast trends year over year, or in absence of historical data, use your intuition and test hypotheses to gauge what good looks like. If you can find them, industry benchmarks can be a useful measuring stick to start with, or any research by a trusted organization on these topics. "How should I feel about this number?" is a question you need to be able to answer, and having some point of comparison by a validated source or enough insight to articulate why the number is or isn't good will help stimulate discussion on action.

This is especially critical when dealing with stakeholders who don't know your business and have preconceived notions of what numbers need to look like. I remember getting into an argument with a CEO about certain recruiting metrics. While they were focused on how many phone screens they took, they missed that getting that number too high didn't matter if the success rate was low -- I could set them up with phone screens all day (if that's my goal metric), but it would be clear that I'm wasting everyone's time by doing so. Showing the full picture and giving insight into what acceptable looks like helps the team understand your space and enable them to ask better questions or offer suggestions on how to improve things.

Why This Matters

It's easy to fall into the trap of just using data because you have it. We interviewed 4312 candidates! 923 employees took our training! 38% of our employees got the highest performance rating!

This may sound fantastic (or awful!), but you have to measure what matters and show why certain things make a difference. Measuring should be a continuous, long-term activity and gives the team shared focus. Avoid pulling in data for data's sake, and instead think through what I actually want to show with this and how will it help us make better decisions and drive the right conversations.

Metrics can be scary. A lot of people have told me they wanted to get out of sales because the number followed them everywhere and became the statement of their worth. It's a lot easier to hide if you aren't being measured or have clear metrics.

Similarly, over-indexing on metrics can warp incentives. If I only care about the number of people who take my training courses, am I incentivized to actually make the content good or ensure the right people take the training? I should assess volume, but I also need to assess quality and impact (our guardrail metrics). This will help guide the strategy.

Identifying, defining, and reviewing inputs, guardrails, and outputs will help you build a stronger mental model of what your team is working on, why it matters, and how it plays a role in the big picture. This will help justify decisions and advocate for change.

The purpose of rigorously tracking and presenting data is to get the insights to drive discussion and strategy on which actions to take to make improvements to the business. Showing a number just to show you have a number won't achieve this.