How to do self-service analytics, the right way

How to do self-service analytics, the right way

A guide based on failures

Self-service analytics has become a popular trend (and a popular whipping post on data Twitter) in recent years, allowing businesses to easily access and analyze their own data without needing data teams and engineers to respond to every ad-hoc question.

However, with this increased accessibility comes the potential for mistakes that can lead to inaccurate conclusions and poor decision-making. In this post, I’ll highlight some common mistakes made while rolling out self-service analytics and offer some potential ways to mitigate disaster.

But before you go pound the table for self-service analytics at your startup, the key thing to remember here is that, like everything else you do with data, it will never be perfect. But if you don’t try, your technical teams will continue to be underwater, your #askanalytics slack channel will continue to light up, and your business teams will continue to find whatever avenue they can to get to data, whether you like it or not.

Set the culture, early and often

Before standing up self-service analytics, you must be intentional and enforce an analytical mindset from the top. You can’t just claim to be data-driven, you have to lead by example. For founders, define company objectives and tie data to them. Sure, you can use OKRs and any other Silicon Valley framework. There’s plenty of content on these frameworks for you to learn more about. But the frameworks are not the point. The point is to constantly and visibly commit to being data-oriented and holding yourself and others accountable.

For product and engineering leaders, define the metrics you want to impact early on and make sure you have a way to track the impact of your features on those metrics. The last thing you want to do is ship a feature, celebrate the feature, and then find out a few weeks later that you actually can’t track its quantitive impact. This is a slippery slope to a product culture that celebrates shipping and not impact. As a PM at Flexport, I loved to make Slack announcements when new features shipped, but it felt 10x better to follow up a few weeks later with charts that showed that your teams’ hard work was moving the needle.

For GTM leaders, enforce good hygiene in your systems, demand quantitative proof wherever possible, and rally your teams around your numbers. If you don’t have access to what you need, clearly communicate this to your technical counterparts. Don’t create bad habits as workarounds - they’re often harder to reverse later on instead of confronting data gaps early.

Define your metrics

How many meetings have you been in that go something like this?

  1. Presenter: “With this project, we’re planning on increasing X metric by 10% from 75% to 85%”
  2. VP: “How is that metric calculated?”
  3. Presenter: “It’s calculated using A * B / C "
  4. VP: “Why doesn’t it include D? We’ve been using A * B / C + D on our team."
  5. Entire meeting devolves into a heated debate over how to calculate X metric

If you don’t want a shitshow like this on your hand, it’s critical that you define a list of the most important metrics not just for your team, but also with other teams that have partial ownership over these metrics.

Creating and sharing your metrics definitions is an easy way to look good in your first month on the job. Sure it might seem like a pedantic exercise, but it will save you so much pain later.

Clean your data

Garbage in, garbage out. We've all heard it. But it's true - insights are only as reliable as the quality of the data itself.

Look out for things like duplicate records in your systems first. These are most often the result of a lack of unique identifiers. Make sure you’re referencing unique, system-generated IDs instead of user-submitted strings wherever possible, and create rules that prevent the creation of new duplicate records. Ask your engineers about implementing unique key constraints if you see duplicates in your product data.

Be sure to check things like free text, too. Structure your inputs using single-select options whenever possible, as it will save you from having to flatten multi-select submissions or parse free text submissions for reporting purposes.

Lastly, you want to ensure you have reliable, universal IDs you can use to join across systems. If you don’t, ask your engineering teams for help.

It’s important to note that while following these best practices will make it easier for your team to work with data, the unfortunate reality is that your data will never be perfect. Set this expectation with your teams while stressing the value of intentionally configuring systems, paying off data quality debt consistently, and prioritizing based on business value.

Create data models for easier exploration

Now that your metrics are defined and your data is clean enough to get started, focus on creating data models that your business teams can easily explore.

To get started, try to work with your business teams who are exporting data into sheets already for analysis and look at the tables and columns they’re using most often. Instead of forcing them to INDEX-MATCH across the same tables every time, try joining them beforehand in the form of a warehouse view, dbt model, or saved SQL query. Your business teams can then easily work with reliable data instead of manual joins or repetitive questions.

Make your data easy to find

You have metrics and models ready to go. You should now create a go-to place for your stakeholders to find, understand, and ask questions about this data.

For starters, try saving these metrics and models and defining them in Notion, Google Docs, Sheets, or whatever. Give your team instructions on how to use the data and highlight common pitfalls.

Try also to place definitions next to your charts on your team’s most visible KPI dashboards so your team can easily understand what they’re looking at.

If you don’t have a dedicated slack channel to chat through analytics asks, do so. This way, you can track new questions that can be solved scalably through new data models or improvements to existing models.

Use the channel as a forum for driving best practices and continue to point back to the data catalog and existing dashboards. The last thing you want is to keep answering one-off requests with one-off SQL queries and CSV exports. This will result in data sprawl, making your existing models and dashboards harder to find. Remember, this is an ongoing process, and you’ll need to enforce habits on your team for self-service analytics to work.

Know your audience

Self-service analytics for everyone is the dream, but the reality is that there are varying degrees of self-service that you should consider for different personas in the org. Some roles will naturally be much more comfortable analyzing datasets than others because of their general data literacy, knowledge of the data model, and understanding of how the underlying tools are configured.

Instead of rolling out the same interface and process for everyone, consider empowering the more data-literate, analytically-minded roles within your organization to act as an intermediary between engineers or the data team and the average salesperson. For example, look to give your GTM Ops, Biz Ops, Finance Ops, Chiefs of Staff, and other like-minded roles the ability to maintain the data catalog, data models, and dashboards for their respective teams. Empower them to be the support line for all data questions in their line of business and partner with them to improve the systems and pipelines you own.

Impossible is nothing

Self-service analytics has become a bit of a meme on data Twitter because of the number of embarrassing mistakes “non-technical” stakeholders can make when given unfettered access to data. Will stupid mistakes happen? Yes. Will you get frustrated by repetitive questions? Yes. Will you want to give up? Probably. Should you? Absolutely not.

Shit will happen, but your team will be better off trying the methods outlined in this post. In my opinion, the worst thing that can happen is relying on your current methods of one-off favors. Slow, uniformed decision-making is the enemy of any startup trying to run circles around its competition.

If you’re looking for a way to jumpstart your self-service mission, check out Canvas. Unlike other BI tools, Canvas connects directly to your database and >150 SaaS tools, so you don’t need to procure a separate warehouse or ELT tools.

Once your data is centralized, Canvas allows your business teams to explore data, build data models, and share beautiful dashboards with just their spreadsheet skills. Even better, every spreadsheet action generates extensible, editable SQL for your technical teams to jump in if there are issues.

Sign up for a 14-day free trial here or email me at [email protected] if you have questions about Canvas or are just looking for advice. I’d love to hear from you.