What is data ownership and why is it important?

I’ve recently been tasked with redefining our core product KPIs. I’ve had to look through outdated data models to track KPIs like depletion rate, quantity available, and total allocations. While doing so, I’ve seen a lot of deprecated logic and old datasets being used. When this happens, I have to message many different team members, looking for someone who has been with the company long enough to understand why this logic or dataset was created in the first place.

Not knowing who to turn to when coming across a data asset is a huge pain point for those within the data team and business stakeholders. It requires a lot of communication, most of which leads to chasing your tail around in a circle. Nobody wants to spend hours trying to get a simple answer to their data questions.

This is exactly why ownership of data assets is so important. Assigning or taking ownership can save everyone within your organization time and frustration.

What is data ownership?

Data ownership specifies which individual or team to turn to when you have questions about a data asset or need to request it be changed. It eliminates the back-and-forth that occurs when trying to find someone with knowledge of a particular asset. With ownership, someone’s name is publicly assigned to an asset, communicating to the business that this person either understands the data’s business context or is the technical brains behind it. This is typically specified in some sort of data catalog.

Why is data ownership important?

Data ownership is important for three main reasons: data governance, easier cross-team collaboration, and accountability. Let’s discuss these further.

Data governance

Data governance refers to the accessibility, usability, and security of your data. Ownership primarily focuses on security and usability. When assigning an owner to your data assets, you are giving that person the role of protecting the data and ensuring it’s of the best quality. This person may control who does and doesn’t have access to this asset. For example, only the finance team may need access to a dataset with PII information, like credit card numbers. This asset’s owner would be the one to limit access from other teams and ensure it doesn’t get into the wrong hands.

By assigning owners to your assets, you are also ensuring your datasets are actually usable. Ideally, when someone is assigned to monitor them and ensure they remain reliable for the business, they won’t go without being updated for days or weeks. Leaving governance up to one person or team can make it a difficult job, but when this is left to multiple owners (all experts in their domains), it keeps assets secure and usable.

Easier collaboration across teams

Ownership eliminates the need for countless Slack messages and meetings to help track down the person with the most domain knowledge of an asset. The person’s name is explicitly assigned to the asset, making it easy to know who to reach out to. This creates an instant path of communication between you and the asset owner.

Accountability

How many times has a problem slipped through the cracks because you thought someone else on your team was handling it? This is basically the bystander effect of the business world. Nobody works on a problem because they assume someone else already has it covered.

When assets don’t have owners, data quality problems such as an unrefreshed data pipeline, outdated logic, and duplicates can pile up. Assigning an owner holds one person accountable for upkeeping the quality of the asset. When someone is held accountable, there’s no guesswork about who should be maintaining what.

Implementing ownership

Now that you know ownership is worth prioritizing, how should you implement it within your organization?

Make a list of all your data assets, including data sources, data models, and dashboards.

It’s important that you understand how many data assets you have within your organization. This is a good time to take inventory and maybe even delete assets you know are no longer in use or are outdated. When I did this, I found lots of dashboards that hadn’t been touched in years, along with reports highlighting the same metrics as others. You always want to maintain as clean a data environment as possible. Having outdated or duplicated assets only creates more clutter and makes it harder to find those that are useful.

Determine who has the most knowledge of a particular asset.

Once you have made your list, categorize the assets by business domain. You may have assets that fall under product, marketing, finance, or engineering. This will help you further narrow down potential asset owners. In the beginning, you will have to ask around your organization to find the person who created a certain source or dashboard, or who understands the data best. It may be helpful to schedule a meeting with each team and go through all their related assets.

In some cases, it may be necessary to assign two owners — one with knowledge of the business context and another with knowledge of the dataset design. In systems like Y42, you can assign an “owner” and an “expert”. The “expert” in this case would be the person who best understands how the data relates to the business as a whole. Y42 also has a field that tells you who created the asset in the first place. This is particularly helpful for understanding the purpose or logic behind the asset.

Onboard a data catalog or Modern DataOps Cloud like Y42, which already has the ownership feature baked in.

Lastly, you need a system that supports ownership. A data catalog is typically the best feature to use for this as ownership is almost always included. Make sure you look around for one that fits your needs and allows your entire organization to access your assets.

A Modern DataOps Cloud like Y42 is great because it already has a data catalog component with extensive features for defining owners. The platform allows you to “activate” owners by creating triggers with built-in logic. For example, an owner could be notified if something in the dataset breaks or if someone writes a comment. As I mentioned above, you can define an expert as well as an owner. This gives you more flexibility and control over how you want to define ownership within your organization.

Increasing efficiency with data ownership

Prioritizing data ownership within your organization leads to higher-quality data, more time for tasks that have the most impact, and better collaboration across teams.

Let’s revisit the scenario I discussed in the introduction. If owners had been assigned to the product datasets I was rebuilding, I would have been able to go straight to that person and ask the necessary questions. Hours of my time and energy would have been saved in getting the answers I needed in order to create the new datasets.