In 2011, Marc Andreessen famously said software is eating the world regarding the impact software would have on many industries. Today, AI is “eating the world” as it transforms sectors and enables new business models. Yet, data energizes AI, and companies must consume their data cost-effectively and efficiently before they can even think about obtaining value from big internal data and AI initiatives.
Financial holding company Capital One is well aware. Over the past decade, Capital One has steadily transformed its business from pure banking and finance to also becoming a technology services provider. In that time, its new line of business, Capital One Software, provides data management and related applications to other enterprises.
Its initial offering, Slingshot, promises to help businesses to get the most out of the data platform Snowflake. We felt this makes Patrick Barch, senior director of product management at Capital One Software, the perfect person to discuss how CxOs can cost-effectively optimize their organization’s data use.
During his six years at Capital One, Barch worked within various aspects of Capital One’s digital transformation efforts. Still, for the past year and a half, his focus has been leading product management for Slingshot, Capital One’s tool that helps businesses adopt Snowflake, automate data governance processes, and manage cloud costs.
In this Q&A, DigitalCxO explores Capital One’s digital transformation and shares a Q&A with Barch on how organizations can get the most out of their data without breaking their budget.
Capital One’s business transformation is an interesting one, with Capital One’s more than decade-long quest to reimagine itself as a technology provider (which may have begun with its 2012 $9 billion acquisition of ING Bank). Capital One realized that the leaders in the banking industry are going to be at their core; great technology companies. Over time, Capital One began increasingly embracing technology, including cloud computing, APIs, and microservices.
“About seven years ago, we declared that we would shut down our owned and operated data centers and go all in on the public cloud. That was a big, big audacious objective for us. But we knew that to succeed as a technology company, we needed to use our data best, which meant cloud,” Barch says.
And move to cloud Capital One certainly did. The technology team within Capital One began preparing for that migration by re-architecting their data ecosystem in the cloud. As Barch explains, they embraced AWS and Snowflake and successfully pulled the plug on their last data center in 2020. “Along the way, we had to build several products and platforms to enable a highly regulated company to operate at scale in the cloud. The market just wasn’t quite there yet,” he says regarding the off-the-shelf and open-source tools of the time.
“The tools we had to build internally enabled us to scale our ability to support data, data consumption, data governance and infrastructure management. One of the interesting parts about the cloud is that it makes it easy to empower your users with data. You can store more data. And you can compute more data,” he explains.
That’s because compute scales infinitely and instantly on demand. That also presents a bit of a challenge: If a company wants to empower its users with data, it must do it in a way that won’t lead to unexpected costs. “One of the first challenges we had to overcome was empowering all our analysts and data scientists without breaking the bank,” he says.
Not surprisingly, as Capital One shared its transformation journey through interviews, technology conferences, and its blog, other businesses started asking questions. “They noted that our challenges and objectives sounded like what they were going through. They asked if we would consider making some of those products and platforms available. And so, we dabbled with open source for a couple of years,” he says.
Since then, Capital One produced several successful open-source projects that both Amazon and Microsoft have contributed to. But then, in 2022, Capital One formally announced the creation of Capital One Software. “Capital One Software is a new line of business dedicated to bringing our cloud and data management products to market,” Barch explains.
As someone steeped in those efforts, we had several questions for Barch regarding how enterprises can optimize the value they receive from their data:
DigitalCxO: Thanks so much for taking the time to discuss data governance strategies with us. I want to discuss how the cloud makes data widely available to businesses and consumers. Still, it becomes costly when companies don’t manage it correctly, such as storing, managing, and even trying to parse everything. How can businesses manage their data and get more out of it while keeping their costs in line?
Barch: The first thing I’ll say, and probably the most critical, is that yesterday’s models of managing your data ecosystem will no longer work. In the old days, when a central team could manage the company’s most important data, a central team could plan and budget for data spending for the year. That doesn’t work anymore.
That’s because you’ll never realize the value of your data investments by leveraging an old [centralized] ecosystem model. So, the first thing you must know is that the model that you need to employ at your company has changed. You’re going to have to federate.
Now, you can’t just go and fully decentralize and turn the keys to the kingdom over to all of the individual teams closest to the business. Because first, there’s privacy legislation popping up everywhere. You don’t want to introduce unnecessary risk to your business. Still, if they’re given a choice between getting their job done and saving money, the business teams will always pick getting their job done as quickly as possible.
If you want to have any semblance of well-managed data management budgets, you must enforce a central set of standards for data risk and cost governance that will apply to everyone.
At Capital One, we use the term federated. Federated teams [operating under a central set of standards] enable them to work independently to get their job done.
DigitalCxO: How does that work in practice?
Barch: I’ll give you some examples. I’ll start with data and then move into cost. So every data set at the company must adhere to a central set of policies, metadata, curation policies, and data quality checks. And every data set at the company needs to be protected with an entitlement that describes the riskiness of the data set. Those policies are created centrally. They are deployed in a central set of tools. And then, those tools are turned over to the various teams responsible for publishing and consuming data. They know that as long as they go through our central data discovery and access request workflows, you adhere to enterprise governance because it’s built into the workflow.
On the cost side, our Snowflake platform team, for example, developed some guidelines early on. For instance, our Dev and QA environments can never provision anything larger than a small warehouse.
They can set policies that dictate when ad hoc queries can run. They can’t enable that data warehouse to scale beyond a medium size. They’ve been able to define those policies centrally, and then our federated line of business tech teams use those central tools to manage their infrastructure. That way, the business teams can move at their own pace. They can move at the speed of business. But our tech organization and finance teams can likewise trust that best practices are being followed and that the cost controls we spend at the end of the month will be predictable.
DigitalCxO: And compliance is actively monitored?
Barch: Many reports enable our central team to view spending by business unit and any sub-business unit. We enable teams to tag various infrastructure components with projects or outcomes if needed so they can allocate costs that way. They can reactively see how much they’re spending within the guidelines that have been set. But we also send alerts notifications if we notice costs spikes weekly or month over month. We help teams identify potential trouble spots by surfacing which users are encouraging the most and which queries are incurring the most cost. And we’re rolling out some additional functionality in the coming months that will not just surface those trouble spots but help people act and give people more power over solving them.
DigitalCxO: Why don’t companies realize the value of their data in legacy decentralized models?
Barch: It creates bottlenecks. You get value out of your data by enabling really smart analysts and scientists to pay a lot of money to be creative and look for new insights and patterns in datasets you may not realize have underlying value. But when you focus on governing a small number of datasets and tightly control access to a small number of datasets, you limit the universe of information your smart people have access to. And you make it way too difficult for them to essentially do what you’re paying them for, generating new insights that will bring your business forward.
DigitalCxO: It makes sense. If you unleash their creativity and ability to access the data and do what they want with it, they will find interesting things. What are some other things companies do wrong that increase their data costs?
Barch: Some companies will over-index on enablement. That creates another problem around data and security risks and cost overruns. Operating in the cloud is about striking that perfect balance between empowerment and governance.
My short, pithy statement here is to slope governance based on risk. Not every dataset is created equal. Not every use case is created equal. You don’t need to put as much governance and control around temporary user data or staging tables as you do for data going into a Wall Street financial report.
I’ll give you an example here. We enable our teams to create new insights by playing around with data in a well-managed sandbox environment that’s relatively governed. They can only store data for 30 days. That’s when data becomes in scope for some privacy legislation. Those sandbox environments are still controlled by the appropriate data access policies that detail who can access what. But there are not many rules set around what analysts can do in their sandbox space, provided they’re adhering to all relevant legislation. They can promote their work from their user sandbox to a shared collaboration space when they realize they’ve done something interesting and want to share it with others on their team.
Now others on their team with relevant access can see their data products, collaborate on their data product and start thinking about how they might want to use that insight in a model or report. At that point, we asked our teams to provide more information about what they were doing. They have to register their data and provide basic data quality information. They have to keep track of who the owner is, and the governance gets stepped up.
If those teams identify that what they’ve made is starting to drive impact for our business, we will want to harden this process. We enable the process creator to promote it to production, where the script is governed, and in GitHub for change control purposes. The data product fully adheres to our enterprise data quality standards, enterprise lineage standards, data protection standards, and metadata curation standards.
Going through that process from the sandbox environment to full production is governed by a workflow that includes all of the enterprise’s security controls.
DigitalCxO: With all that in place and the proper guardrails and governance, how you maintain your cost structures is clear. However, what are typically the things that spin costs out of control for organizations?
Barch: Right-sizing your compute. Ensure you’re not over-provisioning when you don’t need to, but also that you’re not under-provisioning where you need to make the right investments. Simply staying on top of right-sizing your warehouse size with your scaling capacity for the job at hand is one of the first places to start.
Second, keep an eye on your workloads. In the old days, if somebody wrote a bad SQL query, the only impact was delayed performance on everybody else who was also running queries on that warehouse. Nowadays, the impact of a bad SQL query could be that the warehouse never sleeps or it scales. And before you know it, you’re spending more than you thought. So one makes sure that they’re identifying and optimizing their troublesome queries. And that you’re doing it in a reasonably proactive way. That doesn’t mean at the end of the month, not or once a quarter. It means you’re doing it every day.
The third is to empower your teams with information. It’s interesting how quickly people will fix bad behavior when putting together a report highlighting it. So putting in place a suitable federated ownership model, highlighting bad behavior in reports, using dashboards, and even using gamification strategies to hold those federated teams accountable for what they’re doing is another approach you can take.
DigitalCxO: What are some of the most important lessons you’ve learned?
Barch: I would say one of our biggest learnings is the importance of defining the right patterns for your business early and accounting for your business teams’ needs early.
You can’t do this if you try to approach this with the attitude that you have one enterprise standard, and everyone must fit their process into the standard. That’s not going to be successful, or you’re certainly not going to be as successful as you can be.
Instead, find champion teams in your lines of business that account for the different vectors of complexity you’ll need to solve. Define the right patterns and standards with those champion teams and then roll those patterns and standards out to the rest of the company.
But you need to start with some design work up front and decide how your access and governance policies should work. During this work, you’ll determine how to allocate your budget at the end of each month, decide what variables to use to forecast your budget and define your ecosystem’s architecture and how it should function. It’s essential because this is an integrated ecosystem that touches a lot of different teams.
Fortunately, once you’ve established those initial patterns, it becomes a process of showcasing your patterns and standards to the rest of the organization.
This is all a lot of work, but when you’re done, your organization will find it is getting more value from its data while keeping data costs in line.