Know Your Business (KYB): The next frontier of knowing everything about your customers
Nyca has been focused on digital identity since our inception in 2014. Managing fraud and preventing illicit use of the financial system is a core objective of every participant in financial services. For businesses that serve other businesses, being certain of who owns the business and what their motivations are is still extremely challenging. In this article we’ll explain key KYB vocabulary, explore major trends, dissect the current ecosystem of players, and highlight innovators shaking up the market.
What is KYB, and how does it relate to KYC, fraud, and risk?
Know Your Business (KYB), an extension of Know Your Customer (KYC) processes, focuses on verifying identity and assessing risk of business customers at the time of onboarding and continuously throughout the business customer lifecycle.
KYB, fraud, and risk are interconnected, but not exactly the same. KYB traditionally focuses on regulatory compliance, which can sometimes feel like a “check-the-box” activity to avoid penalties and fines. However, much of the data collected for KYB is also highly relevant to fraud and risk (more on the key data elements later).
To accurately detect fraud, it’s important to know your business’s identity, but also pay attention to a number of other real-time fraud signals (transaction details, behavioral/device information, etc.).
To underwrite risk (i.e., credit risk), it’s important to know your business’s identity, including history of repayment (usually via credit scores) and expected future cash flows (usually through historical cash flow data).
The devil is in the details.
KYC regulations spawned in retaliation to a growing trend of tax evasion via secret foreign bank accounts and bank policies against retaining records of their client’s use of accounts. In 1970, the first US money laundering guidelines were created with the landmark Bank Secrecy Act (BSA). Among other things, the BSA required US financial institutions to maintain certain records and report certain kinds of customer activities, including high value transactions, to the federal Financial Crimes Enforcement Network (FinCEN), a bureau of the Treasury. The BSA is also why we fill out a form indicating whether we are carrying goods greater than $10,000 any time we enter or leave the US.
The BSA has been amended many, many times since enactment. The BSA as an anti-terrorism tool emerged after the law was amended in the late 1980s/early 1990s to require transaction monitoring and to impose an affirmative obligation to report on suspicious transactions by filing Suspicious Activity Reports (SARs). After the September 11, 2001 terrorist attacks, requirements on verifying the identity of customers were strengthened through Title III of the 2001 USA PATRIOT Act.
Jurisdictions around the world have adopted similar laws, such as the multiple iterations of the Anti-Money Laundering Directives (AMLD) in the EU. In fact, there is an entire intergovernmental organization, the Financial Action Task Force (FATF), that identifies ways to improve the global anti-money laundering (AML)/counter-terrorism financing (CFT) regime.
However, up until recently these laws did not require that financial institutions subject their business relationships to the same AML/CFT scrutiny as their consumer relationships. On the tail of the Panama Papers, this blindspot was amended by FinCEN in 2016 with the introduction of Customer Due Diligence (CDD) Requirements for Financial Institutions, which introduced a KYB requirement of identifying and verifying the identity of the beneficial owners of companies opening accounts (defined as controlling owners or individuals owning 25%+ of the company). Then in 2021, Congress passed the Anti-Money Laundering Act of 2020 (AMLA), which included a sweeping BSA overhaul and revamp of the CDD Rule that contained:
The Corporate Transparency Act (CTA), which established uniform beneficial ownership reporting requirements for most companies (starting Jan 2024) and created a secure database for beneficial ownership information at FinCen
Stricter penalties for BSA/AML violations
Increased incentives for whistleblowers
Greater public/private and international cooperation on financial crime matters
Furthermore, this idea of requiring counterparties to collect information about commercial counterparties has been expanding from financial crimes to other activities. In June 2023, the INFORM Consumers Act was passed, requiring online marketplaces to collect information from third-party sellers within 10 days of their obtaining high-volume status.
It’s worth noting that behind these regulations in the US is a chaotic, fragmented data landscape. A few other countries are ahead of the curve in figuring out a scalable way to extract data. For example, the UK Companies House serves as a centralized national registry on business registration, ownership, and financial data.
Why should anyone care about knowing the business that they do business with? The trends below illustrate why KYB has a meaningful impact on business performance.
1. Fraud and cyber risk continues to grow
A recent large scale example of this was the PPP program run by the government during the COVID pandemic. 15%+ of COVID PPP applications were fraudulent – totaling $76B of $800B disbursed.
Business synthetic identities (e.g., fake Amazon storefronts) and the use of shell companies are on the rise. For example, if a Russian state sponsored hacker wants to open a US bank account, he/she might go through intermediaries or shell companies. While those intermediaries might pass compliance checks, more sophisticated fraud detection tools would be needed to understand the source and use of funds and prevent fraud or sanctions violations.
Additionally, structural shifts are introducing new challenges. Ubiquitous real-time payments (RTP/FedNow in the US) require the ability to monitor and react in real time, or else payments will be held for days while compliance teams investigate for AML risk. Momentum to create in-country payment infrastructure following SWIFT’s ban of some Russian banks increases siloed data problems.
2. The number of small businesses has been steadily increasing
As of 2021 there are 330m SMBs globally and 32m in the US.
Additionally, there are 5m+ new applications in the US annually. SMBs are harder to verify than large enterprises for a number of reasons. 29% have no website and 80% are owner-operated with no employees. For these sole proprietors, most of the KYB process behaves like a traditional KYC process. Reporting requirements are also much more limited for SMBs compared to public companies.
3. SMBs expect but are not always receiving consumer-like digital experiences
Setting up business banking accounts can take up to 100 days, whereas today most consumer bank accounts can be opened in minutes. And the business case for speed is clear - fintechs, including Mercury, Brex, and Arc, captured a lion’s share of deposits following the Silicon Valley Bank collapse. Money center banks, especially Chase, also attracted rapid SMB inflows during the 2023 period of regional bank collapses (including Signature Bank and First Republic). Poor KYB infrastructure inhibits the ability for B2B businesses to move fast and be open-minded about who they work with.
These very real concerns impact not just financial institutions, but also software vendors, retailers, social media websites, travel/hospitality providers, and many other types of B2B businesses.
Similar to the KYC landscape, the number of KYB players is extensive. Above is a non-exhaustive list of established growth-stage and legacy players in the space. Broadly, players fall into three buckets: data/service provider, platform, or orchestration/case management tool.
These players offer a unique and proprietary data set or data aggregation layer. Risk and compliance problems are ultimately data science problems, and more quality data = better outcomes.
Banks and other FIs performing KYB due diligence typically run validations against the data sets below:
Some of these data sets are extremely commoditized (e.g., sanctions list from OFAC), with small incremental enrichment opportunity. Many players cleanse, aggregate, and resell these raw data sources. For example, Middesk built direct integrations with each Secretary of State, handled much of the entity resolution process on their end (i.e., de-duplicating businesses with similar names), and exposed that data via an API.
Other data elements, like criminal records, are more proprietary in that they require near real-time analysis of news (i.e., adverse media monitoring). Vendors like Dow Jones / Ripjar, LexisNexis, Refinitiv World-Check, Regulatory Data Corp (owned by Moody's Analytics), and ComplyAdvantage scrape millions of news articles daily and tie those to individuals/companies to provide adverse media monitoring as a service.
Some vendors are known in the industry to be better than others at providing certain attributes. In addition to quality/accuracy, speed can be a differentiator. Some of these lists, like political leaders, are fairly static. On the other hand, fraud happens quickly and often, so low latency updates around criminal records can be immensely valuable for proactive detection. But more frequently updating data sources are generally more expensive, so depending on needs and use case, a cheaper, more static option might be warranted.
There are also best-in-breed service providers that offer more dynamic services. Jumio, Onfido, Veriff, and AU10TIX are known for identity verification, Prove for mobile phone verification, Emailage (owned by LexisNexis) for email verification, SentiLink for synthetic identity, and so on. Many of these companies operate in KYC as well as KYB; particularly for very small businesses, the distinction between an individual—the owner—and the business is often nebulous.
In addition to these verification services based on historical data (“extrinsic signals”), several players, including Sardine, Biocatch, and Neuro-ID, are innovating on “intrinsic signals”, such as behavioral biometrics/analytics. As Sardine CEO Soups puts it, with a huge amount of data online, it’s easy for fraudsters to train LLMs to copy someone’s face and voice and pass normal identity verification processes. Furthermore, today fake businesses are properly registered, and stolen identities pass KYC checks. In the future, device & behavior data might very well provide the strongest risk signals.
Many companies advertise themselves as end-to-end KYB solutions, but from our conversations with 30+ senior executives and founders in the space, very few can offer a one-size-fits-all solution. Some of these platforms (e.g., Unit21, Persona) have taken an open approach, where they enable customers to build out a combination platter of their own data/services and data/services from partners integrated via API. Others, like Trulioo, have binding agreements with the data vendors they use. The space in general has become quite incestuous, with many platforms and vendors using one another.
Orchestration can be thought of as the “brain” in the process - it provides a systematic way to manage workflows throughout the customer lifecycle, usually with an emphasis on the acquisition and onboarding funnel. Orchestration tools will pre-integrate with multiple data sources and provide a low-code/no-code ability to design decisioning rules and workflow sequences. These workflows might also involve asynchronous steps, including cases for manual review or filing reports.
Orchestration becomes a more acute need when 3+ data sources are being combined, or when workflows/logic/data vendors are being changed frequently. Ultimately, because credit and fraud exposure for SMB accounts is much higher than for consumer accounts (e.g., for credit cards: $50-200k vs $3-4k), businesses serving SMBs are more likely to invest in multiple risk tools. Additionally, point solutions become “stale” over time, and buyers may want to rip and replace a specific vendor as technology matures. If a central orchestration platform continues to build integrations with new, best of breed players, this reduces the tech effort required by the buyer to switch solutions in and out.
Many vendors have pivoted to focus on orchestration because of the long-term stickiness and ability to own the end customer relationship (making it easier to upsell/cross-sell other solutions). A competing tension is that many large enterprises are using a platform/orchestration engine to launch something quickly, try out many vendors, and then go direct to vendors and build the orchestration themselves to save on costs and own the relationship with the data provider (i.e., to negotiate volume discounts and request customization). A key point here is that orchestration in itself is not a moat without value-add on top; many orchestrators build down over time in light of this challenge.
While many of the platform players also offer orchestration, the players in our market map labeled as pure play orchestration generally do not offer their own data sources aside from custom risk scores. Instead, their go-to-market is a fully open, integration-first model, focused on orchestrating a specific part of the customer risk lifecycle. Generally speaking, Alloy is preferred for account opening, Unit21 specializes in continuous screening/monitoring, and Hummingbird is focused on manual investigations and SAR filing.
Alloy in particular has become a leader in KYB orchestration due to its strong pipeline of ecosystem partners, giving its clients both flexibility to integrate with partners of choosing and also guidance in picking best of breed partners through “best practice bundles”. In addition, Alloy has enabled other end-to-end providers to succeed; examples include MANTL, which provides digital account-opening-in-a-box for community banks and credit unions, and Sila Money, which provides payment infrastructure as a service. Alloy also enables toggling on and off data sources with ease, which can be helpful in periods of large volatility (e.g., influx of applications during COVID pandemic).
Areas of Opportunity
Who will win out on the platform side? When/where is the consolidation coming?
The $12B+ market for KYB services is large and growing, but on the other hand the data is largely commoditized. Next-generation KYB players will turn lemons into lemonade via novel approaches that make the onboarding and continuous monitoring processes more seamless, less error prone, and tailored to the use case at hand. Additionally, no company has a chance at becoming a comprehensive KYB platform without having a stellar wedge that incentivizes switching from legacy providers. We also think M&A is a necessary evil on a quest to become a comprehensive KYB platform. Legacy players like LexisNexis and Moody’s Analytics have been buying up hot KYC/KYB startups for the last decade.
We think this is possible through four wedges (and we would love to speak with anyone building a company in one of these areas!).
A long tail of alternative, disparate data sets exist (e.g., TrueBiz for website data, Verdata for consumer complaint data, Mesh for professional licenses, among others). These will not displace core due diligence requirements like verifying registration data or screening watchlists, but they may serve as a point of amplification around risk scores. We expect opportunities to develop for entrepreneurs to experiment with a broad range of novel data types for specific use cases - for example, UPS data to confirm receipt of goods - as these data sources become more accessible.
In a previous installment of Fintech Fundamentals, we talk about the need for a fourth bureau in consumer lending: a trusted third party to provide clean data from new sources (and at lower cost) is long overdue. Some startups are focused on the same problem statement for KYB data. For example, Osiris is combining traditional KYB with product adjacencies, focusing on proprietary data sources for fraud, sole proprietor verification, beneficial ownership information, industry (NAICS/SIC), people mapping, and operational status signals - all while allowing the business owners themselves to claim their profile in a “Google Business Profile” fashion.
Cross-border data aggregation
An extension of the “more data” problem is access to global data sets. Many incumbents do provide this, but startups have an opportunity to aggregate more cleanly across languages, currencies, and other nuances that create differences between countries. For example, ShuftiPro can verify ID documents in 150+ languages, including Arabic. AiPrise is a S22 YC company attempting to serve fintechs who are more global in nature with no-code orchestration workflow capabilities combined with massive global data coverage (claims to have 500M+ verifiable businesses on the platform!).
A common adage in fintech is new solutions to old problems. The rapid advances of AI/ML continue raising the bar for new tools that can solve old problems. Coris offers a suite of tools to address gaps in Stripe’s merchant risk platform faced by vertical SaaS companies and PayFacs, including merchant classification using GPT-4 and ongoing merchant monitoring. MinervaAI offers enhanced due diligence in a single, context-driven search (as opposed to endless scrolling through Google pages). Sandbar offers off-the-shelf rules and holistic models to cut down false positives in transaction monitoring, which can be >95% of alerts. Greenlite wants to use AI agents to fully automate some of the responsibilities of KYC/KYB analysts. A number of companies are also attempting to make the Ultimate Beneficial Ownership (UBO) visualization and drill down process more efficient.
Better orchestration capabilities
We do believe that orchestration is ultimately not an area that banks/FIs should worry about building in-house because it is not a point of competitive differentiation visible to end customers. With that in mind, we believe orchestration is not a “winner-take-all” market, and many alternative solutions have begun to emerge. Parallel Markets provides a fully orchestrated business onboarding process and creates reusable business identities to reduce duplicative onboarding work for SMBs across different FIs. Oscilar provides a lightweight alternative to Alloy, whereas Sliderule and Effectiv offer more customization and analytics. Ballerine is experimenting with an open source model. Marble offers a low code, real-time rules engine to make it possible to monitor and risk score events besides transactions, such as logins, signups, or new accounts, and maintain a robust audit trail. Footprint orchestrates KYC/KYB, security, and fraud while offloading the security risk of storing PII through vaulting infrastructure. Dotfile, Bits Technology, Ondorse, Detected, ComplyCube, and several others are focused on the European market.
Who will be the next unicorn in space? There will always be an emerging “next-generation” of advanced data and service providers. As we highlighted earlier, many of the underlying data sets are commodities, so pure KYB data providers will erode over time due to margin pressure from low-cost competitors. To get to $100m+ in ARR and achieve the holy grail of a comprehensive KYB solution, a few qualities are essential:
A defensible data moat, which can come from:
Unique data sets and lack of reliance on traditional data points
Give-to-get models with network effects, built on modern data infrastructure that enables fast feature changes and ML enhancement
An opinionated “open-platform strategy” allowing buyers to orchestrate end-to-end KYB workflows combining in-house assets with third party data/services
Architecture that intentionally combines product adjacencies (credit, fraud) and enables customer segment adjacencies (non-FIs)
Relentless execution and opportunistic M&A
Thanks for reading Fintech Fundamentals! Subscribe for free to receive new posts and support my work.