Databricks Ventures

Everything you need to know about Databricks Ventures: their three-pronged ecosystem investment thesis, notable portfolio companies, typical check size, and how to position your startup for funding.

Databricks Ventures is the strategic investment arm of Databricks, the San Francisco-based data analytics company that reached a $62 billion valuation following a $10 billion funding round in late 2024. Founded in 2021 and led by Andrew Ferguson, the venture arm deploys capital from Databricks' balance sheet rather than a traditional LP-backed fund structure, making it uniquely positioned as both a financial and strategic investor in the data, analytics, and AI ecosystem.

The firm has backed more than 50 startups in roughly four years of operation, deploying approximately 15 new investments per year across a three-pronged investment thesis: ecosystem software companies that enrich the Databricks environment, system integrators that help enterprises implement the platform, and AI-focused startups through the Databricks AI Fund launched in May 2024.

This comprehensive guide covers everything founders need to know about securing funding from Databricks Ventures, including their specific investment criteria, real portfolio companies, typical check sizes, and proven strategies for pitching your startup to the team.

Key Takeaways

  • Databricks Ventures is the strategic investment arm of Databricks ($62B+ valuation), deploying capital from the company's balance sheet rather than a traditional VC fund.
  • Typical check size: $1M to $10M per investment for seed through Series B, with a preference for participating rather than leading in Series A and later rounds.
  • Three-pronged thesis: (1) Ecosystem software enriching Databricks, (2) System integrators for enterprise implementation, (3) AI-focused startups via the AI Fund.
  • Notable portfolio includes Anomalo, Cleanlab, Glean, Mistral AI, Perplexity, Nimble, Unstructured, Labelbox, Matillion, and Lovelytics.
  • Portfolio companies gain access to Databricks' enterprise customer base, Delta Sharing integration, technical experts, and cloud marketplace distribution.
  • Databricks Ventures is more focused on strategic ecosystem fit than on valuation discipline when making investment decisions.

Investment Focus & Thesis

Databricks Ventures operates with a deliberately structured three-pronged investment thesis that reflects the company's broader ecosystem strategy.

The first pillar covers ecosystem software companies that build products tightly integrated with the Databricks Lakehouse platform. These companies use Databricks APIs, Delta Sharing, or other platform primitives to deliver complementary functionality in data quality, governance, MLops, or analytics tooling. The strategic value comes from enriching the platform offering and increasing customer stickiness.

The second pillar focuses on system integrators and consulting partners that help enterprise customers implement Databricks at scale. These investments are less about financial returns and more about building a robust delivery ecosystem that makes Databricks the default choice for enterprise data transformation projects.

The third pillar, launched as the Databricks AI Fund in May 2024, targets early- and growth-stage AI startups that are building on top of or alongside the Databricks platform. This $1M+ prize fund (also offering credits, mentorship, and expert access) specifically targets pre-seed and seed-stage companies using or enabling AI in connection with Databricks. The AI Fund has already backed six companies including Anomalo, Cleanlab, Glean, Mistral AI, Perplexity, and Unstructured.

Unlike traditional venture funds, Databricks Ventures is explicit that valuation discipline takes a back seat to strategic fit. As Andrew Ferguson has stated, the team is 'more focused on whether it makes strategic sense to get into the round than on the valuation.' This creates an unusual opportunity for founders: strategic value can justify a higher price than pure financial logic would support.

Recent Investment Activity

Databricks Ventures has maintained a consistent investment pace of roughly 15 new investments per year since its founding in 2021, with over 50 companies now in the portfolio. The firm is notably active in the Israeli market, having backed Nimble's $47 million Series B in February 2026 to help enterprise AI systems access real-time web data through Delta Sharing integration.

The launch of the AI Fund in mid-2024 marked a deliberate shift toward earlier-stage AI companies. The Built-On Databricks Startup Challenge, announced in April 2025, offers $1M+ in prizes for innovative startups building B2B applications on Databricks, further signaling the firm's commitment to the developer ecosystem.

Databricks Ventures has been particularly active at the intersection of enterprise AI and proprietary data infrastructure. The rise of agentic AI applications, powered by Databricks' Lakebase (knowledge retrieval) and Apps offerings (enterprise application deployment), has become a key area of focus for new investments. Ferguson has publicly predicted that 2026 will be the year enterprises begin consolidating their AI vendors, making deep platform integration a strategic necessity.

In addition to new investments, Databricks Ventures actively supports its portfolio through follow-on rounds, particularly for companies demonstrating strong ecosystem traction and platform integration success.

Notable Portfolio Companies

Anomalo: Data quality monitoring platform that uses machine learning to automatically detect anomalies and data issues in enterprise data pipelines, integrated with Databricks for automated alerting and remediation workflows.

Cleanlab: Enterprise data cleaning platform that automatically identifies and corrects errors in structured and unstructured datasets, helping data teams maintain high-quality training data for AI models on Databricks.

Glean: Enterprise AI search and knowledge discovery platform that connects to organizational data sources including Databricks, enabling natural language querying across an entire company's data assets.

Mistral AI: French AI company developing large language models and AI infrastructure, one of Databricks Ventures' higher-profile bets in the foundational AI model layer.

Perplexity: AI-powered search engine that represents the consumer-facing end of the AI ecosystem thesis, providing an alternative approach to information discovery that complements data analytics workflows.

Nimble: Israeli startup that automates real-time web data ingestion into enterprise AI systems, integrating with Databricks and Microsoft Azure through Delta Sharing to close enterprise AI data gaps.

Unstructured: Data preprocessing platform purpose-built for AI workloads, helping enterprises transform unstructured data into formats optimized for LLM fine-tuning and retrieval-augmented generation on Databricks.

Labelbox: Enterprise training data platform for AI model development, providing data labeling, curation, and governance capabilities that integrate with Databricks' MLflow for end-to-end model lifecycle management.

Matillion: Data transformation and ELT platform that enables data teams to load, transform, and orchestrate data pipelines into Databricks, widely used by enterprises migrating from legacy data warehouses.

Lovelytics: Data and AI consultancy that specializes in implementing Databricks solutions for enterprise customers, representing the systems integrator pillar of the investment thesis.

What Databricks Ventures Looks For

Ecosystem integration depth is the single most important factor in a Databricks Ventures investment decision. The team evaluates whether a company's technology meaningfully uses Databricks platform primitives, not just whether it could theoretically integrate someday. Founders should be prepared to demonstrate working API integrations, Delta Sharing connections, or native MLflow compatibility.

Data quality and AI readiness has become an increasingly important evaluation criterion. With the rise of LLMs and agentic AI workflows, Databricks Ventures looks favorably on companies that help enterprises get their data houses in order before feeding it into AI systems. Data observability, data quality monitoring, and data governance solutions that reduce AI hallucinations are particularly relevant.

Founding team depth in data infrastructure matters significantly. The team can evaluate technical architecture quickly and values founders who have clearly thought through how their product fits within the enterprise data stack. Prior experience building and operating data infrastructure at scale is a meaningful signal.

Enterprise traction within the Databricks customer base creates a powerful positive feedback loop. Companies that can point to referenceable enterprise customers already using both Databricks and their product have a significantly higher likelihood of getting funded.

Business model clarity around consumption-based pricing is valued. As enterprises increasingly move toward consumption-based procurement for AI and data tools, companies whose pricing model aligns with platform usage (as opposed to flat seat licenses) tend to have stronger retention and expansion metrics.

Competitive moats must be defensible beyond the Databricks relationship. The fund does not require exclusivity and wants to see that companies can win independent of the Databricks connection, even while benefiting from the partnership.

How to Connect With Databricks Ventures

Warm introductions remain the highest-conversion path to a Databricks Ventures meeting. Trusted investors with data infrastructure focus, existing Databricks ecosystem participants, and industry experts who can vouch for both the team and the technical integration approach are the most effective referral sources.

The Built-On Databricks Startup Challenge offers a structured path to visibility for early-stage companies building on the platform. The competition awards $1M+ in prizes and provides direct access to Databricks engineers, product managers, and the ventures team. Winning or placing highly in the competition has led directly to investment discussions.

The Databricks AI Accelerator, launched in September 2025 with backing from Databricks Ventures, is another entry point for pre-seed and seed-stage AI startups. The program offers funding, credits, mentorship, and expert access. Companies accepted into the accelerator gain regular touchpoints with the ventures team and structured pathways to follow-on investment.

Cold outreach through the Databricks website is less effective but viable if you can clearly articulate your platform integration approach. Your pitch should lead with your specific Databricks integration (which APIs, which platform features) and show evidence of product-market fit in a datacentric workflow.

When preparing for a meeting, be ready to discuss your technical architecture in detail. The Databricks Ventures team includes people who can evaluate your data model, your pipeline design, and your integration approach at a deep technical level. Vague answers about 'eventually integrating with Databricks' will not advance the conversation.

Follow-up discipline matters. The fund evaluates many companies and decision timelines can extend beyond initial expectations. Brief, milestone-focused updates (new enterprise customer, new platform integration shipped, significant metric improvement) maintain momentum without being pushy.

Financial Preparation for Strategic Investors

Founders pursuing Databricks Ventures should prepare financials that reflect consumption-based revenue dynamics. Strategic investors will scrutinize your ARR benchmarks composition, net revenue retention, and the degree to which your revenue is recurring versus variable based on data volume or query execution.

Unit economics that demonstrate efficiency at scale carry significant weight. Databricks Ventures has visibility into how enterprise software companies in the data ecosystem perform, and your gross margin profile, CAC/LTV/CAC ratios ratio, and path to profitability should be grounded in realistic benchmarks from comparable companies.

Financial projections should reflect an honest view of enterprise sales cycles. Data infrastructure procurement typically involves longer evaluation periods than consumer applications, and your revenue model should account for 6-12 month enterprise sales cycles with appropriate staging assumptions.

Understanding your compute and infrastructure costs as a percentage of revenue is especially important for data infrastructure companies. Investors will ask about your cost structure and want to see that you can scale without linear cost growth.

Working with a fractional CFO experienced in data infrastructure fundraising can meaningfully improve your positioning. Databricks Ventures is looking for founders who understand the economics of the data ecosystem, not just the technology. Financial fluency combined with technical depth is a powerful combination in the room.

Pro Tip

When pitching Databricks Ventures, lead with your specific platform integration rather than your general AI story. Show them exactly which Databricks APIs you use, how Delta Sharing or MLflow is part of your product architecture, and what the shared customer experience looks like. The ventures team evaluates ecosystem fit before market size, and a clear, concrete integration story answers their primary question faster than any slide about TAM.

Frequently Asked Questions

What industries does Databricks Ventures focus on?

Databricks Ventures focuses exclusively on the data, analytics, and AI ecosystem. The fund invests in companies building data infrastructure, ML platforms, AI applications, data quality and governance tools, and system integrators implementing Databricks for enterprise customers. Geographic focus includes the US, Israel, and other major startup markets.

What stage companies does Databricks Ventures invest in?

Databricks Ventures invests across the spectrum from pre-seed through growth stage. The AI Fund targets pre-seed and seed companies, while the broader fund invests in seed through Series B. The firm participates rather than leads in Series A and later rounds, making seed and early Series A the most likely stages for a lead investment.

What is Databricks Ventures's typical check size?

Databricks Ventures typically invests $1M to $10M per deal depending on stage and strategic importance. The firm prefers to participate rather than lead in Series A and later rounds, and the AI Fund awards $1M+ in prizes through the Built-On Databricks Startup Challenge for early-stage companies.

How do I apply to Databricks Ventures?

The most effective approach is a warm introduction from an existing Databricks ecosystem participant, a trusted investor with data infrastructure focus, or an industry expert. You can also apply through the Built-On Databricks Startup Challenge or the Databricks AI Accelerator program, both of which provide direct access to the ventures team.

What does Databricks Ventures look for in founders?

Databricks Ventures looks for founders with deep expertise in data infrastructure, ML operations, or AI applications, and a clear technical vision for how their product integrates with the Databricks platform. Prior experience building or operating large-scale data systems is valued, as is evidence of product-market fit within enterprise data workflows.

Does Databricks Ventures lead rounds or follow?

Databricks Ventures typically prefers to lead or co-lead at seed and early Series A stages, and transitions to participating rather than leading in Series A and later rounds. The firm's participation decisions are driven more by strategic ecosystem considerations than valuation discipline.

How long does Databricks Ventures's due diligence process take?

The due diligence process varies based on deal complexity and the team's current volume, but the firm is known for making decisions relatively quickly for companies with clear strategic fit. Founders should expect several weeks from initial meeting to term sheet for straightforward opportunities, with more complex technical evaluations taking longer.

What should I prepare before meeting with Databricks Ventures?

Prepare a detailed technical architecture overview showing your specific Databricks platform integration, evidence of product-market fit with enterprise customers, a financial model reflecting consumption-based revenue dynamics, and a clear explanation of your competitive moat independent of the Databricks relationship. Be ready to discuss your data model, pipeline design, and MLflow compatibility in technical depth.

Prepare Your Pitch for Databricks Ventures?

Our fractional CFO team has deep experience helping data infrastructure companies raise from strategic investors like Databricks Ventures. We can help you build financials that demonstrate your platform integration value and position your startup for success.

Discuss Fundraising Strategy