Databricks

Updated

Databricks was founded in 2013 by seven UC Berkeley AMPLab researchers—Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia, Patrick Wendell, and Reynold Xin—who built the original Apache Spark.

Reviewed by 7wData

On this page

Profile

Unified data and AI platform combining data engineering, analytics, and foundation model training on open formats.

Databricks was founded in 2013 by seven UC Berkeley AMPLab researchers—Ali Ghodsi, Andy Konwinski, Arsalan Tavakoli-Shiraji, Ion Stoica, Matei Zaharia, Patrick Wendell, and Reynold Xin—who built the original Apache Spark. The company commercializes Spark and has evolved into a vertical stack for enterprise data and AI. As of February 2026, Databricks reported a $5.4 billion annualized revenue run-rate, growing 65 percent year-over-year, serving over 20,000 organizations globally.

The company disclosed 800+ customers each consuming above $1 million annually and 70+ above $10 million. More than 60 percent of Fortune 500 companies use Databricks. The platform unifies three traditional silos: data engineering (ELT pipelines), analytics (SQL), and AI (model training and inference).

Delta Lake, an open storage format, underpins the lakehouse and offers transactional reliability on object storage. Unity Catalog layers governance across data, models, and AI agents. Recent strategic moves reveal an aggressive push toward agentic AI systems.

In May 2025, Databricks acquired Neon, a serverless Postgres company, for approximately $1 billion. Months later, the company announced Lakebase at its Data + AI Summit in June 2025—a fully managed Postgres database integrated with the lakehouse. Mosaic AI, acquired from MosaicML for $1.3 billion in 2023, now drives a $1.4 billion AI revenue line.

The company also released MLflow 3.0 and Agent Bricks for building production AI agents. On the funding side, Databricks completed a Series L round in December 2025 at a $134 billion valuation, raising $4 billion in equity. In January 2026, JPMorgan led an $1.8 billion debt round, signaling IPO readiness.

CEO Ali Ghodsi has indicated a potential S-1 filing in the second half of 2026. If Databricks goes public at its current valuation, it would be the largest enterprise software IPO on record. Headcount stands at 7,000+, disciplined hiring relative to revenue growth.

Track Databricks and 240+ vendors.

335k+ subscribers read the daily AI & data note. One email, both newsletters. Unsubscribe anytime.

Products by Databricks

Who buys this

  • Fortune 500 enterprises running analytics and predictive ML on multi-cloud infrastructure
  • AI labs and GenAI startups fine-tuning models on proprietary data (OpenAI, Adobe, Replit via Neon)
  • Mid-market data teams migrating from data warehouses to lakehouse for cost and governance
  • Financial services and insurance firms using feature engineering and real-time ML for risk scoring

Publicly disclosed clients

  • Salesforce
  • Shell
  • HP
  • Comcast
  • Rivian
  • Fox Sports
  • OpenAI (via Neon)
  • Adobe

Strengths and what to watch

Strengths

  • Deepest investment in open formats: owns Delta Lake, acquired Tabular (original Iceberg creators in 2024), and now supports native Iceberg alongside Delta
  • 65% YoY revenue growth with positive free cash flow and 140%+ net dollar retention across 20,000 customers, 800+ paying $1M+
  • Vertical bundling of AI infrastructure: lakehouse + Postgres (via Neon) + model training (Mosaic AI) + observability (MLflow) reduces customer switching cost

Watch for

  • IPO execution and timing risk: planned H2 2026 filing depends on equity market appetite; $1.8B debt raise could become a liability if IPO window closes
  • AI talent retention: Naveen Rao, AI Chief, departed as company prepares for public markets, raising questions about AI product strategy continuity
  • Business user adoption plateau: complexity remains a barrier; most reported issues relate to administrative overhead and cluster management vs. simpler warehouse alternatives

Recent moves

Key Information

Industry
Data Lake / Lakehouse
Founded
2013
Employees
5001-10000
Headquarters
San Francisco, CA

Frequently Asked Questions

What is Databricks?

Databricks is a unified data and AI platform that combines data engineering, analytics, and foundation model training on open formats like Delta Lake. Founded in 2013 by UC Berkeley researchers who built Apache Spark, it serves over 20,000 organizations globally and 60 percent of Fortune 500 companies.

How large is Databricks?

Databricks reported a $5.4 billion annualized revenue run-rate in February 2026, growing 65 percent year-over-year. The company serves over 20,000 organizations, with 800+ customers spending $1 million annually and 70+ spending above $10 million. It closed a $4 billion Series L round at $134 billion valuation in December 2025.

What is Delta Lake?

Delta Lake is Databricks's open storage format that underpins its lakehouse architecture. It offers transactional reliability on object storage, addressing key limitations of traditional data lakes. Databricks owns Delta Lake and also acquired Tabular, creator of Iceberg, supporting both open formats natively alongside proprietary solutions.

What is Lakebase?

Lakebase is Databricks's fully managed Postgres database integrated with its lakehouse platform, announced at the June 2025 Data + AI Summit. Acquired via the $1 billion Neon acquisition in May 2025, Lakebase extends Databricks's vertical bundling strategy, combining structured database capabilities with lakehouse flexibility for enterprise users.

Does Databricks support AI model training?

Yes. Databricks acquired Mosaic AI from MosaicML for $1.3 billion in 2023, creating a $1.4 billion AI revenue line. The company released MLflow 3.0 and Agent Bricks for building production AI agents. Its vertical stack now bundles lakehouse infrastructure, model training, and observability for enterprise AI workloads.

Is Databricks going public?

CEO Ali Ghodsi indicated a potential S-1 filing in the second half of 2026. In January 2026, JPMorgan led an $1.8 billion debt round, signaling IPO readiness. At current $134 billion valuation, Databricks would represent the largest enterprise software IPO on record if it goes public as planned.

How Databricks compares

Direct head-to-head against 3 competitors. Picked by 7wData.

This company

Databricks

Positioning
Unified data and AI platform combining data engineering, analytics, and foundation model training on open formats.
Customer segments
Fortune 500 enterprises running analytics and predictive ML on multi-cloud infrastructure
Strengths
Deepest investment in open formats: owns Delta Lake, acquired Tabular (original Iceberg creators in 2024), and now supports native Iceberg alongside Delta
Watch for
IPO execution and timing risk: planned H2 2026 filing depends on equity market appetite; $1.8B debt raise could become a liability if IPO window closes
Recent moves
Data + AI Summit 2025: Databricks announces Lakebase GA, Agent Bricks, MLflow 3.0 with 20,000+ attendees

Snowflake

Positioning
Warehouse-first AI Data Cloud competing on multi-cloud data sharing, SQL governance, and agentic AI workload expansion.
Customer segments
Fortune 2000 enterprises in financial services, healthcare, and retail needing governed multi-cloud analytics and SQL-first workloads.
Strengths
Zero-copy data sharing across organizations via Snowflake Marketplace, enabling cross-company data collaboration without data movement or duplication.
Watch for
Consumption-based pricing creates unpredictable bills at scale; customers report budgeting difficulty when query volumes spike beyond planned windows.
Recent moves
Acquired Observe, an AI-powered observability platform, for approximately $1 billion, announced January 2026.

Microsoft Fabric

Positioning
Unified SaaS analytics platform bundling data engineering, warehouse, real-time analytics, and Power BI under one Azure-native capacity license.
Customer segments
Azure-committed enterprises with existing Microsoft licenses seeking to consolidate data engineering and BI under one vendor contract.
Strengths
Single capacity-unit pricing covers data engineering, warehouse, BI, and real-time analytics together, reducing per-product licensing and procurement overhead.
Watch for
Fine-grained resource governance is immature; documented noisy-neighbor problems affect multi-workload deployments; alerting mechanisms remain underdeveloped as of 2025.
Recent moves
Acquired Osmos, an agentic data engineering startup, to add autonomous data preparation to Fabric, announced January 5, 2026.

Google BigQuery

Positioning
Serverless SQL analytics on Google Cloud, repositioned as a data-to-AI layer with embedded Gemini and native ML execution.
Customer segments
Google Cloud-native enterprises and data engineering teams running petabyte-scale serverless analytics without cluster provisioning or infrastructure management.
Strengths
Serverless execution model: no cluster provisioning, automatic scaling to petabytes, pay-per-query pricing removes infrastructure overhead Databricks requires.
Watch for
Unpredictable query costs at scale; 37 percent of customers report the pricing model does not scale adequately, per 2025 review data.
Recent moves
Launched managed Iceberg tables GA and cross-cloud lakehouse extending analytics to AWS and Azure, announced at Google Cloud Next 2026.

Sources

  1. www.databricks.com — Product portfolio (Lakebase, Unity Catalog, Genie, Agent Bricks), customer base scale and industries served
  2. www.cnbc.com — Neon acquisition announcement, valuation, strategic rationale around AI agents
  3. www.databricks.com — February 2026 revenue metrics ($5.4B run-rate), YoY growth (65%), customer count and tiers, Series L details, debt financing
  4. www.databricks.com — June 2025 Data + AI Summit announcements including Lakebase, MLflow 3.0, Agent Bricks, keynote speakers
  5. techcrunch.com — Neon acquisition details, founding year, employee count, major Neon customers
  6. www.prnewswire.com — Data + AI Summit 2025 details and product announcements
  7. sacra.com — Revenue growth trajectory, ARR metrics, funding history, financial profile
  8. towardsdatascience.com — Competitive positioning relative to Snowflake, platform maturity concerns, market challenges