Apache Kafka

Distributed event-streaming platform for real-time data pipelines.

Reviewed by 7wData
Updated

On this page

Publisher review

Apache Kafka is an open-source distributed event streaming platform created by LinkedIn engineers in 2011 and now maintained by the Apache Foundation. It provides durable, append-only logs that allow multiple consumers to replay events at different offsets, delivering messages with latencies as low as 2ms while scaling to handle trillions of messages daily across petabytes of data.

Kafka is the reference streaming platform in data engineering, with over 80% of Fortune 100 companies relying on it for mission-critical pipelines. In 2026, the platform continues to evolve: Kafka 4.0 introduced Queues (KIP-932), adding share groups as an alternative to consumer groups for queue-like semantics, and the roadmap includes two-phase commit support (KIP-939) to improve transactional guarantees.

The platform's value lies in its role as an "event backbone"—a single source of truth for real-time data across an organization. Teams use it with Kafka Streams for in-app stream processing, Kafka Connect for integrating hundreds of data sources and destinations, and Schema Registry (via Confluent) for data governance. The community remains active, contributing 2,450+ commits across 61 contributors in 2025.

However, Kafka carries real trade-offs. Self-hosted clusters demand substantial operational overhead: production deployments typically require a minimum of 2.3 full-time equivalent operations staff to manage broker lifecycle, partitioning strategy, consumer lag monitoring, and incident response. Managed alternatives like Confluent Cloud can cost 8–24x more than self-hosted, with pricing stacking compute, storage, and egress fees on top of a base platform charge. Kafka Streams, the native stream processing library, excels at basic transformations but struggles with stateful operations (joins, aggregations); teams doing complex analytics often turn to Apache Flink instead.

The core decision most teams face is whether the operational complexity and cost of Kafka is justified by the business value of real-time, replay-capable event logs.

Get the AI & data signal, daily.

335k+ subscribers read this every morning. One email, both newsletters. Unsubscribe anytime.

How it works

  1. Event log storage with replay

    Append-only logs allow consumers to read from any offset, enabling historical replay and multi-speed consumption.

  2. Kafka Streams

    Embedded stream processing library for stateless transformations, filters, and aggregations without separate cluster infrastructure.

  3. Kafka Connect

    Framework with 100+ pre-built connectors integrating databases, cloud services, and data warehouses as sources and sinks.

  4. Consumer groups with automatic rebalancing

    Coordinate parallel consumption of partitions across instances with built-in failover when members join or leave.

  5. Horizontal scalability

    Production clusters routinely handle trillions of messages daily across petabytes of data and thousands of partitions.

  6. Exactly-once semantics

    Transactional writes prevent message duplication during failures, though implementation adds complexity and overhead.

Strengths and trade-offs

Strengths

  • Industry standard trusted by 80%+ of Fortune 100 for mission-critical pipelines.
  • Scales massively with 2ms latencies and proven throughput of trillions of messages daily.
  • Active open-source community (2,450+ commits in 2025; KIP-driven feature roadmap).

Trade-offs

  • Operational complexity requires deep expertise; self-hosted production typically needs 2.3+ FTE operations staff for broker management and incident response.
  • Managed services (Confluent Cloud) cost 8–24x more than self-hosted; pricing stacks usage fees on top of base platform charges.
  • Kafka Streams struggles with stateful operations (joins, aggregations); teams needing complex analytics often switch to Apache Flink.

Pricing context

Apache Kafka itself is free and open-source. Self-hosted clusters cost approximately $571–$8,820 monthly for 3–30 brokers (excludes operational FTE). Confluent Cloud operates on pay-as-you-go pricing (compute, storage, egress, platform fees); a 9-broker cluster runs roughly $3,649–$4,000+ monthly.

AWS MSK charges similarly to Confluent. Cost comparison (March 2026) shows self-hosted offers lowest per-message cost but highest operational burden; Confluent Cloud simplifies management but increases total cost of ownership. Teams often choose self-hosted with tools like Strimzi to balance cost and control.

Getting started with Apache Kafka

  1. Download and configure broker

    Download Apache Kafka from kafka.apache.org, extract the archive, and verify Java is installed. Configure broker.properties with broker.id, log.dirs, and listener ports. Start the broker using kafka-server-start.sh. This creates a local single-broker cluster for evaluation.

  2. Connect a data source

    Choose a pre-built Kafka Connect connector matching your data source (database, API, cloud service, or data warehouse). Deploy the connector using connect-standalone.sh or connect-distributed.sh. Provide credentials and source configuration; the connector begins producing messages to a Kafka topic.

  3. Define topics and partitions

    Create a topic using kafka-topics.sh with --partitions to set the number of parallel consumers and --replication-factor for durability. Topics organize events by entity (customers, orders, transactions). Partitioning enables horizontal scaling across brokers; replication ensures messages survive broker failures.

  4. Produce and consume messages

    Use kafka-console-producer.sh to write test events to your topic. Then use kafka-console-consumer.sh to read them back, verifying messages flow end-to-end. Confirm consumers can read from different offsets to validate the append-only log and replay capability that define Kafka's value.

  5. Track lag and plan operations

    Track consumer lag (difference between log end offset and consumer offset) using kafka-consumer-groups.sh. Monitor broker disk usage and message throughput. Plan for operational overhead: production clusters typically need dedicated staff to manage broker lifecycle, partitioning changes, and incident response.

Frequently Asked Questions

What is Apache Kafka?

Apache Kafka is an open-source distributed event streaming platform created by LinkedIn in 2011. It provides append-only logs allowing multiple consumers to replay events at different offsets, delivering messages with latencies as low as 2ms while scaling to handle trillions of messages daily.

How much does Apache Kafka cost?

Apache Kafka itself is free and open-source. Self-hosted clusters cost approximately $571–$8,820 monthly for 3–30 brokers, excluding operational staffing. Confluent Cloud operates on pay-as-you-go pricing with a 9-broker cluster running roughly $3,649–$4,000+ monthly. Managed services typically cost 8–24x more than self-hosted.

How many people does it take to run a Kafka cluster?

Production Kafka deployments typically require a minimum of 2.3 full-time equivalent operations staff to manage broker lifecycle, partitioning strategy, consumer lag monitoring, and incident response. This significant operational overhead is a key trade-off teams must consider against the cost savings of self-hosting.

What is Kafka Streams?

Kafka Streams is an embedded stream processing library for stateless transformations, filters, and aggregations without requiring separate cluster infrastructure. However, it struggles with stateful operations like joins and complex aggregations; teams needing advanced analytics often switch to Apache Flink instead.

Does Kafka guarantee message delivery?

Kafka offers exactly-once semantics through transactional writes that prevent message duplication during failures. However, implementing this feature adds significant complexity and operational overhead. The 2026 roadmap includes KIP-939 for two-phase commit support to improve transactional guarantees further. Most teams balance delivery guarantees with operational simplicity.

Why do 80% of Fortune 100 companies use Kafka?

Kafka serves as the event backbone for real-time data across organizations, enabling 80%+ of Fortune 100 companies to build mission-critical pipelines. Its proven ability to handle trillions of messages daily with 2ms latencies, combined with ecosystem tools like Kafka Connect and Consumer Groups, makes it the industry standard.

Alternatives in this category

Integrations

Confluent AWS MSK Flink Spark

How Apache Kafka compares

Direct head-to-head against 3 competitors. Picked by 7wData.

This tool

Apache Kafka

Pricing
Apache Kafka itself is free and open-source. Self-hosted clusters cost approximately $571–$8,820 monthly for 3–30 brokers (excludes operational FTE). Confluent Cloud operates on pay-as-you-go pricing (compute, storage, egress, platform fees); a 9-broker cluster runs roughly $3,649–$4,000+ monthly. AWS MSK charges similarly to Confluent. Cost comparison (March 2026) shows self-hosted offers lowest per-message cost but highest operational burden; Confluent Cloud simplifies management but increases total cost of ownership. Teams often choose self-hosted with tools like Strimzi to balance cost and control.
Target
Apache Kafka is an open-source distributed event streaming platform created by LinkedIn engineers in 2011 and now maintained by the Apache Foundation.
Deployment
self-hosted
Strength
Industry standard trusted by 80%+ of Fortune 100 for mission-critical pipelines.
Watch for
Operational complexity requires deep expertise; self-hosted production typically needs 2.3+ FTE operations staff for broker management and incident response.

Confluent Cloud

Pricing
$0.014-$0.050/GB ingress/egress; Flink at $0.21/CFU-hour; production clusters $385-$3,000+/month; enterprise $10,000+/month.
Target
Enterprises running Kafka wanting managed brokers, Flink, and data governance without dedicated operations staff.
Deployment
SaaS; dedicated, BYOC, and serverless tiers on AWS, GCP, Azure.
Strength
Packages managed Kafka, managed Flink, and Stream Governance Advanced together in a single cloud service.
Watch for
Pricing stacks compute, storage, and egress fees on top of base platform charge; total cost runs 8-24x self-hosted Kafka.

Redpanda

Pricing
Free tier ($300 credit, 30 days); BYOC from ~$143/month plus cloud costs; Enterprise is contact sales.
Target
Engineering teams wanting Kafka API compatibility without JVM or ZooKeeper operations, prioritizing lower latency.
Deployment
Open-source; SaaS serverless (AWS only); BYOC; self-managed Enterprise on-prem.
Strength
Single-binary C++ architecture eliminates JVM and ZooKeeper, cutting latency and hardware footprint vs. standard Kafka.
Watch for
Connector ecosystem offers 300+ plugins vs. Kafka's 1,000+; enterprise governance requires more manual configuration than Confluent.

Amazon Kinesis Data Streams

Pricing
On-Demand Standard: $0.08/GB in, $0.04/GB out, $0.04/stream-hour; Advantage mode cuts rates 60% with 25 MB/s minimum.
Target
AWS-native teams needing managed event streaming without broker operations, especially those integrated with Lambda or Redshift.
Deployment
SaaS, AWS-only; no on-prem or multi-cloud deployment option.
Strength
Native integration to Lambda, S3, Redshift, and the full AWS service catalog with no broker configuration required.
Watch for
No Kafka API compatibility; migrating to or from Kafka requires a full application rewrite, creating hard AWS vendor lock-in.

User reviews

No user reviews yet. Be the first to write one.

Sources

Reporting on this tool draws on these publicly available sources.

  1. kafka.apache.org — Kafka architecture, features, scalability, and current version (4.2); KIP roadmap for 2025–2026.
  2. axonops.com — Detailed cost comparison (March 2026) for self-hosted, Amazon MSK, and Confluent Cloud; staffing requirements (2.3 FTE minimum).
  3. www.tinybird.co — Main Kafka alternatives and trade-offs; insight that most teams seeking alternatives actually need analytics serving, not event streaming.
  4. www.getorchestra.io — Kafka operational challenges: resource consumption, message ordering constraints, security configuration complexity, exactly-once semantics overhead.
  5. www.conduktor.io — Kafka Streams vs Flink trade-offs; Streams excels at basic operations but struggles with stateful operations and complex analytics.
  6. developers.redhat.com — Community activity (2,450+ commits, 61 contributors in 2025); KIP-932 (Queues) and KIP-939 (2PC) roadmap items.