MongoDB protocol CosmosDB (Core, SQL) Kafka protocol - Event Hubs (AMQP, HTTP)

Here’s the hierarchy in Kafka:


Cluster

  • A Kafka cluster is the whole deployment.
  • It’s made of multiple brokers (servers).
  • All brokers together form the cluster and share metadata about topics, partitions, leaders, replicas, etc.
  • Example: A production Kafka setup might have 10 brokers forming one cluster.

Topic

  • A topic is a logical category or stream of messages (like a table in a DB).
  • Producers write messages into a topic; consumers read from it.
  • Each topic is split into partitions for scaling.
  • Example: Topic = orders, where all order events are stored.

Partition

  • A partition is a single, ordered log inside a topic.
  • Each message in a partition has a unique offset (sequential ID).
  • Ordering is guaranteed only within a partition, not across all partitions in a topic.
  • Each partition has:
    • One leader replica (handles reads/writes).
    • Zero or more follower replicas (sync from leader for fault tolerance).
  • Example: Topic orders with 6 partitions means there are 6 independent logs, possibly spread across different brokers.

Hierarchy

Cluster
 └── Brokers (servers)
      └── Topics (logical streams, e.g., orders, payments)
           └── Partitions (ordered logs, unit of scaling & replication)

👉 In short:

  • Cluster = all brokers.
  • Topic = logical stream/category.
  • Partition = actual log slice where messages live.

Azure Event Hubs is basically Kafka-as-a-service on Azure.
In fact:

  • It’s conceptually very close to Apache Kafka:
    • Partitioned, high-throughput event ingestion
    • Producers push data in
    • Consumers read data out with offsets & checkpoints
    • Retention window (can replay old data within that window)
  • Microsoft even provides a Kafka-compatible endpoint in Event Hubs.
    👉 Meaning: You can point your Kafka producers/consumers at Event Hubs without running your own Kafka cluster.

So:

  • Event Hubs ≈ Azure’s Kafka (for telemetry + streaming pipelines)
  • Event Grid ≈ Azure’s Pub/Sub fabric (for lightweight event-driven workflows)

⚡ Example:

  • If you’re building an IoT telemetry pipeline, you’d use Event Hubs (like Kafka).
  • If you want to trigger a Function when a blob is created, you’d use Event Grid.

Do you want me to draw you a side-by-side pipeline diagram (Kafka vs Event Hubs vs Event Grid) so it’s crystal clear?