Apache Kafka is more than a message broker; with Kafka Streams, it’s a stream processing engine. You can join, filter, and aggregate data in real-time without an external database. This guide explores the DSL (KStream vs KTable) and stateful operations.
KStream vs KTable
Understanding this duality is core to Kafka Streams.
- KStream: An endless record of events (Insert-only log). Example: Credit Card Transactions.
- KTable: A snapshot of the latest state per key (Update log). Example: User Account Balance.
Joining Streams
KStream<String, Order> orders = builder.stream("orders");
KTable<String, Customer> customers = builder.table("customers");
// Join Order stream with Customer table (Enrichment)
KStream<String, EnrichedOrder> enriched = orders.join(customers,
(order, customer) -> new EnrichedOrder(order, customer),
Joined.with(Serdes.String(), new OrderSerde(), new CustomerSerde())
);
enriched.to("enriched-orders");
Key Takeaways
- State is maintained locally using RocksDB.
- Scales horizontally by adding instances (consumers).
- Requires the data to be co-partitioned (same key) for joins.
Discover more from C4: Container, Code, Cloud & Context
Subscribe to get the latest posts sent to your email.