How Kafka Stores Messages Internally
Recently I was exploring RabbitMQ and Kafka to do some POCs. I found that Kafka provides very high throughput, which refers to the ability to process a large number of messages in a short amount of time, as compared to RabbitMQ (not RabbitMQ streams). I started exploring how Kafka stores messages and provides high throughput. In this post we are going to discuss how message queues store data to provide high throughput.
A Little Correction
Most people think that Kafka is a queue, but that's wrong. At its very core, Kafka is a distributed commit log. It is not like a traditional queue data structure where you push and consume data from different ends.
So What is Commit Log?
A commit log is one of the simplest yet most powerful data storage concepts used in distributed systems. At its core, a commit log is an append-only sequence of records stored on disk. Every new piece of data is simply appended to the end of the log, and nothing is modified or removed from the middle.
Offsets
Before we talk about how messages are stored using store and index files, we need to understand one important concept in commit logs: offsets.
An offset is a unique identifier assigned to every message written to the log. Instead of identifying messages using IDs or keys, the system simply assigns an incrementing number to each message as it is appended.
For example, if a log starts empty, the first few messages written to it might look like this:
Offset 0 → Message A
Offset 1 → Message B
Offset 2 → Message C
Offset 3 → Message DWhere does Kafka Stores it's Data?
Kafka and other commit logs usually store all their data on the system disc. They do not use any database for that. Instead Kafka relies on operating system optimisations for high throughput. That's why Kafka throughput depends on how good your disc is.
How Messages Are Stored in a Commit Log
Now that we understand what a commit log is, let's look at how messages are actually stored on disk.
At a high level, a commit log stores data using two main components:
- The store file – where the actual message bytes are written.
- The index file – which maps offsets to positions in the store file.
These two structures allow the system to append messages sequentially while still supporting fast lookups.
The Store File
The store file is where the actual message payload is written. Messages are written sequentially to the end of the file in an append-only fashion. However, simply writing messages one after another would create a problem: when reading data back, how would the system know where one message ends and the next begins?
To solve this, each message is stored with a small header. Before writing the message payload, the system first writes the length of the message. So the data layout inside the store file looks like this:
| message length | message bytes |
| message length | message bytes |
| message length | message bytes |When reading a message, the system first reads the length and then reads the corresponding number of bytes. This makes it possible to efficiently traverse the log.
The Index File
While sequential writes are great for performance, reading messages directly from the store file would be slow if we had to scan from the beginning every time. This is where the index file comes in.
The index stores a mapping between:
- Offset → Position in the store file
Each index entry contains two pieces of information:
- The offset of the message within the segment
- The byte position where that message begins in the store file
Conceptually it looks like this:
| offset | position |
| offset | position |
| offset | position |When a message is appended:
- The message is written to the store file.
- The system records the byte position where the message was written.
- An entry is added to the index mapping the offset to that position.
This allows the system to jump directly to the correct location in the store file when reading a message.
Segments
If a commit log stored all messages in a single growing file, the file would eventually become extremely large and difficult to manage. To solve this problem, commit logs divide the log into multiple smaller files called segments.
Each segment is responsible for storing a continuous range of offsets. A segment typically consists of two files: a store file and an index file. We talked about both of them earlier.
Example
Suppose a log starts with offset 0 and each segment can store 1000 messages.
Segment 0 (base offset: 0)
Offsets: 0 - 999
Segment 1 (base offset: 1000)
Offsets: 1000 - 1999
Segment 2 (base offset: 2000)
Offsets: 2000 - 2999If a consumer wants to read message offset 1450, the system can quickly determine that this offset belongs to the segment with base offset 1000, and then use that segment’s index file to find the exact position of the message in the store file.
Why This Design Is Fast
This storage model provides high throughput for several reasons. First, writes are sequential appends, which are extremely efficient for disks.
Second, the system separates data storage from indexing, allowing fast lookups without disrupting sequential writes.
Third, the index file can be memory mapped. This mechanism allow you to map any file inside computer memory so you can access it like an array. This makes writes very efficient.
Before You Go
If you made it this far, Thank You.
I usually write about backend engineering, distributed systems, and things I learn while working on real problems. Not theory — mostly practical stuff that I wish someone had explained to me earlier.
I run a free newsletter where I share these kinds of write-ups. No spam. Just occasional backend engineering notes.