To satisfy a read, the DataStax Enterprise (DSE) database must combine results from the active memtable and potentially multiple SSTables.
The database processes data at several stages :
- Check the memtable
- Check row cache(frequently access values in it), if enabled
- Check Bloom filter
- Find the partition offset in the partition index in the chunk cache/memory
- If any required index data is not present in the cache, pull it from disk
- Read the data from the uncompressed chunk cache
- If any required data chunk is not present in the cache:
- Locate the data on disk using the compression offset map
- Fetch the data from the SSTable on disk into the chunk cache
Row cache :
- If the row cache is enabled, it stores a subset of the partition data stored on disk in the SSTables in memory
- DSE 5.0 and later, the row cache is stored in fully off-heap memory using an implementation that relieves garbage collection pressure in the Java Virtual Machine (JVM)
- The rows stored in row cache are frequently accessed rows that are merged and saved to the rowcache from the SSTables as they are accessed
When the desired partition data is not found in the row cache, the bloom filteris checked.
Bloom filter :
- DSE database checks the Bloom filter to discover which SSTables are likely to have the requested partition data. It can sometimes return false positives
- If the Bloom filter does not rule out an SSTable, the DSE database checks the partition index
- The Bloom filter is stored in off-heap memory, and grows to approximately 1-2 gigabytes (GB) per billion partitions
Partition index :
- The partition index maps partition keys to a row index and supports iteration from a partially specified partition position if required
- The partition index trie data structure uses a unique byte-ordered partition key prefixes to point to:
- A row index for tables that have wide partitions
- Directly to the data position in a file for tables that have partitions that only include a few rows or a single row
Uncompressed data in chunk cache (memory) :
- The chunk cache buffers data before it is compressed and written to an SSTable. When reading data, uncompressed data is checked first. If the data is compressed and on disk, the process uses the compression offset map to locate the data on disk and uncompress the SSTable into memory.
Compression offset map :
- The compression offset map stores pointers to the exact location on disk where the desired partition data will be found
- This location is stored in off-heap memory and is accessed by either the partition key cache or the partition index
- After the compression offset map identifies the disk location, the desired compressed partition data is fetched from the correct SSTable(s)
- The more data is compressed, the greater number of compressed blocks required, and the larger the compression offset table
Want to see How is data written (write path) in Cassandra?