Cassandra Cassandra Interview Questions Interview Questions Technologies

How are indexes stored and updated in Cassandra?

Secondary indexes filter tables for data stored in non-primary key columns. They are used to query a table using a column that cannot normally be queried.

Secondary indexes can be built for a column in a table. These indexes are stored locally on each node in a hidden table and built in a background process. If a query includes both a partition key condition and a secondary index column condition, the query will be successful because the query can be directed to a single node partition.

If a secondary index is used in a query that is not restricted to a particular partition key, the query will have prohibitive read latency because all nodes will be queried. A query with these parameters is allowed only if the query option ALLOW FILTERING is used. This option is not appropriate for production environments, and does not guarantee trouble-free indexing.

When a column is updated, the index is updated as well.

If the old column value still exists in the memtable, which typically occurs when updating a small set of rows repeatedly, DataStax Enterprise (DSE) removes the corresponding obsolete index entry; otherwise, the old entry remains to be purged by compaction. If a read sees a stale index entry before compaction purges it, the reader thread invalidates it.

  • Secondary indexes can impact performance greatly.
  • The index table is stored on each node in a cluster, so a query involving a secondary index can rapidly become a performance problem if multiple nodes are accessed.
  • DSE Search can provide a more robust solution for indexing data.

Do not add a secondary index and a search index to the same table.

Read more about How are indexes stored and updated and Using a secondary index

Comments Rating 0 (0 reviews)

About the author


Add Comment

Click here to post a comment