Cassandra requests explained (partition, clustering, requirements) + FREE TRAINING!

One thing I’ve struggled with is request requirements. I never fully understood them until taking the free course at https://datastaxacademy.elogiclearning.com/

 

First to explain the partitioning key (aka primary key) vs clustering keys:

(partitioningkey, optional_clusteringkey1, optional_clusteringkey2)

The partitioning key can be complex (ie, a composite)

( ( partitioning_key1, partitioning_key2), optional_clusteringkey1, etc)

 

So you probably understand that if you wish to request a certain key, you must request a certain primary key.

 

So for a table with this key definition (colA, colB, colC), you must request something like this:

select * from table where colA = ‘something’;

 

This applies to composite primary keys as well! Consider this key definition: ( (colA, colX), colB, colC )

select * from table where colA = ‘something’ and colX = ‘something’;

You have to specify both if you want to specify one!!

 

However I never understood clustering columns correctly. Again using key definition: (colA, colB, colC)

If you specify all keys, of course it works fine

select * from table where colA = ‘something’ and colB = ‘something’ and colC = ‘something’;

If you specify them in order and it works fine because they are grouped on disk by colB and THEN colC, which allows the following to work:

select * from table where colA = ‘something’ and colB = ‘something’;

If you try to skip one of the clustering columns, it will not work because it would have to dive into each of the skipped columns and search for the third value. This would be very expensive. I’m unsure if you could enable filtering to make it work anyway, but it shouldn’t be done even if that works. If you have to enable filtering, you are doing it wrong! You need to create a new table instead with the data grouped the way you want to pull it. For instance, the following will not work!

select * from table where colA = ‘something’ and colC = ‘something’;

I’m really glad I got that behind me. That was some voodoo when I was playing around with cqlsh trying to learn cassandra.