patternModerate
Cassandra - Query a column with collection type
Viewed 0 times
columnwithquerycollectiontypecassandra
Problem
I am pretty new to cassandra, so pardon me if this turns out to be a silly question.
I have a table structure as below
I wanted to know, whether I can execute a query which gives matching records from the
For instance, if I insert values in the table as below
Will I be able to fetch it as
Please Help.
I have a table structure as below
CREATE TABLE data_points (
id text PRIMARY KEY,
created_at timestamp,
previous_event_id varchar,
properties map
);I wanted to know, whether I can execute a query which gives matching records from the
map type fields.For instance, if I insert values in the table as below
INSERT INTO datapoints (id, properties) VALUES ('1', { 'fruit' : 'apple', 'band' : 'Beatles' });Will I be able to fetch it as
SELECT * from data_points WHERE properties.band='Beatles'; Please Help.
Solution
You can index collection types in cassandra 2.1 and later. You are after:
Detailed example:
Word of warning, secondary indexes don't scale out well as they use a scatter/gather algorithm to find what you need, if you plan to use them for heavy tagging it might be better to denormalize the
Further reading:
SELECT * FROM WHERE CONTAINS Detailed example:
cqlsh> USE ks;
cqlsh:ks> CREATE TABLE data_points (
id text PRIMARY KEY,
created_at timestamp,
previous_event_id varchar,
properties map
);
cqlsh:ks> create index on data_points (properties);
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('1', { 'fruit' : 'apple', 'band' : 'Beatles' });
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('2', { 'fruit' : 'cherry', 'band' : 'Beatles' });
cqlsh:ks> SELECT * FROM data_points WHERE properties CONTAINS 'Beatles';
id | created_at | previous_event_id | properties
----+------------+-------------------+----------------------------------------
2 | null | null | {'band': 'Beatles', 'fruit': 'cherry'}
1 | null | null | {'band': 'Beatles', 'fruit': 'apple'}
(2 rows)Word of warning, secondary indexes don't scale out well as they use a scatter/gather algorithm to find what you need, if you plan to use them for heavy tagging it might be better to denormalize the
properties field int a separate table and carry out multiple queries.Further reading:
- Is it possible to use cql to query collections in a row?
Code Snippets
cqlsh> USE ks;
cqlsh:ks> CREATE TABLE data_points (
id text PRIMARY KEY,
created_at timestamp,
previous_event_id varchar,
properties map<text,text>
);
cqlsh:ks> create index on data_points (properties);
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('1', { 'fruit' : 'apple', 'band' : 'Beatles' });
cqlsh:ks> INSERT INTO data_points (id, properties) VALUES ('2', { 'fruit' : 'cherry', 'band' : 'Beatles' });
cqlsh:ks> SELECT * FROM data_points WHERE properties CONTAINS 'Beatles';
id | created_at | previous_event_id | properties
----+------------+-------------------+----------------------------------------
2 | null | null | {'band': 'Beatles', 'fruit': 'cherry'}
1 | null | null | {'band': 'Beatles', 'fruit': 'apple'}
(2 rows)Context
StackExchange Database Administrators Q#55165, answer score: 17
Revisions (0)
No revisions yet.