HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

Postgres nested hstore select query

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
postgresqueryhstorenestedselect

Problem

I have a carts table with items hstore column. An example entry in this column is:

carts.items row#1 = {
                     "1614" => {:quantity=>"100", :price_cents=>1655},
                     "1938" => {:quantity=>"50", :price_cents=>1955},
                     "1983" => {:quantity=>"100", :price_cents=>2255},
                     "1322" => {:quantity=>"10", :price_cents=>4455},
                     "1691" => {:quantity=>"25", :price_cents=>1055},
                     "1734" => {:quantity=>"20", :price_cents=>1255}
                    }

carts.items row#2 = {"1614"=>{:quantity=>"50", :price_cents=>1655}}


So my carts table would look like this:

id   | items  
---------+-------
    1    |   {"1614"=>{:quantity=>"100", :price_cents=>1655}, "1938" => {:quantity=>"50", :price_cents=>1955},"1983"=>{:quantity=>"100", :price_cents=>2255},"1322"=>{:quantity=>"10", :price_cents=>4455},"1691"=>{:quantity=>"25", :price_cents=>1055},"1734"=>{:quantity=>"20", :price_cents=>1255}}
    2    |   {"1614"=>{:quantity=>"50", :price_cents=>1655}}


You will notice that there is one duplicate id (1614) in the hash, but its quantity is different.

I want to write a query that will return a table with the item id counts and the total quantity. It should look like this:

item_id | count | total 
---------+-------+------
   1614  |   2   |  150
   1938  |   1   |  50
   1983  |   1   |  50
   1322  |   1   |  100


Here is the query that I am working with:

SELECT  
skeys(carts.items) as item_ids,
COUNT(*) AS count,
svals(carts.items) as items
FROM carts
GROUP BY
skeys(carts.items),
svals(carts.items)


It returns:

item_id | count | total 
---------+-------+------
   1614  |   1   |  {:quantity=>100}
   1614  |   1   |  {:quantity=>50}
   1938  |   1   |  {:quantity=>50}
   1983  |   1   |  {:quantity=>50}
   1322  |   1   |  {:quantity=>100}


I aslo have tried:

```
SELECT key, count(*) FROM
(SELECT (each(items)).key FROM carts) AS stat
GROUP

Solution

Original question (dupes within hstore value)

I think the root cause of your problems is explained in this quote from the manual:

Each key in an hstore is unique. If you declare an hstore with
duplicate keys, only one will be stored in the hstore and there is no
guarantee as to which will be kept.

Bold emphasis mine.

You are mistaken in assuming that you could have the same key twice in a single hstore value.

I can also not reproduce your count of 2. I get a count of 1 for the key 1614. Start by not using count as column name. It is a reserved word (but allowed in Postgres).

I get (tested with Postgres 9.1.9):

WITH carts(items) AS (
   SELECT '"1614"=>{:quantity=>"100"}, "1938"=>{:quantity=>"50"}, "1614"=>{:quantity=>"50"}, "1983"=>{:quantity=>"100"}, "1322"=>{:quantity=>"10"}, "1691"=>{:quantity=>"25"}, "1614"=>{:quantity=>"77"}, "1734"=>{:quantity=>"20"}'::hstore
    )
SELECT key, count(*) AS ct
FROM  (SELECT (each(items)).key FROM carts) AS stat
GROUP  BY key
ORDER  BY ct DESC, key;


Result:
key | ct
------+----
1322 | 1
1614 | 1
1691 | 1
1734 | 1
1938 | 1
1983 | 1

Updated question (dupes across multiple rows)

To aggregate values you need a subquery (or CTE).

Simplified test case:

CREATE TEMP TABLE carts(c_id serial, items hstore); 
INSERT INTO carts(items) VALUES
  ('"1614"=>"100", "1938"=>"50", "1983"=>"100", "1322"=>"10", "1691"=>"25", "1734"=>"20"')
, ('"1614"=>"50"');


Query:

SELECT item_ids, count(*) AS ct, sum(items) AS sum_items
FROM  (
   SELECT (each(items)).key AS item_ids
        , (each(items)).value::int AS items  -- assuming values can be cast to int
   FROM   carts
   ) sub
GROUP  BY 1           -- ordinal reference is just notational convenience
ORDER  BY 2 DESC;


Result:
item_ids | ct | sum_items
----------+----+-----------
1614 | 2 | 150
1734 | 1 | 20
1691 | 1 | 25
1983 | 1 | 100
1938 | 1 | 50
1322 | 1 | 10

Code Snippets

WITH carts(items) AS (
   SELECT '"1614"=>{:quantity=>"100"}, "1938"=>{:quantity=>"50"}, "1614"=>{:quantity=>"50"}, "1983"=>{:quantity=>"100"}, "1322"=>{:quantity=>"10"}, "1691"=>{:quantity=>"25"}, "1614"=>{:quantity=>"77"}, "1734"=>{:quantity=>"20"}'::hstore
    )
SELECT key, count(*) AS ct
FROM  (SELECT (each(items)).key FROM carts) AS stat
GROUP  BY key
ORDER  BY ct DESC, key;
CREATE TEMP TABLE carts(c_id serial, items hstore); 
INSERT INTO carts(items) VALUES
  ('"1614"=>"100", "1938"=>"50", "1983"=>"100", "1322"=>"10", "1691"=>"25", "1734"=>"20"')
, ('"1614"=>"50"');
SELECT item_ids, count(*) AS ct, sum(items) AS sum_items
FROM  (
   SELECT (each(items)).key AS item_ids
        , (each(items)).value::int AS items  -- assuming values can be cast to int
   FROM   carts
   ) sub
GROUP  BY 1           -- ordinal reference is just notational convenience
ORDER  BY 2 DESC;

Context

StackExchange Database Administrators Q#48905, answer score: 3

Revisions (0)

No revisions yet.