snippetsqlMinor
How to do this binary appearance analysis
Viewed 0 times
thisanalysisappearancebinaryhow
Problem
I have a table named
An example of
What I want to do is generate a binary appearance chart, where it shows a
All the possibilities are on columns, and all the
results:integer: id
json: websitesAn example of
websites column is:[{"url": "http://google.com"}, {"url": "http://example.com"}]What I want to do is generate a binary appearance chart, where it shows a
1 or a 0 depending on if ANY url in the JSON blob is contained within another set.All the possibilities are on columns, and all the
results are as rows.result_id | wikipedia.org | google.com | example.com
10 1 1 1
11 0 1 0
12 0 0 0
13 1 0 0- For it to return
1, ANY url in the JSON array object (with key->>{'url'}) must match the domain
- The URLs are full urls so ideally I would like to use regex to check if its a match (i can't do a full string match because I only care about the HOST of the URL). I'm thinking something like
/https?:\/\/(example\.com)(/|?|$)/
Solution
Assuming data type
Exact matches are simpler, since you can use the built-in
The cast to integer only to meet your desired form of
Pattern matching is more sophisticated, since there is no direct
db<>fiddle here
Replace
Related:
jsonb and current Postgres 10.Exact matches are simpler, since you can use the built-in
jsonb contains operator @>:SELECT id
, (websites @> '[{"url": "http://wikipedia.org"}]')::int AS "wikipedia.org"
, (websites @> '[{"url": "http://google.com"}]' )::int AS "google.com"
, (websites @> '[{"url": "http://example.com"}]' )::int AS "example.com"
FROM results
ORDER BY id;The cast to integer only to meet your desired form of
1/0. Or omit the cast and just use the boolean t/f instead.Pattern matching is more sophisticated, since there is no direct
jsonb operator:SELECT id
, count(*) FILTER (WHERE url ~ '^https?://wikipedia\.org\M') AS "wikipedia.org"
, count(*) FILTER (WHERE url ~ '^https?://google\.com\M') AS "google.com"
, count(*) FILTER (WHERE url ~ '^https?://example\.com\M') AS "example.com"
FROM results r
LEFT JOIN LATERAL (
SELECT elem->>'url' AS url
FROM jsonb_array_elements(r.websites) w(elem)
) w ON true
GROUP BY id
ORDER BY id;db<>fiddle here
Replace
count(*) FILTER ... with bool_or(url ~ '^https?://wikipedia\.org\M') etc. if URLs are never contained more than once and/or you are not interested in the actual count.Related:
- Search for nested values in jsonb array with greater operator
- Indexed range comparison of array key's value in row inside JSONB
- Return counts for multiple ranges in a single SELECT statement
- Collapsing a column to yes/no on query
- What is the difference between LATERAL and a subquery in PostgreSQL?
Code Snippets
SELECT id
, (websites @> '[{"url": "http://wikipedia.org"}]')::int AS "wikipedia.org"
, (websites @> '[{"url": "http://google.com"}]' )::int AS "google.com"
, (websites @> '[{"url": "http://example.com"}]' )::int AS "example.com"
FROM results
ORDER BY id;SELECT id
, count(*) FILTER (WHERE url ~ '^https?://wikipedia\.org\M') AS "wikipedia.org"
, count(*) FILTER (WHERE url ~ '^https?://google\.com\M') AS "google.com"
, count(*) FILTER (WHERE url ~ '^https?://example\.com\M') AS "example.com"
FROM results r
LEFT JOIN LATERAL (
SELECT elem->>'url' AS url
FROM jsonb_array_elements(r.websites) w(elem)
) w ON true
GROUP BY id
ORDER BY id;Context
StackExchange Database Administrators Q#205220, answer score: 4
Revisions (0)
No revisions yet.