HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to pivot data in SQLite

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
pivotdatasqlitehow

Problem

I've a table which contains text data containing a bunch of translations for different language. Each translation is for a specific label.

I need to generate a pivot table so to get quickly what's missing.

An example of the records is

1, en, hello
1, fr, bonjour
1, es, hola
2, en, how are you
2, fr, 
3, es, come es stas


Although translations should be always there for all language, I'm not 100% sure this is the case. So missing fields have to be considered.

The desired outcome is this

|ID|EN|FR|ES|
|1|hello|bonjour|hola|
|2|how are you| |come es stas|


The challenge I have is that the column order may not always be the same when the database is populated, so in theory I should have a dynamic list of fields.

There is not a direct PIVOT function in SQLite, so I started experimenting with the group_concat obtaining a comma separated string.

SELECT DISTINCT language, group_concat(word, ',') OVER (PARTITION BY language) AS group_concat
FROM vocabulary;


I can ran past the results later on in Python if needed; the issue is that any missing value is not appending an empty item thus shifting all the concatenation by n, thus making this solution not valid.

I have attempted also to use the filter clause in the select predicate (though this mean hardcoding the columns), but I was not able to succeed.

Any idea on how this can be achieved?

Solution

I was able to arrive to a semi-satisfactory solution, though the only thing missing is a dynamic column selection.

SELECT id, 
       MAX(CASE WHEN "language" == 'it' THEN word END) as 'it',
       MAX(CASE WHEN "language" == 'en' THEN word END) as 'en',
       MAX(CASE WHEN "language" == 'ru' THEN word END) as 'ru'
FROM (select t.*,
             row_number() over (partition by language order by id) as seq
      from vocabulary t
     ) t
GROUP BY t.seq;


Regrettably SQLite doesn't support dynamic queries, so these have to be constructed via the callee, in my case a Python script.

Code Snippets

SELECT id, 
       MAX(CASE WHEN "language" == 'it' THEN word END) as 'it',
       MAX(CASE WHEN "language" == 'en' THEN word END) as 'en',
       MAX(CASE WHEN "language" == 'ru' THEN word END) as 'ru'
FROM (select t.*,
             row_number() over (partition by language order by id) as seq
      from vocabulary t
     ) t
GROUP BY t.seq;

Context

StackExchange Database Administrators Q#266614, answer score: 3

Revisions (0)

No revisions yet.