HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

gin index 'tid' array does not match table array

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
arraymatchgindoestidindexnottable

Problem

I created the following table and inserted some values into it as follows:

CREATE TABLE query_all_lexeme (
        payload text,
        normalized tsvector GENERATED ALWAYS AS (to_tsvector('english', payload)) STORED
    );
    
INSERT INTO query_all_lexeme (payload)
    VALUES ('fat cats ate rats');

INSERT INTO query_all_lexeme (payload)
    VALUES ('summarize the functions and operators that are provided for full text searching');

INSERT INTO query_all_lexeme (payload)
    VALUES ('Constructs a phrase query');

INSERT INTO query_all_lexeme (payload)
SELECT
    'Constructs a phrase query this is a test'
FROM
    generate_series(1, 10000);

INSERT INTO query_all_lexeme (payload)
SELECT
    'Constructs a phrase query this is a test'
FROM
    generate_series(1, 100);


Then I created a gin index:

CREATE INDEX query_all_lexeme_vector ON query_all_lexeme USING gin (normalized);


And then I run the query below to get gin index info:

SELECT * FROM
        gin_metapage_info (get_raw_page ('query_all_lexeme_vector', 0)) \gx


Result:

+-[ RECORD 1 ]-----+------------+
| pending_head     | 4294967295 |
| pending_tail     | 4294967295 |
| tail_free_size   | 0          |
| n_pending_pages  | 0          |
| n_pending_tuples | 0          |
| n_total_pages    | 14         |
| n_entry_pages    | 1          |
| n_data_pages     | 12         |
| n_entries        | 15         |
| version          | 2          |
+------------------+------------+


WITH cte AS (
        SELECT
            flags,
            p
        FROM
            generate_series(1, 13) AS p,
            gin_page_opaque_info (get_raw_page ('query_all_lexeme_vector', p)))
    SELECT
        array_agg(p)
    FROM
        cte
    WHERE
        flags::text = '{data,leaf,compressed}';


return

+----------------------+
|      array_agg       |
+----------------------+
| {3,4,6,7,9,10,12,13} |
+----------------------+


In the following query, I should expect at least one r

Solution

Your logic would only apply if one index entry were guaranteed to contain all the ctids for a given lexeme. There is no such guarantee, and can't be because index tuples have a strictly bounded size which is far less than enough to contain all possible ctids.

Maybe you could switch from array equality to overlaps or contains (&&, @>).

Context

StackExchange Database Administrators Q#313766, answer score: 2

Revisions (0)

No revisions yet.