snippetsqlMajor
How do I decompose ctid into page and row numbers?
Viewed 0 times
decomposeintonumberspagehowandrowctid
Problem
Each row in a table has a system column
ctid | id
:---- | -:
(0,1) | 1
(0,2) | 2
dbfiddle here
What's the best way of getting just the page number as from the
The only way I can think of is very ugly.
ctid of type tid that represents the physical location of the row:create table t(id serial);
insert into t default values;
insert into t default values;select ctid
, id
from t;ctid | id
:---- | -:
(0,1) | 1
(0,2) | 2
dbfiddle here
What's the best way of getting just the page number as from the
ctid in the most appropriate type (eg integer, bigint or numeric(1000,0))?The only way I can think of is very ugly.
Solution
SELECT (ctid::text::point)[0]::bigint AS block_number FROM t;db<>fiddle here
@bma suggested something similar in his comment. Here is a ...
Rationale for the type
ctid is of type tid (tuple identifier), called ItemPointer in the C-language source code. The manual:This is the data type of the system column
ctid. A tuple ID is apair (block number, tuple index within block) that identifies the
physical location of the row within its table.
Bold emphasis mine. And:
(
ItemPointer, also known as CTID)A block is 8 KB in standard installations. Maximum Table Size is 32 TB. It follows logically that block numbers must accommodate at least a maximum of (fixed per @Daniel's comment):
SELECT (2^45 / 2^13)::int -- = 2^32 = 4294967294Which would fit into an unsigned
integer. On further investigation I found in the source code that ...blocks are numbered sequentially, 0 to 0xFFFFFFFE.
Bold emphasis mine. Which confirms the first calculation:
SELECT 'xFFFFFFFE'::bit(32)::int8 -- max page number: 4294967294Postgres uses signed integer and is therefore one bit short. I couldn't pin down, yet, whether the text representation is shifted to accommodate signed integer. Until somebody can clear this up, I would fall back to
bigint, which works in any case.Cast
There is no registered cast for the
tid type in Postgres 9.3 (still true in Postgres 13):SELECT *
FROM pg_cast
WHERE castsource = 'tid'::regtype
OR casttarget = 'tid'::regtype;
castsource | casttarget | castfunc | castcontext | castmethod
------------+------------+----------+-------------+------------
(0 rows)You can still cast to
text. There is a text representation for every type in Postgres:Another important exception is that "automatic I/O conversion casts",
those performed using a data type's own I/O functions to convert to or
from text or other string types, are not explicitly represented in
pg_cast.The text representation matches that of a point, which consists of two
float8 numbers, that cast is lossless.You can access the first number of a point with index 0. Cast to
bigint. Voilá.Performance
I ran a quick test in Postgres 9.4 on a table with 30k rows (best of 5) with a couple of expressions that came to mind, including your original:
SELECT (ctid::text::point)[0]::int -- 25 ms
, right(split_part(ctid::text, ',', 1), -1)::int -- 28 ms
, ltrim(split_part(ctid::text, ',', 1), '(')::int -- 29 ms
, (ctid::text::t_tid).page_number -- 31 ms
, (translate(ctid::text,'()', '{}')::int[])[1] -- 45 ms
, (replace(replace(ctid::text,'(','{'),')','}')::int[])[1] -- 51 ms
, substring(right(ctid::text, -1), '^\d+')::int -- 52 ms
, substring(ctid::text, '^\((\d+),')::int -- 143 ms
FROM tbl;int instead of bigint, mostly irrelevant for the purpose of the test. I eventually repeated the test in Postgres 13 with bigint on a table with 50k rows. Results are largely the same!The cast to
t_tid builds on a user-defined composite type, like @Jake commented.The gist of it: Casting tends to be faster than string manipulation. Regular expressions are expensive. The above solution is shortest and fastest.
Code Snippets
SELECT (ctid::text::point)[0]::bigint AS block_number FROM t;SELECT (2^45 / 2^13)::int -- = 2^32 = 4294967294SELECT 'xFFFFFFFE'::bit(32)::int8 -- max page number: 4294967294SELECT *
FROM pg_cast
WHERE castsource = 'tid'::regtype
OR casttarget = 'tid'::regtype;
castsource | casttarget | castfunc | castcontext | castmethod
------------+------------+----------+-------------+------------
(0 rows)SELECT (ctid::text::point)[0]::int -- 25 ms
, right(split_part(ctid::text, ',', 1), -1)::int -- 28 ms
, ltrim(split_part(ctid::text, ',', 1), '(')::int -- 29 ms
, (ctid::text::t_tid).page_number -- 31 ms
, (translate(ctid::text,'()', '{}')::int[])[1] -- 45 ms
, (replace(replace(ctid::text,'(','{'),')','}')::int[])[1] -- 51 ms
, substring(right(ctid::text, -1), '^\d+')::int -- 52 ms
, substring(ctid::text, '^\((\d+),')::int -- 143 ms
FROM tbl;Context
StackExchange Database Administrators Q#65964, answer score: 33
Revisions (0)
No revisions yet.