HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlModerate

Calculate row size and max row size for a table

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
sizemaxcalculateforandrowtable

Problem

Is there any way of calculating the number of bytes occupied by the table? I know that you can get some information from information_schema.tables but that information is not accurate enough.

What actually required is the number of bytes according to the definition of the table for InnoDB only and collation could also be considered as utf-8-general-ci.

For example, a table test is as following

create table test   (
   col1 varchar(25),
   col2 int,
   col3 varchar(3),
   col4 char(15),
   col5 datetime    
);


I would require to know the total row size that can be accumulated in one row, according to the types of columns in the table.

Found a some sort of similar solution for MS SQL Server, but need its MySQL version

Solution

Here are my two cents using SQL and querying the INFORMATION_SCHEMA.COLUMNS table, based on the MySQL documentation 'Data type storage requirements'.

Note: this is not be entirely accurate, as ENUM and SET column sizes depend on the number of enumeration values, respectively set members. Here I used the "worst-case" values (2 and 8 bytes respectively), this already gives a good hint about the row size. One might parse the INFORMATION_SCHEMA.COLUMNS.COLUMN_TYPE to get the number of enum values/set members for such columns and then compute the amount of bytes based on what the documentation says.

SELECT col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME, SUM(col_sizes.col_size) AS EST_MAX_ROW_SIZE
FROM (
    SELECT 
        cols.TABLE_SCHEMA, 
        cols.TABLE_NAME, 
        cols.COLUMN_NAME,
        CASE cols.DATA_TYPE
            WHEN 'tinyint' THEN 1
            WHEN 'smallint' THEN 2
            WHEN 'mediumint' THEN 3
            WHEN 'int' THEN 4
            WHEN 'bigint' THEN 8
            WHEN 'float' THEN IF(cols.NUMERIC_PRECISION > 24, 8, 4)
            WHEN 'double' THEN 8
            WHEN 'decimal' THEN ((cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE) DIV 9)*4  + (cols.NUMERIC_SCALE DIV 9)*4 + CEIL(MOD(cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE,9)/2) + CEIL(MOD(cols.NUMERIC_SCALE,9)/2)
            WHEN 'bit' THEN (cols.NUMERIC_PRECISION + 7) DIV 8
            WHEN 'year' THEN 1
            WHEN 'date' THEN 3
            WHEN 'time' THEN 3 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'datetime' THEN 5 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'timestamp' THEN 4 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'char' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'binary' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'varchar' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'varbinary' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'tinyblob' THEN 9
            WHEN 'tinytext' THEN 9
            WHEN 'blob' THEN 10
            WHEN 'text' THEN 10
            WHEN 'mediumblob' THEN 11
            WHEN 'mediumtext' THEN 11
            WHEN 'longblob' THEN 12
            WHEN 'longtext' THEN 12
            WHEN 'enum' THEN 2
            WHEN 'set' THEN 8
            ELSE 0
        END AS col_size
    FROM INFORMATION_SCHEMA.COLUMNS cols
) AS col_sizes
GROUP BY col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME

Code Snippets

SELECT col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME, SUM(col_sizes.col_size) AS EST_MAX_ROW_SIZE
FROM (
    SELECT 
        cols.TABLE_SCHEMA, 
        cols.TABLE_NAME, 
        cols.COLUMN_NAME,
        CASE cols.DATA_TYPE
            WHEN 'tinyint' THEN 1
            WHEN 'smallint' THEN 2
            WHEN 'mediumint' THEN 3
            WHEN 'int' THEN 4
            WHEN 'bigint' THEN 8
            WHEN 'float' THEN IF(cols.NUMERIC_PRECISION > 24, 8, 4)
            WHEN 'double' THEN 8
            WHEN 'decimal' THEN ((cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE) DIV 9)*4  + (cols.NUMERIC_SCALE DIV 9)*4 + CEIL(MOD(cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE,9)/2) + CEIL(MOD(cols.NUMERIC_SCALE,9)/2)
            WHEN 'bit' THEN (cols.NUMERIC_PRECISION + 7) DIV 8
            WHEN 'year' THEN 1
            WHEN 'date' THEN 3
            WHEN 'time' THEN 3 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'datetime' THEN 5 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'timestamp' THEN 4 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'char' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'binary' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'varchar' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'varbinary' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'tinyblob' THEN 9
            WHEN 'tinytext' THEN 9
            WHEN 'blob' THEN 10
            WHEN 'text' THEN 10
            WHEN 'mediumblob' THEN 11
            WHEN 'mediumtext' THEN 11
            WHEN 'longblob' THEN 12
            WHEN 'longtext' THEN 12
            WHEN 'enum' THEN 2
            WHEN 'set' THEN 8
            ELSE 0
        END AS col_size
    FROM INFORMATION_SCHEMA.COLUMNS cols
) AS col_sizes
GROUP BY col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME

Context

StackExchange Database Administrators Q#114471, answer score: 10

Revisions (0)

No revisions yet.