snippetsqlCritical
How to estimate/predict data size and index size of a table in MySQL
Viewed 0 times
estimatesizepredictmysqlhowandindexdatatable
Problem
I am finding what is the best way to estimate size of a table for that I have studied lot of blogs and forums but unable to find any accurate answer
For example, we have a table City with InnoDB engine,lets say in future (in next 1 year) it will have 1 million of records so what will be the estimated data size and index size of that table in this period.
UPDATE
What will be the estimated upper bound (Max size of table) with 1 million records and how can we estimate it.
For example, we have a table City with InnoDB engine,lets say in future (in next 1 year) it will have 1 million of records so what will be the estimated data size and index size of that table in this period.
mysql> desc City;
+-------------+----------+------+-----+---------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-------------+----------+------+-----+---------+----------------+
| ID | int(11) | NO | PRI | NULL | auto_increment |
| Name | char(35) | NO | | | |
| CountryCode | char(3) | NO | MUL | | |
| District | char(20) | NO | | | |
| Population | int(11) | NO | | 0 | |
+-------------+----------+------+-----+---------+----------------+
5 rows in set (0.03 sec)UPDATE
What will be the estimated upper bound (Max size of table) with 1 million records and how can we estimate it.
Solution
Given the table description, I see
For a million rows, that would 77,000,000 bytes (73.43 MB)
As for measuring the table, for a given table mydb.mytable, you can run this query
To measure all tables grouped by Database and Storage Engine
Run these queries and you can track changes in database/engine disk usage.
Give it a Try !!!
- 66 bytes per row of data
- 4 bytes per row for the primary key
- 7 bytes per row for country code index
- 3 bytes for the country
- 4 bytes for Clustered Key attached to the country code
- Total of 77 bytes of data and keys
- This does not factoring housekeeping for BTREEs or Tablespace Fragmentation
For a million rows, that would 77,000,000 bytes (73.43 MB)
As for measuring the table, for a given table mydb.mytable, you can run this query
SELECT
CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',SUBSTR(units,pw1*2+1,2)) DATSIZE,
CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',SUBSTR(units,pw2*2+1,2)) NDXSIZE,
CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',SUBSTR(units,pw3*2+1,2)) TBLSIZE
FROM
(
SELECT DAT,NDX,TBL,IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
FROM
(
SELECT data_length DAT,index_length NDX,data_length+index_length TBL,
FLOOR(LOG(IF(data_length=0,1,data_length))/LOG(1024)) px,
FLOOR(LOG(IF(index_length=0,1,index_length))/LOG(1024)) py,
FLOOR(LOG(data_length+index_length)/LOG(1024)) pz
FROM information_schema.tables
WHERE table_schema='mydb'
AND table_name='mytable'
) AA
) A,(SELECT 'B KBMBGBTB' units) B;To measure all tables grouped by Database and Storage Engine
SELECT
IF(ISNULL(DB)+ISNULL(ENGINE)=2,'Database Total',
CONCAT(DB,' ',IFNULL(ENGINE,'Total'))) "Reported Statistic",
LPAD(CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',
SUBSTR(units,pw1*2+1,2)),17,' ') "Data Size",
LPAD(CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',
SUBSTR(units,pw2*2+1,2)),17,' ') "Index Size",
LPAD(CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',
SUBSTR(units,pw3*2+1,2)),17,' ') "Total Size"
FROM
(
SELECT DB,ENGINE,DAT,NDX,TBL,
IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
FROM
(SELECT *,
FLOOR(LOG(IF(DAT=0,1,DAT))/LOG(1024)) px,
FLOOR(LOG(IF(NDX=0,1,NDX))/LOG(1024)) py,
FLOOR(LOG(IF(TBL=0,1,TBL))/LOG(1024)) pz
FROM
(SELECT
DB,ENGINE,
SUM(data_length) DAT,
SUM(index_length) NDX,
SUM(data_length+index_length) TBL
FROM
(
SELECT table_schema DB,ENGINE,data_length,index_length FROM
information_schema.tables WHERE table_schema NOT IN
('information_schema','performance_schema','mysql')
AND ENGINE IS NOT NULL
) AAA GROUP BY DB,ENGINE WITH ROLLUP
) AAA) AA) A,(SELECT ' BKBMBGBTB' units) B;Run these queries and you can track changes in database/engine disk usage.
Give it a Try !!!
Code Snippets
SELECT
CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',SUBSTR(units,pw1*2+1,2)) DATSIZE,
CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',SUBSTR(units,pw2*2+1,2)) NDXSIZE,
CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',SUBSTR(units,pw3*2+1,2)) TBLSIZE
FROM
(
SELECT DAT,NDX,TBL,IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
FROM
(
SELECT data_length DAT,index_length NDX,data_length+index_length TBL,
FLOOR(LOG(IF(data_length=0,1,data_length))/LOG(1024)) px,
FLOOR(LOG(IF(index_length=0,1,index_length))/LOG(1024)) py,
FLOOR(LOG(data_length+index_length)/LOG(1024)) pz
FROM information_schema.tables
WHERE table_schema='mydb'
AND table_name='mytable'
) AA
) A,(SELECT 'B KBMBGBTB' units) B;SELECT
IF(ISNULL(DB)+ISNULL(ENGINE)=2,'Database Total',
CONCAT(DB,' ',IFNULL(ENGINE,'Total'))) "Reported Statistic",
LPAD(CONCAT(FORMAT(DAT/POWER(1024,pw1),2),' ',
SUBSTR(units,pw1*2+1,2)),17,' ') "Data Size",
LPAD(CONCAT(FORMAT(NDX/POWER(1024,pw2),2),' ',
SUBSTR(units,pw2*2+1,2)),17,' ') "Index Size",
LPAD(CONCAT(FORMAT(TBL/POWER(1024,pw3),2),' ',
SUBSTR(units,pw3*2+1,2)),17,' ') "Total Size"
FROM
(
SELECT DB,ENGINE,DAT,NDX,TBL,
IF(px>4,4,px) pw1,IF(py>4,4,py) pw2,IF(pz>4,4,pz) pw3
FROM
(SELECT *,
FLOOR(LOG(IF(DAT=0,1,DAT))/LOG(1024)) px,
FLOOR(LOG(IF(NDX=0,1,NDX))/LOG(1024)) py,
FLOOR(LOG(IF(TBL=0,1,TBL))/LOG(1024)) pz
FROM
(SELECT
DB,ENGINE,
SUM(data_length) DAT,
SUM(index_length) NDX,
SUM(data_length+index_length) TBL
FROM
(
SELECT table_schema DB,ENGINE,data_length,index_length FROM
information_schema.tables WHERE table_schema NOT IN
('information_schema','performance_schema','mysql')
AND ENGINE IS NOT NULL
) AAA GROUP BY DB,ENGINE WITH ROLLUP
) AAA) AA) A,(SELECT ' BKBMBGBTB' units) B;Context
StackExchange Database Administrators Q#46069, answer score: 68
Revisions (0)
No revisions yet.