HiveBrain v1.2.0
Get Started
← Back to all entries
snippetsqlMinor

How to see percentage of existing values for all columns in mysql table?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
valuescolumnsalltableseemysqlforhowexistingpercentage

Problem

I have three large-ish tables (~20 columns each) and I want to know what percentage of rows have a value for each column.

For instance, this table:

ID   | TITLE               | SERVING     | NOTE
-------------------------------------------------------------------
6716 |  Yummy Molasses ... | 1 cookie    |
3765 |  Rosemary-Red Wi... |             |
5178 |  Stuffed Avocado... |             |
6025 |  Amazing Pea Sou... | about 1 cup |
4412 |  Overnight Oatme... | 1 cup       | Note: Steel-cut oats...


Might yield results similar to this:

ID   | TITLE               | SERVING     | NOTE
-------------------------------------------------------------------
100% | 100%                | 60%         | 20%


This is for an internal report, so the format isn't too important. If I only have the count() instead of the percent, that's fine. Having the column name would be nice, but not necessary. Etc.

I'm not very familiar with queries that involve more meta questions like gathering table names or working over all the columns in a table, so I'm interested to see what people suggest.

Solution

I have a Dynamic SQL solution

PROPOSED QUERY

SET group_concat_max_len = 1048576;
SET @GivenDB = 'mydb';
SET @GivenTable = 'mytable';
SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
')*100/COUNT(1) ',column_name,'_pct')),
' FROM ',table_schema,'.',table_name) sqlstmt
INTO @sql FROM information_schema.columns
WHERE table_schema=@GivenDB
AND table_name=@GivenTable;
PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;


EXAMPLE

Let's use a sample table I made up for another question

mysql> use timyash
Database changed
mysql> show tables;
+-------------------+
| Tables_in_timyash |
+-------------------+
| mytable           |
+-------------------+
1 row in set (0.00 sec)

mysql> desc mytable;
+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| id           | int(11) | NO   | PRI | NULL    | auto_increment |
| transmitted  | int(11) | NO   |     | NULL    |                |
| connect_time | int(11) | NO   |     | NULL    |                |
+--------------+---------+------+-----+---------+----------------+
3 rows in set (0.02 sec)

mysql> select * from mytable;
+----+-------------+--------------+
| id | transmitted | connect_time |
+----+-------------+--------------+
|  1 |     2650131 |       117987 |
|  2 |     6465178 |        78073 |
|  3 |    25905117 |       159268 |
|  4 |    59178089 |       410282 |
|  5 |    73502942 |       596408 |
|  6 |    75695427 |       683045 |
|  7 |    77576167 |       740379 |
+----+-------------+--------------+
7 rows in set (0.00 sec)


mysql>

PROPOSED QUERY DISPLAYED

mysql> SET group_concat_max_len = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenDB = 'timyash';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenTable = 'mytable';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
    -> ')*100/COUNT(1) ',column_name,'_pct')),
    -> ' FROM ',table_schema,'.',table_name) sqlstmt
    -> INTO @sql FROM information_schema.columns
    -> WHERE table_schema=@GivenDB
    -> AND table_name=@GivenTable;
Query OK, 1 row affected (0.01 sec)

mysql> SELECT @sql\G
*************************** 1. row ***************************
@sql: SELECT COUNT(id)*100/COUNT(1) id_pct,COUNT(transmitted)*100/COUNT(1) transmitted_pct,COUNT(connect_time)*100/COUNT(1) connect_time_pct FROM timyash.mytable
1 row in set (0.00 sec)

mysql>


PROPOSED QUERY EXECUTED

mysql> SET group_concat_max_len = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenDB = 'timyash';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenTable = 'mytable';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
    -> ')*100/COUNT(1) ',column_name,'_pct')),
    -> ' FROM ',table_schema,'.',table_name) sqlstmt
    -> INTO @sql FROM information_schema.columns
    -> WHERE table_schema=@GivenDB
    -> AND table_name=@GivenTable;
Query OK, 1 row affected (0.02 sec)

mysql> PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;
Query OK, 0 rows affected (0.00 sec)
Statement prepared

+----------+-----------------+------------------+
| id_pct   | transmitted_pct | connect_time_pct |
+----------+-----------------+------------------+
| 100.0000 |        100.0000 |         100.0000 |
+----------+-----------------+------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql>


PROPOSED QUERY EXECUTED WITH NULL DATA

Let's change two rows to have NULL column data

mysql> alter table mytable modify column connect_time INT NULL;
Query OK, 0 rows affected (0.66 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> update mytable set connect_time = 1/0 WHERE id in (2,5);
Query OK, 2 rows affected (0.06 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> select * from mytable;
+----+-------------+--------------+
| id | transmitted | connect_time |
+----+-------------+--------------+
|  1 |     2650131 |       117987 |
|  2 |     6465178 |         NULL |
|  3 |    25905117 |       159268 |
|  4 |    59178089 |       410282 |
|  5 |    73502942 |         NULL |
|  6 |    75695427 |       683045 |
|  7 |    77576167 |       740379 |
+----+-------------+--------------+
7 rows in set (0.00 sec)

mysql>


Let's run it again

```
mysql> SET group_concat_max_len = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenDB = 'timyash';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenTable = 'mytable';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
-> ')*100/COUNT(1) ',column_name,'_pct')),
-> ' FROM ',table_schema,'.',table_name) sqlstmt
-> INTO @sql FROM information_schema.columns
-> WHERE table_schema=@GivenDB
-> AND table_name=@GivenTable;
Query OK, 1 row affected (0.03 sec

Code Snippets

SET group_concat_max_len = 1048576;
SET @GivenDB = 'mydb';
SET @GivenTable = 'mytable';
SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
')*100/COUNT(1) ',column_name,'_pct')),
' FROM ',table_schema,'.',table_name) sqlstmt
INTO @sql FROM information_schema.columns
WHERE table_schema=@GivenDB
AND table_name=@GivenTable;
PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;
mysql> use timyash
Database changed
mysql> show tables;
+-------------------+
| Tables_in_timyash |
+-------------------+
| mytable           |
+-------------------+
1 row in set (0.00 sec)

mysql> desc mytable;
+--------------+---------+------+-----+---------+----------------+
| Field        | Type    | Null | Key | Default | Extra          |
+--------------+---------+------+-----+---------+----------------+
| id           | int(11) | NO   | PRI | NULL    | auto_increment |
| transmitted  | int(11) | NO   |     | NULL    |                |
| connect_time | int(11) | NO   |     | NULL    |                |
+--------------+---------+------+-----+---------+----------------+
3 rows in set (0.02 sec)

mysql> select * from mytable;
+----+-------------+--------------+
| id | transmitted | connect_time |
+----+-------------+--------------+
|  1 |     2650131 |       117987 |
|  2 |     6465178 |        78073 |
|  3 |    25905117 |       159268 |
|  4 |    59178089 |       410282 |
|  5 |    73502942 |       596408 |
|  6 |    75695427 |       683045 |
|  7 |    77576167 |       740379 |
+----+-------------+--------------+
7 rows in set (0.00 sec)
mysql> SET group_concat_max_len = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenDB = 'timyash';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenTable = 'mytable';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
    -> ')*100/COUNT(1) ',column_name,'_pct')),
    -> ' FROM ',table_schema,'.',table_name) sqlstmt
    -> INTO @sql FROM information_schema.columns
    -> WHERE table_schema=@GivenDB
    -> AND table_name=@GivenTable;
Query OK, 1 row affected (0.01 sec)

mysql> SELECT @sql\G
*************************** 1. row ***************************
@sql: SELECT COUNT(id)*100/COUNT(1) id_pct,COUNT(transmitted)*100/COUNT(1) transmitted_pct,COUNT(connect_time)*100/COUNT(1) connect_time_pct FROM timyash.mytable
1 row in set (0.00 sec)

mysql>
mysql> SET group_concat_max_len = 1048576;
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenDB = 'timyash';
Query OK, 0 rows affected (0.00 sec)

mysql> SET @GivenTable = 'mytable';
Query OK, 0 rows affected (0.00 sec)

mysql> SELECT CONCAT('SELECT ',GROUP_CONCAT(CONCAT('COUNT(',column_name,
    -> ')*100/COUNT(1) ',column_name,'_pct')),
    -> ' FROM ',table_schema,'.',table_name) sqlstmt
    -> INTO @sql FROM information_schema.columns
    -> WHERE table_schema=@GivenDB
    -> AND table_name=@GivenTable;
Query OK, 1 row affected (0.02 sec)

mysql> PREPARE s FROM @sql; EXECUTE s; DEALLOCATE PREPARE s;
Query OK, 0 rows affected (0.00 sec)
Statement prepared

+----------+-----------------+------------------+
| id_pct   | transmitted_pct | connect_time_pct |
+----------+-----------------+------------------+
| 100.0000 |        100.0000 |         100.0000 |
+----------+-----------------+------------------+
1 row in set (0.00 sec)

Query OK, 0 rows affected (0.00 sec)

mysql>
mysql> alter table mytable modify column connect_time INT NULL;
Query OK, 0 rows affected (0.66 sec)
Records: 0  Duplicates: 0  Warnings: 0

mysql> update mytable set connect_time = 1/0 WHERE id in (2,5);
Query OK, 2 rows affected (0.06 sec)
Rows matched: 2  Changed: 2  Warnings: 0

mysql> select * from mytable;
+----+-------------+--------------+
| id | transmitted | connect_time |
+----+-------------+--------------+
|  1 |     2650131 |       117987 |
|  2 |     6465178 |         NULL |
|  3 |    25905117 |       159268 |
|  4 |    59178089 |       410282 |
|  5 |    73502942 |         NULL |
|  6 |    75695427 |       683045 |
|  7 |    77576167 |       740379 |
+----+-------------+--------------+
7 rows in set (0.00 sec)

mysql>

Context

StackExchange Database Administrators Q#87179, answer score: 3

Revisions (0)

No revisions yet.