patternsqlMinor

Numerical data with lots of NULLs: 6NF or document database?

Submitted by: @import:stackexchange-dba·Mar 10, 2026·

Viewed 0 times

6nfnumericalwithlotsdatabasedocumentdatanulls

Problem

I have a source of numerical data mostly comprised of sensor readings, physical quantities and flags. Each associated with a datetime.

|  id  |        timetag           |   sensor1   |      sensor2     |  flag1  |  ...  |
--------------------------------------------------------------------------------------
| 943  | 2016-12-08 10:27:00.000  |  1.2323523  |  -23123.5346234  |    0    |  ...  |
| 944  | 2016-12-08 10:27:00.038  |     NULL    |     163.6151345  |    1    |  ...  |
| 945  | 2016-12-08 10:27:02.000  |  1.2477801  |       NULL       |    3    |  ...  |
                                       ...

The data has the following characteristics:

There are hundreds of fields, and new unexpected fields may pop up at any time. We need to record those too.

For every row, only a small subset (

Sometimes a field is available but corrupted, i.e. NaN, and we need to record that.

Data is written once, updated almost never and read relatively often.

We need to do range queries on the data, both on the date and on the actual numbers.

We don't want to lose any data.

It doesn't need to be a distributed system.

I'm wondering what the best way to store these data would be.

Solution 1: One big table

A single huge table would be a bad idea, because it would be full of NULLS and inconvenient. Moreover, it would be impossible to distinguish between "recorded but NaN" NULLs and "not recorded" NULLs.

Solution 2: 6th normal form

The first solution that comes to mind is to use 6NF, creating one table for each field.

This is clean in theory, but

queries would require many many JOINs at the same time;

new tables would have to be created on the fly when a new field arrives for the first time;

fields that are always paired, like for example longitude and latitude would have to be automatically detected and their tables merged for efficiency;

if at some point one of the columns in a "merged" table (e.g. lon in the lon-lat example above) start

Solution

JSONB might be a good option. If the TYPES of your flags and sensors are always the same (let's say they are all floats and ints), you could also use another strategy:

CREATE TABLE tb
    (id           serial primary key, 
     timetag      timestamp default now(),
     sensor_nr    integer,
     sensor_value float,
     flag_nr      integer,
     flag_value   integer
    ) ;

If you have real values for sensors 2 and 1000, you would then do:

INSERT INTO 
    tb 
    (sensor_nr, sensor_value) 
VALUES 
    (2, 123.456), 
    (1000, 0.123) ;

Or, if there are flags:

INSERT INTO 
    tb 
    (sensor_nr, sensor_value, flag_nr, flag_value)
VALUES
    (1000, 123.456, 1000, 0),
    (2, 234.567, 2, 1) ;

The NaN value would be represented by NULL; and the "no reading" would be represented by just the non-existence of the time/sensor_nr row.

If you need frequent range SELECTs of different types, you would index by timetag; and also by sensor_nr and sensor_value... The insert costs would be relatively high, but the SELECTs could be fast.

If the "new unexpected fields that pop at any type" are of different types (i.e.: you have some co-ordinate pairs (float,float) and not just simple floats) this approach won't be flexible enough. In that occasion, probably JSON(B) and the new indices is probably your best alternative; at the cost of losing (some) type safety.

Code Snippets

CREATE TABLE tb
    (id           serial primary key, 
     timetag      timestamp default now(),
     sensor_nr    integer,
     sensor_value float,
     flag_nr      integer,
     flag_value   integer
    ) ;

INSERT INTO 
    tb 
    (sensor_nr, sensor_value) 
VALUES 
    (2, 123.456), 
    (1000, 0.123) ;

INSERT INTO 
    tb 
    (sensor_nr, sensor_value, flag_nr, flag_value)
VALUES
    (1000, 123.456, 1000, 0),
    (2, 234.567, 2, 1) ;

Context

StackExchange Database Administrators Q#157538, answer score: 3

Revisions (0)

No revisions yet.