HiveBrain v1.2.0
Get Started
← Back to all entries
snippetMinor

How can I allow part of a key to be null and enforce uniqueness of records with nulls?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
canenforcewithnullpartrecordsallowhowanduniqueness

Problem

I have an Access table called Engagement with four fields:

Emp_id, Year, Week, Act_id


It records when an employee is/was engaged in an activity.
Each field is Number and each makes up part of the composite primary key.
The semantics of the application was that each entry (each Emp_id-Year-Week-Act_id combination) must be unique.
That is, while an employee will generally have different year-weeks and different activities, an employee can sometimes engage in the same activity on different year-weeks or even engage in different activities in the same year-week. Nulls were not allowed. Everything worked.

Now I need to expand/modify the semantics to allow unknown – or more aptly, not disclosed – Year-Week values associated with any given employee-activity.
Of course, attempting to enter a row with empty Year-Week results in
"Index or primary key cannot contain a Null value".
So I need a change to the table design.

One thing I tried was to convert the primary key index into a non-primary index by turning Primary off (and leaving Unique on) in the Indexes window.
This correctly prevents duplicate records where Year-Week values are non-empty -- but it allows duplicate records where Year-Week are empty.

For example, using the above non-primary, unique index, the following data are allowed:

Emp_id  Year  Week  Act_id
     7  2014    12      31  } Same activity,
     7  2015    22      31  } different dates.
     7  2015    33      32
     7  2015    40      33  } Same dates,
     7  2015    40      34  } different activities.
     7  2016     2      36
     7                  38  } Different activities,
     7                  39  } undisclosed dates.


And none of the following additions are subsequently allowed:

Emp_id  Year  Week  Act_id
     7  2014    12      31  } Both records are 
     7  2015    33      32  } duplicates of above.


So far so good (behavior matches requirements). However, both the following additions are subseque

Solution

The strongest advice I can give is to keep the primary key immutable unless it is absolutely unfeasible for your use-case. You appear to be describing 2 different datasets.

  • User-Actions taken during a given Period ( defined by the composite key of Year-Week )



  • User-Actions taken during an unknown Period



My initial impression is that the unknown period must needs be a different dataset. Surely if you don't know when an action has taken place you also do not yet know enough to move it from its staging environment to the table where User-Action-Period is the unique identifier. I don't know MS Access well enough to comment on the specifics of null keying in that environment, but I strongly recommend defining your data model for yourself in such a way that you know if you need to include the "equivalent-to-null" keyed records on the same table.

From your description, it seems that the problem occurs when attempting to insert multiple instances of a User-Action where the Period occurred is not known. Once again, if you do not know enough to uniquely identify the committed time of the User-Action, it is appropriate to stage the metrics you are recording about the same User-Action elsewhere until you can appropriately identify the relationship this User-Action has to the other similar User-Actions that occurred at another key-able time Period.

I see two paths forward:
If only one User-Action with "unknown" time-key is permissible

Example: The action is "pending" or "to-be-completed". The user will always be submitting data for a future / not-completed period.

Solution: Pick your favorite method of keying the time period as "equivalent-to-null" and make the key columns not-nullable. You may continue to store these datasets in the same table

Gotcha: If the user submits an action with a committed period while an unknown period exists for the same User-Action, you may run into a situation where data is committed where the period is known but other data already committed for an unknown period ( which happens to be the same one ) is left in the "unknown period" state and lost for that User-Action-Time record and the user doesn't immediately know why.
Multiple User-Actions with "unknown" time-keys are permissible

Example: Users are able to uniquely identify the same action type with a different identifier for an unknown Period.

Solution: You must define another way to key the data ( where the time the action occurs is a metric of the other multi-keyed record ). Allow the users to submit data to this separately stored dataset and merge it into your User-Action-Time period dataset using appropriate logic. It's more work, but there's no way around it if the datasets are bona-fide separate.

Context

StackExchange Database Administrators Q#141241, answer score: 6

Revisions (0)

No revisions yet.