patternMinor
Best way to design this mileage table?
Viewed 0 times
thisdesignwaymileagetablebest
Problem
I will be populating a Miles Per Gallon (MPG) table. It's coming from an odometer source.
It's currently set up as so:
There seems to be some redundancy. The
Should we have those
What indexes would work best? There's about 3,000 trucks (records) inserted in a batch every week.
I'm using SQL Server 2008 R2.
Edit: A sample query that would I'd be using
It's currently set up as so:
id (primary_key)
, truck_num
, start_date
, end_date
, start_miles
, end_miles
, start_fuel
, end_fuel
, miles
, gals
, mpgThere seems to be some redundancy. The
miles is (end_miles - start_miles), ditto for gals.Should we have those
miles and gals columns precalculated and stored in the database? It would definitely make querying easier, but at the expense of space. Same question for having the mpg calculated. A computed column would slow things down, no?What indexes would work best? There's about 3,000 trucks (records) inserted in a batch every week.
I'm using SQL Server 2008 R2.
Edit: A sample query that would I'd be using
-- find average mpg for since ytd
select m.truck_num, avg(mpg)
from mpg m
join truck t on t.truck_num = m.truck_num
where start_date >= @begin_of_year and end_date <= @today
group by truck_numSolution
- Computed columns are your friend. Use them to capture simple calculations you're going to do anyway, and to guarantee that the computed values are correct.
- Persist the results if you want to index or filter on them; don't persist them if you just need to pull the value once in a while.
- Capture all data constraints using
CHECKconstraints.
Here is a pseudo-schema definition:
id PRIMARY KEY
, truck_num
, start_date
, end_date CHECK (end_date > start_date)
, start_miles CHECK (start_miles >= 0)
, end_miles CHECK (end_miles > start_miles)
-- what if they refill the tank?
, start_gals CHECK (start_gals >= 0)
, end_gals CHECK (end_gals = 0)
-- all these should be computed
, miles = end_miles - start_miles
, gals = start_gals - end_gals
, mpg = miles/galsAs for indexing the table, here's what I think:
- Cluster on
start_date ASC. This will satisfy yourWHEREclause. You are always inserting data by increasingstart_date, meaning your writes will always be sequential under this clustering scheme. You are also always querying bystart_dateso you satisfy your biggest query pattern as well. (3,000 inserts per week is nothing. Because you have such a low volume of inserts, you could even cluster onstart_date ASC, end_date ASC.)
- Create a non-clustered index on
truck_numandINCLUDE mpg. This should satisfy yourSELECT,JOIN, andGROUP BYclauses. If you want toORDER BY mpg, then makempgpart of the index key aftertruck_numinstead of justINCLUDE-ing it.
When you're done, test your setup as follows:
- Create a test table
- Pump it full of test data
- Create the indexes
- Update statistics
- Run your most common queries
- Check their plans and run times
Code Snippets
id PRIMARY KEY
, truck_num
, start_date
, end_date CHECK (end_date > start_date)
, start_miles CHECK (start_miles >= 0)
, end_miles CHECK (end_miles > start_miles)
-- what if they refill the tank?
, start_gals CHECK (start_gals >= 0)
, end_gals CHECK (end_gals < start_gals AND end_gals >= 0)
-- all these should be computed
, miles = end_miles - start_miles
, gals = start_gals - end_gals
, mpg = miles/galsContext
StackExchange Database Administrators Q#6836, answer score: 7
Revisions (0)
No revisions yet.