snippetMinor
How to model when you could have many (thousands of) attributes to a table (product)?
Viewed 0 times
howyoutablecouldproductattributesmanywhenthousandsmodel
Problem
We have a main table, let's say, Product. It has several general attributes like 'Product Name', Manufacturer ID, etc. As of now, all these attributes are implemented in the data model as columns to the Product.
Now we have a new requirement where there could be thousands of attributes. Color, size, weight, dimensions, style, etc. Each of these attributes have pre-defined values. Like color can only be ('green', 'yellow', 'red'). The value set is defined for each attribute. But there could be thousands of attributes and each attribute could have thousands of pre-defined values.
So, for this new feature, we defined these attributes as 'row values' instead of 'columns'.
Something like this:
Product Table
ProductID Name Date_Created
1 Candy some date
2 Shirt some date
3 Pants some date
Attribute Table
AttributeID Name
1 Color
2 Dimension
3 Weight
AttributeValue Table
AttributeValueID AttributeID Value
1 1 Green
2 1 Yellow
3 1 Red
4 2 11x11
5 2 12x12
6 2 13x13
7 3 1 lb
8 3 2 lb
9 3 3 lb
Now to create association between Product Table and Attribute table, we have an association table called 'ProductAttributes'
ProductAttributes Table
ProductAttributeId ProductID AttributeID AttributeValueID
1 1 1 1
2 1 2 4
3 1 3 9
4 2 1 2
5 2 3 8
6 3 3 8
Does this m
Now we have a new requirement where there could be thousands of attributes. Color, size, weight, dimensions, style, etc. Each of these attributes have pre-defined values. Like color can only be ('green', 'yellow', 'red'). The value set is defined for each attribute. But there could be thousands of attributes and each attribute could have thousands of pre-defined values.
So, for this new feature, we defined these attributes as 'row values' instead of 'columns'.
Something like this:
Product Table
ProductID Name Date_Created
1 Candy some date
2 Shirt some date
3 Pants some date
Attribute Table
AttributeID Name
1 Color
2 Dimension
3 Weight
AttributeValue Table
AttributeValueID AttributeID Value
1 1 Green
2 1 Yellow
3 1 Red
4 2 11x11
5 2 12x12
6 2 13x13
7 3 1 lb
8 3 2 lb
9 3 3 lb
Now to create association between Product Table and Attribute table, we have an association table called 'ProductAttributes'
ProductAttributes Table
ProductAttributeId ProductID AttributeID AttributeValueID
1 1 1 1
2 1 2 4
3 1 3 9
4 2 1 2
5 2 3 8
6 3 3 8
Does this m
Solution
What's the question? Does this make sense? Yea, sure.
Example query to get the product name and attributes for a product:
Index-wise, make sure you've got the usual primary key indexes. You'll also need an index on ProductAttributes.productID. You'll need an index on Attribute.name if you're planning on referring to this field in a query condition (eg get products which have a value for attribute 'weight')
You don't really need the AttributeID in the ProductAttributes table, as you've got the AttributeValueID, whose table then links to ProductAttributes. But having it there could conceivably act as a short cut if you were frequently getting a list of Attributes for a product without their values:
@Colin'tHart - you may be right about it not being a good design, but you'll need to explain why for your argument to have any weight.
- A product could have more than one attribute Fine
- A product could have more than one of the same attribute. A shirt is both 'brown' and 'green' Um. If you say so. Sounds a bit strange but that's up to you, I guess.
- All Attributes and its Values have to be pre-defined (loaded from text file) Again, sounds a bit strange that numerical fields like weight and length have to have pre-defined values. I personally go for a table of pre-defined values and storing non-predefined values separately, but it makes the structure a LOT more complex. Up to you if you want (or need) to do it this way, really.
- Once the data is loaded (once every few months), there is heavy SELECTing. So the model needs to be read-optimized. If it's properly indexed then the fact that you've got a squillion rows shouldn't impact on performance so much.
Example query to get the product name and attributes for a product:
SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name,
ProductAttributes.AttributeValueID,
AttributeValue.AttributeValueID, AttributeValue.Value,
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234Index-wise, make sure you've got the usual primary key indexes. You'll also need an index on ProductAttributes.productID. You'll need an index on Attribute.name if you're planning on referring to this field in a query condition (eg get products which have a value for attribute 'weight')
You don't really need the AttributeID in the ProductAttributes table, as you've got the AttributeValueID, whose table then links to ProductAttributes. But having it there could conceivably act as a short cut if you were frequently getting a list of Attributes for a product without their values:
SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
--THIS JOIN IS NOW NOT REQUIRED... JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
--THIS JOIN CAN BE REPLACED BY THE ONE BELOW... JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
JOIN Attribute on ProductAttributes.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234@Colin'tHart - you may be right about it not being a good design, but you'll need to explain why for your argument to have any weight.
Code Snippets
SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name,
ProductAttributes.AttributeValueID,
AttributeValue.AttributeValueID, AttributeValue.Value,
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
--THIS JOIN IS NOW NOT REQUIRED... JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
--THIS JOIN CAN BE REPLACED BY THE ONE BELOW... JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
JOIN Attribute on ProductAttributes.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234Context
StackExchange Database Administrators Q#71655, answer score: 2
Revisions (0)
No revisions yet.