patternsqlMinor
Choosing MyISAM over InnoDB for these project requirements; and long term options
Viewed 0 times
theserequirementsinnodblongtermoptionsprojectformyisamand
Problem
Sorry for the long post, but I had to give as much info as possible to make this very vague question more specific.
My project's aim is to let users search a (huge) database of variety of products.
The most common usecase is:
I have three main tables 'products', 'features_enum' and 'features'. It is very important to let the data-entry users, create new 'features' on the fly, for the products - hence I am using EAV (anti)pattern.
Here are the structures of the tables:
A sample format of my main search query:
My Experimentation So Far:
Due to my limited knowledge and experience in DBs, I hit a wall with theoretical planning. So, I generated a large set of test data to simply see what will work. All three tables had 500,000 test rows. Here are the avg. run times of the main search query:
Test machine - Pentium 4 1.9GHz, 1.5GB RAM, IDE HDD, Win7.
I have basically not done anything to optimize
My project's aim is to let users search a (huge) database of variety of products.
- Each product is present under a category.
- Each product will have 10 to 100 'specs' or 'features' by which the users will search.
The most common usecase is:
- User clicks on a category; then clicks various sub-categories if required.
- User starts off with 1 or 2 criteria and searches for products.
- User then keeps adding more criteria to the search to narrow down on the product.
I have three main tables 'products', 'features_enum' and 'features'. It is very important to let the data-entry users, create new 'features' on the fly, for the products - hence I am using EAV (anti)pattern.
Here are the structures of the tables:
'products'
ID(PK), TITLE, CATEGORY
(Indexed by CATEGORY)
'features_enum'
ID(PK), TITLE
'features'
P_ID, F_ID, VAL
(Indexed by P_ID and then F_ID)A sample format of my main search query:
SELECT
p.ID,
p.TITLE PROD_TITLE,
fe.TITLE FEATURE_TITLE,
f.VAL
FROM
products p, features f, features_enum fe
WHERE
p.CATEGORY = 57 AND
p.ID = f.P_ID AND
f.F_ID = fe.ID AND
(
(f.F_ID = 1 AND f.VAL = 'Val1') AND
(f.F_ID = 2 AND f.VAL = 'Val2') AND
...
(f.F_ID = N AND f.VAL = 'ValN') AND
)My Experimentation So Far:
Due to my limited knowledge and experience in DBs, I hit a wall with theoretical planning. So, I generated a large set of test data to simply see what will work. All three tables had 500,000 test rows. Here are the avg. run times of the main search query:
- InnoDB without indexing: 90s.
- InnoDB with indexing: 15s. 0.3s after buffer pool size increase
- MyISAM without indexing: 9s.
- MyISAM with indexing: 0.7s.
- MyISAM with indexing + FIXED row type: 0.16s.
Test machine - Pentium 4 1.9GHz, 1.5GB RAM, IDE HDD, Win7.
I have basically not done anything to optimize
Solution
InnoDB and MyISAM each have their strengths and weaknesses.
If you have enough RAM, I would choose InnoDB because it caches data and index pages in the Buffer Pool. MyISAM only caches index pages in the Key Cache.
MyISAM tables experience full table locks for each INSERT, UPDATE, and DELETE. MyISAM tables always require disk access for data.
InnoDB tables always incur disk I/O in the following areas:
SUMMARY
In an environment with the following:
I would always choose InnoDB. Please check out my other post about InnoDB over MyISAM : When to switch from MyISAM to InnoDB?
When would I every choose MyISAM?
Under the following scenario
Disk I/O wise, MyISAM has a slight edge with
On the other hand, InnoDB has to interact with multiple files (ibdata1, serveral read/write threads upon the .ibd of the InnoDB table).
May 03, 2012: Which is faster, InnoDB or MyISAM?
Sep 20, 2011: Best of MyISAM and InnoDB
If you have enough RAM, I would choose InnoDB because it caches data and index pages in the Buffer Pool. MyISAM only caches index pages in the Key Cache.
MyISAM tables experience full table locks for each INSERT, UPDATE, and DELETE. MyISAM tables always require disk access for data.
InnoDB tables always incur disk I/O in the following areas:
- Double Write Buffer : Changes are posted in ibdata1 to avoid OS caching
- Insert Buffer : Changes to Secondary (non-Unique) Indexes as posted in ibdata1
- Data and Indexes
- With innodb_file_per_table = 0, changes are written to ibdata1
- With innodb_file_per_table = 1, changes are written to
.ibdtablespace file. Read I/O against ibdata1 still necessary to crosscheck table metadata
SUMMARY
In an environment with the following:
- heavy writes
- heavy reads
- tons of RAM
- heavy connections
I would always choose InnoDB. Please check out my other post about InnoDB over MyISAM : When to switch from MyISAM to InnoDB?
When would I every choose MyISAM?
Under the following scenario
- Using MySQL Replication
- Master with all InnoDB
- Slave with all tables converted to MyISAM
ALTER TABLE ... ROW_FORMAT=Fixedfor all tables on the Slave
Disk I/O wise, MyISAM has a slight edge with
ROW_FORMAT-Fixed because you only interact with one file, the .MYD file. The row size is completely predictable because VARCHAR is treated as CHAR this shortening access time for data retrieval.May 10, 2011: What is the performance impact of using CHAR vs VARCHAR on a fixed-size field?
Mar 25, 2011: Performance implications of MySQL VARCHAR sizes
On the other hand, InnoDB has to interact with multiple files (ibdata1, serveral read/write threads upon the .ibd of the InnoDB table).
Context
StackExchange Database Administrators Q#24951, answer score: 4
Revisions (0)
No revisions yet.