gotchaMinor
Why does data get corrupted in "large" text files - Reasons to use database
Viewed 0 times
corruptedwhytextdatabasegetlargefilesdoesreasonsuse
Problem
I am just starting to learn about databases and using mysql. I read that databases are more reliable than having large ascii text files and that ascii text files are more prone to corruption as they get bigger.
Can someone explain why this is the case and how databases safeguard against this in a high level? Thank you.
Can someone explain why this is the case and how databases safeguard against this in a high level? Thank you.
Solution
I can only speak to MS SQL Server not MySQL I'm afraid. SQL Server breaks up a large database into small pieces called
So what happens once you have some corruption? Well in a text file you won't realize that the corruption exists until you happen to be looking at the corrupted portion of the file or the file will no longer open. This means that over time more and more corruption will build up over time. In SQL Server each page has a
Which leads us to repairing the corruption. In a text file if your network admin is taking proper backups then you can hopefully restore your file back to before the corruption. If you caught the corruption early enough. If you didn't there may not be a backup old enough. Also your only option is to restore the entire file. This means any changes made to your file since the corruption occurred will be lost. Databases provide a number of restoration methods. If you are taking regular backups you can of course just restore the database. This isn't much different than your text file. However at least in SQL Server there are also options to restore parts of a database even down to the
Here is some additional reading for you.
SQL Skills postings on page checksums.
Understanding Pages and Extents
DBCC CHECKDB and database integrity
Sorry this is MS SQL Server and not MySQL but I imagine some of the principals are the same or at least very similar.
PAGES that are 8k in size. There are a couple of advantages here. With a large text file any time you make a chance you re-write the entire file. With a SQL Server when you make changes you only write down the PAGES that have changed. This means that instead of writing megabytes an gigabytes or even terabytes you are writing down information in 8K chunks. Less writing means less chance of causing corruption.So what happens once you have some corruption? Well in a text file you won't realize that the corruption exists until you happen to be looking at the corrupted portion of the file or the file will no longer open. This means that over time more and more corruption will build up over time. In SQL Server each page has a
CHECKSUM built in. This CHECKSUM is checked under several conditions. Every time that page is written, when backups are taken with the WITH CHECKSUM option and when DBCC CHECKDB is run. As a side note DBCC CHECKDB checks for a number of different possibilities of corruption not just the CHECKSUM. Because of all this if you (or your DBA) is careful they will find the corruption more quickly.Which leads us to repairing the corruption. In a text file if your network admin is taking proper backups then you can hopefully restore your file back to before the corruption. If you caught the corruption early enough. If you didn't there may not be a backup old enough. Also your only option is to restore the entire file. This means any changes made to your file since the corruption occurred will be lost. Databases provide a number of restoration methods. If you are taking regular backups you can of course just restore the database. This isn't much different than your text file. However at least in SQL Server there are also options to restore parts of a database even down to the
PAGE level. And in fact with certain types of high availability there is even the option of automatically fixing a page as soon as corruption occurs.Here is some additional reading for you.
SQL Skills postings on page checksums.
Understanding Pages and Extents
DBCC CHECKDB and database integrity
Sorry this is MS SQL Server and not MySQL but I imagine some of the principals are the same or at least very similar.
Context
StackExchange Database Administrators Q#58494, answer score: 5
Revisions (0)
No revisions yet.