HiveBrain v1.2.0
Get Started
← Back to all entries
patternsqlMinor

What happens when you update a VARCHAR cell with a longer string?

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
cellupdateyouwhatwithlongervarcharwhenhappensstring

Problem

Say I have a column of type VARCHAR(32) in a mySQL database.

On Spetmeber 1st I decide to store the string "tea", but on September 2nd I decide to update it to "coffee".

Clearly, if all the records in this table were squished right up against each other, and we tried to lengthen one record by 3 bytes, then all the records that appeared after this one would need to shift down by 3 bytes. Of course, this is ridiculous; there is no way any DBMS would ever resort to shifting thousands of possible entries.

So what exactly does mySQL do in this eventuality? Does it behave the same way for TEXT and BLOB types?

EDIT:
After reading this a day later, I realized that this question was fairly ambiguous. Here is an example that I hope will clear things up:

Say I have a table, fav_drinks with two columns:

  • user_id, which is an INTEGER



  • drink, which is a VARCHAR(32)



Pretend that this table is stored like this in memory:

[1,"juice",2,"tea",3,"soda",4,"hot chocolate"]


That is, all records are stored sequentially one after the other. If we need to update user 2's favourite drink from "tea" to "coffee", in theory we would need to shift down the entries for users 3 and 4. Of course, I don't think this is what would happen in a real database.

So, to reiterate the question, how does mySQL manage this specific case where one table entry suddenly requires more memory?

Solution

MySQL's InnoDB stores rows in blocks of 16KB each. Within a block there could be a few big rows or hundreds on narrow rows. Almost always, there is some spare room in a block. When a column in a row is changed, the row is rewritten in the same block (if possible) and things are shuffled around within the block to make room. (The exact details are buried in the code.)

If so much text (or blob or whatever) is added, that there is not room in the block, then there is a "block split". This is where the rows in the original block are split between two blocks.

Blocks are stored in a B+Tree arrangement of blocks. (See Wikipedia.) Therefore, the blocks are not necessarily stored consecutively, but they can be referenced consecutively. Thus, a block split in a billion-row table is about the same small amount of work as in a hundred-row table.

Another aspect is the "MVCC" that goes on to allow for multiple transactions to touch the same rows at the same time. This leads to not just replacing a row with the modified value, but actually hanging onto the previous copy of the row until the transaction that needs it is finished. But again, the spare room in a block, plus block splits, plus the BTree organization handles this.

So, your next question is about deleting? Well, if the text is shrunken a lot, or a bunch of rows are DELETEd, then it seems like a block might be shrunken a lot, possibly even emptied? Yes. In this case, InnoDB checks to see if two adjacent blocks are small enough to combine, and does so, thereby freeing up one of the blocks for future use. (This freed up block is kept in the tablespace and not given back to the OS, hence ibdata1 never shrinks; etc.)

All of that stuff happens under the covers, and we don't need to worry about it.

Really large records: For example a row with a MEDIUMTEXT column with 1MB of data won't fit in 16KB. What to do? Well, the bulky columns are actually stored in other blocks, separate from the main part of the record. This leads to some confusing limitations on the maximum size of a row and some unexpected performance issues. Again, this is handled transparently, and the unit of allocation is 16KB.

While I am at it, note that the structure of an INDEX is essentially identical to that of a table -- a BTree of 16KB blocks that play block split/merge games under the covers as the index entries are added/deleted for row inserts/deletes.

Context

StackExchange Database Administrators Q#177807, answer score: 5

Revisions (0)

No revisions yet.