HiveBrain v1.2.0
Get Started
← Back to all entries
patternjavaMinor

Oracle 11g: performance improvements of inserts

Submitted by: @import:stackexchange-dba··
0
Viewed 0 times
11ginsertsperformanceoracleimprovements

Problem

I have a table of 500 millions of rows (and growing)

I did the following to improve performance of inserts:

On database side:

  • dropped all indexes and constraints



  • disabled logging



On application side:

  • switched from JPA managed entities to native insert queries, added APPEND Oracle hint to the query



  • tried to commit in batches per 1k/2k/3k of rows



  • tried to write in parallel (multiple threads, thread count = to core count on server) to one table



This gave me about 300 rows per second

Additionally tried:

  • write in parallel in batches to multiple tables (to group then back results using UNION)



This gave me about 1k rows per second, but on empty tables. But when I filled tables with dummy data (200 of millions each), speed of inserts dropped to 250 - 300 per second.

Could anyone suggest what else can I do to speed-up inserts? Basically I want to understand what is (what could be) the bottleneck first.

UPD:
Table is partitioned by insert date, table has about 60 columns - most of columns are VARCHAR2(2000 BYTE)

Solution

Just saw the update, 60-col table with mostly VARCHAR(2k) fields -- that is (potentially) a monster table.

First things first...

You have to understand your bottleneck FIRST. On the app side, go all the way back to your single-threaded batch-insert solution (1/2/3k at a time) and begin running it and login to the DB machine and run a 'top' -- see how much time the DB process is taking AND how much (if any) wa% time the machine is showing.

If top is showing you ANY wa% time, that means your DB is I/O bound and you likely need to consider multiple DB machines (shards) or consider throwing SSDs on the host machine.

That's it; your research stops here. It doesn't matter how much CPU the DB was taking or how saturated your app client was; if you are hitting I/O latency issues on the host DB, that is as fast as it will EVER go for you.

TIP If hardware changes are out of the question, depending on the filesystem you are running (Linux) you can try disabling logging or metadata writing for the DB to slightly improve performance at the filesystem level. You can do something similar on NTFS, but this will only give you a little boost. This won't be 2x.

Now, second things second...

Let's say you had next to no wa% time but your CPU is pegged fully by the DB process. Your only option now is to introduce more DB machines (shards) and split the work.

Again, you're done with your research if this is the case. Nothing you can do to tweak the CPU to go faster.

Lastly, third things... third...

Let's say the DB isn't doing much of anything. Then, go to the client machine running the batch insert and check the CPU load -- is it pegged? If so, fire up some more machines doing the exact same batch inserts and see if you can get a linear ramp.

If the CPU isn't pegged, fire up some more threads on the same machine until it is pegged and see how the DB scales.

I think you may have already tried that, so my guess is that either your client host was already pegged (and more threads isn't going to make a difference) or the DB was already hitting its limit and can't scale any farther.

Addendum

Doing raw inserts on an unindexed table that has no garbage in it is essentially an APPEND operation which should be going as fast as the disk can handle the writes.

Creating more tables on the same host machine isn't going to help, if anything it will increase your disk seeks (to get to the other tables on disk to append to) and will slow things down.

It is critical to figure out that bottleneck first, then we can optimize the hell out of it.

Hope that helps! Keep us posted.

Context

StackExchange Database Administrators Q#37557, answer score: 5

Revisions (0)

No revisions yet.