patternModerate
Best way to delete very large recordset in Oracle
Viewed 0 times
recordsetdeletewaylargeveryoraclebest
Problem
I manage an application which has a very large (nearly 1TB of data with more than 500 million rows in one table) Oracle database back end. The database doesn't really do anything (no SProcs, no triggers or anything) it's just a data store.
Every month we are required to purge records from the two of the main tables. The criteria for the purge varies and is a combination of row age and a couple of status fields. We typically end up purging between 10 and 50 million rows per month (we add about 3-5 million rows a week via imports).
Currently we have to do this delete in batches of about 50,000 rows (ie. delete 50000, comit, delete 50000, commit, repeat). Attempting to delete the entire batch all at one time makes the database unresponsive for about an hour (depending on the # of rows). Deleting the rows in batches like this is very rough on the system and we typically have to do it "as time permits" over the course of a week; allowing the script to run continuously can result in the a performance degradation that is unacceptable to the user.
I believe that this kind of batch deleting also degrades index performance and has other impacts that eventually cause the performance of the database to degrade. There are 34 indexes on just one table, and the index data size is actually larger than the data itself.
Here is the script that one of our IT people uses to do this purge:
This database must be up 99.99999% and we've only got a 2 day maintenance window once a year.
I'm looking for a better method for removing these records, but I've yet to find any. Any suggestions?
Every month we are required to purge records from the two of the main tables. The criteria for the purge varies and is a combination of row age and a couple of status fields. We typically end up purging between 10 and 50 million rows per month (we add about 3-5 million rows a week via imports).
Currently we have to do this delete in batches of about 50,000 rows (ie. delete 50000, comit, delete 50000, commit, repeat). Attempting to delete the entire batch all at one time makes the database unresponsive for about an hour (depending on the # of rows). Deleting the rows in batches like this is very rough on the system and we typically have to do it "as time permits" over the course of a week; allowing the script to run continuously can result in the a performance degradation that is unacceptable to the user.
I believe that this kind of batch deleting also degrades index performance and has other impacts that eventually cause the performance of the database to degrade. There are 34 indexes on just one table, and the index data size is actually larger than the data itself.
Here is the script that one of our IT people uses to do this purge:
BEGIN
LOOP
delete FROM tbl_raw
where dist_event_date < to_date('[date]','mm/dd/yyyy') and rownum < 50000;
exit when SQL%rowcount < 49999;
commit;
END LOOP;
commit;
END;This database must be up 99.99999% and we've only got a 2 day maintenance window once a year.
I'm looking for a better method for removing these records, but I've yet to find any. Any suggestions?
Solution
The logic with 'A' and 'B' might be "hidden" behind a virtual column on which you could do the partitioning:
alter session set nls_date_format = 'yyyy-mm-dd';
drop table tq84_partitioned_table;
create table tq84_partitioned_table (
status varchar2(1) not null check (status in ('A', 'B')),
date_a date not null,
date_b date not null,
date_too_old date as
( case status
when 'A' then add_months(date_a, -7*12)
when 'B' then date_b
end
) virtual,
data varchar2(100)
)
partition by range (date_too_old)
(
partition p_before_2000_10 values less than (date '2000-10-01'),
partition p_before_2000_11 values less than (date '2000-11-01'),
partition p_before_2000_12 values less than (date '2000-12-01'),
--
partition p_before_2001_01 values less than (date '2001-01-01'),
partition p_before_2001_02 values less than (date '2001-02-01'),
partition p_before_2001_03 values less than (date '2001-03-01'),
partition p_before_2001_04 values less than (date '2001-04-01'),
partition p_before_2001_05 values less than (date '2001-05-01'),
partition p_before_2001_06 values less than (date '2001-06-01'),
-- and so on and so forth..
partition p_ values less than (maxvalue)
);
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('B', date '2008-04-14', date '2000-05-17',
'B and 2000-05-17 is older than 10 yrs, must be deleted');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('B', date '1999-09-19', date '2004-02-12',
'B and 2004-02-12 is younger than 10 yrs, must be kept');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('A', date '2000-06-16', date '2010-01-01',
'A and 2000-06-16 is older than 3 yrs, must be deleted');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('A', date '2009-06-09', date '1999-08-28',
'A and 2009-06-09 is younger than 3 yrs, must be kept');
select * from tq84_partitioned_table order by date_too_old;
-- drop partitions older than 10 or 3 years, respectively:
alter table tq84_partitioned_table drop partition p_before_2000_10;
alter table tq84_partitioned_table drop partition p_before_2000_11;
alter table tq84_partitioned_table drop partition p2000_12;
select * from tq84_partitioned_table order by date_too_old;Code Snippets
alter session set nls_date_format = 'yyyy-mm-dd';
drop table tq84_partitioned_table;
create table tq84_partitioned_table (
status varchar2(1) not null check (status in ('A', 'B')),
date_a date not null,
date_b date not null,
date_too_old date as
( case status
when 'A' then add_months(date_a, -7*12)
when 'B' then date_b
end
) virtual,
data varchar2(100)
)
partition by range (date_too_old)
(
partition p_before_2000_10 values less than (date '2000-10-01'),
partition p_before_2000_11 values less than (date '2000-11-01'),
partition p_before_2000_12 values less than (date '2000-12-01'),
--
partition p_before_2001_01 values less than (date '2001-01-01'),
partition p_before_2001_02 values less than (date '2001-02-01'),
partition p_before_2001_03 values less than (date '2001-03-01'),
partition p_before_2001_04 values less than (date '2001-04-01'),
partition p_before_2001_05 values less than (date '2001-05-01'),
partition p_before_2001_06 values less than (date '2001-06-01'),
-- and so on and so forth..
partition p_ values less than (maxvalue)
);
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('B', date '2008-04-14', date '2000-05-17',
'B and 2000-05-17 is older than 10 yrs, must be deleted');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('B', date '1999-09-19', date '2004-02-12',
'B and 2004-02-12 is younger than 10 yrs, must be kept');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('A', date '2000-06-16', date '2010-01-01',
'A and 2000-06-16 is older than 3 yrs, must be deleted');
insert into tq84_partitioned_table (status, date_a, date_b, data) values
('A', date '2009-06-09', date '1999-08-28',
'A and 2009-06-09 is younger than 3 yrs, must be kept');
select * from tq84_partitioned_table order by date_too_old;
-- drop partitions older than 10 or 3 years, respectively:
alter table tq84_partitioned_table drop partition p_before_2000_10;
alter table tq84_partitioned_table drop partition p_before_2000_11;
alter table tq84_partitioned_table drop partition p2000_12;
select * from tq84_partitioned_table order by date_too_old;Context
StackExchange Database Administrators Q#710, answer score: 18
Revisions (0)
No revisions yet.