patternjavaMinor
Bulk database update/insert from CSV file
Viewed 0 times
fileinsertupdatecsvbulkdatabasefrom
Problem
I am implementing application specific data import feature from one database to another.
I have a CSV file containing say 10000 rows. These rows need to be inserted/updated into database.
There might be the case, where couple of rows may present in database that means those need to be updated. If not present in database, those need to be inserted.
One possible solution is that, I can read one by one line, check the entry in database and build insert/update queries accordingly. But this process may take much time to create update/insert queries and execute them in database. Some times my CSV file may have millions of records.
Is there any other faster way to achieve this feature?
I have a CSV file containing say 10000 rows. These rows need to be inserted/updated into database.
There might be the case, where couple of rows may present in database that means those need to be updated. If not present in database, those need to be inserted.
One possible solution is that, I can read one by one line, check the entry in database and build insert/update queries accordingly. But this process may take much time to create update/insert queries and execute them in database. Some times my CSV file may have millions of records.
Is there any other faster way to achieve this feature?
Solution
There's a nice technology available in Oracle called External Tables. In your scenario, you could access your external plain-text data using External Tables from within the database and update your existing data in database with SQL statements you love and are used to – for example,
In most cases, using Oracle supplied utilities is the best way to perform ETL. And because your question sounds more like administrative one I suggest you to look at my previous post on DBA Stack Exchange "Update Oracle database from CSV".
UPDATE: This approach works pretty well for reading external data in database. Generally, you define external data format every time you need to process the plain-text file which has new format. Once external table is created you can query it just like a real database table. Whenever there is a new data to import, you just replace the underlying file(s) on the fly without need to recreate external table(s). Since external table can be queried as any other database table, you can write SQL statements to populate other database tables.
The overhead of using External Tables is usually lower compared to other techniques you would implement manually because this technology was designed with performance in mind taking into account the Oracle Database architecture.
INSERT, MERGE etc.In most cases, using Oracle supplied utilities is the best way to perform ETL. And because your question sounds more like administrative one I suggest you to look at my previous post on DBA Stack Exchange "Update Oracle database from CSV".
UPDATE: This approach works pretty well for reading external data in database. Generally, you define external data format every time you need to process the plain-text file which has new format. Once external table is created you can query it just like a real database table. Whenever there is a new data to import, you just replace the underlying file(s) on the fly without need to recreate external table(s). Since external table can be queried as any other database table, you can write SQL statements to populate other database tables.
The overhead of using External Tables is usually lower compared to other techniques you would implement manually because this technology was designed with performance in mind taking into account the Oracle Database architecture.
Context
StackExchange Database Administrators Q#46517, answer score: 7
Revisions (0)
No revisions yet.