patternphpMinor
Enhance CSV file database import
Viewed 0 times
filecsvdatabaseenhanceimport
Problem
I'm using the script below to import a large .csv file to my database. If the table is empty the process takes about 5 minutes to finish on a local machine. If I'm using the file to update existing values on the same table it takes more than 15 minutes to finish. My .csv file contains about 35,000 rows.
How can I speed up the process?
```
if ( $request->get( $_POST["action"] ) == "import" ) {
$file = $upload->file_upload( "import", "media/import" );
if ( file_exists( DIR_UPLOAD_PHOTO . "/media/import/" . $file ) ) {
$file = DIR_UPLOAD_PHOTO . "/media/import/" . $file;
$handle = fopen( $file, "r" );
if ( $handle ) {
$lines = explode( "\r", fread( $handle, filesize( $file ) ) );
}
$total_array = count( $array );
$x = 0;
foreach ( $lines as $line ) {
if ( $x >= 1 ) {
$data = explode( "|", $line );
$titlu = trim( addslashes( $data[0] ) );
$alias = $this->generate_seo_link( $titlu );
$gramaj = trim( $data[1] );
$greutate = trim( $data[2] );
$pret_total = trim( $data[3] );
$pret_redus = trim( $data[4] );
$poza = trim( $data[5] );
$pret_unitar = trim( $data[6] );
$categorie = trim( $data[7] );
$brand = trim( addslashes( $data[8] ) );
$descriere = trim( addslashes( $data[9] ) );
$vizibil = trim( $data[10] );
$cod = trim( $data[11] );
$nou = trim( $data[12] );
$cant_variabila = trim( $data[13] );
$congelat = trim( $data[14] );
$tva = trim( $data[15] );
How can I speed up the process?
```
if ( $request->get( $_POST["action"] ) == "import" ) {
$file = $upload->file_upload( "import", "media/import" );
if ( file_exists( DIR_UPLOAD_PHOTO . "/media/import/" . $file ) ) {
$file = DIR_UPLOAD_PHOTO . "/media/import/" . $file;
$handle = fopen( $file, "r" );
if ( $handle ) {
$lines = explode( "\r", fread( $handle, filesize( $file ) ) );
}
$total_array = count( $array );
$x = 0;
foreach ( $lines as $line ) {
if ( $x >= 1 ) {
$data = explode( "|", $line );
$titlu = trim( addslashes( $data[0] ) );
$alias = $this->generate_seo_link( $titlu );
$gramaj = trim( $data[1] );
$greutate = trim( $data[2] );
$pret_total = trim( $data[3] );
$pret_redus = trim( $data[4] );
$poza = trim( $data[5] );
$pret_unitar = trim( $data[6] );
$categorie = trim( $data[7] );
$brand = trim( addslashes( $data[8] ) );
$descriere = trim( addslashes( $data[9] ) );
$vizibil = trim( $data[10] );
$cod = trim( $data[11] );
$nou = trim( $data[12] );
$cant_variabila = trim( $data[13] );
$congelat = trim( $data[14] );
$tva = trim( $data[15] );
Solution
Well this might just be my shortest post to date:
Why? Why bother parsing, and manually inserting each and every row individually if you can use something like:
Check the full syntax, and possibilities here. It contains references to how you can handle duplicate keys (hint:
Update, some actual code review:
Having said all that, I think one thing about your code just looked too worrying not to mention: you're obviously using a DB abstraction layer of sorts, or some extension. Please, make sure to use prepared statements. It might also be worth looking into
This snippet does, pretty much, the same thing as your code does, only it's a lot shorter, safer and cleaner... The DB extension I've used here is
IF one of the inserts fail, an exception should be thrown (unless you've set the errmode to silent). If this happens, catch the exception and call
There's also the issue of a rather weird:
You also have this weird way of adding
Just leave out the else, it's noise. It's code-smell.
You also increment
Why? Why bother parsing, and manually inserting each and every row individually if you can use something like:
LOAD DATA INFILE "/path/to/data.csv"
INTO TABLE produse
(titlu, alias, gramaj) -- add all column names here
COLUMNS TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r' -- \r or \n, seems to be \r in your case for some reason
IGNORE 1 LINES; -- if your CSV contains a line containing field names or somethingCheck the full syntax, and possibilities here. It contains references to how you can handle duplicate keys (hint:
REPLACE is what you're probably after).Update, some actual code review:
Having said all that, I think one thing about your code just looked too worrying not to mention: you're obviously using a DB abstraction layer of sorts, or some extension. Please, make sure to use prepared statements. It might also be worth looking into
fgetcsv, to extract data from CSV files in the future. Instead of that pesky fread + explode mess, you can simply write:$handle = fopen('/path/to/data.csv', 'r');
$delimiter = '|';//your delimiter
$db->beginTransaction();//use transactions if you want
//create a prepared statement for the inserts
$stmt = $db->prepare('INSERT INTO my.table (field1, field2, field3) VALUES (:val1, :val2, :val3)');
//set the bind keys we're going to use
$bindKeys = array(':val1', ':val2', ':val3');
while ($line = fgetcsv($handle, 1024, $delimiter))
{
$line = array_map('trim', $line);//trim all of the values in one line
$bind = array_combine($bindKeys, array_values($line));//set the keys for the bind
$stmt->execute($bind);//insert the lot
$stmt->closeCursor();//optional, depending on the db, this might be required
}
$db->commit();//save changes...
fclose($handle);//
$db = null;//close DB connection
echo 'All done!', PHP_EOL;This snippet does, pretty much, the same thing as your code does, only it's a lot shorter, safer and cleaner... The DB extension I've used here is
PDO, in case you were wondering, but you can do the same thing with mysqli_*, too.IF one of the inserts fail, an exception should be thrown (unless you've set the errmode to silent). If this happens, catch the exception and call
$db->rollBack(); to avoid storing partial data.There's also the issue of a rather weird:
$cod = trim($data[11]);//no spaces possible
if ( $cod != "" && $cod != " " )
//second condition is IMPOSIBURUYou also have this weird way of adding
else clauses where they really don't serve any purpose:} else {
$alias = $alias;//??
}Just leave out the else, it's noise. It's code-smell.
You also increment
$x to work out if you're processing the first line (first element of $lines) or not. Why bother, why not drop the first element in the array, and process the rest as a whole:array_shift($lines);//removes $lines[0]
foreach ($lines as $line)
{
//process rest
}Code Snippets
LOAD DATA INFILE "/path/to/data.csv"
INTO TABLE produse
(titlu, alias, gramaj) -- add all column names here
COLUMNS TERMINATED BY '|'
OPTIONALLY ENCLOSED BY '"'
ESCAPED BY '\\'
LINES TERMINATED BY '\r' -- \r or \n, seems to be \r in your case for some reason
IGNORE 1 LINES; -- if your CSV contains a line containing field names or something$handle = fopen('/path/to/data.csv', 'r');
$delimiter = '|';//your delimiter
$db->beginTransaction();//use transactions if you want
//create a prepared statement for the inserts
$stmt = $db->prepare('INSERT INTO my.table (field1, field2, field3) VALUES (:val1, :val2, :val3)');
//set the bind keys we're going to use
$bindKeys = array(':val1', ':val2', ':val3');
while ($line = fgetcsv($handle, 1024, $delimiter))
{
$line = array_map('trim', $line);//trim all of the values in one line
$bind = array_combine($bindKeys, array_values($line));//set the keys for the bind
$stmt->execute($bind);//insert the lot
$stmt->closeCursor();//optional, depending on the db, this might be required
}
$db->commit();//save changes...
fclose($handle);//
$db = null;//close DB connection
echo 'All done!', PHP_EOL;$cod = trim($data[11]);//no spaces possible
if ( $cod != "" && $cod != " " )
//second condition is IMPOSIBURU} else {
$alias = $alias;//??
}array_shift($lines);//removes $lines[0]
foreach ($lines as $line)
{
//process rest
}Context
StackExchange Code Review Q#68976, answer score: 3
Revisions (0)
No revisions yet.