patternphpMinor
Removing image records if no physical file exists
Viewed 0 times
fileimageremovingrecordsexistsphysical
Problem
I have a working script that selects image fields in all tables and empty their values if the physical file doesnt exist.
The script is working fine, but it takes time. I was wondering if there is a smarter (more optimized) way to get the same function.
$query1 = "SELECT table_name,column_name
FROM information_schema.columns
WHERE table_schema='schemaname' AND column_name like '%image%' or column_name='video'";
$result1 = mysql_query($query1) or die(mysql_error() . " -- " . $query1);
while($row1 = mysql_fetch_row($result1)){
if (!strpos($row1[0],'backup') > 0){
$sql = "Select COLUMN_NAME FROM information_schema.columns WHERE TABLE_NAME = '".$row1[0]."' AND EXTRA = 'auto_increment'";
$resultcol = mysql_query($sql);
$rowcol = mysql_fetch_row($resultcol);
$query2 = "SELECT " . $row1[1] . ", " .$rowcol[0] . "
FROM " . $row1[0] . "
WHERE " . $row1[1] . " != '' AND " . $row1[1] . " IS NOT NULL
";
echo $query2 . "";
$result2 = mysql_query($query2) or die(mysql_error() . " -- " . $query2);
while ($rowdb = mysql_fetch_row($result2)){
if (!strpos($rowdb[0],'facebook') > 0 && !file_exists($img_root.'/'.$rowdb[0])){
$sql = "UPDATE ".$row1[0]." SET ". $row1[1] . " = '' WHERE " . $rowcol[0]. "= ".$rowdb[1];
echo $sql . "";
$delete_count++;
//mysql_query("UPDATE ".$row1[0]." SET ". $row1[1] . " = '' WHERE id = ".$row1["id"]);
}
}
}
}The script is working fine, but it takes time. I was wondering if there is a smarter (more optimized) way to get the same function.
Solution
The
Each query that you issue has a rather large overhead: you have to communicate the query to the server; the server has to parse, plan, and execute the query; the results have to be returned to your application. Therefore, you want to avoid running any query in a loop. You're much better off executing fewer complex queries than many simple queries.
First, note that your
Let's clarify the naming.
Notionally, you want to execute this query for each result returned by that
Of course, there are two problems with that idea:
-
You have to execute many
This is a strong indication that your database schema is poorly designed. There should not be a proliferation of tables! You should also not be doing version control or backups by creating tables named
A solution to the first problem is to create a temporary table that lists all images on your website's filesystem, then write the
A solution to the second problem is to use a stored procedure with a loop that executes dynamically composed SQL queries with a cursor.
Anyway, this solution is rather nasty (and untested!) — all as a result of your poor database schema design.
mysql_* functions are deprecated. Use mysqli or PDO instead.Each query that you issue has a rather large overhead: you have to communicate the query to the server; the server has to parse, plan, and execute the query; the results have to be returned to your application. Therefore, you want to avoid running any query in a loop. You're much better off executing fewer complex queries than many simple queries.
First, note that your
$query1 has a bug: AND has higher precedence than OR, so you need parentheses. Also, you eventually ignore any column named …backup, so you might as well filter those out immediately.SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');Let's clarify the naming.
- Your
$row1[0]is what I calltable_name
- Your
$row1[1]is what I callimage_col
Notionally, you want to execute this query for each result returned by that
$schema query.UPDATE `$schema['table_name']`
SET `$schema['image_col']` = ''
WHERE
`$schema['image_col']` != '' AND `$schema['image_col']` IS NOT NULL
AND `$s['image_col']` NOT LIKE '%facebook%'
AND NOT file_exists('$img_root/$schema['image_col']')Of course, there are two problems with that idea:
- There is no
file_exists()function in MySQL (and furthermore, if such a function existed, it would be wrong to run it on the database server rather than on the application server, since they could be on different machines with different filesystems).
-
You have to execute many
UPDATEs, not just one. Furthermore, the UPDATE queries are dynamic — the table and column names vary.This is a strong indication that your database schema is poorly designed. There should not be a proliferation of tables! You should also not be doing version control or backups by creating tables named
…backup. I encourage you to follow up with these issues on http://dba.stackexchange.com.A solution to the first problem is to create a temporary table that lists all images on your website's filesystem, then write the
UPDATE with an anti-join.$db = new mysqli(…);
$db->query('CREATE TEMPORARY TABLE fs_images ( filename VARCHAR(128) PRIMARY KEY );');
$ins = $db->prepare('INSERT INTO fs_images VALUES (?)');
$file_it = new RecursiveDirectoryIterator($img_root);
foreach (new RecursiveIteratorIterator($file_it) as $f) {
$ins->bind_param('s', basename($f));
$ins->execute();
}
$ins->close();A solution to the second problem is to use a stored procedure with a loop that executes dynamically composed SQL queries with a cursor.
DELIMITER $
CREATE PROCEDURE clear_images()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE table_name VARCHAR(64);
DECLARE image_col VARCHAR(64);
DECLARE schema_cursor CURSOR FOR
SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN schema_cursor;
schema_loop: LOOP
FETCH schema_cursor INTO table_name, image_col;
IF done THEN
LEAVE schema_loop;
END IF;
PREPARE stmt FROM CONCAT(
'UPDATE `', @table_name, '`',
' SET `', @image_col, '` = ''''',
' WHERE',
' `', @image_col, '` != '''' AND `', @image_col, '` IS NOT NULL',
' AND `', @image_col, '` NOT LIKE ''%facebook%''',
' AND `', @image_col, '` NOT IN (SELECT filename FROM fs_images);'
);
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END LOOP;
END$
DELIMITER ;Anyway, this solution is rather nasty (and untested!) — all as a result of your poor database schema design.
Code Snippets
SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');UPDATE `$schema['table_name']`
SET `$schema['image_col']` = ''
WHERE
`$schema['image_col']` != '' AND `$schema['image_col']` IS NOT NULL
AND `$s['image_col']` NOT LIKE '%facebook%'
AND NOT file_exists('$img_root/$schema['image_col']')$db = new mysqli(…);
$db->query('CREATE TEMPORARY TABLE fs_images ( filename VARCHAR(128) PRIMARY KEY );');
$ins = $db->prepare('INSERT INTO fs_images VALUES (?)');
$file_it = new RecursiveDirectoryIterator($img_root);
foreach (new RecursiveIteratorIterator($file_it) as $f) {
$ins->bind_param('s', basename($f));
$ins->execute();
}
$ins->close();DELIMITER $$
CREATE PROCEDURE clear_images()
BEGIN
DECLARE done INT DEFAULT FALSE;
DECLARE table_name VARCHAR(64);
DECLARE image_col VARCHAR(64);
DECLARE schema_cursor CURSOR FOR
SELECT TABLE_NAME AS table_name, COLUMN_NAME AS image_col
FROM INFORMATION_SCHEMA.COLUMNS
WHERE
TABLE_SCHEMA = 'schemaname'
AND TABLE_NAME NOT LIKE '%backup%'
AND (COLUMN_NAME LIKE '%image%' OR COLUMN_NAME = 'video');
DECLARE CONTINUE HANDLER FOR NOT FOUND SET done = TRUE;
OPEN schema_cursor;
schema_loop: LOOP
FETCH schema_cursor INTO table_name, image_col;
IF done THEN
LEAVE schema_loop;
END IF;
PREPARE stmt FROM CONCAT(
'UPDATE `', @table_name, '`',
' SET `', @image_col, '` = ''''',
' WHERE',
' `', @image_col, '` != '''' AND `', @image_col, '` IS NOT NULL',
' AND `', @image_col, '` NOT LIKE ''%facebook%''',
' AND `', @image_col, '` NOT IN (SELECT filename FROM fs_images);'
);
EXECUTE stmt;
DEALLOCATE PREPARE stmt;
END LOOP;
END$$
DELIMITER ;Context
StackExchange Code Review Q#44161, answer score: 2
Revisions (0)
No revisions yet.