Close

October 26, 2012

Magento Orphan File Cleaner

While working on a client’s Magento CE cart, we noticed that their diskspace had jumped astronomically over the past few weeks. Coincidentally, this started right around the time our C# dot net SOAP Magento product feed started running and had a few “bumps” a long the way. Apparently, when the Dot Net service failed its attempt and uploading images, those images were still stored in the Magento /media/catalog/product folder. Orphaned Magento product images! Gradually this built up over time and we amassed close to 2GB of unused images!

There was a plugin called Image clean (Here), but everytime we tried to install it on Magento CE 1.6.0, it would cause our Admin interface to die a horrible death.

After tracking down a few other script ideas, we came up with our own orphan file cleaner. Essentially, the script just crawls the media/catalog/product directory recursively, and tries to match up each image to their respective meta data tags in the database. A few posts said that this should be found in the “catalog_product_entity_varchar” table, but we also found images in the “catalog_product_entity_media_gallery ” table. Our script tries to look for it in either of those tables and if it is not found, it deletes the image from the system.

***** Warning *****, this should ONLY be run on your development or TEST server. Never ever run anything that removes files from a production server without first testing it.

The script worked very well for us and was able to cut our image store in half. We look forward to any comments or tweaks that anyone may have to make it better. Hope it helps!

<?php
 
$root_folder='/home/mehome/www/media/catalog/product';
$host        =    "localhost";
$user        =    "mehome_prod";
$pass        =    "mepassntrealofcourse";
$db    =    "mehome_prod";
 
    $con=mysql_connect($host,$user,$pass) ;
	mysql_select_db($db);
 
 
//imageFound:  Efficient?  Heck no.  Does it work well in a pinch?  Yes.
function imageFound($image){
	global $con;
	$sql="SELECT count(*) as tcnt FROM catalog_product_entity_media_gallery WHERE upper(value) = upper('".$image."')";
	if(!$result = mysql_query($sql,$con)){
		die($sql." FAILED".mysql_error());
	}
	$row = mysql_fetch_assoc($result);
	$count=$row['tcnt'];
	if($count>0) return true;
 
	$sql="SELECT count(*) as tcnt FROM catalog_product_entity_varchar WHERE upper(value) = upper('".$image."')";
	if(!$result = mysql_query($sql,$con)){
		die($sql." FAILED".mysql_error());
	}
	$row = mysql_fetch_assoc($result);
	$count=$row['tcnt'];
 
	if($count>0) return true;
	else{
		return false;
	}
}
 
 
//Get our RDI for our directory.
$it = new RecursiveDirectoryIterator($root_folder);
 
//Keep going: matching files and removeing non-matched files.
foreach(new RecursiveIteratorIterator($it) as $file) {
	//strip the file path, so we can match it up in Magento
	$cleanfile=str_replace($root_folder,"",$file);
	if(imageFound($cleanfile)==true){
	}else{
		echo "\n Ruh, roh.  Not found!";
		echo $cleanfile . "## REMOVEING ### \n";
		unlink($file);
	}
}