Aug 28, 2011

Removing unnecessary files using powershell

Now a days some people (including me) used to save whole heap of files into their hard disks. Word documents, excel sheets, powerpoint presentations, pictures  taken, pictures received from others and white papers (some times they are word documents), code files (including SQL),   Then you have Acrobat reader files, e-books, video and audio files.  Most of them take a lot of space. In addition to it, I also have multiple versions of it.

Managing the files is a problem.  Who likes the idea of losing the files due to of hardware corruption? A simple solution is to take backups. External hard disks are are cheap and handy solution for this issue.
I am not sure how this mechanism works for you, but for me, it is a problem. I have downloaded same documents multiple times, and each time it has gone into a different location.I have also taken backup at different stages, and some times I have removed the files from my hard disk and some times, I haven't. Now it turned out to be that  I have same files in multiple backup folders and zip files and it is a painful process to identify them one by one and delete unnecessary ones.
So I wrote e Powershell script, Which goes through my all backups and identify the identical files.
I hope this is useful to you.

$path = "l:\complete"
$d = get-childitem $path
$dd = $d | sort-object Name,DirectoryName
$j =0
$totalspace =0
for ($i=0; $i -lt $dd.Length-1; $i++)
    if (($dd[$i].Name -eq $dd[$i+1].Name) -and ($dd[$i].Length -eq $dd[$i+1].Length))
      $line = "FC /B /LB1  """ +$dd[$i].FullName + """  """+ $dd[$i+1].FullName+"""  >>L:\Comparefiles.txt"
 "FilesFound "+ $j+ "; totalSpace "+ $totalspace 

First I compared the files using FC (File Compare utility that comes with Windows) Later with slight modification I removed one file by moving one file to another location. it needs only one change in the script:  Change the line calls FC to call MOVE
 $line = "MOVE /Y """ +$dd[$i].FullName + """  """+ $dd[$i+1].FullName+""""
 The results are copied  into a .bat file (text document with the .bat extension) and after removing the line where you may need both files, the /bat file was executed.
You can further improve the code in various ways:
1.  You can include the files you need based on parameters and filters,
$d = get-childitem $path -include *.PDF -recurse | where-object {$_.Length -gt 1000000}
2.  You can add additional commands for comparison. If you want to find the attributes you want to use, you can try,
$d[0] |select *
When I tried this is the answer I got:
PSPath            : Microsoft.PowerShell.Core\FileSystem::L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD\1_SQL2K5 Troubleshooting guide.doc
PSParentPath      : Microsoft.PowerShell.Core\FileSystem::L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD
PSChildName       : 1_SQL2K5 Troubleshooting guide.doc
PSDrive           : L
PSProvider        : Microsoft.PowerShell.Core\FileSystem
PSIsContainer     : False
VersionInfo       : File:             L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD\1_SQL2K5 Troubleshooting guide.doc
                    Debug:            False
                    Patched:          False
                    PreRelease:       False
                    PrivateBuild:     False
                    SpecialBuild:     False
BaseName          : 1_SQL2K5 Troubleshooting guide
Mode              : -a---
Name              : 1_SQL2K5 Troubleshooting guide.doc
Length            : 44032
DirectoryName     : L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD
Directory         : L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD
IsReadOnly        : False
Exists            : True
FullName          : L:\Complete\E\Z\Other Useful Docs\SQL2K5 Install Docs\OLD\1_SQL2K5 Troubleshooting guide.doc
Extension         : .doc
CreationTime      : 29/9/09 01:56:28 PM
CreationTimeUtc   : 29/9/09 08:26:28 AM
LastAccessTime    : 14/10/09 06:55:56 AM
LastAccessTimeUtc : 14/10/09 01:25:56 AM
LastWriteTime     : 27/4/07 10:04:00 PM
LastWriteTimeUtc  : 27/4/07 04:34:00 PM
Attributes        : Archive

Hope this post helps you to manage your files too:

No comments:

Post a Comment