Sunday, September 16, 2007

Duplicate files finder

Oh no, not another duplicate file finder!! There are already hundreds of these... and still, I had to hack my own, since I couldn't find a single one that kept hashes in a database for quick subsequent searches and support of lots of files, and was free. So I wrote dupito (stupid name, I know). It uses SQL Server Compact Edition to store and query hashes (SHA-512 is used). I tried SQLite at first, but it was way too slow... and since I used Castle ActiveRecord, it was just a matter of configuration to change that to SQLServerCE.

Anyway, usage is as follows: when called without arguments, it indexes files in the current directory and subdirectories, then prints a list of duplicate files.

Command-line arguments:

  • c: cleans up database, deleting rows that reference nonexistent files
  • r: rehashes all files in database 
  • l: lists duplicate files currently in database

And here it is.

(As a sidenote, the exe is pretty big because I merged in Castle.ActiveRecord.dll + all of its dependencies...)

No comments: