Friday, January 04, 2008

Photo organisation

I have a system for storing my photos that I'm largely pleased with - I simply put them in folders that allow alpha-numeric sorting by date - e.g. "2007-03-22 Visit to Spain"

However, with 400 such folders containing almost 20,000 photos, and a small number of folders in my old system, like "Beach" or "Kids Photos" or even "Unsorted" it was time to do something about it.

I like the cygwin toolset, which is basically a full Unix toolset (grep, vi, sed, etc) for windows. It also comes with perl, and a program called 'exif'. The exif program spits out all the data for an image to the console as shown at the bottom of this post. This allows me to simply get the date/time of the photo, allowing me to rename by date and sort by time.

I wrote a small perl script to handle this for me, and now I can quickly take a set of photos in a folder called "Kids Photos" and have the script drop them into one folder per day. Then I can rename the folders how it suits. Should have done this a long time ago.

After renaming all my non-conforming photos and folders, I made top level folders called 2002, 2003... etc to limit the number of folders in the top level. It's an aweful lot tidier now.

Disclaimer - if you find anything useful here, that's great, but it's all at your own risk. If you don't understand it, don't run it! Backup your photos before you start ;)

Script - should anyone want it:



#always
use strict;
use warnings;
use Getopt::Long;

# option for just stating what should be done,
# but not actually doing it
my $print_only = 0;
GetOptions( "p" => \$print_only );
if ($print_only)
{
print STDERR "Running in print only mode.\n";
}

#first scan all data
my %file_data;
my %dates;

print STDERR "Reading file info...";
foreach my $file (@ARGV)
{
next if ( $file !~ /jpg$/i );
print STDERR ".";
my $exif = `exif '$file'`;
my ( $date, $time ) =
$exif =~ /Date and Time.*\(20\d\d:\d\d:\d\d)\s(\d\d:\d\d:\d\d)/;
if ( !$date )
{
print "date : $date [$exif]\n\n";
print "Not enough exif info for file $file\n";
next;
}

#replace : with -
$date =~ s/\:/-/g;

#store the file data for later
$file_data{$file}->{'date'} = $date;
$file_data{$file}->{'time'} = $time;
$file_data{$file}->{'date_time'} = $date . " " . $time;
$dates{$date}++;

}
print STDERR "Done\n";

#create the directories
if ( !$print_only )
{
foreach my $date ( keys %dates )
{
mkdir($date);
}
}

my $counter = 0;
foreach my $file (
sort { $file_data{$a}->{'date_time'} cmp $file_data{$b}->{'date_time'} }
keys(%file_data)
)
{
$counter++;

my $date = $file_data{$file}->{'date'};
my $print_count = sprintf( "%03d", $counter );
my $target = "$date/$date $print_count.jpg";
if ( -f $target )
{
die "cannot overwrite target $target";
}

my $cmd = "mv '$file' '$target'";
print "$cmd\n";
if ( !$print_only )
{
`$cmd`;
}

}
__DATA__


Exif example output:


EXIF tags in '2007-05-03 Coastal Drive 001.jpg' ('Intel' byte order):
--------------------+----------------------------------------------------------
Tag Value
--------------------+----------------------------------------------------------
Manufacturer KONICA MINOLTA CAMERA, Inc.
Model DiMAGE G400
Orientation top - left
x-Resolution 72.00
y-Resolution 72.00
Resolution Unit Inch
YCbCr Positioning centered
Compression JPEG compression
Orientation top - left
x-Resolution 72.00
y-Resolution 72.00
Resolution Unit Inch
Exposure Time 1/13 sec.
FNumber f/4.7
ISO Speed Ratings 50
Exif Version Exif Version 2.2
Date and Time (origi2007:05:03 19:43:16
Date and Time (digit2007:05:03 19:43:16
ComponentsConfiguratY Cb Cr -
Compressed Bits per 3.40
Brightness 31/10
Exposure Bias 0.0
MaxApertureValue 3.00
Metering Mode Center-Weighted Average
Light Source 0
Flash Flash did not fire, compulsatory flash mode.
Focal Length 5.6 mm
Maker Note 688 bytes unknown data
FlashPixVersion FlashPix Version 1.0
Color Space sRGB
PixelXDimension 2272
PixelYDimension 1704
Custom Rendered Normal process
Exposure Mode Auto exposure
White Balance Auto white balance
Digital Zoom Ratio 0.00
Focal Length In 35mm34
Scene Capture Type Night scene
Gain Control Normal
Contrast Normal
Saturation Normal
Sharpness Normal
Subject Distance RanUnknown
InteroperabilityIndeR98
InteroperabilityVers
--------------------+----------------------------------------------------------
EXIF data contains a thumbnail (3168 bytes).

No comments: