2.16.2016

The Hard Drive Shuffle. Here today, here tomorrow.


We're in the season now. The jobs are flying through the door. And I'm watching the "gas gauges" on the hard drives starting to head toward zero. One of the hard drives on the system is a WB 2TB USB 2.0 drive that's delivered faithful service since 2008. As I ponder higher math I think that means we've been using it going on eight years. Pretty astounding. We've got one other that's nearly the same vintage. Never a sick day or complaint. But, at some point, a hard drive deserves to go into retirement...

I recently bought two more drives. They are WD, 4 TB USB 3 drives. I have high hopes for them as well. They'll join two earlier 4 TB hard drives and two, fast, Firewire 800, 7200, 2 TB G-Tech drives. I use the G-Techs for editing video projects.

When I retire a drive I make sure that critical files are backed up in two other places. Drives this old will have two different sets of DVDs as file back-up and, what I noticed when looking through the directories, there are a lot of duplications, or back-up files, shared between the two "more mature" hard drives.

The other thing I do is to print out the directory for each hard drive and attach it to the chassis. That way, if all other back-ups fail, I'll have a decent chance of finding older files on those rare client re-requests. To preemptively answer all my geek-readers, "yes, I know I am supposed to spin up the retired drives about once a month to keep them in working order!" Thank you.

So, I currently have about 20 Terabytes of storage hanging off my machine and that should last me through 2016 but I wonder if there's not a better way to do this. I don't mean some big, RAID rack but something more elegant... Hypercube memory? What's out there that we'll be using in five years while we laugh derisively about those "old hard drives..." ????

On another topic: 

I have been using the Sigma 24-35mm f2.0 Art Lens on a D750 for the last two days and I must remark about two things. One is that the lens seems to be as sharp and nano-acuity saturated as one could have hoped for. That's a plus. Second; after using one inch and M4:3 cameras, and their attendant lenses, I have to remark that when used on a full frame camera, even at 24mm, there is not a whole lot of depth of field when you use the lens wide open at close quarters. I found myself dialing down the aperture more often than not. But that's because of the job. Left to my own devices I am looking forward to making some work with both forced perspective AND shallow depth of field. It's a whole new ballgame around here...

Having tasted the 24-35mm Art lens and the 50mm Art lens I understand the value proposition.

One more day at St. Gabriel's and then I get a little down time. Which I will fill with post production....




One of the original Craftsy Photo Classes and 
still one of the best! 

I met Lance a couple of weeks ago in Denver
and found him to be really fun and knowledgeable 
this class reflects what he teaches in hands-on
workshops in Ireland and Iceland, as well as 
cool places around the U.S.

How to make what we shoot into a cohesive
train of visual thought.

18 comments:

atmtx said...

I'm up to 17TB connected to my system. All individual external drives. I don't trust RAID, I've seen too many RAID failures.

Kirk, Photographer/Writer said...

Thanks for the corroboration. My experience as well. Lunch?

Herman said...

I do trust RAID, at least of the right type. You should have a raid 5+1 setup in addition to the normal backups.

RAID is not a backup solution, it is just a more stable filesystem than a single drive.

You can get very nice small cubes these days that you can fill with drives to do this RAID thing. The working speed might surprise you. (though for video projects working on a large SSD will still be more pleasant).
(not cheap)

In any case the key message is still: backup.
Even at work (at a datacenter, large multi-layer (fast disk <-> slow disk <-> tapes)) we still backup, because you cannot trust running filesystems.

TMJ said...

I, too, like WD , have just bought two external 5TB USB 3.0 drives. Very quick, inaudible, run cool. RAID is oversold for individuals/small enterprises: having said that, I have a Synology NAS box running RAID 1, but for just as a media server (backed up elsewhere).

cfw said...

Have you considered this?

http://www.southampton.ac.uk/news/2016/02/5d-data-storage-update.page

Wally said...

check out FreeNas for backup. Use old computers and FreeNas code to create quasi off the shelf storage. you do need to buy drives however can pick up an old PC chassie and powersupply- 600 watts probably what you need- ans roll your own. From review this looks like a cost effective alternative to Raid systems.

Michael Meissner said...

I tend to view hard disk drives as consumables, where you have them for awhile until they get too small or fail, and you go on to the next year's drive. Now, me personally, all of my photos fit into a single drive, and as the drive fills up, I move on to the next generation which is generally much bigger.

And every couple of years the way the storage connects to the computer changes (i.e. SCSI, IDE, SATA, etc. or USB-2, USB-3, e-stata, etc.), and you need to start moving files while you can get both types on a single computer.

In terms of backups, you want to get into a way that you do the backup as part of your natural work flow. I have a master computer that holds all of the photos and videos, and I have copies on the other computers I work at. My work flow is to copy the changing files from the master location to the computer I'm working at, do any editing, and then copy it back. This way, if I decide in the middle of processing, that I want to take a completely different tack or I deleted files by accident, I just refresh the files from the master computer. As a programmer, I have scripted this so I don't have to manually think, get the files and put the files.

In terms of older jobs, only you can say how often clients come back to you wanting copies from the vaults. My guess is, in general 90% of the time people don't want the old stuff. Yes, it happens, but it may be only a few clients that need it spanning multiple years.

It is always useful to do what if type scenarios. For example, what if the drive you are working on crashes right now. Would you lose critical data? Now, for a single individual, it may be better to have backups done at specific phases of the job. If the disk crashed during editing, you could always just start over, and redo the edits (assuming you had the originals preserved). This is the main place where RAID helps you. RAID does not help if you delete files by accident, and it normally doesn't cover backup of old files.

The next scenario is what happens if your house/studio burns down. Assuming that family/studio dog are safe, but the computers are now molten slag, will you be able to finish whatever job you are working on with a new computer? This is where off-site backups that are done as a routine procedure help. It doesn't help if you only do backups on a monthly basis if you lose the files you have shot in the last 2 weeks. Here something like having 2 removable disks, each of which contains the files you are working on. You have one disk on the computer, and sync the files to that disk. As part of your routine, you take the disk out, and store it off site, replacing it with the disk you have off site, and put that disk in the computer. To be even more paranoid, you might have 5 disks that you rotate in order. Storing it off site might mean just kept in your car, at your house if you have a separate studio, or it could mean a bank safe deposit box.

Beyond that, you get into what happens if you have an area wide disaster. It may be that if a hurricane comes through and destroys most of Austin, that it won't matter if you don't have files from some client from 2 years ago. Some times you can get too paranoid about what if scenarios.

The counter argument is sometimes what you are saving is valuable to future generations. One of the stories that floated around after the terrorist attacks on 9/11 was Jacques Lowe who had taken many photos over the years of JFK, and had stored his work in the underground JP Morgan bank vaults at the World Trade Center (http://www.theguardian.com/artanddesign/photography-blog/2013/sep/27/john-f-kennedy-jacques-lowe-photography).

To paraphrase the old saw about Chicago politics, backup early, backup often. However, you want a balance, so that you aren't dominated by the what if scenarios.

Michael Meissner said...

Note, I had to cut the following paragraph off of my previous posting due to size:



Depending on your circumstances, offsite backups via the internet/cloud may be useful. However, I tend to think it doesn't scale all that well. If your internet provider has a cap on the bandwidth and you start paying a lot more if you go over the cap, you don't want to be copying a lot of files to your backups. At one point, my home internet was done via a cell phone internet provider, and if I went over 5GB a month, they started charging. I had to be more sparing in what I copied to my external web server during that time period (which I use as my ultimate backup). Even if you don't have a cap on bandwidth, if you come home from a job and have shot say 32 gigabytes, it may take a lot of time to copy that via the internet.

PittsburghDog said...

I've been using a Synology Diskstation since 2007. We upgraded to a 4 drive model in 2011 and it has worked like a champ. We run in a RAID5+1 for protection. We've had several disk failures, but it has never been a problem, because we just pull out the defective drive, pop in the new one and it rebuilds the volume. You can even use it while it is rebuilding. We also have external HDDs that we use to always have backups offsite. A NAS is very easy to expand your storage by simply taking out a old hard drive (ie 2TB) and replacing with something larger.

The Synology is compatible with TimeMachine backup and those with 4 or more drives are very fast. The company has been around a long time so their software is very stable. We use it for our business, as an iTunes server, for streaming movies to our AppleTV and various other uses. I sound like a salesman, but like most apple products...It just works. I'm not a systems admin or a "tech" guy. The synology does most of the work, which is good because I don't have time to mess around with it.

MikeR said...

RAID 1 (mirroring) or RAID 5 (parity) is intended to keep your system running even if a disk fails. A backup is still necessary. Either daily, or continuous.

Archival backup drives should be in the native file system. You don't want the good stuff to be held hostage by any vendor's backup software.

My corporate IT background bleeds over into the way I back up my home computers. I use a multi-layer approach: an attached external 3 TB drive as target for daily backup software. Two different makers NAS storage, for critical directory trees. A rotating set of portable drives, one always off-site.

Kirk in PDX said...

Just a quick note, RAID 5 is no longer considered a best practice. With the drive sizes we have now (2TB and up), the chance of a failed rebuild go way up (http://www.zdnet.com/article/has-raid5-stopped-working/). Best practice is RAID 10, but that doesn't replace backups. A disk array seems unecsary in this case.

The rule of thumb I use is that for any data you don't want to lose, you need to have at least three copies of it. Working data, a backup and an off-site backup (to prevent loss to fire, flood or theft, etc.).

rexdeaver said...

Is 360 TB enough?

http://www.diyphotography.net/360tb-disc-that-lasts-13-8-billion-years-is-this-the-future-of-data-storage/

Joe said...

A RAID 10 stack would be optimum, more TB, faster, and generally more likely to avoid catastrophic failure and data loss than single drives.

RAID 10 is now easy to build with top-quality bare RAID 10 enclosures costing a few hundred to several hundred dollars, plus the cost of the WD drives. (Their new Red series, especially the Enterprise grade drives, seem a very good blend of speed and long-term reliability.)

I've been building networks and file servers (including photo-only storage) since 1990, and this seems to be the best way to go at this time.

Also, direct external SATA connections (eSATA) for any drive tend to be a lot faster when moving large chunks of data than USB 3, which tends to bog down pretty quickly. Most mid-to-upper end PCs now have eSATA connections on their rear, and occasionally front, panels into which an eSATA connection to a RAID enclosure can be made.

eSATA even works well for single drives used to back up data and then take the backup off-premises. I've timed making the same 2TB photo backup with the same WD drive and drive attachment and the eSATA connection to the same drive finishes about 3-4 times faster than when using the USB 3 connection approach.

Joe said...

A RAID 10 stack would be optimum, more TB, faster, and generally more likely to avoid catastrophic failure and data loss than single drives.

Older forms of RAID are more troublesome, but RAID 10 or RAID 1 run disks in parallel, so that data is always written to two totally independent disk sets. Now that large capacity hard disks are so inexpensive, there's little reason to avoid using that storage level redundancy.

RAID 10 is now easy to build with top-quality bare RAID 10 enclosures costing a few hundred to several hundred dollars, plus the cost of the WD drives. (Their new Red series, especially the Enterprise grade drives, seem a very good blend of speed and long-term reliability.)

I've been building networks and file servers (including photo-only storage) since 1990, and this seems to be the best way to go at this time.

Also, direct external SATA connections (eSATA) for any drive tend to be a lot faster when moving large chunks of data than USB 3, which tends to bog down pretty quickly. Most mid-to-upper end PCs now have eSATA connections on their rear, and occasionally front, panels into which an eSATA connection to a RAID enclosure can be made.

eSATA even works well for single drives used to back up data and then take the backup off-premises. I've timed making the same 2TB photo backup with the same WD drive and drive attachment and the eSATA connection to the same drive finishes about 3-4 times faster than when using the USB 3 connection approach.

Dave V said...

I am trying to be disciplined about offsite backup to a disk that is a DIFFERENT brand than the main backup. (My techie friend says that this is best practice.)

I am considering renting a safety deposit box. For less than $30 I can get a 3x5x22 at a local credit union. That would hold a lot of properly labelled thumb drives/SD cards.

On the output side of the equation I have begun reserving 14 bit raw for circumstances where I need to do a lot of pushing. Candidates for 14 bit raw include sunsets where I want to crank the shadows so that I can see texture in the rocks. The difference between 14 and 12 bit raw is very, very subtle, but the file size really adds up on a hard disk. Most studio stuff looks great with the jpegs right out of the camera.

Robert Hudyma said...

Word of caution: a colleague of mine configured a NAS (network attached storage) device and stored all his photos on it. It was a RAID configuration so he felt safe. Then the power supply failed (it was a Dell server), it failed in such a way that it burned out all his hard drives with an over-voltage. All the drives were fried.

He was devastated, and didn't have a huge fortune to try and recover the drives (I would have tried to get replacement circuit boards for each drive to see if that worked but that is a crap shoot at best).

Multiple copies, off-site, and testing your back-ups really work is the only way to go.

I have synchronized computers at home and at work and if either fail I can clone the drives in about 6 hours. I also have external drives that have everything and that is updated monthly. Again identical copies at home and at work.

Edward Richards said...

Seagate is now selling a 8TB drive designed for backup and cloud storage use for less than $220. I have one running and it works fine. These can reduce the size of your drive herd. I use Crashplan for cloud backup. You can buy unlimited storage for a reasonable amount. It takes a while to get things uploaded, but like rust, it runs in the background and gets the job done. Since most of your content is stable through time, it should not matter how long it takes to load. If you ever need to recover the data, Crashplan will load it on a harddrive and overnight that to you for a reasonable fee.

Unknown said...

its a very bad idea to rely on just one device

its best to keep it simple, within budget, and tiered back up method

tier 1: so something like a WD My Book duo (set up Raid 1 mirrored) as primary external disk always attached to laptop/desktop. Have a 3rd identical barebone disk on hand if a single disk fails. this provides a right now level of protection. for those on a deadline, this is a reasonable level of protection. simpler and cheaper than raid 5 or raid 10. Raid 1 has fastest data read/write performance of other raid, is cheaper, and provides real time data protection

tier 2: a hard disk shuffle is a good plan

tier 3: maybe DVD or cloud? both DVDs and the cloud can fail. If the internet connection goes belly up, no access to the cloud.

tier 4: additionally, use a Uninterruptable power supply (UPS) on any external disks and desktop. Power fluctuations can be just as bad. Bad power = data corruption = totally lost data.

no matter what plan is used, sometimes bad things happens to good data.