The growth of data stored on file servers continues to expand at an alarming rate – over 20% per year according to analysts! Users continue to generate more files, save larger files, download more files, and save more files received as attachments with emails that they feel the need to keep for unlimited periods of time. In the past when disk space was constrained, users spent time housekeeping and disposing of unneeded files. Since most desktops now have 320G+ disk drives, users have little time or motivation to clean up the network storage they use. Even though hardware storage costs per gigabyte are dropping, the costs of managing storage are skyrocketing.
The result: most organizations solved the explosive growth of storage requirements by adding on expensive disk options to their network storage devices (i.e. NetApp, Dell, EMC). In the time of loose budgets, it was a simple, no-brainer solution – call the vendor and order an upgrade if storage space was depleting faster than expected.
Welcome to the “new normal”. Budgets are tight. Organizations have been forced to be more creative and spend time manually managing their storage devices to find room for the new data requirements while finding time to clean up existing files and finding ways to relocate files to other locations.
Every organization is making do with less - less money, less people. The “new normal” deals with more storage emergencies: out of space, backups fail, devices end of life, hardware crashes, and more.
Automatic storage upgrades used to be the norm. Buy more storage every year, change out the box to the latest-greatest every three years, and buy bigger backup capabilities as needed.
Now, there are a number of storage options, hardware and software, to allow organizations to handle storage requirements for growth at significantly lower costs than previous choices.
The reality of data: age and value Think about it – the reason storage growth is a problem is because you are saving all those user files that the users perceive they need instant access to… forever. Year after year the data keeps piling up.
When a user leaves your organization, a manager or colleague saves the entire collection of that user’s files into a folder – in case someday there is a need to find a critical piece of information in their “stuff”. This can grow to enormous amounts of space quickly. For example, an organization with reasonable employee turnover and reasonably sized My Documents folders (i.e. the author of this paper has 100+ GB after 5+ years) - terabytes to petabytes can happen too quickly to believe.
Then there are those “important” shared folders – old financial data, old projects, marketing graphics, contracts, diagrams, etc. that must be there “just in case” and “when we need them”. You also may have data you need to maintain forever for regulatory reasons. Of course, they have HUGE value to be found easily, but they don’t need to be taking up expensive space.
More reality of data: garbage and junk File servers are cluttered with useless and non-organization data:
• Five+ years of annual memos… for each employee
• iTunes libraries
• Downloaded videos
• Copies of programs and drivers – for obsolete hardware
• Copies of copies of copies of copies…
How can you clean up your users’ bad habits? Do you have time to browse directories and write a few scripts to recover some space? Are you backing up the junk over and over again? Without an automated solution, there is no method for enforcing any policies on file retention.
What you don’t know… you don’t know If you can’t see the problem, does it really exist?
Yes, it does and it’s bigger than you can imagine! There are millions and millions of files on your file servers - and you pretty much don’t have a clue of what they are, who put them there, how long are they needed for, and if they can be deleted or moved.
How can you make decisions on something you can’t really see in detail?
It’s unbelievable how many files are on production file servers that are:
• older than 5 years
• can’t be used with existing hardware or software
• that you faithfully continue to backup regularly
File Access Frequency by Age of File in Years
Emotional retention versus reality Users “need” to have full access to every last file, instantly and without searching… just in case.
If it was possible to monitor the files a user accesses by the age of the files, you’d see a graph similar to this:
The older a file is – the less it gets accessed. Our studies prove that less than .1% (1/10th of one percent) of files older than 1 year ever get accessed more than once!
The reality is:
• a user’s perception of the value of their data may or may not decrease over time
• user’s will not take the time to clean out the “not needed any more” files – no time for housekeeping
• user’s don’t know or care what storage device the files are on
• retrieval time at sub-second to seconds for less accessed files is not a factor in user satisfaction
Compliance and Clean up How do you know if users are abusing your organization policies? Stuffing your servers with music files, videos, and pictures takes up expensive space. Are there issues with copyright infringement
Backups and Deduplication Of course you do backups. And you test the backups to make sure you can recover, right?
Backup software and hardware have come a long way. Instead of brick level backups every night, you’re probably doing delta or incremental level backups – from a brick level backup you only backup those files that have changed each day. Every so often you make a new brick level backup and start the cycle again.
Backups are great if you need to recovery from a disaster. However, they are not the best solution to recover files or folders users have deleted.
Deduplication software and hardware go a step farther in that they try to match “data blocks” with previously stored blocks to further reduce the amount of data backed up. The algorithms for hashing and collisions continue to get better to make this process more accurate and efficient.
You are still repeatedly backing up garbage and junk as well as files that haven’t changed or been accessed for a long time! The current level of technology has convinced you into being more efficient in executing a bad process.
Storage Tiers and HSM As files get older; just move them to cheaper storage. Some are seamless to users and others give the user a search capability to locate older files. These solutions are the right way to go except they are expensive. Remember, the new normal? No budget for these fancy devices.
This is a really great solution – and it can be done with software
Software File Archiving
There are very effective and economical solutions to solving your storage problems that do not require additional hardware File Archiver is an example. In fact, if you implement a software file archiving strategy you can postpone hardware add-ons and upgrades for a long, long time.
Solving the need to have the most used files accessible on your high speed (higher cost) storage devices is the goal.
Leverage what you already have Software file archiving creates virtual volumes on less expensive storage hardware – SAN devices, SATA drives on older servers, etc. The software can move the files from the expensive storage to the lower cost storage by simple rules – i.e. age of the files, folder location, file type, etc. These rules are run nightly or weekly at off hours.
All file types are eligible for archiving and no client software is needed on any user’s computer.
Deduplication occurs across all file servers.
Seamless to users File Archiver software leaves a tiny 1K shortcut, “stub”, in the original file location that maintains the file icon and all security settings of the original file. The user clicks on the file as they normally would to open the file and behind the scenes the file is retrieved from its archived location and opened for the user.
If a user changes the file and saves it, the file stays in that location until the new version meets the rules criteria for archiving.
The results – you level off the growth on expensive storage.
As new data is added to your file servers, your policies are working behind the scenes to archive, move or delete older files. Your expensive storage is preserved for your most important and most accessed data.
Backup times are reduced The amount of data on your expensive storage devices is no longer cluttered with old, unneeded files. Backing up the virtual volumes is even better – static unchanging data is compacted so that the virtual volume needs to be backed up only once – the data inside will not change.
Restore Files or Folders – in Seconds If you or your help desk needs to restore a “lost” or an “accidently” deleted file or folder, then you just search the archive and click Restore. You can put back the file or folder – the complete file or just a shortcut.
Easier and faster disaster recovery If you have a storage hardware failure, you can nearly instantly restore all the files’ shortcuts from the archive repository. If you have 150,000 files that take up a terabyte and switch them to “stubs”, you only have to restore 1.5MB to get back into full operation.
Delete the junk – get in compliance You can manually delete any files that are no longer needed or shouldn’t be on your file systems – inappropriate stuff that is prohibited such as MP3s, videos, etc. Better yet, set up daily policies to delete all junk files before you make the backups.
Consolidating File Servers? Get more data on a single server with shortcuts to lower cost storage!
So what are you waiting for? Software file archiving is easy to install, low cost, and is independent of hardware storage devices.
You can be saving space the same day you install a software file archiving solution.
You will take care of immediate storage problems and put yourself into position to control your storage costs for the future.
How do you get started? The first step is to understand what you have!
You need visibility - solid, accurate information on how much space is being wasted, amount of duplication, analysis by file type/location/owner/age.
Put in the File Analyzer reporting solution. You will get browser-based, drill down reports to view your files by server, by owner, by size, by type, by name, etc. to pinpoint the usage and abuse of your storage. Find old files, unneeded files, duplicate files, across all servers and by existing folder structures.