Background and Challenges
For most companies, data – especially unstructured data, continues to grow by 50 percent annually. The impact of spending more every year on storage, and on protecting and managing information has often pushed IT departments to their limits. A myriad of strategies and solutions, including hardware and software, are emerging to help IT managers address these issues. Effective implementation of tiered storage is an important solution for addressing these challenges. EaseFilter Inc. developed an EaseTag Tiered Storage SDK which can help you seamlessly migrate your storage to cloud.
Tiered Cloud Storage
Tiered storage is an underlying principle of ILM(information lifecycle management). It is a storage networkingmethod where data is stored on various types of media based on performance, availability and recovery requirements. For example, data intended for restoration in the event of data loss or corruption could be stored locally -- for fast recovery -- while data for regulatory purposes could be archived to lower cost disks.
Today's tiered storage infrastructures range from simple two-tier architecture consisting of SCSI or fibre channel attached disk and tape to cloud storage. Regardless of the method of tiering, organizations are looking to tiered storage and ILM to lower cost and improve operational efficiency.
Implementing tiered storage infrastructures can dramatically decrease the cost associated with achieving an RPOand RTO of zero. Classification of data can provide different RPOs and RTOs based on application and business requirements. Policy-based data migration ensures that the right data is in the right place at the right time.
Using the cloud as a storage tier
Public cloud computing enables users and IT departments to deploy applications without having to make capital investments in computer hardware.And with storage forming an increasing part of on-premise budgets, public cloud storage provides a way to convert storage costs to an operational expense, rather than a capital expense.
Using the cloud as a storage tier, data can first be moved to a ‘warm’ archive tier of higher-performance disk, where it can still be accessed quickly to meet RPO and RTO SLA’s. As you retain archives for longer, older data can then be moved to a ‘cold’ archive tier with better economics. (This is similar to the tiered storage cost/performance model offered by Amazon S3 with its “warm” Standard tier, “cold” Infrequent Access tier and “frozen” Glacier tier.)
EaseTag Tiered Storage Filter Driver SDK
EaseTag Tiered Storage (hierarchical storage management, HSM) Filter Driver SDK, is a data storage technique which automatically moves data between high-cost and low-cost storage media, such as network attached storage(NAS),optical discs and cloud storage. A stub is created for and replaces each migrated file in the fast disk drives. On the local system, a stub file looks and act like a regular file. When the user application accesses a migrated file stub, the Windows operating system transparently directs a file access request to the EaseTag Tiered Storage SDK. The EaseTag driver will send the request to the remote site to retrieve the data back from the repository to which it was migrated(see Figure 1).
Figure 1. Tiered Storage Data Flow Chart
The automated tiered storage can integrate with existing applications, without affecting the original data and programs. Without any modification of existing applications, the local storage can automatically be extended to the network storage.
Tiered storage can be widely used in telecommunications, government, oil, medical and other industries. Tiered storage is the first choice of medical PACS (Picture Archiving and Communication System, medical imaging storage and transmission systems), a lot of data in such applications are rarely visit, these data are transferred to a less expensive network storage. When users and applications access the stub files in the local storage, it is completely transparent, the system will automatically restore the data back to the stub file from the network storage server. The network attachedstorage is scalable, tiered storage products provide users with an infinite online data space.
The main advantages of EaseTag Tiered Storage are:
- Lower Storage Costs.If you have two terabytes of expensive server storage where 50% of the data is never or rarely accessed, with EaseTag Tiered Storage, you can transfer a terabyte data to the NAS storage or cloud storage, you can save a terabyte storage space in SAN RAID storage.
- To maximize the server's hard disk available space. Reclaim storage space without disrupting users. Set up policies to automatically remove older files from file servers, cleaning up disk space, and replace them with an intelligent shortcut "stub" that invisibly retrieves the original file from the archive.
- To improve efficiency. When the user needs these data, it can be accessed transparently in real time. If you need to recover and restore a file that was accidently deleted or modified in error, you can restore the file or even a whole folder from the repository.
- Reduce the server backup time and recovery time, only need to backup frequently used files.
- Improve data security. the data in the server can be encrypted, and access these data through the storage management software, only authorized users can access the data, and can log the access activities.
- To remove duplicate data, the storage server only keep single instance
Best-practice cloud storage deployment procedures
The following sections provide some best-practice procedures to assist you in planning and implementing your cloud storage deployment.
1. Set up migration policies:
a) Migrate and stub files based on file type, for example only process all .JPG, .PPT, .BMP, .GIF, .TIF, .PDF, and .PSD files.
b) Migrate and stub files based on file size, for example only process file size greater than 10MB files.
c) Migrate and stub files based on file time stamp, for example only process file creation/modified/last access time greater than 60 days
The below example shows how to migrate the files
The volume ArchiveVolume total capacity is 3.34TB, there are a folder calls ArchiveFolder, there are two files, one is 100GB, another one is 10GB， the total space on disk is 110Gb，they are very big two files(see Figure 2 and 3).
Figure 2. The volume and folder properties before migration
Figure 3. The file property before migration
2. Data migration and create stub files
After the files were transferred to the cloud data center, then make these files to stub files ( see Figure 5). The stub file only take 4k physical space, and the file size didn’t change, compare the Figure 2 and Figure 4, it shows that the volume gains 110 GB free space back, that’s because the original physical files data space were released. The stub files have the offline icon which will let the backup software and anti-virus software don’t read these files.
Figure 4. The volume and folder properties after migration
Figure 5. The file property after migration
3. Access stub files or restore files
When the user application access the stub files, the EaseTag tiered storage SDK can read back a block of data of the file, this method is especially good to the large file, it is much faster than restore back the whole file. For example, the application only wants to read back 64kb data in offset 0 for a file size is 100GB, it will only return the 64Kb data. For some applications, they need to read the whole file, the SDK also can restore the whole file in the first read.
4.The data disposition
When the files don’t use any more, it can be disposed from the repository. The data will be deleted permanently .
We specialize in file system filter driver development. We architect, implement and test file system filter drivers for a wide range of functionality. We can offer several levels of assistance to meet your specific needs:
- Provide consulting service for your existing file system filter driver.
- Customize the SDK to meet your requirement.
- Create your own filter driver with our source code.