The cloud storage migration challenges
A cloud storage migration is when a company moves some or all of its local data into the cloud, usually to run on the cloud-based infrastructure provided by a cloud service provider such as AWS and Azure.The main challenge of the cloud storage migration here is how to carry out your migration with minimal disruption to normal operation, at the lowest cost, and over the shortest period of time. If your data becomes inaccessible to users during a migration, you risk impacting your business operations.
The biggest challenge of the cloud storage migration for most small to medium size companies is the application redevelopment to adopt the cloud storage, most companies can't afford the expense. With CloudTier Tiered Storage SDK, it uses the cloud storage as second tier, it can automatically to move data between local and the cloud, so your application doesn't need to do any change, it can access the cloud storage just like the local one transparently.
Cloud Storage Migration Benefits With CloudTier Tiered Storage SDK
Tiered Storage File System Filter Driver Technology
The Cloud storage tiering was implemented with tiered storage file system filter driver. A file system filter driver intercepts requests targeted at a file system or another file system filter driver. By intercepting the request before it reaches its intended target, the filter driver can extend or replace functionality provided by the original target of the request. File system filtering services are available through the filter manager in Windows. The CloudTier tiered storage filter driver can intercept the file I/O to the local storage and redirect it to the remote cloud storage by implementing the file system filtering funtionalities which was provided by the Filter Manager framework.
Hierarchical storage management
CloudTier Tiered Storage (hierarchical storage management, HSM) Filter Driver SDK, is a data storage technique which automatically moves data between high-cost local disk and low-cost remote cloud storage. CloudTier tiered storage SDK can help simplify the migration process by providing the transparent file access from the remote storage. Using the cloud as a storage tier, data can first be moved to a ‘warm’ archive tier of higher-performance disk, where it can still be accessed quickly to meet RPO and RTO SLA’s. As you retain archives for longer, older data can then be moved to a ‘cold’ archive tier with better economics. (This is similar to the tiered storage cost/performance model offered by Amazon S3 with its “warm” Standard tier, “cold” Infrequent Access tier and “frozen” Glacier tier.)
Transparent cloud storage migration
CloudTier SDK can create a stub file when your local file was migrated to the cloud storage. On your local storage, a stub file looks and acts like a regular file without taking the physical space, the stub file keeps all the properties and securities of the original file with the embedded customized meta data. When the user application accesses a migrated stub file, the CloudTier driver will retrieve the data back from the remote storage and return it back to the application, it is completely transparent to the application.
Cloud-Based Archiving Solution
Cloud-based archiving software can save you time and money, is affordable, flexible and scalable for small to medium sized business. Set rules in the system to automatically archive files to the cloud that meet custom, organization-specific criteria.Your cloud archive storage can be accessed immediately with no delay, exactly when your organization needs it.Users have access to stubs (ghost files) for immediate and seamless access to data in the cloud without migrating to a different screen or view.
With the data tiering technology provided by CloudTier SDK, migrating data into the cloud is never simple as it was, your existing application doesn't need to be changed, you don't need to develop the new application in the cloud, the current infrastructure doesn't need to be changed. So there are no extra cost for the cloud storage migration, it can reduce the cloud storage deployment expense and reduce the deployment time dramatically.
High Availability with CloudTier Tiered Storage Technology
The CloudTier Tiered Storage configuration provides a high availability infrastructure by maintaining the local storage and remote cloud storage environments and synchronously writing to them during storage operations. This ensures that, from an application or end-user perspective, there is no downtime as there is a seamless transition to the cloud storage in case the primary storage fails.
In terms of SLA numbers, CloudTier Tiered Storage can help you achieve a recovery point objective (RPO) of zero and a recovery time objective (RTO) of less than 1 second. This feature ensures that your cloud environment is resilient, safe from service disruptions, and able to host critical workloads as well as data migration processes without requiring expensive HA setup on the application side.
The Advantages of CloudTier Tiered Storage
- Lower Storage Costs: If you have two terabytes of expensive server storage where 50% of the data is never or rarely accessed, with CloudTier Tiered Storage, you can transfer a terabyte data to the NAS storage or cloud storage, you can save a terabyte storage space in SAN RAID storage.
- To maximize the server's hard disk available space: Reclaim storage space without disrupting users. Set up policies to automatically remove older files from file servers, cleaning up disk space, and replace them with an intelligent shortcut "stub" that invisibly retrieves the original file from the archive.
- To improve efficiency:When the user needs these data, it can be accessed transparently in real time. If you need to recover and restore a file that was accidently deleted or modified in error, you can restore the file or even a whole folder from the repository.
- Reduce the server backup time and recovery time, only need to backup frequently used files
- Improve data security: the data in the server can be encrypted, and access these data through the storage management software, only authorized users can access the data, and can log the access activities.
- To remove duplicate data, the storage server only keep single instance
Best-practice for Tiered Storage
Tiered storage can be widely used in telecommunications, government, oil, medical and other industries. Tiered storage is the first choice of medical PACS (Picture Archiving and Communication System, medical imaging storage and transmission systems), a lot of data in such applications are rarely visit, these data are transferred to a less expensive network storage. When users and applications access the stub files in the local storage, it is completely transparent, the system will automatically restore the data back to the stub file from the remote cloud storage. The cloud storeage is scalable, tiered storage products provide users with an infinite online data space.
Cloud Storage Migration Procedures
The following sections provide some best-practice procedures to assist you in planning and implementing your cloud storage deployment.
Set up migration policies:
- Migrate and stub files based on file type, for example only process all .JPG, .PPT, .BMP, .GIF, .TIF, .PDF, and .PSD files.
- Migrate and stub files based on file size, for example only process file size greater than 10MB files.
- Migrate and stub files based on file time stamp, for example only process file creation/modified/last access time greater than 60 days
The below example shows how to migrate the files
The volume ArchiveVolume total capacity is 3.34TB, there are a folder calls ArchiveFolder, there are two files, one is 100GB, another one is 10GB， the total space on disk is 110Gb，they are very big two files(see Figure 2 and 3).
Figure 2. The volume and folder properties before migration
Figure 3. The file property before migration
Data migration and create stub file
After the files were transferred to the cloud data center, then make these files to stub files ( see Figure 5). The stub file only take 4k physical space, and the file size didn’t change, compare the Figure 2 and Figure 4, it shows that the volume gains 110 GB free space back, that’s because the original physical files data space were released. The stub files have the offline icon which will let the backup software and anti-virus software don’t read these files.
After the local files were migrated to the cloud, the new storage tier was created, the local file is the storage tier 0, the cloud storage was the storage tier 1. Moving the data from tier 0 to tier 1 is transparent to the application.
Figure 4. The volume and folder properties after migration
The API to create stub file
BOOL CreateStubFile( LPCTSTR fileName, LONGLONG fileSize, ULONG fileAttributes, ULONG tagDataLength, BYTE* tagData, BOOL overwriteIfExist, PHANDLE pHandle )
Figure 5. The file property after migration
Stub file reading and stub file rehydration
To read the stub file, it requires to enable the CloudTier filter driver service with below API:
BOOL RegisterMessageCallback( ULONG ThreadCount, Proto_Message_Callback MessageCallback, Proto_Disconnect_Callback DisconnectCallback )
After the serivice is enabled, the read or write request to the stub file, the MessageCallback function will be invoked, in the callback funtion you can read back a block of data of the file or get back the whole file, block reading method is especially good to the large file, the perfomance is much better than restore back the whole file. For example, the application only wants to read back 64kb data in offset 0 for a file size is 100GB, it will only return the 64Kb data.
There are another option is to rehydrate the stub file on the first read on demand, when the stub file was rehydrated, all the data of the file was written back to the stub file, the stub file was retored back to the original normal physical file. The rehytration process is moving the storage tier 1 back to the storage tier 0. this process is also completely transparent to the application.
The data disposition
When the files don’t use any more, it can be disposed from the repository. The data will be deleted permanently.
About EaseFilter Inc.(https://www.easefilter.com)
We specialize in file system filter driver development. We architect, implement and test file system filter drivers for a wide range of functionality. We can offer several levels of assistance to meet your specific needs:
· Provide consulting service for your existing file system filter driver.
· Customize the SDK to meet your requirement.
· Create your own filter driver with our source code.