The challenges to connect the public cloud storage
Enterprises are increasingly adopting cloud storage options because they need more capacity, elastic capacity and a better way to manage storage costs over time. The growing amount of enterprise data is proving too difficult for IT departments to manage using their data center alone. Migrating and managing your data storage in the cloud can offer significant value to the business. A cloud storage migration is when a company moves some or all of its local data into the cloud, usually to run on the cloud-based infrastructure provided by a cloud service provider such as AWS and Azure.The main challenge of the cloud storage migration here is how to carry out your migration with minimal disruption to normal operation, at the lowest cost, and over the shortest period of time. If your data becomes inaccessible to users during a migration, you risk impacting your business operations.The biggest challenge of the cloud storage migration for most small to medium size companies is the application redevelopment to adopt the cloud storage, most companies can't afford the expense. CloudTier Cloud Connect provides a complete solution to transparently connect to Amazon S3 storage and Azure storage. CloudTier uses the cloud storage as second tier, it can automatically to move data between local and the cloud, so your application doesn't need to do any change, it can access the cloud storage just like the local one transparently.Tiered Storage File System Filter Driver Technology
The Cloud storage tiering was implemented with tiered storage file system filter driver. A file system filter driver intercepts requests targeted at a file system or another file system filter driver. By intercepting the request before it reaches its intended target, the filter driver can extend or replace functionality provided by the original target of the request. File system filtering services are available through the filter manager in Windows. The CloudTier tiered storage filter driver can intercept the file I/O to the local storage and redirect it to the remote cloud storage by implementing the file system filtering functionalities which was provided by the Filter Manager framework.
Hierarchical storage management
CloudTier Cloud Storage Connect is a data storage technique which automatically moves data between high-cost local disk and low-cost remote cloud storage. CloudTier Cloud Storage Connect can help simplify the migration process by providing the transparent file access from the remote storage. Using the cloud as a storage tier, data can first be moved to a ‘warm’ archive tier of higher-performance disk, where it can still be accessed quickly to meet RPO and RTO SLA’s. As you retain archives for longer, older data can then be moved to a ‘cold’ archive tier with better economics. (This is similar to the tiered storage cost/performance model offered by Amazon S3 with its “warm” Standard tier, “cold” Infrequent Access tier and “frozen” Glacier tier.)Integrate your exiting on-premises applications with remote cloud storage transparently
Our CloudTier Cloud Storage Connect service can connect an on-premise software appliance with cloud-based storage to integrate your existing on-premises applications with the remote cloud storage infrastructure in a seamless, secure, and transparent fashion.There are no interruption to migrate your on-premise files to the remote cloud storage, don't need to change your existing applications and infrastructure.- Set up file cloud migration policies based on the file type, file size, file attributes.
- Create stub file based on the policies after the file was migrated to the cloud storage, it can free up the space from on-premise storage.
- Transparent the cloud storage access by reading the stub file for your local application.
- Transparent moving data back from remote cloud storage to the local, re-hydrate the stub file for the recent access file based on the policies.
How to connect to Amazon S3
- Make sure you have a S3 key pair. You will need both the access key ID and the secret access key in order to continue. You can get them from the S3 console website.
- Select Amazon_S3 cloud provider name. Click "Add Site" button to create a new site for the amazon s3 connection.
- Put your site name and then enter your access key id and secret access key in the text boxes, choose the region in your setting.
- Check the enable upload multiple parts box if you want to use parallel upload tasks for a file.
- Check the enable parallel download box if you want to use parallel download tasks for a file.
- Set the number of the parallel tasks for upload or download.
- After filled in all the data, click apply to save the settings.
- Click test connection to check if your setting is correct.
How to connect to Microsoft Azure Storage
- Get your connection string from the Microsoft Azure Dashboard Portal site, by clicking on the link to the Dashboard website.
- Select AzureStorage cloud provider name. Click "Add Site" button to create a new site for the Azure storage connection.
- Put your site name and then enter your connection string in the text box.
- Check the enable upload multiple blobs box if you want to use parallel upload tasks for a file.
- Check the enable parallel download box if you want to use parallel download tasks for a file.
- Set the number of the parallel tasks for upload or download.
- After filled in all the data, click apply to save the settings.
- Click test connection to check if your setting is correct.
CloudTier Cloud storage explorer
With the cloud storage explorer, you can seamlessly view and manage your files in different cloud storage providers, you can download or upload your files easily using with intuitive interface as below.Create a stub file
A stub file is a file with sparse file and reparse point attributes, you can attach your customized tag data to the stub file. A stub file looks and acts like a regular file. It has the same file attributes with the original physical file (file size, creation time, last write time, last access time). It also keeps the original file's security. The difference between the stub file and the normal physical file is the stub file doesn't take any physical space in disk, looks like a 0 kb file. Below is the stub file screenshot:The API to create stub file
BOOL CreateStubFile( LPCTSTR fileName, LONGLONG fileSize, ULONG fileAttributes, ULONG tagDataLength, BYTE* tagData, BOOL overwriteIfExist, PHANDLE pHandle )
Read the stub file
To read a stub file, first you need to register the stub file callback service. When the filter driver service is started, the CloudTier filter driver will handle all the IO requests of the stub file,when you read or write to the stub file, the MessageCallback function will be invoked with the MessageSend data structure.BOOL RegisterMessageCallback( ULONG ThreadCount, Proto_Message_Callback MessageCallback, Proto_Disconnect_Callback DisconnectCallback )
Handle the callback function
- Return block data. When the application reads the stub file with specific offset and length of the file, you only need to return the specific block data with the user requested, block reading method is especially good for the large file, for example, the application only wants to read the 64kb data in offset 0 for a file size is 100GB, you will only need to return the 64Kb data back to the application, the performance is much better.
- Return a cache file. When the application reads the stub file, you download all the data back to a cache file, and return the cache file name to the file system filter driver, the filter driver will read the data from there and return it back to the application.
- Rehydrate the stub file on first read. When the application reads the stub file, you download all the data back to a cache file, and return the cache file name to the file system filter driver, the filter driver will read the data from there and write the data back to the stub file, the stub file was restored back to the original normal physical file. The rehydration process is moving the storage tier 1 back to the storage tier 0. this process is also completely transparent to the application
public struct MessageSendData
{
public uint MessageId; //this is the request sequential number.
public IntPtr FileObject; //the address of FileObject,it is equivalent to file handle,it is unique per file stream open.
public IntPtr FsContext; //the address of FsContext,it is unique per file.
public uint MessageType; //the I/O request type.
public uint ProcessId; //the process ID for the process associated with the thread that originally requested the I/O operation.
public uint ThreadId; //the thread ID which requested the I/O operation.
public long Offset; //the read/write offset.
public uint Length; //the read/write length.
public long FileSize; //the size of the file for the I/O operation.
public long TransactionTime; //the transaction time in UTC of this request.
public long CreationTime; //the creation time in UTC of the file.
public long LastAccessTime; //the last access time in UTC of the file.
public long LastWriteTime; //the last write time in UTC of the file.
public uint FileAttributes; //the file attributes.
public uint DesiredAccess; //the DesiredAccess for file open, please reference CreateFile windows API.
public uint Disposition; //the Disposition for file open, please reference CreateFile windows API.
public uint SharedAccess; //the SharedAccess for file open, please reference CreateFile windows API.
public uint CreateOptions; //the CreateOptions for file open, please reference CreateFile windows API.
public uint CreateStatus; //the CreateStatus after file was openned, please reference CreateFile windows API.
public uint InfoClass; //the information class or security information
public uint Status; //the I/O status which returned from file system.
public uint FileNameLength; //the file name length in byte.
[MarshalAs(UnmanagedType.ByValTStr, SizeConst = MAX_FILE_NAME_LENGTH)]
public string FileName; //the file name of the I/O operation.
public uint SidLength; //the length of the security identifier.
[MarshalAs(UnmanagedType.ByValArray, SizeConst = MAX_SID_LENGTH)]
public byte[] Sid; //the security identifier data.
public uint DataBufferLength; //the data buffer length.
[MarshalAs(UnmanagedType.ByValArray, SizeConst = MAX_MESSAGE_LENGTH)]
public byte[] DataBuffer; //the data buffer which contains read/write/query information/set information data.
public uint VerificationNumber; //the verification number which verifies the data structure integerity.
}
Cloud Storage Gateway
To connect and manage the cloud storage, we implement a cloud provider class for different cloud providers with below interface, we have implemented all these functionalities for Amazon S3 and Azure Storage with the .NET SDK, you can add more cloud provider code to here.public abstract class CloudProvider : IDisposable
{
public abstract bool GetDirFileList();
public abstract bool AsyncDeleteFile();
public abstract bool AsyncDownload();
public abstract bool AsyncUpload();
public abstract bool AsyncMakeDir();
public abstract bool AsyncRenameFile();
public abstract bool AsyncDownloadDirectoryList();
public abstract bool IsDirectoryExist(string directoryName);
}