IT executives are looking for ways to improve storage management and maximize their storage investment. Hierarchical Storage Management (HSM), a common approach in mainframe environments for many years, is now getting attention in client/server environments. In addition to the benefits that can be derived from HSM and other leading storage technologies, active archiving is recognized as a cost-effective strategy that solves the problem of excessive database growth for the long-term. In fact, active archiving can work within the framework of HSM systems to enable a best-practice "staged" approach to enterprise data management.
What do HSM and Active Archiving Offer?
HSM systems are designed to reduce storage and administration costs while keeping data available. This technology intelligently migrates files along a hierarchy of storage devices ranked in terms of cost per unit of storage, speed for access and retrieval, and available capacity. The HSM solution manages files based on the rules specified by the storage system administrator. Once these rules are defined, the HSM system manages file migration and demigration automatically, moving files of a certain age or type to the near-line storage device.
Although various products differ, HSM generally classifies data according to a three-tier architecture.
* Level One data (hot data) must be readily available. Hot data is kept on the network server.
* Level Two data (warm data) is accessed periodically. Warm data is migrated to a near-line storage device, such as a low-cost, high-capacity hard drive or an optical disk jukebox.
* Level Three data (cold data) is infrequently used, but is kept for business or legal reasons. Cold data is archived in a tape storage library.
Every time HSM migrates a file, it places a "stub" on the network server that points to the file location on the near-line storage device or tape. This pointer system is completely transparent to the user, who can view and access the file as if it were still stored on the server. When a user requests a file that has been migrated, the HSM system "de-migrates" the file by copying it to the appropriate higher level for easy access. The response time for accessing this data depends on which level in the HSM hierarchy the file is stored and the type of medium that is used to store the file.
Active archiving is similar to HSM, providing users the ability to manage rarely accessed data. However, unlike traditional archiving, active archiving is a proven technology that safely archives and removes nonessential data from complex relational databases with 100 percent accuracy. Going well beyond the traditional definition of archiving, active archiving allows companies to select and remove precise sets of rarely used data or "active reference data" from a production database and save the data to an Archive File, referentially intact and complete. Archive Files can be saved on the most convenient and cost-effective storage media where the data remains "active" for easy access and possible restoration when needed. These capabilities keep archived data accessible and dramatically reduce database overload, allowing companies to reduce storage requirements, improve application response time and reallocate current capacity to support more users and transactions.
It's true that active archiving and HSM both address the problem of explosive data growth by moving data to more cost-effective storage devices. However, active archiving is designed for relational data, while HSM is best suited for other types of data such as document files, bit maps, and video clips. Although HSM is ideal for managing these types of data, the technology is poorly matched for managing relational database tables, which can be very large.
Safely and Accurately Managing Relational Data
Traditionally, IT organizations have been reluctant to remove relational data from production databases because of inherent difficulties. Accidentally deleting essential data could bring mission-critical systems to a halt. Another major concern is the need to quickly locate and access data once it is archived and removed from the production database. Responding to audits, lawsuits, government or security investigations, as well as answering customer questions, may require access to archived data and possibly restoring it to production.
Active archiving provides a "best practice" strategy for managing relational database growth. This proven technology cleanly and safely archives and removes precise subsets of relational data accurately and completely, while maintaining the integrity and business context of the data. The active archiving process saves not only the data, but also the metadata describing tables, columns and relationships used to create the archive, ensuring that archived data can always be accessed and restored if needed, even if the data model has changed.
Active archiving technology is unique because it understands and intelligently processes data relationships, regardless of the complexity. This understanding is obtained from a database catalog and from supplemental relationships defined to a shared directory. These capabilities makes it easy to define, retrieve and manipulate complex, referentially intact sets of related data from multiple tables, without writing low-level extract routines.
Improving Performance, Availability and ROI
By separating mission-critical data from non-critical data, companies can safely reduce the size of overloaded databases by up to 50 percent or more during the initial archive. Significant improvements in application performance and availability are realized immediately. Response time is faster and access to decision-making information is easier. Service levels improve and productivity is enhanced. Expensive upgrades and maintenance fees can be deferred or eliminated, reducing operating costs. Ongoing active archiving (daily, weekly or monthly) helps manage data growth and keeps applications operating at peak performance.
Active archiving keeps databases streamlined, significantly reducing the time and resources needed to rebuild the database when disaster strikes. The recovery process can be staged by recovering mission-critical data first and nonessential data later, if necessary. This strategy enables IT organizations to maintain databases at a manageable size that allows them to meet their disaster recovery service level agreements. Software upgrades routine backup and restore procedures will also take much less time.
Active archiving is database independent, allowing IT organizations to manage archived data across the leading relational database platforms (Oracle, DB2/UDB, Sybase, SQL Server, and Informix). It is possible to archive and restore data from one database platform to another or archive data from multiple relational databases to create one or more enterprise Archive Files. If the DBMS platform changes, then active archiving will continue to support the enterprise data management strategy through the transition.
Active Archiving Enhances the Value of HSM Storage
While HSM is file-based, active archiving is row-based. HSM manages relational data at the table or dataset level (a relational database table is physically stored as a file), and active archiving manages relational data at the row or record level. Active archiving works within the framework of HSM technology and serves as a gateway for HSM to manage subsets of rarely accessed relational data at the row level. This capability enhances the value of HSM storage.
Although USM can migrate relational database tables up and down the HSM hierarchy, based on the last reference to any data in a database table or file, the size of the production database does not change. Because it's likely that users may need to access a small part of the database at least once during the period designated by the system administrator, the entire relational database file will probably continue to reside at Level One storage.
For example, a customer database table is usually accessed on a regular basis, which would keep it at Level One. Although only a subset of this data remains "hot" (that is, current customers), the entire dataset must be kept on the network server including historical customer data that is rarely, if ever, needed. This limitation minimizes the potential benefit of HSM, degrades application performance and limits the availability of mission-critical CRM applications
Active archiving addresses this problem by safely removing historical customer data from the Level One online database at the row level and storing it in a compressed format in one or more Archive Files, which are then managed by HSM. Immediate benefits include faster customer response time, improved call agent productivity, and the ability to manage larger call volumes while reducing the average cost per call.
Companies that have already deployed HSM understand the benefits of "staged" data management for flat files. The HSM migration rules can be defined to manage "warm" and "cold" Archive Files. By combining HSM and active archiving, IT organizations benefit from both streamlined relational databases and HSM managed Archive Files. Routine active archiving can be implemented across the enterprise, with or without an HSM solution, to achieve a greater return on investment from the existing IT infrastructure.