Wednesday, June 20, 2007

Database Growth and Solutions Part III

Well this is our final in the series on Database Growth and Solutions for that growth. Today with focus on Database Archiving and Hierarchical Storage Management (HSM).

Database Archiving

Database archiving provides the ability to archive data that is seldom accessed off to various storage options while retaining the ability to easily access that data and ensuring referential integrity as well as providing the ability to easily remove that data once its retention requirements have been met.

First you need to determine what data you wish to archive and what your data retention and availability requirements are for that data. How long are you willing to wait to retrieve that historical data? Can you view that data through a different medium other than your current application? Are you at risk if you do not archive your data?

Among the options for archiving data are:
  • Backups/snapshots of the database in which you keep the data available for as long as you it. This unfortunately is just a single snapshot and does not address a rolling archive type strategy.
  • For certain data, you can export the data and keep it as a dmp file or extract the data into a CSV file.
  • You can build a datamart/datawarehouse or another reporting database and relocate the data.
  • Third party tools that will extract the pertinent data, file/store it and then provide the ability to remove it from the source database. Many third party vendors provide these tools and they are constantly improving their products to be able to execute the archiving capability against most modules in the E-Business Suite.

Make certain that if you back-up data that you may not need for several years, that you take into consideration the software and platform that it currently resides on. You may want to keep a backup copy of the OS software and application/database binaries filed away safely as well. You may be forced to rebuild an environment to be able to retrieve that data and then find out you can’t get that OS and software anymore.

In comparison to partitioning, this data, once it has been archived, can be fully removed from the source database to the media/format you have chosen (see Hierarchical Storage Management below). Depending on the complexity of your data, that is where the effort becomes a great deal more difficult to implement if you attempt to build a custom developed solution to archiving. Again, it may be easier and more cost effective to use the third party vendors to implement your strategy.Hierarchical Storage Management (HSM)

It’s time to realize that not all data is equal. Some data is business critical and needs to be accessed in milliseconds. But much of the data we accumulate is not so critical nor does it require the same level of access. But ask yourself this hard question: how much of your data do you store on expensive, highly-redundant storage but which is rarely accessed and is not business critical? That’s where HSM comes in.

Hierarchical Storage Management (HSM) views all data as being in some phase of its “lifecycle”. Like most of us a piece of data is born, serves some purpose and slowly declines in value to the organization. That’s not a very happy thought for human beings but for data we can be less emotional.

A typical data lifecycle would include points where it is transactional, referential, historical, auditable and, finally, disposable. Transactional data is business critical and highly relevant to operations. It requires high speed access and experiences high incidences of retrieval. On the other end of the lifecycle, auditable data requires lower access speeds and also low incidences of retrieval. Plus it may be read-only at this point in its life. So why store both in the same storage environment and hassle with the performance degradation?

Here are some points to get you started:
  • Evaluate your data store, categorizing various types of data into one of the data lifecycle phases. How often is it accessed? How fast is it needed? What value does it have? Who owns it? How many users require it?
  • Consider where the data is on your storage platforms. Could it be more efficiently stored elsewhere? You probably would cringe at the thought but there is probably some data that needs to be relegated to microfiche and much that can be archived to tape and stored.
  • Evaluate your Service Level Agreements for data management with your stakeholders. Help them see the value of HSM.
  • Evaluate legal requirements for data storage – does the data need to be easily accessible, or merely accessible in some format? Is there any data – historical mailnotes come to mind – that must be available in the case of a lawsuit, but legally only has to be available only on paper, for the purposes of discovery?
  • Consider the options for HSM tiers of data storage. Here are the most popular.
  1. Tape Backups
  2. Microfiche
  3. Secondary data storage (lower-cost and slower storage)
  4. Printed
  5. Optical Disk
  6. Delete it
  • Explore the HSM options from storage vendors.
  • Publish your HSM policy and ensure the buy–in of the data owners.

We recommend that you start where you can show big storage wins quickly.

Please respond back if you have other solutions and comments that we could add to a future blog.

No comments: