Regulatory compliance and business risk plays a significant role in the implementation methodologies of corporate information systems. Further, the compliance and business risks associated with these corporate information systems are, in general, well known. However, as part of the implementation process many of these information systems will be populated with legacy data. The compliance and business risks associated with migrating this legacy data and content into a new system are not necessarily understood. In this context, risks associated with data migrations are a direct result of migration error. Further, industry testing strategies to mitigate such risk—or more specifically data migration error—lack consistency and are far from deterministic. This is the first of two articles that present thoughts and recommendations on how such a testing strategy can be designed.
The information presented in this article includes some of the lessons learned from our clients' quality control and the actual error history from testing the migrations of hundreds of thousands of fields and terabytes of content.
Valiance Partners has tested hundreds of data and content migrations, primarily in FDA-regulated industries (pharmaceuticals, medical devices, biotechnologies and food products) and in the auto and manufacturing industries as well. The information presented here includes some of the lessons learned from our clients' quality control experience and the actual error histories.
The recommended approach to designing migration testing strategies is to document the risk, the likelihood of occurrence and then to define the means to mitigate risk via testing as appropriate. The identification of risk is tricky and much of the process will be specific to the system that needs to be migrated. Having an understanding of the type of data being migrated and the characteristics of the destination system are good starting points. There is also the realization that most migrations will encounter unexpected types of error. Here are a few actual examples:
If these risks or error conditions can be predetermined, designing the testing strategy can be straightforward. As migrations yield many examples of "needle-in-a-haystack" error conditions, designing a testing strategy can become a complex affair. In an attempt to create a comprehensive list of error conditions, Valiance began logging actual error conditions and categorizing these conditions. The list below presents these error categories and a few related error conditions:
A complete list of error categories and associate error conditions will be published at Valiance shortly.
If mitigating the risk involved with data migrations involves appropriate testing to minimize migration error, what are the options?
The de facto approach to testing data and content migrations relies upon sampling, where some subset of random data or content is selected and inspected to ensure the migration was completed "as designed." Those that have tested migrations using this approach are familiar with the typical iterative test, debug and retest method, where subsequent executions of the testing process reveal different error conditions as new samples are reviewed.
Sampling works, but is reliant upon an acceptable level of error and an assumption pertaining to repeatability. An acceptable level of error implies that less than 100 percent of the data will be migrated without error and the level of error is inversely proportionate to the number of samples tested (refer to sampling standards such as ANSI/ASQ Z1.4). As per the assumption on repeatability, the fact that many migrations require four, five or more iterations of testing with differing results implies that one of the key tenants of sampling is not upheld, i.e., "nonconformities occur randomly and with statistical independence..."
Dependent on specific requirements, sampling may have a role in a well defined migration testing strategy. But what are the alternative approaches to sampling that may be more appropriate for other testing scenarios?
Valiance's next article, scheduled for publication in January, will describe the various options for data migration testing and provide a set of recommendations to create a data migration testing strategy that minimizes the chance of error for a specific migration.