Zephyrnet Logo

Becoming a Prized Data Warehouse and Data Integration Tester

Date:

Click to learn more about author Wayne Yaddow.

Data warehouse (DW) testers with data integration QA
skills are in demand.

Data warehouse disciplines and architectures are well established and often discussed in the press, books, and conferences. They have become a standard necessity for most modern organizations. Each business often uses one or more data warehouse systems to make company decisions every day. In a word, the data warehouse is a standard component of business infrastructures.

New business models, constant technological progress, and ever-changing legal regulations require that companies replace their business applications. As a side effect, there are demands for integrating data from existing source applications to target applications – often supported by a data warehouse. Doing so calls for a stringent data integration process model combined with well-defined quality assurance measures.

Every year, companies lose millions as a result of
inaccurate and missing data in their operational databases. This severe issue
corrupts data warehouses, causing them and reliant applications to fail.

As a discipline, data warehouse quality assurance
covers much more than technology. It includes roles and organizational
structures, monitoring, measuring, reporting, and remediating data warehouse
issues.

Data warehouse testers’ technical skill requirements
are unique and demanding, but the rewards can be many. For those who enjoy
testing, hold a strong interest in Data Quality, have a knack for coding
queries, data warehouse, and data integration, DW testing may be a significant
career step.

To get started or further enhance the know-how needed
as a data warehouse tester, several learning stages may be necessary. They are
listed below with suggestions for gaining the required knowledge and
experience.

Understanding Data Warehousing Concepts  

QA analysts should understand the terminology used in
data warehousing, the basic flow of DW data, and routine testing tasks on an
ETL QA project. Data models and data mapping documents are standard means for
expressing DW requirements and architecture. Therefore, the ability to comprehend
DW requirements, then create test scenarios, is a must. Ralph Kimball and Joe
Caserta’s book “The Data Warehouse ETL Toolkit” (Wiley Publishing) and Andy
Oppel’s “Databases: A Beginner’s Guide” (McGraw Hill Publishing) are good
starting points. These books offer an overview as well as some detail about the
related architectures and terminology. Larry English’s “Improving Data Warehouse and Business Information Quality” (Wiley
Publishing) offers several chapters on assessing and assuring Data Quality.

Figure 1 displays a basic representative data warehouse implementation – from the identification of source data (lower left) to report and portal reporting (upper left). In between, several typical phases of the end-to-end data warehouse development process are depicted, such as source extract to staging, dimension data load to the operational data store (ODS), fact data to the data warehouse, and report and portal functions extracting data for display and reporting. Figure 1 illustrates that all ETL programs and data movements should be verified throughout the end-to-end QA process.

Figure 1: Where testing is required in a data warehouse structure

Grasping the Challenges of DW Testing

Testers should understand the challenges that data
warehouse testing frequently presents.

  • Data
    models and data mapping documents represent many aspects of project
    requirements; familiarity with these artifacts is essential to data integration
    testing success.
  • There’s
    a large variety of Data Management systems, including Oracle, Microsoft, and
    IBM. More than a few enterprise data warehouses employ multiple database
    products from these and other vendors.
  • Databases
    are often so large as to necessitate tests based on data sampling. Choosing the
    best methods of data selection is a necessary skill.
  • Few
    test tools exist that cover the many needs for data integration testing,
    particularly tools to aid in testing complex data transformations. Therefore,
    testers should be creative with manual testing while learning and implementing
    as many available tools as possible. With high expectations, but qualified
    staff in short supply, QA managers should seek powerful yet easy-to-use tools
    that can be quickly deployed without overhauling the budget.
  • An
    extensive understanding of SQL queries, data profiling methods, Excel, and DB
    editors is essential. Fortunately, e-learning, books, and classes are available
    for these needs.
  • The
    ability to assess readiness for data warehouse testing during and after test
    planning is complete.

There are many more
challenges and solutions for those challenges. You can learn about them in vendor
classes or online from those offering data warehouse tester training.

Planning for DW Tests and Test Cases

Initiating and implementing a data warehouse test-planning
process may be a new experience for DW test candidates. You can ease this
effort through reviews of existing data warehouse “test planning templates,”  “master test plans,” “test strategies/approaches,”
or data integration “end-to-end” test plans. Through observations of DW
planning templates and associated checklists, people seeking to become data
warehouse testers will learn from these to create effective strategies of their
own. In these documents, testing topics and test scenarios – unique to data
warehouse QA – will be discovered.

  • Common
    ETL goals and objectives as checklists for test coverage
  • Examples
    of defects frequently found during ETL testing; writing test cases to find them
  • Recommended
    ETL test scenarios and test cases
  • Estimating
    DW test resources and schedules
  • Source
    to target data profiling – what to look for
  • Formal
    QA entry and exit criteria for data build deployment
  • Examples
    of ETL test scenarios
  • Functional
    testing (e.g., security, performance, ETL error logs, regression testing
    guidelines)
  • Assessing
    test plans and test readiness

Using data warehousing as an example, Figure 2 illustrates the primary checkpoints (testing points) in an end-to-end data integration testing process. The figure demonstrates how data should be verified as it is extracted from sources, transformed for loads into target databases, and aggregated for loads into data marts. Only after data owners and other stakeholders confirm that a data integration was successful can the whole process be considered complete and ready for production.

Figure 2: Checkpoints that are necessary to audit and verify Data Quality in data integration projects. A data warehouse integration example is depicted here.

Planning and Managing Test Data

Those new to data warehouse testing should learn how
best to address the complications often encountered when planning, then
selecting, test data for the data warehouse QA effort. The variety of choices
for data selection, and challenges associated with each, must be learned so
that testers can make wise choices.

Characteristics
of a Test Data Management Strategy

  • Generate an
    adequate quantity of required test data from all data sources
  • Create synthetic
    data for any missing data (e.g., that needed for negative testing)
  • Carefully
    schedule test data extraction and load process
  • Reduce security
    risk in test data for sensitive/personal content
  • Manage test
    environments for test data preparation and cleanup – gain DBA and configuration
    management support
  • Provide access
    control to test data sets that multiple QA users consume

Critical
Challenges to Test Data Planning

  • Availability of essential test data to
    verify and validate all business uses cases and rules
  • Identity of all source tables,
    the constraints, and dependencies
  • Understanding the range of possible values for
    various fields (including all boundary values)
  • Ability to create volumes of test data from
    heterogeneous data sources that are needed in the test environments
  • The number of resources (e.g.,
    people, tools) to develop traceability for business requirements à
    to test cases à
    to test data

Tester Skills for DW Testing

Staffing and training the QA team are among the essential
steps in the test planning process. Testers’ skills can be developed so that a job
description is easily prepared and candidates effectively interviewed. Critical
DW QA needs, skills, and experiences should be outlined so that QA team members
can be chosen and prepared for all facets of testing.

The following is a data warehouse testing job
description that represents what organizations are often seeking.

  • Solid understanding
    of data warehousing concepts, architectures, and processes
  • Multi-years of
    professional experience testing large databases
  • Talent to
    collaborate with product, project, data engineering, and analytics team to
    provide input on QA resourcing needs and timelines
  • Experiences with
    relational and dimensional database structures
  • Experience with
    ETL & BI test planning and hands-on testing
  • Understanding of
    ETL mapping documents for developing test cases
  • Formal training
    and experiences with SQL queries: Oracle / SQL Server / DB2
  • Exposure to
    testing across all layers of a data warehouse and/or data integration platforms
  • Experiences with
    SQL query development and query execution scripts based on ETL mapping
    documents
  • Experience in
    testing data transformations from source to target mappings
  • Aptitude for
    researching and troubleshooting root causes of ETL issues
  • Ability to use a
    variety of DB editors, Excel, and MS Access for analysis of test query results
  • Solid in ETL
    data validation processes (e.g., Informatica/DataStage/SSIS)
  • Experiences with
    end-to-end data validation for ETL and BI systems 
  • Ability to
    conduct various testing including DW unit, functional, integration, regression,
    performance, and post-production testing
  • Strong in BI report
    validation in Cognos/Business Objects/Microstrategy/SSRS BI environments
  • Ability to work
    with subject matter experts to resolve gaps/questions in requirements
  • Aspirations to
    assist developers in recreating test failures leading to problem resolutions
  • Solid
    understanding of business intelligence (BI) dashboard tools such as OBIEE
  • Experience
    partnering with data stewards, data architects, and software engineers
  • Usage and
    testing skills with reporting and analytics tools (e.g., OBIEE, Tableau)
  • Experiences with
    scripting and automation of data testing  

Concluding Remarks

Data warehousing projects are known to be risky. They can fail for many reasons: poor Data Architecture, inconsistently defined data, inability to integrate data from varied data sources, missing and inaccurate data values, insufficient data transformation testing, inconsistent use of data fields, unacceptable query performance, and more.

DW project risks will be decreased
when staffed with well-trained and motivated testers who serve their projects
from early on and during the entire data warehouse development.

Checkout PrimeXBT
Trade with the Official CFD Partners of AC Milan
The Easiest Way to Way To Trade Crypto.
Source: https://www.dataversity.net/becoming-a-prized-data-warehouse-and-data-integration-tester/

spot_img

Latest Intelligence

spot_img

Chat with us

Hi there! How can I help you?