Business Intelligence (BI) and data warehousing (DW) are separate entities serving distinct functions in organizations. The data warehouse, representing the enterprise information architecture, serving to achieve a “single version of the truth”, is still necessary today to support the user-facing BI system in the DW/BI solution. Perhaps once operational systems can be completely standardized, reduced to simply transactional data including a history for both master and transaction data, and can put out flawless data, we will be able to run BI tools directly against them. The reality is that the traditional data warehouse is still mostly necessary to support BI.
Once IT has created, populated, and validated the accuracy of the data in the data warehouse, the goal is to enable business users to access and analyze their data without much further IT involvement, yielding self-service BI. Self-service BI sitting on top of quality information should ideally result in better business decision making, enabling executives and managers to act.
The primary objectives of the data warehouse are that of data management and delivery. Data management, which includes data analysis, encompasses the bulk of the effort in that it involves standardizing and integrating data from multiple sources. Additionally, the ability to share data and reuse processes is a principal goal of a mature DW/BI environment.
Knowing that the data warehouse is tied to BI, there are several components that frame the overall BI initiative.
The first aspect in establishing a data warehouse, and the reason to practically have a BI initiative in the first place, is the analysis of business requirements for analytics capabilities. The first challenge is in the prioritization of the strategic analytical requirements, which usually provides the highest return on investment over operational reporting requirements, typically providing a lower ROI. Key factors to consider that impact the overall effort and scope of the project include: assessing organizational readiness, effective communication of upcoming changes to all affected, managing scope of users changing their own initial requirements and breaking apart the project into manageable phases with easily attainable milestones.
Data Warehouse Design
Mapping key business requirements to benefits achieved by BI is where key metrics to be improved upon need to be determined. Beneficial changes to KPIs such as increased revenue, improved profit margins and reduced costs need to be considered in the data infrastructure requirements. A risk mitigation strategy should be built into the overall equation.
Business value can be achieved by addressing those high value, high priority BI requirements and translating them into a sound data design and/or model.
Data analysis and administration are a major part of the effort in the design phase since activities such as data integration, data profiling, data modeling, collaborating with data owners, data governance, data cleansing, resolving data integrity issues and disputes, and other custodial responsibilities such as data validations must take place to achieve a robust design. Data anomalies, misunderstandings, and disagreements about the data may surface among users, which must be addressed.
The first aspect of database design includes the database design schema, which is documented as a physical data model which shows tables, columns, primary and foreign keys, and indices. It is either a star schema, a snowflake schema, or a relational/non-multidimensional schema.
The second aspect is the physical design document, which describes dataset placement, index placement, etc. Other aspects are the script files themselves, which include database data definition language (DDL) and data control language (DCL) statements.
ETL (Extract, Transform, Load) is the most complicated process in a DW/BI environment because it must coordinate the population of all data warehouse target databases. The purpose of the ETL process is to extract source data from disparate sources and move it into the data warehouse target databases while simultaneously standardizing and integrating the data. The most important aspect of ETL design is the source to target mapping document showing all data transformations.
Data Warehouse Build
A proven and sound data warehouse development methodology combined with a collaborative approach with the goal of giving ownership of the BI application to the business people has proven to be most successful. By building a more self-service oriented data architecture, the user community becomes incrementally empowered and productivity rises.
Implementing the back-end ETL process is usually done using an ETL tool such as SQL Server Integration Services (SSIS). Depending on the condition of your source files (integrity, location, history), the complexity of the required transformations and the functional limitations of the ETL tool, you may have to write some custom code to enrich the ETL tool functionality.
BI Architecture and Modeling
Standard procedures, guidelines, and conventions must be in place during the BI modeling phase. Establishing modeling policies is crucial for the DW and BI components and the relationships among them to function properly. The BI model gives way to the business applications which are the business processes that access data from the data warehouse through reports, ad-hoc queries, business dashboards, and so on. The latter allow business execs to act on the information to effect change in the company.
Ongoing management of the interdependent components of the BI initiative should overlap all phases of the data warehouse project to derive the highest business value. Often, companies are internally working on multiple projects in parallel, making it a challenge to manage the DW/BI project while respecting timelines, milestones, and realigning requirements with changing business needs. Until you know what the data warehouse effort will be, you cannot create a project plan and commit to any deadline.
After the new DW/BI project is implemented in production, the project team performs a post-implementation review. Lessons are learned from every project and there is usually opportunity to streamline the development approach by practicing more process re-use, avoiding redundant tasks, improving communication, improving estimation methods, identifying gaps in infrastructure, and defining critical success factors for the next project.
As much as it is a risk, the goal of DW/BI is not to add to the data chaos. It is to turn “data anarchy” from disparate sources and operational systems into data with business value and trustworthy information with maximum flexibility and reusability. This requires a methodological approach to developing the data warehouse and BI applications sitting atop.
The result should be a store of consistent, clean, standardized, trustworthy data, flexible enough to evolve that will be accessed by the BI application and delivered with timeliness to the user community for analysis, i.e. a “single version of the truth”.
Check out our Services page for more information and be on the lookout for updates on more details of how we work with our customers to build their Analytics Infrastructure, Architecture and Model.
Or, in the meantime, contact us for help with implementing your datawarehouse for BI.