|
My online series of articles has been focused on the need for businesses to "get serious" about their approach to developing an enterprise business intlligence (BI) and data warehousing (DW) capability. When pursuing this capability it is important to adopt a holistic view, followed by disciplined investment and execution. To develop the future vision for this capability, you should consider seven interrelated areas: 数据挖掘
1. Strategy 2. People 3. Process 4. Metrics 5. Applications 6. Data 7. Architecture 商业智能
This is the second column that will explore the key considerations of the "Data" focus area. In Part 1 of my column on data, I discussed sourcing and integration considerations, but there are many other data-related topics to consider. 商业智能
Some of the most challenging but important data considerations relate to the areas of: 数据挖掘
Lately these data management disciplines have been grouped into the umbrella topic of enterprise information management in the industry. This is mostly in recognition of the fact that these are not really just concerns of BI/DW but instead they really need to be addressed across all enterprise systems and require the involvement of businesspeople from across the enterprise too. 商业智能
数据挖掘
Master Data Management
"Why can't I get these reports to roll up to the same regions?" 本文转载自数据挖掘研究院
This question represents a common frustration of business users when tackling enterprise BI and DW. Often, source systems have been designed in isolation from each other, with each system addressing the needs of a particular process or function. When attempting to pull data together from multiple systems, you will quickly find situations where data values don't match up when you would expect that they would. 本文转载自数据挖掘研究院
This often occurs with master and reference data. Examples of master data would be your lists of customers, products, employees, company locations and suppliers. Examples of reference data would be things that describe or categorize these - such as lists of regions, industries, product types, employee types, etc. These master and reference data lists typically have hierarchical relationships - such as product families - which are often represented differently between systems, too. HAMMER_SHI
Because BI reporting and analysis needs won't stop while these data issues are all fixed in the sources, a tactical approach to solving this problem would involve creating a mapping/cross-referencing capability as the data is brought into the BI/DW databases. This provides a mechanism to map the data to some form of enterprise standard so that it can be reported in a consistent way. Additionally, this mapping process could also be used to add new reference data values or new hierarchy rollup levels to support specific reporting needs. 本文转载自数据挖掘研究院
Creating a mapping of master data values between sources is a good start - but it also creates its own set of issues. For example, if people are used to seeing the product hierarchy one way in reports out of the source system they use the most and then they see it differently in reports coming from the enterprise BI/DW applications - their confusion may actually increase - even though you were originally trying to eliminate confusion by mapping everything to an enterprise standard. Additionally, you'll need to assign people who are responsible for keeping the mappings up to date as the master/reference data values change and business processes that enforce periodic review of the mappings. 数据仓库
The ideal solution to this problem is "bigger than BI." This would involve addressing this issue in the source systems themselves. This solution involves: HAMMER_SHI
1. Getting business groups to standardize the lists of master data values.
This requires a data governance structure (people and processes) to be established. Data governance activities must involve the "right" businesspeople who can make the appropriate decisions about data values and have the backing of higher-level executives who can support the decisions recommended by the data governance team members and who can "break ties" when necessary. Typically you have data governance subgroups of data stewards that care the most about particular data domains (products). These stewardship subgroups should have cross-company representation for the relevant business functions and the related business processes and applications - so that any recommendations related to that data domain are understood and agreed by the most appropriate people.
|