Have any of you ever played the arcade game, Whac-a-Mole? For those of you who are unfamiliar with it, a typical Whac-A-Mole machine consists of a large, waist-level cabinet with five holes in its top and a large, soft, black mallet. Each hole contains a single plastic mole and the machinery necessary to move it up and down. Once the game starts, the moles will begin to pop up from their holes at random. The object of the game is to force the individual moles back into their holes by hitting them directly on the head with the mallet, thereby adding to the player’s score.
I bring up Whac-a-Mole because anytime I start down the path of what is supposed to be a new data integration project, inevitably, a number of other ‘like’ projects pop up, seemingly out of nowhere. Trying to control them is often nearly impossible. It’s always interesting to me to watch the amazement from project team members when they realize that if there had been communication and coordination across the company, that they would have know about these other projects. But usually there’s not, and so what is happening across an organization’s data landscape is often little understood by the business areas to whom having access to high quality data is critically important.
Visibility into, and transparency of, an organization’s data landscape – from an enterprise perspective – is critical to the success of everything from data integration/SOA to data governance to master data management to data quality to security and privacy compliance. You can’t govern what you don’t know you have; you can’t secure and protect what you don’t know you have (or you spend too much on security because you can’t properly take a risk-based approach); you can’t integrate and optimize what you don’t know you have; and, it’s hard to develop master data if you don’t know where all the data of a certain type lives in the organization. And in a world where customer intimacy and experience is ruling the day, if you aren’t aware of and connecting all the data you have on your customers, could you do more harm than good?
Capturing information about your organization’s data landscape – and its inter-connected ecosystem – is a fairly straightforward process, though time-consuming the first time out of the gate if it’s never been done before. But it’s more than just a straight-forward inventorying of assets. There are lots of other metadata that should be collected and known about the landscape to bring the game-changing value.
Other important items to know about your data landscape include:
- The alignment of data assets to the organization, functional areas, processes, and services
- Assignment of stewardship responsibilities
- Authoritative data sources
- Data quality metrics – accuracy, integrity, currency
- Data security information – criticality, integrity, availability, access rights
- Definitions, rules, policies, standards, compliance environment
- Relationships and upstream/downstream impacts
- Information exchange mechanisms
Strong enterprise architecture practices and metadata repository can assist in linking data assets to the business and services that those assets support. This allows for slicing and dicing of data in ways to support DG, standards, policies, DQ, portfolio management, and performance management. It also supports understanding of the interconnectedness of data assets across the environment, including externally.
Increasing the agility and speed-to-market of an organization are just a couple benefits of having full visibility into the enterprise data landscape. Other benefits include:
- Identify opportunities and gaps
- Understand risks
- Optimize data and identify and reduce redundancies
- Understand impact risks of changes ‘to the assembly line’
- Define authorities and responsibilities
- Improve overall return on investment in enterprise data assets
If your organization hasn’t done an inventory of its data assets, it can be overwhelming to know where to start. Start small, in manageable chunks, in ways that make sense for your organization, based on need and priority.