Data Maturity in a Social Business and Big Data World

Posted by Micheline Casey

Dion Hinchcliffe, the executive vice-president of strategy at Dachis Group, in his book Social Business By Design, defines social business as the intentional use of social media to drive meaningful, strategic business outcomes. Companies are leveraging social media platforms and data to drive business, inform consumer engagement, and to enhance and expand the company’s analytical capability. Watching Twitter feeds? Check. Monitoring Facebook and Pininterest? Yep. Building internal collaboration platforms to more tightly integrate your business partners? Of course.

To harness the transformative power that social business and social business analytics promises, companies need to integrate information from multiple data sources. This includes both structured and unstructured data. It is critical then, to have both a strong data governance foundation in place, as well as an infrastructure that can quickly consume, integrate, analyze, and distribute this new information. Incompatible standards and formats of data in different sources can prevent the integration of data and the more sophisticated analytics that create value.

A company’s ability to strongly leverage social media as a social business will be infinitely enhanced by having a strong foundational data and technology infrastructure, along with data governance policies and processes for integrating social media data sets.

The figure below overlays Hinchcliffe’s social business maturity model (in red, with four of eight community management competencies shown in gold) with a traditional data governance maturity model (shown in blue) and technology maturity model (in orange).

Implementing cross-channel customer engagement or enriching in-house data with purchased behavioral/lifestyle data WITHOUT already having master data and a master data management system in place would require hours of manual manipulation on the part of employees, leaving little time for the actual analysis of data. Additionally, services such as alerts and recommendations would not be accurately possible (thus potentially risking a privacy violation) without a master profile of the customer. Likewise, an organization’s internal infrastructure (beyond big data clusters) must also be sophisticated enough to move data throughout the organization, when and where it’s needed.

While the rush to social business and big data certainly is on, smart data companies are also investing in foundational data management, data governance, and technology architecture to support their long-term vision.

Data and Trust – Thoughts from the World Economic Forum’s Global Agenda Outlook 2013

Posted by Micheline Casey

In the recently released “Global Agenda Outlook 2013” by the World Economic Forum, one of the main topics that is tackled as part of the ‘agenda’ is titled ‘Thriving in a Hyperconnected World”. The main premise is that the physical world and the digital world are merging rapidly, and institutions and leaders are not prepared to deal with it. Not only are the technologies evolving, but the amount of data being generated is completely unprecedented, yet will only grow.

Two of the major components of this “hyperconnectedness” that the WEF discusses are data and trust. Marc Davis with Microsoft Online Service Division frames it nicely: “[Big data] is a question of the structure of the digital society and digital economy, what it means to be a person, who has what rights to see and use what information, and for what purposes might they use it.”

Globally, countries and industries are dealing with the policy, economic and regulatory structures (on top of the technical interoperability challenges) to control the flow and sharing of data, particularly personal data. Yet there is virtually nothing that is done today without a data component. There are both huge societal benefits to the amount of data generated today, as well as potentially enormous – and life-threatening- drawbacks to this data if not managed properly, if collected erroneously, and if inappropriately shared.

There are many reasons why we each give data up – to open a bank account, to purchase a vehicle, to get healthcare treatment, to find people to date, to unlock a badge from our favorite gaming site. But in these instances, we make a conscious choice to give up certain pieces of data and information about ourselves.

But we don’t know what the internal data quality practices are of the companies to whom we give data; we don’t know how they manage their cyber security practices; we don’t know how their internal access and authentication controls are managed; we don’t know if the company has the ability to do tagging at the data element level to fortify its privacy compliance protocols; we don’t know to whom the company resells our data; we don’t know if the company’s legitimate business partners with legitimate access to our data are also protecting our data with the same degree of integrity.

Knowing what I know about the actual limited capabilities of federal and states governments here in the U.S. to actually integrate and share data, I’m far less concerned with ‘Big Brother’ than I am with Amazon and Apple (both of whom seem to do a far more effective and efficient job of managing my data correctly) doing something creepy with my data (like recommend me purchasing a Justin Bieber CD).

Trust frameworks, transparency, policies, accountabilities – these are all steps on the right path to building trust. To engender trust by people, by society, in how data is collected, managed, and used, requires multiple degrees of sophistication far beyond where many organizations and institutions are today. This includes with technology, policies and regulations, and economic models. Unfortunately, policy will never keep up with the speed of technology innovation, so it may take awhile to get to trust.

Most importantly, however, individuals need to take responsibility for their data: being educated about their data, about how to control it, and to be given more controls over their data (especially when its in the hands of institutions). This part of the discussion is largely absent from the overall debate, and needs to be given its due attention.

Thoughts about how to move this individual responsibility discussion forward?

Identity, Data, Privacy and Security – Tumbling Together

Posted by Micheline Casey

For over a decade, the Federal Government has had numerous efforts and initiatives on identity and access management (IAM). These efforts morphed into identity, credential, and access management (with of course its own acronym, ICAM), underscoring a fundamental principle of … Continue reading →

The Data Management Framework: From Conceptual to Tactical

Posted by Micheline Casey

I heart the Data Management Association’s Data Management Functional Framework wheel. I really do. It provides a great anchoring point from which to prioritize and work through an organization’s data management challenges. I tend to get a lot of questions … Continue reading →

Open Data – Is Anyone Really Using It?

Posted by Micheline Casey

Over the past four years, there has been an unprecedented push for government transparency at the federal and state levels. The statistics are really quite amazing, considering it’s the government that we’re talking about. There are almost 400,000 raw and geospatial data sets on Data.gov, including some that leverage semantic web and RDF (resource description framework) technology for linked open data sets. 35 states providing other raw open data sets online (http://www.data.gov/opendatasites). Many cities and counties are joining in this effort, though some are reticent due to the revenue loss impact by not being able to sell these data sets to data aggregators and research institutions. There are close to 1300 apps that have been created through the use of the federal data sets alone, and cities across the country add to that total with the local hackathons and codeathons that regularly take place.

Data is a critical asset of the governmental ecosystem, and we’re actually seeing governments at all levels (and internationally) making that data available (to the extent that it is permissible and they are able and still be in compliance with federal and state privacy policies). Don’t get me wrong, there is still a lot more that can be – and will be – done, but the past three years have really been momentous.

But, what feels really absent to me in the big data discussions that everyone seems to be so enamored with, is what the private sector is actually doing to incorporate and leverage these governmental open data sets with other “externally” source-able data (in this I include data from credit reporting agencies, data aggregators, social media, etc.). I recently attended the Strata + Hadoop conference in New York City. There were very few government participants. And, I heard very little about how the private sector is using and leveraging open data. While marketing firms for years have used census data, I still struggle to find good examples of how healthcare organizations or banks or logistics companies or energy companies are incorporating and using this data to enrich and add insights to existing data sets they have.

Maybe they are, but just being quiet about it. Or maybe outside the DC beltway, companies aren’t really paying that much attention to what the Feds are doing with its data assets. Or, maybe they don’t trust the quality of the data from the Feds, just like they don’t trust most other things coming from the Federal government.

In some ways, it seems like the Feds have been really promoting these open data efforts. Certainly the start-up community has noticed and continues to churn out new apps to take advantage of these data sets. The “brand” – open data- is certainly locked in and is recognized worldwide.

But what about established companies, of all sizes? Are they getting the message? Do they understand the value in those datasets? Is there anything that can be done to further communicate and message this to the private sector? Do we even care? I don’t have answers, I’m just raising the questions based on my observations.

I guess I am a believer that publishing open data for the sake of being transparent isn’t necessarily all that helpful to anyone other than those organizations and non-profits who are focused on ensuring that “we-the-people” aren’t getting screwed. And certainly, there is some base value in that. But the real leverage points, the real utility of open data is when it can be combined with other data and actually USED to solve problems or answer questions or help companies innovate.

We are still in the early maturity stages of open data and big data. I look forward to the growth in this area, not just from the government side, but from private industry in terms of how they incorporate and leverage open data sets. In the meantime, I will hope that industry really starts to appreciate the value in what government is giving them so freely. Maybe in this Darwinstic environment in which we live, the companies that get it will leverage the data, drive innovation, and will rise to the top. The others; well, they won’t.

How Data Management and Data Governance Support Big Data ROI

Posted by Micheline Casey

According to Aberdeen Group (source: Data Management for BI: Fueling the Analytical Engine with High-Octane Information), best-in-class companies take 12 days on average to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards 143 days. If you’re an average company (raise your hand if you think you are), it will take you up to 2 months to integrate new data. Do you have that much time? Can you afford to react that slowly to customer or market opportunities? For laggard companies, this really means they’re dead in the water.

Traditional legacy, siloed systems are heavy: incompatible standards and data formats slow or prevents quick integration of existing or existing with new data. This inability to quickly and effectively integrate new data sets for either real-time or predictive analytics limits the ability of organizations to pursue new opportunities, support customer needs, and to drive insights. In short, it limits an organization’s ability to be proactive in revenue-generating situations.

Part of the reason for this time delay and inability to drive the ‘last-mile’ analytics that companies need, is the lack of data management processes. No one needs to explain anymore the myriad of data across organizations and ecosystems: transactional data, operational data, analytical repositories, social media data, mobile device data, sensor data, and structured and unstructured data. Usually a lack of data (with some exceptions) is not the main problem for an organization.

The main problem is how to manage, integrate it, and distribute it (appropriately) so that organizations can nimbly and agilely exploit data and opportunities. The Data Management Association International (DAMA) provides a framework that is a holistic approach to understanding the data and information needs of the enterprise and its stakeholders. Most of my readers already have an awareness of the DAMA ‘wheel’, which highlights ten areas of data management, with data governance as the center point. In addition to the Data Warehousing & Business Intelligence Management pieces of the wheel, other key areas of data management that are important to data integration, analytics and optimization include: data governance, data architecture, master data management, meta data management, and data security.

From a big data perspective, these area help provide answers to the following questions:

What data do we have? Where are gaps in data that we need?
What data is intellectual property for us that can help us exploit new opportunities?
How do we integrate the right data together?
How do these data sets relate to each other?
Do we have all of the data about this (fill in the blank – person, event, thing, etc.)?
What are the permissible purposes of the data? Can we link and leverage these disparate data sets together and still be in regulatory compliance?

Data driven, data centric organizations (best-in-class) consider data needs early and often in the business strategy process. They are not in reactive mode after IT has architected and implemented a solution to then determine what the reporting, analytics, and big data opportunities may be. Understanding the business strategy and business needs drives strong data management and data governance. Data management and data governance allow the strong management of data assets, so those assets are leverageable for big data purposes in ways that optimizes benefit and return on invest to the organization.

Data management maturity supports big data maturity by providing the policies, processes, and infrastructure to quickly consume, integrate, analyze, and distribute high quality, trusted data to the user’s (employee, executive, customer, business partner) point of touch so that insights can be derived and action taken as rapidly as possible.

Beyond Quality And Security – The Importance Of Establishing Control Points For Information Management Across The Organization

Posted by Micheline Casey

Strong data management doesn’t just begin on the back end, when the data actually hits a database. It begins long before that, early in the data lifecycle, and across many areas of the organization.

One of the crucial elements in becoming a data-centric organization is in culturally changing the awareness of thinking about data from a variety of prisms. Strategies come down from top management; specific goals and objectives then get developed. The next questions should be: what data do we need to support those goals, objectives, programs etc? Does that data already exist in the organization or do we have a gap? For the gaps, how do we close them and how do we ensure tightness and alignment with existing data management strategies?

There are a number of control points that come out of this scenario:

Strategic planning – What data do we need to measure success?

Goals and objectives – What KPIs and metrics are important? What type of reporting and dashboards are required? Do we have all the data that we need for reporting and metrics measurements? Do we trust in the quality and integrity of the data that we need for reporting? If not, what gaps do we need to close to build trust?

Budgeting and financing – What controls have we implemented to support the optimization of our data investments across the entire enterprise? Are we aligning various programs across the organization such that we reduce data silos and redundancy, and optimize information sharing and infrastructure development where possible? Does someone have stated authority and responsibility for overseeing this planning and budgeting?

Business case development – What data do we have in-house (presumes a knowledge of all enterprise data assets) to support new programs or applications? Can we leverage these in-house data sets for this purpose (compliance/regulatory check-point)? How do we close the data gaps we have (can we capture data via existing sources? Do we need to purchase data from 3rd parties? Are there open data sets that are leverageable?)

Requirements gathering – Where are the authoritative sources of the different data sets we need? Are we leveraging organizational reference and master data?

Build vs. buy decisioning – If we build something in-house, how can we maximize previous infrastructure investments in data, hardware, middleware, and exchange mechanisms so as to minimize duplication or silo building? Buying a solution means building in checkpoints for ensuring ease of integration and data extraction.

Contracts and Procurement – What language do we have in our contracts to enforce compliance or alignment with internal data management and data security policies? Do we always get a data dictionary? Do we ask vendors to provide us mapping to our conceptual and logical data models? Do we ensure data quality levels (for certain types of acquisitions)? Who actually owns the data? If we’re outsourcing our data, what are our access rights for transactional, analytical, regulatory, and recovery purposes?

Organizations that think this way are truly data-centric organizations. Not only do they understand data as an asset, but also both try to protect it from dilution and look for the multiplier effect on their data investment by improving the leveragability of data across the organization and its ecosystem.

Applying Entrepreneurial Principles to Data Governance

Posted by Micheline Casey

I recently read a wonderful article by Daniel Isenberg in the Harvard Business Review article entitled “Planting Entrepreneurial Innovation in Inner Cities” (June 5, 2012). While on the surface it had nothing to do with data governance, it hit me as I got further in to it, that from an organizational perspective, there are many similarities between new and young data governance efforts and entrepreneurial ventures.

Many data governance efforts start out as entrepreneurial feats, engineered by a few people with a vision, creating it on a shoe-string budget. Perhaps they have an “angel” investor – an executive sponsor with enough vision to provide some capital and a few people to see what can happen. The goal of course, is to get enough wins (customers) under their belt to build a business plan and take it to an internal “venture capitalist” for full funding, more resources, and to support expansion. The data governance teams have similar qualities to entrepreneurs in terms of the amount of time, energy, creativity and dedication to their vision it takes to build the program out.

So, here is a synopsis of Isenberg’s major principles about fostering an environment of entrepreneurship. Think about how these can be applied in your organization.

Develop an inclusive vision of high growth entrepreneurship-, “It is a reality that a small number of extraordinary entrepreneurial successes have a disproportionately stimulating effect on the environment for entrepreneurship… But, counter this with a strong message to entrepreneurs that they need to play a role in community building. …you need to tirelessly communicate a coherent message to all of the stakeholders and residents, highlighting the entrepreneurial benefits…”

The application of this to your data governance effort is pretty straightforward. Find and nurture relationships with those who are most excited about the business value of data governance and can create the most impact. But, make sure they understand that they need to support and foster data governance through community building and sharing what they’ve learned with others. The big difference with data governance efforts versus entrepreneurship, is that as data governance efforts across an organization expand and mature, everyone should win, not just a few.

Use best processes, not best practices – “We are a ‘platform’ not a program. An ecosystem exists in nature when numerous species of flora and fauna interact in a dynamic, self-adjusting balancing act. You need to provide a broad platform to support the inclusive vision, for all to interact with each other in innovative ways. Best processes are more important than best practices. One element of “best process” in fostering entrepreneurship ecosystems is experimentation. Experiment. Test. Invent. “

Through collaboration and community-building efforts, data governance efforts continue to build out the platform and portfolio of best process language, products and services that enable the organization’s data ecosystem to thrive and innovate. And, don’t be afraid to try out new things and see if they work. Processes, standards, definitions, policies – these can all be tweaked over time if necessary.

Define principles, not clusters – “Innovation, creativity, design, sustainability, experimentation, entrepreneurship, inclusiveness: these are example principles to be infused into the city’s collective consciousness. It is the entrepreneur’s job, not City Hall’s or that of a consulting firm, to learn how to identify opportunity, usually where most people think it doesn’t exist. Many of the great opportunities defy definition and lie in the creative “inter-sectors”: health care and the environment; real estate development and information technology and cleantech; education and mobile communications.”

Classically, this is why data governance and data management are based on enterprise architecture and take an enterprise view of data. Trying to solve data issues in silos or divisions can move the ball forward – usually in terms of efficiencies. But to truly be innovative, connections and unlikely combinations across silos, divisions, and even ecosystems need to occur. Visibility into all enterprise data assets, identifying authoritative data sources, and providing high quality data: these are some examples of data governance principles that can support innovation in an organization.

Invest time, not money: “Nothing is free… [but]better to spend your energy persuading the stakeholders that it is worth their while to make those investments…investment is seen as enlightened self-interest.”

Yes, data governance and data management takes time AND money. But, the fact is that you do want all enterprise stakeholder invested in the outcome and success of the program – because they are dependent on quality data to succeed. If they all chip in and have a stake in the game, they will be more interested in helping you succeed.

Fight the battle for talent, not capital: “Make your city an amazing place for the most talented entrepreneurs, innovators and creative people to come to seek their fortunes, to live, work, and play in.”

Data governance isn’t exactly a city, but the concepts of community-building still apply. Set up internal communities using social media to allow folks to come together virtually and share ideas. Spend time trumpeting successes and encouraging the cross-pollination of ideas. Set up your program so it enables success and innovation in the organization by tying it in with key strategic initiatives that have employees talking.

One final point. Isenberg article specifically discussed entrepreneurship in inner cities. He described “Inner city” in this way – “remember just a decade ago when the term ‘inner city’ basically meant ‘dead city’, conjuring up images of destruction, dereliction and despair? Today, inner cities are “in” – innovative, hip hotbeds of convenient culture, commerce and connection.”

Sounds a lot like the world of data to me.

Understanding Your Data Landscape

Posted by Micheline Casey

Have any of you ever played the arcade game, Whac-a-Mole? For those of you who are unfamiliar with it, a typical Whac-A-Mole machine consists of a large, waist-level cabinet with five holes in its top and a large, soft, black mallet. Each hole contains a single plastic mole and the machinery necessary to move it up and down. Once the game starts, the moles will begin to pop up from their holes at random. The object of the game is to force the individual moles back into their holes by hitting them directly on the head with the mallet, thereby adding to the player’s score.

I bring up Whac-a-Mole because anytime I start down the path of what is supposed to be a new data integration project, inevitably, a number of other ‘like’ projects pop up, seemingly out of nowhere. Trying to control them is often nearly impossible. It’s always interesting to me to watch the amazement from project team members when they realize that if there had been communication and coordination across the company, that they would have know about these other projects. But usually there’s not, and so what is happening across an organization’s data landscape is often little understood by the business areas to whom having access to high quality data is critically important.

Visibility into, and transparency of, an organization’s data landscape – from an enterprise perspective – is critical to the success of everything from data integration/SOA to data governance to master data management to data quality to security and privacy compliance. You can’t govern what you don’t know you have; you can’t secure and protect what you don’t know you have (or you spend too much on security because you can’t properly take a risk-based approach); you can’t integrate and optimize what you don’t know you have; and, it’s hard to develop master data if you don’t know where all the data of a certain type lives in the organization. And in a world where customer intimacy and experience is ruling the day, if you aren’t aware of and connecting all the data you have on your customers, could you do more harm than good?

Capturing information about your organization’s data landscape – and its inter-connected ecosystem – is a fairly straightforward process, though time-consuming the first time out of the gate if it’s never been done before. But it’s more than just a straight-forward inventorying of assets. There are lots of other metadata that should be collected and known about the landscape to bring the game-changing value.

Other important items to know about your data landscape include:

The alignment of data assets to the organization, functional areas, processes, and services
Assignment of stewardship responsibilities
Authoritative data sources
Data quality metrics – accuracy, integrity, currency
Data security information – criticality, integrity, availability, access rights
Definitions, rules, policies, standards, compliance environment
Relationships and upstream/downstream impacts
Information exchange mechanisms

Strong enterprise architecture practices and metadata repository can assist in linking data assets to the business and services that those assets support. This allows for slicing and dicing of data in ways to support DG, standards, policies, DQ, portfolio management, and performance management. It also supports understanding of the interconnectedness of data assets across the environment, including externally.

Increasing the agility and speed-to-market of an organization are just a couple benefits of having full visibility into the enterprise data landscape. Other benefits include:

Identify opportunities and gaps
Understand risks
Optimize data and identify and reduce redundancies
Understand impact risks of changes ‘to the assembly line’
Define authorities and responsibilities
Improve overall return on investment in enterprise data assets

If your organization hasn’t done an inventory of its data assets, it can be overwhelming to know where to start. Start small, in manageable chunks, in ways that make sense for your organization, based on need and priority.

Why the CDO or ‘Top Data Job’ is Needed Now

Posted by Micheline Casey

I’ve opined about this before, and I’m going to do it again. It is time for the role of Chief Data Officer (or chief data czar or, as my friend Peter Aiken calls it, the ‘top data job’). Regardless of … Continue reading →

dataTrending

Navigating the Ecosystem of Enterprise Information Management, Strategy, and Security

Data and Trust – Thoughts from the World Economic Forum’s Global Agenda Outlook 2013

Identity, Data, Privacy and Security – Tumbling Together

The Data Management Framework: From Conceptual to Tactical

Open Data – Is Anyone Really Using It?

How Data Management and Data Governance Support Big Data ROI

Beyond Quality And Security – The Importance Of Establishing Control Points For Information Management Across The Organization

Applying Entrepreneurial Principles to Data Governance

Understanding Your Data Landscape

Why the CDO or ‘Top Data Job’ is Needed Now