Data Maturity in a Social Business and Big Data World

Dion Hinchcliffe, the executive vice-president of strategy at Dachis Group, in his book Social Business By Design, defines social business as the intentional use of social media to drive meaningful, strategic business outcomes. Companies are leveraging social media platforms and data to drive business, inform consumer engagement, and to enhance and expand the company’s analytical capability. Watching Twitter feeds? Check. Monitoring Facebook and Pininterest? Yep.  Building internal collaboration platforms to more tightly integrate your business partners? Of course.

To harness the transformative power that social business and social business analytics promises, companies need to integrate information from multiple data sources. This includes both structured and unstructured data. It is critical then, to have both a strong data governance foundation in place, as well as an infrastructure that can quickly consume, integrate, analyze, and distribute this new information. Incompatible standards and formats of data in different sources can prevent the integration of data and the more sophisticated analytics that create value.

A company’s ability to strongly leverage social media as a social business will be infinitely enhanced by having a strong foundational data and technology infrastructure, along with data governance policies and processes for integrating social media data sets.

The figure below overlays Hinchcliffe’s social business maturity model (in red, with four of eight community management competencies shown in gold) with a traditional data governance maturity model (shown in blue) and technology maturity model (in orange).

DM Maturity in a Social Business

Implementing cross-channel customer engagement or enriching in-house data with purchased behavioral/lifestyle data WITHOUT already having master data and a master data management system in place would require hours of manual manipulation on the part of employees, leaving little time for the actual analysis of data. Additionally, services such as alerts and recommendations would not be accurately possible (thus potentially risking a privacy violation) without a master profile of the customer. Likewise, an organization’s internal infrastructure (beyond big data clusters) must also be sophisticated enough to move data throughout the organization, when and where it’s needed.

While the rush to social business and big data certainly is on, smart data companies are also investing in foundational data management, data governance, and technology architecture to support their long-term vision.

Data and Trust – Thoughts from the World Economic Forum’s Global Agenda Outlook 2013

In the recently released “Global Agenda Outlook 2013” by the World Economic Forum, one of the main topics that is tackled as part of the ‘agenda’ is titled ‘Thriving in a Hyperconnected World”.  The main premise is that the physical world and the digital world are merging rapidly, and institutions and leaders are not prepared to deal with it. Not only are the technologies evolving, but the amount of data being generated is completely unprecedented, yet will only grow.

Two of the major components of this “hyperconnectedness” that the WEF discusses are data and trust. Marc Davis with Microsoft Online Service Division frames it nicely: “[Big data] is a question of the structure of the digital society and digital economy, what it means to be a person, who has what rights to see and use what information, and for what purposes might they use it.”

Globally, countries and industries are dealing with the policy, economic and regulatory structures (on top of the technical interoperability challenges) to control the flow and sharing of data, particularly personal data. Yet there is virtually nothing that is done today without a data component. There are both huge societal benefits to the amount of data generated today, as well as potentially enormous – and life-threatening- drawbacks to this data if not managed properly, if collected erroneously, and if inappropriately shared.

There are many reasons why we each give data up – to open a bank account, to purchase a vehicle, to get healthcare treatment, to find people to date, to unlock a badge from our favorite gaming site. But in these instances, we make a conscious  choice to give up certain pieces of data and information about ourselves.

But we don’t know what the internal data quality practices are of the companies to whom we give data; we don’t know how they manage their cyber security practices; we don’t know how their internal access and authentication controls are managed; we don’t know if the company has the ability to do tagging at the data element level to fortify its privacy compliance protocols; we don’t know to whom the company resells our data; we don’t know if the company’s legitimate business partners with legitimate access to our data are also protecting our data with the same degree of integrity.

Knowing what I know about the actual limited capabilities of federal and states governments here in the U.S. to actually integrate and share data, I’m far less concerned with ‘Big Brother’ than I am with Amazon and Apple (both of whom seem to do a far more effective and efficient job of managing my data correctly) doing something creepy with my data (like recommend me purchasing a Justin Bieber CD).

Trust frameworks, transparency, policies, accountabilities – these are all steps on the right path to building trust. To engender trust by people, by society, in how data is collected, managed, and used, requires multiple degrees of sophistication far beyond where many organizations and institutions are today. This includes with technology, policies and regulations, and economic models. Unfortunately, policy will never keep up with the speed of technology innovation, so it may take awhile to get to trust.

Most importantly, however, individuals need to take responsibility for their data: being educated about their data, about how to control it, and to be given more controls over their data (especially when its in the hands of institutions). This part of the discussion is largely absent from the overall debate, and needs to be given its due attention.

Thoughts about how to move this individual responsibility discussion forward?

Identity, Data, Privacy and Security – Tumbling Together

For over a decade, the Federal Government has had numerous efforts and initiatives on identity and access management (IAM). These efforts morphed into identity, credential, and access management (with of course its own acronym, ICAM), underscoring a fundamental principle of … Continue reading

Open Data – Is Anyone Really Using It?

Over the past four years, there has been an unprecedented push for government transparency at the federal and state levels. The statistics are really quite amazing, considering it’s the government that we’re talking about. There are almost 400,000 raw and geospatial data sets on Data.gov, including some that leverage semantic web and RDF (resource description framework) technology for linked open data sets. 35 states providing other raw open data sets online (http://www.data.gov/opendatasites). Many cities and counties are joining in this effort, though some are reticent due to the revenue loss impact by not being able to sell these data sets to data aggregators and research institutions. There are close to 1300 apps that have been created through the use of the federal data sets alone, and cities across the country add to that total with the local hackathons and codeathons that regularly take place.

Data is a critical asset of the governmental ecosystem, and we’re actually seeing governments at all levels (and internationally) making that data available (to the extent that it is permissible and they are able and still be in compliance with federal and state privacy policies). Don’t get me wrong, there is still a lot more that can be – and will be – done, but the past three years have really been momentous.

But, what feels really absent to me in the big data discussions that everyone seems to be so enamored with, is what the private sector is actually doing to incorporate and leverage these governmental open data sets with other “externally” source-able data (in this I include data from credit reporting agencies, data aggregators, social media, etc.). I recently attended the Strata + Hadoop conference in New York City. There were very few government participants. And, I heard very little about how the private sector is using and leveraging open data. While marketing firms for years have used census data, I still struggle to find good examples of how healthcare organizations or banks or logistics companies or energy companies are incorporating and using this data to enrich and add insights to existing data sets they have.

Maybe they are, but just being quiet about it. Or maybe outside the DC beltway, companies aren’t really paying that much attention to what the Feds are doing with its data assets. Or, maybe they don’t trust the quality of the data from the Feds, just like they don’t trust most other things coming from the Federal government.

In some ways, it seems like the Feds have been really promoting these open data efforts. Certainly the start-up community has noticed and continues to churn out new apps to take advantage of these data sets. The “brand” – open data- is certainly locked in and is recognized worldwide.

But what about established companies, of all sizes? Are they getting the message? Do they understand the value in those datasets? Is there anything that can be done to further communicate and message this to the private sector? Do we even care? I don’t have answers, I’m just raising the questions based on my observations.

I guess I am a believer that publishing open data for the sake of being transparent isn’t necessarily all that helpful to anyone other than those organizations and non-profits who are focused on ensuring that “we-the-people” aren’t getting screwed. And certainly, there is some base value in that. But the real leverage points, the real utility of open data is when it can be combined with other data and actually USED to solve problems or answer questions or help companies innovate.

We are still in the early maturity stages of open data and big data. I look forward to the growth in this area, not just from the government side, but from private industry in terms of how they incorporate and leverage open data sets. In the meantime, I will hope that industry really starts to appreciate the value in what government is giving them so freely.   Maybe in this Darwinstic environment in which we live, the companies that get it will leverage the data, drive innovation, and will rise to the top. The others; well, they won’t.

How Data Management and Data Governance Support Big Data ROI

According to Aberdeen Group (source: Data Management for BI: Fueling the Analytical Engine with High-Octane Information), best-in-class companies take 12 days on average to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards 143 days. If you’re an average company (raise your hand if you think you are), it will take you up to 2 months to integrate new data. Do you have that much time? Can you afford to react that slowly to customer or market opportunities? For laggard companies, this really means they’re dead in the water.

Traditional legacy, siloed systems are heavy: incompatible standards and data formats slow or prevents quick integration of existing or existing with new data. This inability to quickly and effectively integrate new data sets for either real-time or predictive analytics limits the ability of organizations to pursue new opportunities, support customer needs, and to drive insights. In short, it limits an organization’s ability to be proactive in revenue-generating situations.

Part of the reason for this time delay and inability to drive the ‘last-mile’ analytics that companies need, is the lack of data management processes. No one needs to explain anymore the myriad of data across organizations and ecosystems: transactional data, operational data, analytical repositories, social media data, mobile device data, sensor data, and structured and unstructured data. Usually a lack of data (with some exceptions) is not the main problem for an organization.

The main problem is how to manage, integrate it, and distribute it (appropriately) so that organizations can nimbly and agilely exploit data and opportunities. The Data Management Association International (DAMA) provides a framework that is a holistic approach to understanding the data and information needs of the enterprise and its stakeholders. Most of my readers already have an awareness of the DAMA ‘wheel’, which highlights ten areas of data management, with data governance as the center point. In addition to the Data Warehousing & Business Intelligence Management pieces of the wheel, other key areas of data management that are important to data integration, analytics and optimization include: data governance, data architecture, master data management, meta data management, and data security.

From a big data perspective, these area help provide answers to the following questions:

  • What data do we have? Where are gaps in data that we need?
  • What data is intellectual property for us that can help us exploit new opportunities?
  • How do we integrate the right data together?
  • How do these data sets relate to each other?
  • Do we have all of the data about this (fill in the blank – person, event, thing, etc.)?
  • What are the permissible purposes of the data? Can we link and leverage these disparate data sets together and still be in regulatory compliance?

Data driven, data centric organizations (best-in-class) consider data needs early and often in the business strategy process. They are not in reactive mode after IT has architected and implemented a solution to then determine what the reporting, analytics, and big data opportunities may be.  Understanding the business strategy and business needs drives strong data management and data governance. Data management and data governance allow the strong management of data assets, so those assets are leverageable for big data purposes in ways that optimizes benefit and return on invest to the organization.

Data management maturity supports big data maturity by providing the policies, processes, and infrastructure to quickly consume, integrate, analyze, and distribute high quality, trusted data to the user’s (employee, executive, customer, business partner) point of touch so that insights can be derived and action taken as rapidly as possible.

Beyond Quality And Security – The Importance Of Establishing Control Points For Information Management Across The Organization

Strong data management doesn’t just begin on the back end, when the data actually hits a database. It begins long before that, early in the data lifecycle, and across many areas of the organization.

One of the crucial elements in becoming a data-centric organization is in culturally changing the awareness of thinking about data from a variety of prisms. Strategies come down from top management; specific goals and objectives then get developed. The next questions should be: what data do we need to support those goals, objectives, programs etc? Does that data already exist in the organization or do we have a gap? For the gaps, how do we close them and how do we ensure tightness and alignment with existing data management strategies?

There are a number of control points that come out of this scenario:

  • Strategic planning – What data do we need to measure success?
  • Goals and objectives – What KPIs and metrics are important? What type of reporting and dashboards are required? Do we have all the data that we need for reporting and metrics measurements? Do we trust in the quality and integrity of the data that we need for reporting? If not, what gaps do we need to close to build trust?
  • Budgeting and financing – What controls have we implemented to support the optimization of our data investments across the entire enterprise? Are we aligning various programs across the organization such that we reduce data silos and redundancy, and optimize information sharing and infrastructure development where possible? Does someone have stated authority and responsibility for overseeing this planning and budgeting?
  • Business case development – What data do we have in-house (presumes a knowledge of all enterprise data assets) to support new programs or applications? Can we leverage these in-house data sets for this purpose (compliance/regulatory check-point)? How do we close the data gaps we have (can we capture data via existing sources? Do we need to purchase data from 3rd parties? Are there open data sets that are leverageable?)
  • Requirements gathering – Where are the authoritative sources of the different data sets we need? Are we leveraging organizational reference and master data?
  • Build vs. buy decisioning – If we build something in-house, how can we maximize previous infrastructure investments in data, hardware, middleware, and exchange mechanisms so as to minimize duplication or silo building? Buying a solution means building in checkpoints for ensuring ease of integration and data extraction.
  • Contracts and Procurement – What language do we have in our contracts to enforce compliance or alignment with internal data management and data security policies? Do we always get a data dictionary? Do we ask vendors to provide us mapping to our conceptual and logical data models? Do we ensure data quality levels (for certain types of acquisitions)? Who actually owns the data? If we’re outsourcing our data, what are our access rights for transactional, analytical, regulatory, and recovery purposes?

Organizations that think this way are truly data-centric organizations. Not only do they understand data as an asset, but also both try to protect it from dilution and look for the multiplier effect on their data investment by improving the leveragability of data across the organization and its ecosystem.

Applying Entrepreneurial Principles to Data Governance

I recently read a wonderful article by Daniel Isenberg in the Harvard Business Review article entitled “Planting Entrepreneurial Innovation in Inner Cities” (June 5, 2012).  While on the surface it had nothing to do with data governance, it hit me as I got further in to it, that from an organizational perspective, there are many similarities between new and young data governance efforts and entrepreneurial ventures.

Many data governance efforts start out as entrepreneurial feats, engineered by a few people with a vision, creating it on a shoe-string budget.  Perhaps they have an “angel” investor – an executive sponsor with enough vision to provide some capital and a few people to see what can happen. The goal of course, is to get enough wins (customers) under their belt to build a business plan and take it to an internal “venture capitalist” for full funding, more resources, and to support expansion. The data governance teams have similar qualities to entrepreneurs in terms of the amount of time, energy, creativity and dedication to their vision it takes to build the program out.

So, here is a synopsis of Isenberg’s major principles about fostering an environment of entrepreneurship. Think about how these can be applied in your organization.

Develop an inclusive vision of high growth entrepreneurship-, “It is a reality that a small number of extraordinary entrepreneurial successes have a disproportionately stimulating effect on the environment for entrepreneurship… But, counter this with a strong message to entrepreneurs that they need to play a role in community building. …you need to tirelessly communicate a coherent message to all of the stakeholders and residents, highlighting the entrepreneurial benefits…”

The application of this to your data governance effort is pretty straightforward. Find and nurture relationships with those who are most excited about the business value of data governance and can create the most impact. But, make sure they understand that they need to support and foster data governance through community building and sharing what they’ve learned with others.  The big difference with data governance efforts versus entrepreneurship, is that as data governance efforts across an organization expand and mature, everyone should win, not just a few.

Use best processes, not best practices – “We are a ‘platform’ not a program. An ecosystem exists in nature when numerous species of flora and fauna interact in a dynamic, self-adjusting balancing act. You need to provide a broad platform to support the inclusive vision, for all to interact with each other in innovative ways. Best processes are more important than best practices. One element of “best process” in fostering entrepreneurship ecosystems is experimentation. Experiment. Test. Invent. “

Through collaboration and community-building efforts, data governance efforts continue to build out the platform and portfolio of best process language, products and services that enable the organization’s data ecosystem to thrive and innovate. And, don’t be afraid to try out new things and see if they work. Processes, standards, definitions, policies – these can all be tweaked over time if necessary.

Define principles, not clusters – “Innovation, creativity, design, sustainability, experimentation, entrepreneurship, inclusiveness: these are example principles to be infused into the city’s collective consciousness.  It is the entrepreneur’s job, not City Hall’s or that of a consulting firm, to learn how to identify opportunity, usually where most people think it doesn’t exist. Many of the great opportunities defy definition and lie in the creative “inter-sectors”: health care and the environment; real estate development and information technology and cleantech; education and mobile communications.”

Classically, this is why data governance and data management are based on enterprise architecture and take an enterprise view of data. Trying to solve data issues in silos or divisions can move the ball forward – usually in terms of efficiencies. But to truly be innovative, connections and unlikely combinations across silos, divisions, and even ecosystems need to occur. Visibility into all enterprise data assets, identifying authoritative data sources, and providing high quality data: these are some examples of data governance principles that can support innovation in an organization.

Invest time, not money: “Nothing is free… [but]better to spend your energy persuading the stakeholders that it is worth their while to make those investments…investment is seen as enlightened self-interest.”

Yes, data governance and data management takes time AND money. But, the fact is that you do want all enterprise stakeholder invested in the outcome and success of the program – because they are dependent on quality data to succeed. If they all chip in and have a stake in the game, they will be more interested in helping you succeed.

Fight the battle for talent, not capital: “Make your city an amazing place for the most talented entrepreneurs, innovators and creative people to come to seek their fortunes, to live, work, and play in.”

Data governance isn’t exactly a city, but the concepts of community-building still apply. Set up internal communities using social media to allow folks to come together virtually and share ideas. Spend time trumpeting successes and encouraging the cross-pollination of ideas. Set up your program so it enables success and innovation in the organization by tying it in with key strategic initiatives that have employees talking.

One final point. Isenberg article specifically discussed entrepreneurship in inner cities. He described “Inner city” in this way – “remember just a decade ago when the term ‘inner city’ basically meant ‘dead city’, conjuring up images of destruction, dereliction and despair? Today, inner cities are “in” – innovative, hip hotbeds of convenient culture, commerce and connection.”

Sounds a lot like the world of data to me.