Data Maturity in a Social Business and Big Data World

Dion Hinchcliffe, the executive vice-president of strategy at Dachis Group, in his book Social Business By Design, defines social business as the intentional use of social media to drive meaningful, strategic business outcomes. Companies are leveraging social media platforms and data to drive business, inform consumer engagement, and to enhance and expand the company’s analytical capability. Watching Twitter feeds? Check. Monitoring Facebook and Pininterest? Yep.  Building internal collaboration platforms to more tightly integrate your business partners? Of course.

To harness the transformative power that social business and social business analytics promises, companies need to integrate information from multiple data sources. This includes both structured and unstructured data. It is critical then, to have both a strong data governance foundation in place, as well as an infrastructure that can quickly consume, integrate, analyze, and distribute this new information. Incompatible standards and formats of data in different sources can prevent the integration of data and the more sophisticated analytics that create value.

A company’s ability to strongly leverage social media as a social business will be infinitely enhanced by having a strong foundational data and technology infrastructure, along with data governance policies and processes for integrating social media data sets.

The figure below overlays Hinchcliffe’s social business maturity model (in red, with four of eight community management competencies shown in gold) with a traditional data governance maturity model (shown in blue) and technology maturity model (in orange).

DM Maturity in a Social Business

Implementing cross-channel customer engagement or enriching in-house data with purchased behavioral/lifestyle data WITHOUT already having master data and a master data management system in place would require hours of manual manipulation on the part of employees, leaving little time for the actual analysis of data. Additionally, services such as alerts and recommendations would not be accurately possible (thus potentially risking a privacy violation) without a master profile of the customer. Likewise, an organization’s internal infrastructure (beyond big data clusters) must also be sophisticated enough to move data throughout the organization, when and where it’s needed.

While the rush to social business and big data certainly is on, smart data companies are also investing in foundational data management, data governance, and technology architecture to support their long-term vision.

Advertisements

Data and Trust – Thoughts from the World Economic Forum’s Global Agenda Outlook 2013

In the recently released “Global Agenda Outlook 2013” by the World Economic Forum, one of the main topics that is tackled as part of the ‘agenda’ is titled ‘Thriving in a Hyperconnected World”.  The main premise is that the physical world and the digital world are merging rapidly, and institutions and leaders are not prepared to deal with it. Not only are the technologies evolving, but the amount of data being generated is completely unprecedented, yet will only grow.

Two of the major components of this “hyperconnectedness” that the WEF discusses are data and trust. Marc Davis with Microsoft Online Service Division frames it nicely: “[Big data] is a question of the structure of the digital society and digital economy, what it means to be a person, who has what rights to see and use what information, and for what purposes might they use it.”

Globally, countries and industries are dealing with the policy, economic and regulatory structures (on top of the technical interoperability challenges) to control the flow and sharing of data, particularly personal data. Yet there is virtually nothing that is done today without a data component. There are both huge societal benefits to the amount of data generated today, as well as potentially enormous – and life-threatening- drawbacks to this data if not managed properly, if collected erroneously, and if inappropriately shared.

There are many reasons why we each give data up – to open a bank account, to purchase a vehicle, to get healthcare treatment, to find people to date, to unlock a badge from our favorite gaming site. But in these instances, we make a conscious  choice to give up certain pieces of data and information about ourselves.

But we don’t know what the internal data quality practices are of the companies to whom we give data; we don’t know how they manage their cyber security practices; we don’t know how their internal access and authentication controls are managed; we don’t know if the company has the ability to do tagging at the data element level to fortify its privacy compliance protocols; we don’t know to whom the company resells our data; we don’t know if the company’s legitimate business partners with legitimate access to our data are also protecting our data with the same degree of integrity.

Knowing what I know about the actual limited capabilities of federal and states governments here in the U.S. to actually integrate and share data, I’m far less concerned with ‘Big Brother’ than I am with Amazon and Apple (both of whom seem to do a far more effective and efficient job of managing my data correctly) doing something creepy with my data (like recommend me purchasing a Justin Bieber CD).

Trust frameworks, transparency, policies, accountabilities – these are all steps on the right path to building trust. To engender trust by people, by society, in how data is collected, managed, and used, requires multiple degrees of sophistication far beyond where many organizations and institutions are today. This includes with technology, policies and regulations, and economic models. Unfortunately, policy will never keep up with the speed of technology innovation, so it may take awhile to get to trust.

Most importantly, however, individuals need to take responsibility for their data: being educated about their data, about how to control it, and to be given more controls over their data (especially when its in the hands of institutions). This part of the discussion is largely absent from the overall debate, and needs to be given its due attention.

Thoughts about how to move this individual responsibility discussion forward?

Open Data – Is Anyone Really Using It?

Over the past four years, there has been an unprecedented push for government transparency at the federal and state levels. The statistics are really quite amazing, considering it’s the government that we’re talking about. There are almost 400,000 raw and geospatial data sets on Data.gov, including some that leverage semantic web and RDF (resource description framework) technology for linked open data sets. 35 states providing other raw open data sets online (http://www.data.gov/opendatasites). Many cities and counties are joining in this effort, though some are reticent due to the revenue loss impact by not being able to sell these data sets to data aggregators and research institutions. There are close to 1300 apps that have been created through the use of the federal data sets alone, and cities across the country add to that total with the local hackathons and codeathons that regularly take place.

Data is a critical asset of the governmental ecosystem, and we’re actually seeing governments at all levels (and internationally) making that data available (to the extent that it is permissible and they are able and still be in compliance with federal and state privacy policies). Don’t get me wrong, there is still a lot more that can be – and will be – done, but the past three years have really been momentous.

But, what feels really absent to me in the big data discussions that everyone seems to be so enamored with, is what the private sector is actually doing to incorporate and leverage these governmental open data sets with other “externally” source-able data (in this I include data from credit reporting agencies, data aggregators, social media, etc.). I recently attended the Strata + Hadoop conference in New York City. There were very few government participants. And, I heard very little about how the private sector is using and leveraging open data. While marketing firms for years have used census data, I still struggle to find good examples of how healthcare organizations or banks or logistics companies or energy companies are incorporating and using this data to enrich and add insights to existing data sets they have.

Maybe they are, but just being quiet about it. Or maybe outside the DC beltway, companies aren’t really paying that much attention to what the Feds are doing with its data assets. Or, maybe they don’t trust the quality of the data from the Feds, just like they don’t trust most other things coming from the Federal government.

In some ways, it seems like the Feds have been really promoting these open data efforts. Certainly the start-up community has noticed and continues to churn out new apps to take advantage of these data sets. The “brand” – open data- is certainly locked in and is recognized worldwide.

But what about established companies, of all sizes? Are they getting the message? Do they understand the value in those datasets? Is there anything that can be done to further communicate and message this to the private sector? Do we even care? I don’t have answers, I’m just raising the questions based on my observations.

I guess I am a believer that publishing open data for the sake of being transparent isn’t necessarily all that helpful to anyone other than those organizations and non-profits who are focused on ensuring that “we-the-people” aren’t getting screwed. And certainly, there is some base value in that. But the real leverage points, the real utility of open data is when it can be combined with other data and actually USED to solve problems or answer questions or help companies innovate.

We are still in the early maturity stages of open data and big data. I look forward to the growth in this area, not just from the government side, but from private industry in terms of how they incorporate and leverage open data sets. In the meantime, I will hope that industry really starts to appreciate the value in what government is giving them so freely.   Maybe in this Darwinstic environment in which we live, the companies that get it will leverage the data, drive innovation, and will rise to the top. The others; well, they won’t.

How Data Management and Data Governance Support Big Data ROI

According to Aberdeen Group (source: Data Management for BI: Fueling the Analytical Engine with High-Octane Information), best-in-class companies take 12 days on average to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards 143 days. If you’re an average company (raise your hand if you think you are), it will take you up to 2 months to integrate new data. Do you have that much time? Can you afford to react that slowly to customer or market opportunities? For laggard companies, this really means they’re dead in the water.

Traditional legacy, siloed systems are heavy: incompatible standards and data formats slow or prevents quick integration of existing or existing with new data. This inability to quickly and effectively integrate new data sets for either real-time or predictive analytics limits the ability of organizations to pursue new opportunities, support customer needs, and to drive insights. In short, it limits an organization’s ability to be proactive in revenue-generating situations.

Part of the reason for this time delay and inability to drive the ‘last-mile’ analytics that companies need, is the lack of data management processes. No one needs to explain anymore the myriad of data across organizations and ecosystems: transactional data, operational data, analytical repositories, social media data, mobile device data, sensor data, and structured and unstructured data. Usually a lack of data (with some exceptions) is not the main problem for an organization.

The main problem is how to manage, integrate it, and distribute it (appropriately) so that organizations can nimbly and agilely exploit data and opportunities. The Data Management Association International (DAMA) provides a framework that is a holistic approach to understanding the data and information needs of the enterprise and its stakeholders. Most of my readers already have an awareness of the DAMA ‘wheel’, which highlights ten areas of data management, with data governance as the center point. In addition to the Data Warehousing & Business Intelligence Management pieces of the wheel, other key areas of data management that are important to data integration, analytics and optimization include: data governance, data architecture, master data management, meta data management, and data security.

From a big data perspective, these area help provide answers to the following questions:

  • What data do we have? Where are gaps in data that we need?
  • What data is intellectual property for us that can help us exploit new opportunities?
  • How do we integrate the right data together?
  • How do these data sets relate to each other?
  • Do we have all of the data about this (fill in the blank – person, event, thing, etc.)?
  • What are the permissible purposes of the data? Can we link and leverage these disparate data sets together and still be in regulatory compliance?

Data driven, data centric organizations (best-in-class) consider data needs early and often in the business strategy process. They are not in reactive mode after IT has architected and implemented a solution to then determine what the reporting, analytics, and big data opportunities may be.  Understanding the business strategy and business needs drives strong data management and data governance. Data management and data governance allow the strong management of data assets, so those assets are leverageable for big data purposes in ways that optimizes benefit and return on invest to the organization.

Data management maturity supports big data maturity by providing the policies, processes, and infrastructure to quickly consume, integrate, analyze, and distribute high quality, trusted data to the user’s (employee, executive, customer, business partner) point of touch so that insights can be derived and action taken as rapidly as possible.