Data Maturity in a Social Business and Big Data World

Dion Hinchcliffe, the executive vice-president of strategy at Dachis Group, in his book Social Business By Design, defines social business as the intentional use of social media to drive meaningful, strategic business outcomes. Companies are leveraging social media platforms and data to drive business, inform consumer engagement, and to enhance and expand the company’s analytical capability. Watching Twitter feeds? Check. Monitoring Facebook and Pininterest? Yep.  Building internal collaboration platforms to more tightly integrate your business partners? Of course.

To harness the transformative power that social business and social business analytics promises, companies need to integrate information from multiple data sources. This includes both structured and unstructured data. It is critical then, to have both a strong data governance foundation in place, as well as an infrastructure that can quickly consume, integrate, analyze, and distribute this new information. Incompatible standards and formats of data in different sources can prevent the integration of data and the more sophisticated analytics that create value.

A company’s ability to strongly leverage social media as a social business will be infinitely enhanced by having a strong foundational data and technology infrastructure, along with data governance policies and processes for integrating social media data sets.

The figure below overlays Hinchcliffe’s social business maturity model (in red, with four of eight community management competencies shown in gold) with a traditional data governance maturity model (shown in blue) and technology maturity model (in orange).

DM Maturity in a Social Business

Implementing cross-channel customer engagement or enriching in-house data with purchased behavioral/lifestyle data WITHOUT already having master data and a master data management system in place would require hours of manual manipulation on the part of employees, leaving little time for the actual analysis of data. Additionally, services such as alerts and recommendations would not be accurately possible (thus potentially risking a privacy violation) without a master profile of the customer. Likewise, an organization’s internal infrastructure (beyond big data clusters) must also be sophisticated enough to move data throughout the organization, when and where it’s needed.

While the rush to social business and big data certainly is on, smart data companies are also investing in foundational data management, data governance, and technology architecture to support their long-term vision.

Open Data – Is Anyone Really Using It?

Over the past four years, there has been an unprecedented push for government transparency at the federal and state levels. The statistics are really quite amazing, considering it’s the government that we’re talking about. There are almost 400,000 raw and geospatial data sets on, including some that leverage semantic web and RDF (resource description framework) technology for linked open data sets. 35 states providing other raw open data sets online ( Many cities and counties are joining in this effort, though some are reticent due to the revenue loss impact by not being able to sell these data sets to data aggregators and research institutions. There are close to 1300 apps that have been created through the use of the federal data sets alone, and cities across the country add to that total with the local hackathons and codeathons that regularly take place.

Data is a critical asset of the governmental ecosystem, and we’re actually seeing governments at all levels (and internationally) making that data available (to the extent that it is permissible and they are able and still be in compliance with federal and state privacy policies). Don’t get me wrong, there is still a lot more that can be – and will be – done, but the past three years have really been momentous.

But, what feels really absent to me in the big data discussions that everyone seems to be so enamored with, is what the private sector is actually doing to incorporate and leverage these governmental open data sets with other “externally” source-able data (in this I include data from credit reporting agencies, data aggregators, social media, etc.). I recently attended the Strata + Hadoop conference in New York City. There were very few government participants. And, I heard very little about how the private sector is using and leveraging open data. While marketing firms for years have used census data, I still struggle to find good examples of how healthcare organizations or banks or logistics companies or energy companies are incorporating and using this data to enrich and add insights to existing data sets they have.

Maybe they are, but just being quiet about it. Or maybe outside the DC beltway, companies aren’t really paying that much attention to what the Feds are doing with its data assets. Or, maybe they don’t trust the quality of the data from the Feds, just like they don’t trust most other things coming from the Federal government.

In some ways, it seems like the Feds have been really promoting these open data efforts. Certainly the start-up community has noticed and continues to churn out new apps to take advantage of these data sets. The “brand” – open data- is certainly locked in and is recognized worldwide.

But what about established companies, of all sizes? Are they getting the message? Do they understand the value in those datasets? Is there anything that can be done to further communicate and message this to the private sector? Do we even care? I don’t have answers, I’m just raising the questions based on my observations.

I guess I am a believer that publishing open data for the sake of being transparent isn’t necessarily all that helpful to anyone other than those organizations and non-profits who are focused on ensuring that “we-the-people” aren’t getting screwed. And certainly, there is some base value in that. But the real leverage points, the real utility of open data is when it can be combined with other data and actually USED to solve problems or answer questions or help companies innovate.

We are still in the early maturity stages of open data and big data. I look forward to the growth in this area, not just from the government side, but from private industry in terms of how they incorporate and leverage open data sets. In the meantime, I will hope that industry really starts to appreciate the value in what government is giving them so freely.   Maybe in this Darwinstic environment in which we live, the companies that get it will leverage the data, drive innovation, and will rise to the top. The others; well, they won’t.

How Data Management and Data Governance Support Big Data ROI

According to Aberdeen Group (source: Data Management for BI: Fueling the Analytical Engine with High-Octane Information), best-in-class companies take 12 days on average to integrate new data sources into their analytical systems; industry average companies take 60 days; and, laggards 143 days. If you’re an average company (raise your hand if you think you are), it will take you up to 2 months to integrate new data. Do you have that much time? Can you afford to react that slowly to customer or market opportunities? For laggard companies, this really means they’re dead in the water.

Traditional legacy, siloed systems are heavy: incompatible standards and data formats slow or prevents quick integration of existing or existing with new data. This inability to quickly and effectively integrate new data sets for either real-time or predictive analytics limits the ability of organizations to pursue new opportunities, support customer needs, and to drive insights. In short, it limits an organization’s ability to be proactive in revenue-generating situations.

Part of the reason for this time delay and inability to drive the ‘last-mile’ analytics that companies need, is the lack of data management processes. No one needs to explain anymore the myriad of data across organizations and ecosystems: transactional data, operational data, analytical repositories, social media data, mobile device data, sensor data, and structured and unstructured data. Usually a lack of data (with some exceptions) is not the main problem for an organization.

The main problem is how to manage, integrate it, and distribute it (appropriately) so that organizations can nimbly and agilely exploit data and opportunities. The Data Management Association International (DAMA) provides a framework that is a holistic approach to understanding the data and information needs of the enterprise and its stakeholders. Most of my readers already have an awareness of the DAMA ‘wheel’, which highlights ten areas of data management, with data governance as the center point. In addition to the Data Warehousing & Business Intelligence Management pieces of the wheel, other key areas of data management that are important to data integration, analytics and optimization include: data governance, data architecture, master data management, meta data management, and data security.

From a big data perspective, these area help provide answers to the following questions:

  • What data do we have? Where are gaps in data that we need?
  • What data is intellectual property for us that can help us exploit new opportunities?
  • How do we integrate the right data together?
  • How do these data sets relate to each other?
  • Do we have all of the data about this (fill in the blank – person, event, thing, etc.)?
  • What are the permissible purposes of the data? Can we link and leverage these disparate data sets together and still be in regulatory compliance?

Data driven, data centric organizations (best-in-class) consider data needs early and often in the business strategy process. They are not in reactive mode after IT has architected and implemented a solution to then determine what the reporting, analytics, and big data opportunities may be.  Understanding the business strategy and business needs drives strong data management and data governance. Data management and data governance allow the strong management of data assets, so those assets are leverageable for big data purposes in ways that optimizes benefit and return on invest to the organization.

Data management maturity supports big data maturity by providing the policies, processes, and infrastructure to quickly consume, integrate, analyze, and distribute high quality, trusted data to the user’s (employee, executive, customer, business partner) point of touch so that insights can be derived and action taken as rapidly as possible.

Beyond Quality And Security – The Importance Of Establishing Control Points For Information Management Across The Organization

Strong data management doesn’t just begin on the back end, when the data actually hits a database. It begins long before that, early in the data lifecycle, and across many areas of the organization.

One of the crucial elements in becoming a data-centric organization is in culturally changing the awareness of thinking about data from a variety of prisms. Strategies come down from top management; specific goals and objectives then get developed. The next questions should be: what data do we need to support those goals, objectives, programs etc? Does that data already exist in the organization or do we have a gap? For the gaps, how do we close them and how do we ensure tightness and alignment with existing data management strategies?

There are a number of control points that come out of this scenario:

  • Strategic planning – What data do we need to measure success?
  • Goals and objectives – What KPIs and metrics are important? What type of reporting and dashboards are required? Do we have all the data that we need for reporting and metrics measurements? Do we trust in the quality and integrity of the data that we need for reporting? If not, what gaps do we need to close to build trust?
  • Budgeting and financing – What controls have we implemented to support the optimization of our data investments across the entire enterprise? Are we aligning various programs across the organization such that we reduce data silos and redundancy, and optimize information sharing and infrastructure development where possible? Does someone have stated authority and responsibility for overseeing this planning and budgeting?
  • Business case development – What data do we have in-house (presumes a knowledge of all enterprise data assets) to support new programs or applications? Can we leverage these in-house data sets for this purpose (compliance/regulatory check-point)? How do we close the data gaps we have (can we capture data via existing sources? Do we need to purchase data from 3rd parties? Are there open data sets that are leverageable?)
  • Requirements gathering – Where are the authoritative sources of the different data sets we need? Are we leveraging organizational reference and master data?
  • Build vs. buy decisioning – If we build something in-house, how can we maximize previous infrastructure investments in data, hardware, middleware, and exchange mechanisms so as to minimize duplication or silo building? Buying a solution means building in checkpoints for ensuring ease of integration and data extraction.
  • Contracts and Procurement – What language do we have in our contracts to enforce compliance or alignment with internal data management and data security policies? Do we always get a data dictionary? Do we ask vendors to provide us mapping to our conceptual and logical data models? Do we ensure data quality levels (for certain types of acquisitions)? Who actually owns the data? If we’re outsourcing our data, what are our access rights for transactional, analytical, regulatory, and recovery purposes?

Organizations that think this way are truly data-centric organizations. Not only do they understand data as an asset, but also both try to protect it from dilution and look for the multiplier effect on their data investment by improving the leveragability of data across the organization and its ecosystem.

Applying Entrepreneurial Principles to Data Governance

I recently read a wonderful article by Daniel Isenberg in the Harvard Business Review article entitled “Planting Entrepreneurial Innovation in Inner Cities” (June 5, 2012).  While on the surface it had nothing to do with data governance, it hit me as I got further in to it, that from an organizational perspective, there are many similarities between new and young data governance efforts and entrepreneurial ventures.

Many data governance efforts start out as entrepreneurial feats, engineered by a few people with a vision, creating it on a shoe-string budget.  Perhaps they have an “angel” investor – an executive sponsor with enough vision to provide some capital and a few people to see what can happen. The goal of course, is to get enough wins (customers) under their belt to build a business plan and take it to an internal “venture capitalist” for full funding, more resources, and to support expansion. The data governance teams have similar qualities to entrepreneurs in terms of the amount of time, energy, creativity and dedication to their vision it takes to build the program out.

So, here is a synopsis of Isenberg’s major principles about fostering an environment of entrepreneurship. Think about how these can be applied in your organization.

Develop an inclusive vision of high growth entrepreneurship-, “It is a reality that a small number of extraordinary entrepreneurial successes have a disproportionately stimulating effect on the environment for entrepreneurship… But, counter this with a strong message to entrepreneurs that they need to play a role in community building. …you need to tirelessly communicate a coherent message to all of the stakeholders and residents, highlighting the entrepreneurial benefits…”

The application of this to your data governance effort is pretty straightforward. Find and nurture relationships with those who are most excited about the business value of data governance and can create the most impact. But, make sure they understand that they need to support and foster data governance through community building and sharing what they’ve learned with others.  The big difference with data governance efforts versus entrepreneurship, is that as data governance efforts across an organization expand and mature, everyone should win, not just a few.

Use best processes, not best practices – “We are a ‘platform’ not a program. An ecosystem exists in nature when numerous species of flora and fauna interact in a dynamic, self-adjusting balancing act. You need to provide a broad platform to support the inclusive vision, for all to interact with each other in innovative ways. Best processes are more important than best practices. One element of “best process” in fostering entrepreneurship ecosystems is experimentation. Experiment. Test. Invent. “

Through collaboration and community-building efforts, data governance efforts continue to build out the platform and portfolio of best process language, products and services that enable the organization’s data ecosystem to thrive and innovate. And, don’t be afraid to try out new things and see if they work. Processes, standards, definitions, policies – these can all be tweaked over time if necessary.

Define principles, not clusters – “Innovation, creativity, design, sustainability, experimentation, entrepreneurship, inclusiveness: these are example principles to be infused into the city’s collective consciousness.  It is the entrepreneur’s job, not City Hall’s or that of a consulting firm, to learn how to identify opportunity, usually where most people think it doesn’t exist. Many of the great opportunities defy definition and lie in the creative “inter-sectors”: health care and the environment; real estate development and information technology and cleantech; education and mobile communications.”

Classically, this is why data governance and data management are based on enterprise architecture and take an enterprise view of data. Trying to solve data issues in silos or divisions can move the ball forward – usually in terms of efficiencies. But to truly be innovative, connections and unlikely combinations across silos, divisions, and even ecosystems need to occur. Visibility into all enterprise data assets, identifying authoritative data sources, and providing high quality data: these are some examples of data governance principles that can support innovation in an organization.

Invest time, not money: “Nothing is free… [but]better to spend your energy persuading the stakeholders that it is worth their while to make those investments…investment is seen as enlightened self-interest.”

Yes, data governance and data management takes time AND money. But, the fact is that you do want all enterprise stakeholder invested in the outcome and success of the program – because they are dependent on quality data to succeed. If they all chip in and have a stake in the game, they will be more interested in helping you succeed.

Fight the battle for talent, not capital: “Make your city an amazing place for the most talented entrepreneurs, innovators and creative people to come to seek their fortunes, to live, work, and play in.”

Data governance isn’t exactly a city, but the concepts of community-building still apply. Set up internal communities using social media to allow folks to come together virtually and share ideas. Spend time trumpeting successes and encouraging the cross-pollination of ideas. Set up your program so it enables success and innovation in the organization by tying it in with key strategic initiatives that have employees talking.

One final point. Isenberg article specifically discussed entrepreneurship in inner cities. He described “Inner city” in this way – “remember just a decade ago when the term ‘inner city’ basically meant ‘dead city’, conjuring up images of destruction, dereliction and despair? Today, inner cities are “in” – innovative, hip hotbeds of convenient culture, commerce and connection.”

Sounds a lot like the world of data to me.

Davos and Data – Please Don’t Forget the Basics!!

The annual World Economic Forum recently ended in Davos (one year, I WILL get an invitation to attend this!!).  Those of you who follow my Twitter feed know that data was a big topic at the WEF this year. There were several sessions on the topic, and a report titled “Big Data, Big Impact: New Possibilities for International Development” was released.  The report focuses specifically on the impact the collection and proper application of big data (particularly from mobile devices) can have on financial services, education, agriculture and health care.

Yesterday, the WEF’s Global Agenda Council on Emerging Technologies released its list of top 10 emerging technologies for 2012.  Number one on that list is Informatics for adding value to information, which the Council further explained as:

“The quantity of information now available to individuals and organizations is unprecedented in human history, and the rate of information generation continues to grow exponentially. Yet, the sheer volume of information is in danger of creating more noise than value, and as a result limiting its effective use. Innovations in how information is organized, mined and processed hold the key to filtering out the noise and using the growing wealth of global information to address emerging challenges.”

Informatics beat out some very cool scientific areas such as synthetic biology, nanoscale design of materials, and high energy density power systems. Data has gone mainstream.

In everyone’s rush to jump on the ‘big data’ bandwagon, the ‘informatics’ bandwagon, the ‘unstructured data’ bandwagon, there are foundational items that need to addressed if organizations are going to see the kinds of payoffs they should be having, or if this becomes added to the list of trendy things that didn’t work out.

#1 – Have a plan. An enterprise information management strategy is absolutely necessary. Your business has a strategic plan (hopefully). There is no way any business today can operate or innovate without using and leveraging data, so there should be a plan around the capture, usage, maintenance, distribution, security, and disposition of your corporate data assets.

#2 – Someone should have ultimate responsibility and authority for data. This is not the CIO. This is not the CTO. This is not the IT team. This is someone who is charged with the responsibility of managing data from the enterprise perspective who represents the business, who sits on the executive leadership team, who makes the executive decisions, and who’s ass is on the line for the overall quality, integrity, and optimization of those data assets.

#3 – There must be an investment in data. This investment should be in the form of people, dollars, training, and technology.

If the foundational items aren’t done, what your company will have is still a bunch of siloed data of questionable quality – you’ll just have more of it.

Starting an Enterprise Data Program From Scratch, Part 2

In my initial blog on December 11, I kicked off dataTrending with a discussion on building an enterprise information management (EIM) program from scratch, as well as what the role of the Chief Data Officer (CDO) should be. The two big challenges we faced at the State of Colorado with regards to EIM were, first, the operational authority and scope of the CDO and program. This blog picks up where the first blog left off, and deals with the second challenge, how to build value quickly in an organization with no history or reference point for this type of work.

The State literally had no history of enterprise architecture or data management principles and policies beyond individual agencies.  Creating value quickly to both build momentum and to increase support among the skeptics would be critical. There was an abundance of opportunities and work to be done. How should we start and how do we prioritize opportunities? How do we organize the work and develop a framework that is repeatable and sustainable across a $19B organization? How do we manage work across multiple swim lanes – governance, policy and process, change management, technology and tools – at once? How does one mobilize an organization to start thinking and acting differently about its data? The cultural and trust issues can be real impediments to success and need to be addressed both head on and with diplomacy.

The most critical factor was to align and prioritize this work with the strategic needs, opportunities, and key business drivers of the State. What was important from the executive management’s (the Governor, Legislature, and Agency Directors) perspective? What were their top problems and issues they were trying to solve that could be supported by our program? This is the only way, in my opinion, an EIM program can truly add value and keep support.

The three prisms through which we approached our work was legislative, governance, and operations.

Through a series of four laws over two years, we established the state’s intent around data sharing and information management; explicitly gave agencies permission to share data (unless a federal or other state law expressly prohibited sharing of certain data); and, established a governance board that would retain continuity through administration changes.

On the governance side, we purposefully asked for a mix of business, technology, and financial representatives from the agency participants to ensure that business was driving priorities. Meetings were held monthly, and a very actionable set of deliverables was developed and worked through so progress could be quickly seen.

Operationally, we adopted enterprise architecture and data management frameworks to ensure our approach had a roadmap and followed industry best practices. We created an enterprise data strategy and work priorities were developed through input from key stakeholders across the organization and the governance board.

One of our primary business drivers was enterprise information sharing to inform policy making, resource decisioning, and program management. We identified three primary communities of interest with major information sharing initiatives across the state agencies and leveraged those projects to begin building out our portfolio of policies, processes, procedures, technologies, and tools that we would need to support enterprise initiatives. Into the project budgets we added line items for key tasks, activities, or people to support the work. Things like an enterprise architecture tool, business analysts, and data systems inventorying.

Instead of trying to tackle the entire data management framework, we identified four major areas that would support the information sharing business driver to begin our policy and standards work. And then finally, we just dove in and started the hard work. It was a major effort of entrepreneurship in a government environment. It was not always easy, and not everything we did was a huge success. But we had executive support and were iteratively able to make progress and show small wins built into bigger successes.