Source: dqindia.com
Seventy percent of companies have reported minimal or no impact from Artificial Intelligence projects, according to a survey by MIT and Boston Consulting Group.
There are a number of reasons for this including a lack of focus on cultural change and training within an organisation as it adapts to new working practices, but the most important factor is poor data. This encompasses everything from inadequate data architecture and discovery, to modelling, quality, and governance.
If using the analogy that artificial intelligence is the “icing on the cake”, then data is the cake itself.
At some point over the next 12 months, with the global recession constraining budgets for every organisation, Chief Information Officers and Chief Data Officers will need to demonstrate a clear return on investment for their AI projects and provide evidence of measurable results.
To date, AI projects have been as much about showcasing that its mere presence demonstrates innovation and a digital transformation at the cutting edge of technology. This will not be the case moving forward with increased financial and operational scrutiny so CIOs/CDOs will therefore need to “get it right” or risk funding for AI projects being curtailed.
Below are five data-related areas of focus that organisations should seek specialist guidance on to “get data ready” and enable success for these artificial intelligence projects.
Data Architecture
There is no more valuable role right now in any organisation than a data architect, and the best architects will understand that the end output from their role is to unlock value from data.
Data Architects understand the organisation’s strategy and business problems to be solved, but have the technical insight to get their hands dirty in the data itself. They know what data is needed and how that data is to be integrated by various systems. They then set out clear guidance for everything from governance to security. This role is not the same as IT architecture and often organisations make the mistake of leaving this to the CTO or a traditional IT transformation provider, rather than bringing in the specialist support required to achieve success.
Data Access
Once the architecture is sound from a business perspective (rather than just an IT perspective), there are a huge number of data sources to feed the system and these will need careful management. The greater the richness of data from various sources the better the output, yet without management, an organisation becomes overloaded with messy, inaccurate, or incomplete data of different types and quality.
These data sources include enterprise data silos (in whatever format from databases to spreadsheets on someone’s hard drive), open-source data such as social media, government data, or data from the Internet of Things sensors.
However, the real challenge is that these are all separate data silos that cannot be moved into one convenient data warehouse for data scientists to extract, transform, and load. The growth of Data Catalogs demonstrate that they are quickly becoming a “must-have” for CIOs/CDOs in any organisation to solve the problem of multiple data silos with the more advanced data catalogs offering a discovery capability so users find data that they didn’t even know existed.
Data Modelling
Data modelling is often seen as boring and therefore overlooked but if your organisation wants to really get value out of its data then this is a critical activity. Why wouldn’t this be the case where data is now such a valuable asset in any organisation?
Time invested in data modelling ensures consistency of structure, terminology and standards throughout the organisation. Even the process from moving from a conceptual model to a physical data model encourages collaboration and agreement between all data stakeholders.
The data modelling process brings together business processes and aligns with data and IT communities. It will define key components and relationships between various sources of data and business workflows and will save time and cost for the organisation as well as improve performance. Put simply, it will enable everyone to understand how data is to be used within the organisation and, more importantly, how that data is turned into information, which delivers insight to an end-user.
Data Quality
Like any asset, its value depends on how reliable it is to the organisation. The same applies to data – often organisations cannot or do not want to measure the cost to their organisation that can be directly attributable to poor quality data or missing data. Data quality is the standard to which an organisation’s data is accurate, timely and complete, as well as consistent with business rules.
Without good data quality, data itself cannot be relied upon for analytics or AI applications. This becomes even more important as the volume of data will grow, as will the types of data from different sources of data so this is an ongoing process that needs to be proactively managed through business rules and accepted by the entire organisation.
Data Governance
Having focussed on the above four items, data governance consists of the rules, enforcement and management of an organisation’s data as an asset. This ensures that the above four areas are not one-off activities that fall by the wayside once completed.
Data only has a certain lifespan for it to be useful. Good data governance accompanies the change in culture through training, which is required to complement AI projects. Data governance must be a whole-organisation proactive activity to maintain a level of data access, quality, security and management to provide the organisation with data at the right time, and of the right quality, to turn into information, and therefore create data-driven insight on which business decision can be made.
A strategy that addresses how an organisation manages the above five areas of data management is a great start to demonstrate that an organisation is thinking about its data as an asset. Artificial intelligence projects can then be far more successful in releasing the value of that data.