By: Mark Lewis, SVP Sales and Marketing
No matter how much a company invests in big data, analytics, or AI or how carefully the initiative was rolled out, a slight misstep in maintaining data integrity can have a negative effect on informed decision-making.
A minor slip-up on data integrity can bring mistrust, reputational loss, issues with data security, uninformed decisions, and expensive regulatory risks. A Gartner research report published in 2018 estimated that poor data quality could be responsible for an average of $15 M per year in losses. Harvard Business Review estimates that $3.1 T overall is wasted on identifying and fixing data issues each year.
The 2020 survey of data leaders by the Business Performance Innovation (BPI) Network noted that a majority of CDOs or Chief Data Officers are building and refining data lakes in the cloud to expand the amount of data. They are doing so to manage and analyze a growing volume and variety of structured, semi-structured, and unstructured data for use cases across their enterprises.
But there also exists a wide range of complexities around data quality, security, synchronization, migration, governance, and visibility. The major challenge is getting lost in trying to achieve too much with their data analytics too quickly. CDOs wrestle with leveraging data analytics to reduce risk and drive new revenue opportunities, including monetizing data. Another major challenge is data replication and migration, moving data efficiently and selectively between locations without compromising data integrity.
Integrity – what and how?
That’s what underlines the aspect of data integrity. It makes it an imperative rather than a choice for all data teams and strategies. Integrity means that data shows reliability and trustworthiness through its life cycle. Data is verified in the database for accuracy and whether it functions as per requirements. It also helps to validate whether data is not modified and/or corrupted unexpectedly while executing any access step to a database.
Data integrity testing takes care of these areas. It finds out if there are any changes or errors in the stored data. It makes sure that data is valid across relationships. This is a test for consistency, conformity, and uniformity of data.
Here are some key ways to test these characteristics:
- Checking data compatibility with an older version of an operating system.
- Database storage at the right place and in the right set.
- Analysis of blank values or default values.
- Connections between front end and back end for all functionalities.
- Ability to keep up with ever-growing complexity and size of data.
- Entity Test – examination of each row of a table for a specific degree of a non-null primary key.
- Domain Integrity – here, each set of data values is checked.
- Checking the relationship between a foreign key and primary key of multiple tables.
- Ascertaining context and real-time consistency of every data value – cold and warm alike.
Data integrity is an essential part of the testing and software development process. Like a NewVantage survey pointed out recently – almost 49 percent of CDOs have primary responsibility for data within their firm. So, they should incorporate data integrity to lead to tangible outcomes of data. They can mitigate risks, avert attacks, and reduce the compliance burden by investing in data integrity in the right way and at the right time. It is a significant investment and can bring huge gains on many dimensions when done well.
Smart CDOs already do that. It is all about making the software fool-proof. This is what will make data investments ready for the power of analytics.