The Three Little Pigs is NOT a Fairy Tale

Published On: February 8, 2021Categories: Blog

Organizations need Five Nine’s of Accuracy in Data Discovery and Classification for Privacy, Security, Data Governance, and Compliance

The Three Little Pigs is a story we all have heard of, and was one of my favorite childhood stories about the three pigs who build their houses of different materials. And we all know the outcome! One day, a big bad wolf blows down the houses of the two pigs made of straws and sticks, but is unable to destroy the house of the third pig made out of bricks. Gosh, not sure how many times I must have read it to my daughters. While it is arguable that all the three pigs were strategic in their thinking and neither of them failed to plan but, the big question is, “what were they trying to protect, against who and was it sustainable?”

Data is not just a driver for business innovation and growth but it also creates a wide and insurmountable threat surface to the organizations with data spread across hybrid IT infrastructures, on-premise and multi-cloud services. Organizations face increasing risks of financial liabilities as the number of data protection and privacy regulations grow internationally. The growing number of privacy incidents and data breaches are hurting the brands and consumer confidence in these companies, not to mention, being subject to increased regulatory fines. Accurately knowing where all your data resides will help to reduce the risks by applying the safety controls & enhance business continuity in case of a breach.

The challenge is that a large number of organizations today employ solutions that discover data in known places only; cannot identify dark data or data in motion within the organization. Moreover, these organizations need to determine which data in their vast data stores is business-critical, sensitive, or subject to regulations so that these tools can effectively manage and protect that data for them. To me, this sounds like building the house with straws and sticks as the truth is, a typical enterprise lacks an understanding of what sensitive data exists and where within the organization, how it is used, who it is shared with and for what purpose, and how long it is needed. So, if an organization doesn’t have an accurate determination of sensitive data, then several questions flare-up: Am I really compliant with all the regulatory nuances? Have I applied all the security controls in the right places, or I have copies of data across the organization still exposed?

Enter 2020 and beyond: An Era of Consumer “Trust” with Data Privacy and Protection

More than ever before, businesses today are legally responsible for safeguarding customers’ personal information residing in any part of the organization. According to Gartner, by 2023, 65% of the world’s population will have their personal information covered under privacy regulations, up from 10% today. Organizations around the world need to make considerable investments in data privacy and protection with the growing regulations, sophisticated data breaches and spur in demand from customers while remaining focused on cost optimizations. According to Gartner analysts [cited in The State of Privacy and Personal Data Protection, 2020-2022], the pace of privacy regulations accelerated through 2020 and has raised the stakes for organizations looking to standardize a global policy when handling personal data. A notable increase in the data subject complaints indicating expectations regarding privacy compliance has not normalized and continues to mount.”

Customers are concerned with how their personal information is stored, who has access to it, and what safeguards are in place to protect their privacy. Many countries (and counting) across the globe have enacted some form of data privacy laws to regulate how information is collected, shared across the organization, and control that the data subject has over it. CISOs and Security & Risk Management leaders face enormous challenges for applicable data privacy and security from incidents such as noncompliance, ransomware, and so on that may lead to fines and lawsuits. For organizations that have or are looking to implement solutions for data discovery that result in errors from unreliable data or misleading insights, the question to ask is, “Is good enough really good enough for your organization?”

The Next-Gen Solution for Sustainable Data Discovery

Understanding where an organization’s sensitive data resides at any instance of time with accuracy helps to understand better how the controls are for managing the risk especially with large amounts of data moving constantly across the organization.’s flagship platform, Inventa™, is the cutting-edge AI-based sustainable data discovery and management platform that provides automated, near real-time discovery, mapping, and tracking of all sensitive data at an enterprise scale with five nines (99.999) of accuracy. It automatically discovers and analyzes all data usage and lineage across the enterprise, even if you have no idea what data you have or where it exists.

Figure 1: A Bird’s Eye View

The platform assumes zero trust in data input and will determine where to scan in a given environment. It enables the organization to identify repositories, such as databases, applications, file systems, log files, cloud storages, real-time streams of informational transactions, etc., where sensitive data resides or may reside. The unique and proprietary passive network packet capture approach helps to identify sensitive data flowing through the organization. Finally, it analyzes and consolidates the data that allows the user to see the data lineage and keeps it up to date, respond to subject access requests, identify production data in non-production locations, and other privacy, security, and data governance tasks.

Summing Up

Sophisticated data breaches continue to rise and expose organizations to expensive fines and potential litigation. Additionally, there is a spur in demand for data subject access rights, such as data access or deletion requests, tracking data transfers, etc. Organizations need to adhere to the growing regulatory compliance requirements at all times and not one instance of time. Historically, organizations have hired consultants or deployed first-generation solutions that use manual techniques (for example, the user needs to point to all repositories to scan for data discovery) to conduct data discovery, mapping, and validation for compliance. Thus, data and analytics leaders spend time and money on mandatory activities like regulatory compliance and incident response augmentation at scale.

The truth is that large amounts of data within the organization is unknown, in motion, and is stored not just in structured databases. Critical data can be anywhere from files, end-user applications, emails, google docs, images, etc. Frankly, this is where the risk lies as it makes it very difficult for data guardians of an organization to locate and create an inventory of critical data for a holistic view. Organizations use® Inventa™ for sustained data discovery and classification to gain better visibility and understanding of the organization’s sensitive data, secure sensitive data with appropriate controls and policies, and support compliance, privacy, and ethical data use. By adding a multilayer machine learning analytic engine, the platform provides the ability to “read and understand” the data and link all the pieces into a full picture represented as both an inventory and a master catalog.

As we learned from the third pig, to build the house on a firm foundation, organizations need to have an automated and sustainable data discovery process with greater accuracy. So, when the big bad wolf comes and that he sure will, our sensitive data is protected across the organization.