Can data privacy and data intelligence coexist?
- by 7wData
We have to change how we approach customer data. Instead of the current “more is better” approach to data acquisition, we need to focus on collecting only the minimum amount we need to remain intelligent.
In a business climate where data is considered one of the most important resources for financial success, this may sound counter-intuitive. However, it’s a change businesses will need to make, and collecting less data is actually not as risky as it sounds.
The common assumption in business today is that the more data your systems have access to, the more intelligent they will be. This is not always the case, however. And even where it is, the inverse — that less data must thus equal less intelligence — is emphatically not true.
When the assumption prevails that more data is a competitive business differentiator, businesses are, in effect, incentivized to pursue new and more ways to gather data — often to disastrous effect.
Every day we see news about data breaches, leaks, and exposed vulnerabilities. We learn horror stories of identity theft and financial fraud, and we witness businesses suffering reputational damage, regulatory punishment, and consumer backlash because of their inability to protect the data they collect.
Privacy is only one of the problems associated with this overwhelming push for more data. There are also substantial costs associated with massive-scale data acquisition and management: computation costs, storage costs, operational costs, and more. We are in an era of Big Data, AI, and machine learning, and yet if data volume continues to be equated with system intelligence, these costs will continue to skyrocket.
Businesses today want to know absolutely everything they can about a customer. Customers, however, recoil at the idea of their every move being watched, recorded, processed, and analyzed. The more data businesses collect, the more exposed customers feel, and when customer data gets stolen, everyone loses. Everyone except the criminals.
But if we’re smarter about what data we collect and how we process and analyze it, we actually don’t need anywhere near the amounts of data we think we do.
The most crucial step is to move away from collecting and relying upon individual data and towards processing and analyzing aggregated data. For example, instead of analyzing data from a single IP address, we can look at IP prefix, and in doing so, we can derive all the intelligence we need. The advantage of this approach is that, the more we can process data at the group level, the less we need to know about individual users. While this may seem paradoxical, the truth is that we can derive more relevant intelligence, even as we require less data. When we engage in feature engineering — a critical part of building advanced models — we can create features based on aggregated data for a specific period of time; for example, a feature to calculate the total amount of transactions processed from a particular device where the amount of each transaction exceeds a defined threshold. With this approach, we don’t need to know individual transaction amounts precisely.
Additionally, with holistic analysis conducted at the group level, we can uncover patterns, trends, and commonalities across actions and accounts that wouldn’t be discernible at the individual level. This enables us to glean a unique layer of valuable insights without having to delve further into individual accounts. The net result is less demand for individual data and greater overall intelligence. Derived data adds another layer of benefit, in that, from one single data point, we can determine multiple additional features that enable us to further refine results.
[Social9_Share class=”s9-widget-wrapper”]
Upcoming Events
From Text to Value: Pairing Text Analytics and Generative AI
21 May 2024
5 PM CET – 6 PM CET
Read More