Now more than ever, unstructured data governance and innovation intersect as more organizations contend with surging unstructured data and the arrival of generative AI. In a recent presentation, Kon Leong, CEO and founder of ZL Technologies, shared some vital insights on managing unstructured data in large enterprises and using AI while ensuring compliance and control.
Leveraging extensive experience with major financial institutions worldwide, Leong discussed the problems and opportunities that unstructured data presents under AI's watchful eye.
Here’s a summary of the key points:
The Unstructured Data Governance Challenge
Structured data differs from unstructured data. Unstructured data includes emails, documents, collaboration tools, social media, and instant messaging. Unlike structured ERP or SAP data, unstructured data is entirely human-generated. In large organizations, its volume easily rivals global giants like Google, with some banks managing tens of petabytes of such data.
"The proliferation of redundant copies significantly increases costs and compliance risks," Leong explained. Adding that redundant data introduces complexity in driving up the cost and, more so, in compliance. Unstructured data growth creates real governance challenges like e-discovery, compliance, records management, and privacy. Organizations must face these challenges if they will unleash value from data.
A New Approach Through Virtualization in Data Governance
Leong introduced a revolutionary way to reduce the challenges caused by unstructured data and the cost of AI - virtualizing data. His idea centers on extracting insights from unstructured data without storing much original content. “Virtualization reduces storage costs by up to 90%,” Leong emphasizes this while outlining how this approach cuts legal risk and speeds analytics.
Key benefits include:
- Cost Saving: Up to 90% can be saved since the need to retain original documents will no longer be felt.
- Risk Mitigation: Virtualization reduces legal exposure since sensitive information will not be in its natural state.
- Efficiency Gains: Analytic processes are exponentially sped up. For example, insights from millions of documents that would have taken weeks earlier are obtained in hours.
The Sandbox and The Beach
To illustrate modern analytics limitations, Leong introduced the 'sandbox' and 'beach' models, emphasizing the benefits of a unified governance landscape over siloed sampling.
“The sandbox approach limits your scope,” he said. Instead, he recommends you to "move to a 'beach' model- a single governance landscape. This eliminates the problems of sampling errors and inefficiencies.” Additionally, virtualization can provide comprehensive compliance besides overcoming the obstacles of storage and cost.
Balancing Governance and Innovation
Governance always has several priorities competing with internal stakeholders. Legal teams want data minimization, and analytics groups want data retention to power the AI models. Leong emphasizes the need for a centralized data committee to sort these conflicts and unite organizational priorities.
“Without alignment, governance becomes a food fight between departments," Leong said. He urged organizations to bake governance into the AI ecosystem to ensure early compliance. Balancing these priorities ensures that one upholds both innovation and regulatory integrity.
Practical Steps for Enterprises
Leong's presentation provides some real ways in which organizations could address these challenges:
“Start with a focused initiative,” Leong advised. “For example, address compliance for a specific region or department before scaling enterprise-wide.”
The following are the practical steps for enterprise:
- Start Small: Large initiatives can be overwhelming when trying to solve one specific pain point, such as compliance for a subset of employees or geographies, before scaling solutions.
- Employ Retrieval-Augmented Generation: Break down AI models into modular building blocks that can be updated or replaced independently without retraining the whole system. This makes data deletion requests easier to accommodate.
- Integrate Governance into Analytics: Build governance into the start of every workflow in AI projects to meet all legal, privacy, and compliance requirements while achieving its analytic goals.
The Future of Unstructured Data Governance
Leong predicts that structured and unstructured data governance will converge in the next 3 to 5 years, and organizations could gain unified insights through diverse data sets. “This shift will redefine enterprise decision-making,” Leong predicted.
But he added that it's all subject to whether today's governance challenges can be overcome. Investing in robust data management frameworks is the path to securely realizing the full value of generative AI. As the scale and complexity of information grow, so do the stakes. Organizations that can innovate with a disciplined approach to governance are best positioned to prosper in the AI era.
Watch the full presentation to learn how your organization can navigate unstructured data challenges and thrive in the era of generative AI.
Related Posts
Additional Resources
Data Consolidation During M&A
Melinda Watts-Smith, Global Head of Services at ZL, shares her insights on managing unstructured data in M&A. Drawing from her…...
The Rise of Full Content Management
Discover how the market is shifting towards full content management as a solution for file share governance....
The Problem With File Analysis Solutions
File analysis solutions promise to streamline data governance, enhance security, and optimize storage resources. However, the reality is not always…...