Zero-copy data sharing: securely share your data on your marketplace, without duplicating or moving it
Zero-copy data sharing transforms the way organizations share and use their data. It allows data from external sources to be explored and consumed securely, without the need for duplication. In this article, Coralie Lohéac, Lead Product Manager at Huwise, explains how deploying zero-copy data sharing within a data marketplace opens up new perspectives for value creation within organizations.
Hello Coralie. In a few words, how would you define zero-copy data sharing and the role it plays in a data marketplace?
Coralie: Zero-copy data sharing makes data accessible and discoverable within a data marketplace, while it remains stored in its original system. There’s no need to duplicate or move it. It ensures that data on the marketplace is secure, reliable, and up-to-date, while allowing business users, who are often less technical or unfamiliar with tools such as data lakes or data warehouses, to easily explore it through an intuitive interface designed for them.
What led you to develop this new data sharing mechanism in the platform?
Coralie: Many of our customers were hesitant about importing certain data assets into their data marketplace. This was especially true when they came from departments that wanted to keep control and ensure it provided a single source of truth, because it meant duplicating the data. Zero-copy data sharing removes this constraint by leaving the data in its original location while making it accessible through the marketplace. It ensures reliability, consistency and security, while making data easy to access and manage by everyone.
Should you share all your data in a data marketplace in zero-copy mode?
Coralie: Not necessarily. Most of our customers opt for a hybrid model on their data marketplace, combining virtualized data (zero-copy data sharing), duplicate data (that has been prepared to be more understandable through the platform’s processors) or just metadata. Our goal is to give them every possible option to share as much data as possible. Then, it’s up to them to decide the best approach, as obviously they know and understand their own data estate and needs.
When can zero-copy data sharing mode be beneficial?
Coralie: Zero-copy data sharing in a data marketplace makes it possible to expand the range of data accessible to business users. As some data cannot be physically moved, this approach still enables it to be shared while providing key safeguards and benefits:
- Security and reliability: Data stays at its source, under the direct control of the teams that manage it. It therefore retains its freshness, reliability, and status as a single source of truth. This is the most popular benefit for our customers.
- Wider usage: data becomes accessible to all profiles thanks to an intuitive platform for exploring data, a data marketplace, without the need to run SQL queries.
- Better data discovery: With AI-powered features like our multilingual semantic search engine or similar data recommendations, users don’t miss out on finding data that’s relevant to their needs.
- Usage management: sharing virtualized data within a data marketplace makes it possible to track who accesses it, what queries are made, and how data is being used. This enables organizations to understand usage and justify the impact and ROI of data sharing.
- Reduced environmental impact: By avoiding storage duplication, zero-copy data sharing reduces the organization’s carbon footprint. This is a key advantage for organizations committed to Corporate Social Responsibility (CSR)/Environmental, Social and Governance (ESG) strategies or that have to meet compliance requirements.
What criteria are used to share data in a zero-copy approach?
Coralie: The decision to share data in a zero-copy approach is based on multiple factors, including:
- The volume of data: The larger the size of the dataset, the more attractive this approach is to reduce the costs associated with duplicate storage.
- Data sensitivity: Some data belongs to specific teams who want to retain ownership and ensure its reliability. Zero-copy data sharing makes it possible to keep this single source of truth while sharing it securely with a larger number of users on the data marketplace.
Can you give us a concrete example of how zero-copy delivers benefits in a data marketplace?
Coralie: Of course. At Huwise, we have enabled zero-copy mode for our platform’s usage and adoption data in our own internal data marketplace. This is data from our data lake, which previously was only accessible to our data teams, because they were the only ones who had the technical training or relevant licenses. Now this data is accessible in real time via self-service to our Product Managers (essentially our business users), without the need for them to go through our data teams. This saves a considerable amount of time for everyone.
Which data management tools offer zero-copy sharing?
Coralie: The zero-copy sharing mode is available for the main data lakes and data warehouses on the market, including Snowflake and Databricks.
What upcoming developments can we expect? More connectors, more AI-driven innovations…?
Share this post:
Articles on the same topic: