Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-2

This is a two part blog series. The following is the second part.

In Part 1, we traced how data collaborations are being reimagined, and laid out the conceptual foundations. From redefining consent through the Account Aggregator framework, to recognizing the limits of consent. We explored how privacy-preserving frameworks like differential privacy protect individuals even when models are built from data; how electronic contracts replace slow, manual agreements with enforceable digital rules; and how confidential clean rooms combine secure hardware and privacy guarantees to enable computation without revealing raw data.

In Part 2, we explore how these building blocks come together in practice.

The Connective Tissue: Data Collabs

Technology alone cannot guarantee privacy, fairness, or effective collaboration. Data-sharing ecosystems need institutional scaffolding — entities that can operationalize trust, manage relationships, and abstract away complexity for participants.

This is where Data Collaboratives (or Data Collabs for short) come in.

A Data Collab isn’t a regulator or a government body. Rather, it is a facilitator organization — a neutral yet entrepreneurial entity that enables, orchestrates, and sustains data collaborations using the DEPA Framework behind the scenes, following its standards and processes set by trusted bodies like an Self-Regulatory Organization (SRO) and a Technology Standards Organization (TSO).

You can think of a Data Collab as the connective tissue of a data ecosystem — linking data providers, data consumers, and service providers.

In practice, a Data Collab:

  1. Provides tools and interfaces for participants to register, onboard, sign electronic contracts, and set up secure collaboration environments such as CCRs.
  2. Signs agreements with data providers to clean, prepare, and catalogue datasets so that they can be safely shared with authorized data consumers.
  3. Manages the flow of value — usually collecting payments from data consumers and distributing them fairly to data providers, while covering operational costs.
  4. Assumes accountability for ensuring that all interactions, permissions, and computations are compliant with the DEPA rules and contractual terms.
  5. Adds value beyond infrastructure — offering domain expertise, workflow design, governance and audit support — streamlining data collaborations.

Data Collabs will likely take different forms depending on the domain they serve. For example, some might focus on oncology research, others on financial fraud detection or climate-risk modeling. Each field has its own kinds of data, privacy rules, and ways of working — so it is natural for Data Collabs to specialize.

Because running these collaborations requires significant operational and technical effort, most Data Collabs will probably be for-profit enterprises. At the same time, because they operate on open, interoperable digital public infrastructure like DEPA, they are not monopolistic platforms. Instead, they enable a competitive marketplace where multiple Data Collabs can coexist, offering participants better choices, fairer pricing, and higher-quality services.

In this way, Data Collabs create a persistent institutional layer for responsible data use; enabling long-term, multi-party cooperation that would be impractical to coordinate through ad hoc agreements.

A real-world example: Accelerating Drug Discovery

Imagine three pharmaceutical companies, each developing treatments for the same rare disease. Each has conducted clinical trials with a few hundred patients — but individually, none has enough data in quantity, diversity, or parameter richness to train a robust predictive model of treatment response. 

Much like pieces of a puzzle, valuable insights often emerge only when data from different sources fit together — yet no single party should hold or see the entire picture.

If these companies could combine their datasets, and enrich them with other sources like gene expression profiles, cell imaging results, or public molecular databases, they could uncover deeper patterns and dramatically speed up drug discovery.

But three major barriers stand in their way:

  1. Competitive concerns: Each company treats its clinical data as proprietary and doesn’t want to reveal it to others.
  2. Privacy regulations: Patients gave consent only to the company that ran their trial — not to share data across firms.
  3. Practical limits: Many patients can’t be re-contacted to renew consent, making manual legal processes infeasible.

This is where the DEPA Framework fits in. Here’s how it would work:

A Data Collab is formed for long-term drug discovery collaborations. It signs electronic contracts with each company, defining rights, responsibilities, and permitted use of data. It handles registration, onboarding, and compliance checks through standardized interfaces.

Electronic contracts set out the exact terms of collaboration — specifying each party’s role, the artefacts they contribute, and the rules that govern privacy, usage, and value-sharing.

Each company uploads its encrypted trial data or model into a Confidential Clean Room. Data inside the CCR is decrypted only after checks confirm that all security and compliance conditions are met.

Data is programmatically joined and enriched within the CCR, followed by AI model training using privacy-enhancing techniques like differential privacy, which appropriately bound the chance of re-identifying patients.

Only the final trained model and its accompanying logs — never the underlying data — leave the CCR. The model can be decrypted solely by the authorized data consumer(s) (i.e. the modellers), protecting their trade secrets.

Auditors can review logs and trace the provenance of all artefacts at any time — via the DEPA AI Chain — to verify compliance and resolve disputes.

This framework delivers several benefits for all concerned stakeholders:

  • For society: Promising treatments reach patients faster, while a reusable governance and technology blueprint emerges for future biomedical collaborations. 
  • For the economy: A new data-driven economy is unlocked, enabling novel business interactions and boosting meaningful economic activity.
  • For companies: They can innovate together without exposing trade secrets or breaking regulatory rules, expanding what’s possible in research and development.
  • For regulators and auditors: Every transaction leaves a verifiable trail, simplifying oversight and boosting trust in the ecosystem.

Summing up

India’s journey toward responsible data use has been progressive and layered.

  • It began with the Account Aggregator framework — making consent Open, Revocable, Granular, Auditable, Notifying and Secure (ORGANS principle).
  • For model training and analytics, Privacy-Enhancing Technologies (PETs) — such as Differential Privacy — introduce mechanisms like the privacy budget to safeguard individuals while enabling learning.
  • To make collaboration faster and more reliable, Electronic Contracts replace traditional paper/PDF agreements with machine-readable, enforceable commitments — cutting through the friction of slow legal processes.
  • Confidential Clean Rooms (CCRs) operationalize these safeguards — enabling computation on sensitive data.
  • Finally, Data Collaboratives weave all these elements together — creating institutional and economic frameworks that make responsible, long-term data collaboration practical and sustainable.

This is the next frontier of Digital Public Infrastructure for AI — proving that protection and innovation are not opposites. With the right frameworks, we can have both.

Read Part 1: Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-1

Please note: The blog post is authored by our volunteers, Hari Subramanian and Sarang Galada

For more information, please visit: https://depa.world/

Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-1

This is a two part blog series. The following is the first part.

Every day, we generate vast amounts of digital data — withdrawing cash, visiting doctors, ordering groceries, using various mobile apps. These data trails have the potential to streamline services, personalize experiences, and drive breakthroughs in fields from medicine to finance. Yet they also carry risks: unfair profiling, intrusive targeting, and exposure of sensitive personal information.

This presents a fundamental challenge: How can we harness the value of data while preserving individual privacy?

Understanding Privacy

In the age of AI, privacy violations no longer just expose personal information. They erode autonomy and tilt power toward those who control data and algorithms. As AI systems harvest behavioral cues, digital footprints, and social networks, people lose control, not just over their information, but also over how they are profiled and influenced. This enables subtle yet pervasive forms of coercion, from tailored manipulation of choices to algorithmic exclusion from opportunities.

At scale, such surveillance dynamics erode trust and weaken democratic agency. In this era, privacy is not merely about secrecy, it is a precondition for freedom, dignity and meaningful participation in society.

Privacy is often mistaken for confidentiality, but it’s not simply about hiding information. Privacy is the property of not being able to identify individuals from the signals they produce. Confidentiality, on the other hand, is about limiting access to those signals in the first place. To protect privacy and confidentiality while respecting individual autonomy, we need strong control mechanisms that let people decide what data is shared, with whom, for what purpose, and for how long.

And privacy isn’t a one-time setting. Data moves through a lifecycle — it is collected, used, stored, reused, and eventually deleted. These protections must hold at every stage, or they are lost.

The Mechanics of Consent

Today, consent remains the most common mechanism for privacy — the basic control primitive intended to let people decide how their data is collected, shared, and used. The concept of consent actually predates the digital era — it began in a paper-based world, where signatures and written permissions served as the primary means of authorizing data use. 

It is important to distinguish between two kinds of consent:

  1. Consent to collect data – allowing an entity to initially gather your data (for example, an app accessing your camera).
  2. Consent to share data – granting permission for that data to be used or passed on for a specific purpose (for example, a bank sharing your salary details with a loan underwriter).

Our focus in this article is on consent to share data, since that is where both the greatest privacy challenges and the most meaningful opportunities for value creation lie.

Here is the problem with how consent is currently implemented today. Under frameworks like GDPR, consent has been defined as a very coarse-grained and blunt artifact. The same entity collects your data, gathers your consent, and enforces the rules around its use. For individuals, this typically means an all-or-nothing choice — share everything or nothing at all. And for innovators, it stifles the ability to responsibly explore new uses of data.

India’s Innovation: Unbundling Consent

When India designed its Account Aggregator system for financial data sharing, it chose a different path. Consent to share data was unbundled into two parts:

  • Collect consent: Managed by trusted intermediaries called Account Aggregators.
  • Enforce consent: Managed downstream by Financial Information Users (like banks or wealth advisors), under ecosystem oversight.

https://sahamati.org.in/what-is-account-aggregator/

At the heart of this design lies a set of principles that make consent Open, Revocable, Granular, Auditable, Notifying, and Secure or ORGANS for short.

The Account Aggregator (AA) framework became the first manifestation of DEPA — the Data Empowerment and Protection Architecture. It is now India’s go-to model for user-consented data sharing between institutions, especially for straightforward data transfers and simple inference tasks.

Consent works well for inferences — one-time decisions like a bank checking your last six months of transactions to approve a loan. Yet, in practice, consent has well-known limits. People are asked to grant permission repeatedly, often through long, opaque terms they don’t fully understand, leading to consent fatigue and a loss of meaningful control.

These limitations become clearer when we move from individual decisions to model training and large-scale analytics, where algorithms learn patterns from millions of records. Seeking or managing consent at that scale is neither practical nor effective. 

What’s worse is that models can sometimes memorize sensitive data and inadvertently reveal it later. This highlights the need for new, complementary control primitives that uphold privacy and accountability even when explicit consent isn’t feasible.

Attempts at de-identification — the process of removing or masking identifiers to anonymize data – have significant limitations in practice. Although anonymization is meant to ensure that individuals cannot be re-identified, de-identification techniques are often reversible when datasets are combined with external information. As a result, such approaches offer only weak privacy guarantees, and numerous cases have shown how easily supposedly “anonymous” data can be linked back to individuals.

Privacy-preserving Algorithms: A New Control Primitive for Training and Analytics

To address these limits, a new class of algorithms has emerged under the broad umbrella of Privacy-Enhancing Technologies (PETs). Let us call these privacy-preserving algorithms, to differentiate them from other classes of PETs. They provide a spectrum of technical safeguards that preserve privacy while still enabling useful computation and collaboration on sensitive data.

Among these, Differential Privacy (DP), a mathematical framework for preserving individual privacy in datasets, stands out as a powerful privacy primitive for model training and data analysis.

The key idea: DP adds carefully calibrated noise to queries or model updates so that the results are statistically indistinguishable whether or not any single individual’s data is included. This ensures that nothing specific about an individual can be reliably inferred.

To make this guarantee rigorous, DP introduces the concept of a privacy budget (often represented by the parameters epsilon ε and delta δ):

  • Each query or training step “spends” some of this budget.
  • With more queries or training epochs, the cumulative privacy loss increases.
  • Once the budget is exhausted, no further queries or training is allowed, keeping the risk of re-identification mathematically bounded.

Think of this as a quantitative accounting system for privacy loss. Note, however, that DP comes with a utility tradeoff: adding calibrated noise can reduce model accuracy or data usefulness. Hence, depending on the use-case, the right privacy controls may be achieved through other privacy-preserving algorithms, or a combination thereof.

Electronic Contracts: Digitizing Trust

While privacy-preserving computation enables data to be used securely, participants still need clear agreements defining who may use it, for what purpose, or under what conditions. For such collaborations to function effectively, there must be a well-defined and enforceable contractual framework that specifies each party’s rights, obligations, and permissions.

The need for such a framework becomes even more pressing as organizations seek to unlock real value from data. No single dataset is enough; the most meaningful insights arise when information from multiple sources — hospitals, banks, labs, startups, or agencies — can be combined and analyzed responsibly. Yet each participant brings its own rules, contracts, and compliance obligations, creating a patchwork of agreements that are difficult to align.

Traditionally, contracts are legal documents — PDFs or paper agreements — written in human language, interpreted by lawyers, and enforced by institutions. They work well when a few parties are involved, but in modern data collaborations, this model quickly breaks down.

Today, every new collaboration means drafting, signing, and managing a maze of separate legal agreements, often in different formats, scattered across systems, and maintained by hand. With every participant added, the web of contracts grows bulkier, making coordination slow, expensive and error-prone. Every change or dispute requires human intervention and can take weeks or months to resolve.

This contractual friction has long been the viscous drag holding back scalable, compliant data collaboration. Not because trust is missing, but because it is buried under paperwork.

Electronic contracts transform this equation. They are machine-readable, digitally signed, and executable agreements that translate legal promises into enforceable code. Instead of being static documents, they are active digital objects that the DEPA orchestration layer can interpret and act upon — automatically initiating workflows, enforcing permissions, and ensuring compliance.

In effect, electronic contracts bridge law and computation.  They enable trust, automation, and accountability at digital speed, replacing manual paperwork with a system that can verify, execute, and audit commitments in real time.

Confidential Clean Rooms (CCR)

To operationalize the above elements, we need infrastructure that embeds privacy and compliance mechanisms by design, while also supporting diverse collaboration modalities — from data analytics and model training to various forms of inference.

That’s where Confidential Clean Rooms (CCRs) come in. A CCR is a secure computing environment that allows organizations to collaborate on data without ever sharing it in plain form. You can think of it as a locked, monitored laboratory where data from multiple parties can be brought together for analysis — yet no participant, not even the operator of the lab, can peek inside.

At the heart of every CCR is Confidential Computing — a technology that uses Trusted Execution Environments (TEEs) built into modern processors.  When data enters a TEE, it is encrypted and isolated from the rest of the system, ensuring that even cloud providers or system administrators cannot access it. Computations run inside this protected enclave, and only verified results can leave. Each TEE also produces a cryptographic attestation, a proof that the computation was executed correctly and under the agreed conditions.

https://depa.world/training/architecture

On their own, CCRs provide secure execution. But when combined with other DEPA primitives..

  1. Electronic Contracts, which specify who can use what data for what purpose, and
  2. Privacy-preserving algorithms, which provide mathematical controls about what information can or cannot leak,

..they form a complete privacy-preserving data-sharing stack.

In essence, Confidential Clean Rooms (CCRs) enable confidential, techno-legal, and privacy-preserving computation on data. They make it possible to conduct large-scale data inference, analytics and modelling responsibly, without transferring raw data to any third party, and thereby eliminating the need for consent specifically for data sharing.

But technology alone doesn’t build ecosystems. Who brings this framework to life, abstracting away its complexity for everyday organizations? How might it help us confront our most urgent global challenges — in health, climate and finance? And how could it unlock entirely new kinds of enterprises, fueling a vibrant and responsible data economy for the Intelligence Age?

Data Collabs!

Read Part 2: Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-2

Please note: The blog post is authored by our volunteers, Hari Subramanian and Sarang Galada

For more information, please visit: https://depa.world/

Imagining Indian Cities: #1 ‘Creative Bangalore’

There is a need for a continuous conversation about the best way to shape the future of Indian cities. This conversation will take place across multiple cities, with learnings from each other.

Therefore, it is proposed to hold an ‘Imagining Indian Cities’ Workshop annually in different Indian places. The first of those took place in Bangalore from 10 to 15 March 2025, with the next two planned for Chennai and Pune.

These Workshops will gather academics, practitioners and urban innovators in multi-day get-together. Half of each conference will focus on the host city, and the other half will be for learnings from elsewhere.

This ‘Creative Bangalore’ Workshop was organised by the Indian think tank iSPIRT Foundation and supported by IISc/IUDX, IIHS and Dassault Systèmes, for bringing together various participants, chosen to form a sustainable collective capable of shedding light on a certain number of key questions and moving towards increasingly measurable contributions.
Initial key questions were:

  1. Genericity and reproducibility of the Creative Cities model developed by Patrick Cohendet
  2. Digital Urban Data, Digital Public Infrastructure (DPI) and territorial intelligence
  3. Placement of (Digital) Commons
  4. Digital representations of culture
  5. Digital representations of Wicked Problems

Results:

  • 5 full days, hosted by Bangalore International Centre (D1), IISc/India Urban Data Exchange (D2), Sabha (D3), Indian Institute of Human Settlement (D4) and Dassault Systèmes (D5);
  • More than 50 speakers, in person or online;
  • A rich repository of content, including speaker presentations, slides and Photos https://drive.google.com/drive/folders/1lxnbuhna3Hz_t49BOGvZG15_0dByi9Ki

As the AI race across the world heats up, a post: “𝐈𝐧𝐝𝐢𝐚 𝐝𝐨𝐞𝐬𝐧’𝐭 𝐰𝐢𝐬𝐡 𝐭𝐨 𝐛𝐞 𝐣𝐮𝐬𝐭 𝐚 𝐭𝐫𝐚𝐝𝐞 𝐜𝐨𝐥𝐨𝐧𝐲 𝐨𝐟 𝐂𝐡𝐢𝐧𝐚 𝐨𝐫 𝐭𝐞𝐜𝐡𝐧𝐨𝐥𝐨𝐠𝐲 𝐜𝐨𝐥𝐨𝐧𝐲 𝐨𝐟 𝐭𝐡𝐞 𝐔𝐒”

To succeed at AI, we need a whole-of-nation approach involving deep-tech startups, enabling industrial policy and pre-commercial publicly-funded research.

When the Biden Administration released its AI Diffusion Executive Order a few weeks back restricting GPUs to countries, it became clear that having strategic autonomy in AI was of paramount importance to India.

Just being the use-case capital for AI wasn’t the right way to go.

India doesn’t wish to be a trade colony of China or the technology colony of the US.

What makes AI different is that it needs a whole-of-nation approach. To win at AI we need deep-tech startups, enabling industrial policy and pre-commercial publicly-funded research. It is only when all three come together that magic can happen.

Our resistance to the whole-of-nation approach is understandable. After all, our IT Services and SaaS industry came up without the whole-of-nation approach. So, many people thought that the same playbook would apply to AI.

China has proved with DeepSeek’s R1 and Moonshot AI [another Chinese company’s] Kimi k1.5 that a whole-of-nation approach can have big payoffs. In India, this approach has worked for cryogenic engines, 4G/5G telecom equipment and India Stack. We do remarkable things when we set our mind to it!

Yes, we have lost some time due to the use-case captial camp. But all is not lost. The field is still young and many areas like neurosymbolic AI are very much open.

The Biden AI Diffusion order, and Chinese success has given new vigour to the whole-of-nation camp within government, private sector and civil society. The debate is now over: You will see some good developments become visible in the coming months  #AI #StrategicAutonomy

Also see: https://www.moneycontrol.com/technology/deepseek-s-llm-success-triggers-big-debate-is-india-s-hesitation-a-strategic-mistake-article-12921811.html

Digital Public Infrastructures

This workshop was organized by the Indian think tank iSPIRT Foundation, French Embassy in India, Consulate General of France in Bangalore, and La French Tech in India, based on the following principles:

  • Gathering high level contributors from India and France: industrials, transdisciplinary academics, diplomats, officials, business founders, think tank members, technology makers;
  • Pushing a Workshop format (not an event, not a round table, not a scientific conference), organizing 3 different days with 3 different viewpoints:
    • Philosophical/epistemological/ human sciences,
    • Economical/techno-legal/social sciences/adoption,
    • Application domains and use cases (Health, Culture, Creative Cities, Agriculture);
  • Targeting recommendations toward the AI Action Summit (Paris, February 2025).

Results:

  • More than 80 speakers, 100 participants in person (in Bangalore or in Paris), 200 participants online;
  • 14 different countries represented all over the world (India, France, Canada, USA, Mexico, Guatemala, Brazil, Germany, Netherland, Italia, Spain, Portugal, Belgium, Thailand);
  • An opening session figuring the Ambassador of India to France H.E. Mr. Jawed Ashraf, the French Digital Affairs Ambassador H.E. Mr. Henri Verdier, the Consul general of France in Bangalore Mr. Marc Lamy;
  • A rich repository of content, including speaker presentations, slides and recordings (https://drive.google.com/drive/folders/1eTDbRgw1g8EOBtXS7uRim9I6gtQRbCZB).

Economic Transformation through AI: Key pillar to a large Indian Economy in Global Top 3

In the rapidly evolving landscape of the AI economy, the choices made today will reverberate for generations. As custodians of India’s future, we must recognize the urgency of embracing AI as a lynchpin of economic growth. The time to act is now!

In an era characterized by relentless technological advancement, a nation’s economic growth trajectory hinges on its ability to harness the power of artificial intelligence (AI). Goldman Sachs reported that generative AI could raise global GDP by 7%. By 2030, this AI driven Intelligence Economy might add $15.7tn of new economic value as per PWC research.

With its burgeoning tech industry, diverse and large data pool and remarkable human capital, India stands at the precipice of an economic transformation that could either propel it to global leadership or condemn it to follow in the wake of other trailblazers. As political decision-makers, the imperative to recognize and seize this opportunity cannot be overstated in view of India’s bid to become one of the top 3 economies of the world. The availability of the DEPA Training Cycle and the DPDP Bill passage through the Parliament open the door to immediate and strategic action via the creation of a large AI economy.

I. The AI Imperative for Global Competitiveness:

India’s demographic dividend of 900mn+ people is no secret but must be coupled with technological prowess to ensure a multiplier effect for sustained growth. As global economies increasingly pivot towards AI-driven industries, overlooking this shift risks consigning India to a secondary role on the global stage. To maintain competitiveness, India must embrace AI not merely as a tool but as the very foundation of its economic strategy going forward. It must ensure that it is not just a consumer of AI but a critical creator of AI. In fact, it must aim to emerge as one of the 3 AI superpowers in the world.

II. Safe AI Leadership Depends on Data

India’s DEPA Training makes privacy-preserving collaboration between Training Dataset Providers and Modelers (called Training Dataset Consumers) possible at a large scale, which is a critical element in AI journey. The DEPA system does not rely on hard-to-implement enforcement of legal covenants around Anonymized Datasets, as is the case in countries like the US, where AI companies are fighting constant litigation. Instead, it depends on computational privacy guarantees in the use of aggregated datasets. This is core to enabling safe AI systems, built with reliable and traceable access to datasets. Then, it can be deployed quickly with human alignment that India can provide with its billion plus users. As India begins to unlock continental-scale datasets using this system, it will give rise to a vibrant ecosystem of AI Modelers. This dataset advantage in AI is not to be underestimated. By focusing on early Safe AI adoption, India can secure a foothold in these sectors, attracting global investment and cementing its position as an innovation hub whose AI innovations would be adopted by societies around the world.

III. Addressing Socioeconomic Disparities: Remote AI driven workflows & 5G

Harnessing AI’s potential can also serve as a powerful tool to address India’s socioeconomic disparities. AI-driven solutions can optimize resource allocation, improve public service delivery, reduce cost of access and create job opportunities across urban and rural areas. With massive 5G rollout, the possibility of digital global work aided by AI is here. It can dramatically bring income opportunity to rural and smaller cities, if we can bring in Indic language AI tools, which lower the bar for participating in the global workflows. By proactively leveraging AI to bridge gaps and enhance productivity, India’s leadership can demonstrate a commitment to inclusive growth and lay the foundation for a more equitable society. All the while reducing strains of growing urbanization, which might be disastrous for its overburdened large cities.

IV. The Gameplan for AI Leadership: Missing piece of compute clusters

DEPA Training will safely and responsibly unlock the collaboration between India Training Dataset Providers and Modelers. We have the talent already and the market scale to do Reinforcement Learning with Humans in the Loop. What we lack is tensor-scale computing enabled for Industry, startups, academia and Govt itself. The Government of India must address this by enabling the creation of many, not one, tensor-scale GPU cloud providers. There are many ways to do this: Challenge Grants, Viability-gap funding for cloud providers, and Matching-grants for Modelers. We favor the Matching Grants method for effectiveness, transparency, and competition. In addition, we must seek to create AI on the edge compute ecosystem for a strategic future.

V. Collaborative Diplomacy and Global Alliances:

AI does not recognize national borders, and collaboration is key to advancing the field. At the same time, we must recognize that Nvidia H100 boards are already on the US Export Control List for China. The US might leverage its muscle further at some time in the future. We must therefore have a strategic perspective in making our aggregate AI capability and datasets available to others based on a principle of reciprocity. We must build careful alliances with a broad set of players in US, EU and Asia that will accelerate India’s AI capabilities but also position the nation as a global AI thought leader.

VI. The Consequences of Inaction:

The consequences of neglecting AI’s potential are dire. India risks becoming a mere consumer of AI technologies, ceding economic leadership to countries that have embraced AI as a strategic priority. China, our neighbor, has famously vowed to be the sole AI superpower by 2030. This passivity could lead to missed opportunities, economic stagnation, and a loss of global influence. It may even result in India failing to breach the top 3 economies, , as we might have to buy both oil and artificial brains, draining our resources for welfare schemes for our large population. That could risk demographic disaster instead of demographic dividend.

Conclusion: We need to act now!

In the rapidly evolving landscape of the AI economy, the choices made today will reverberate for generations. As custodians of India’s future, we must recognize the urgency of embracing AI as a lynchpin of economic growth. The time to act is now! We must catalyze innovation, ensure global competitiveness, and create a prosperous future where India’s leadership is defined not by its past but by its capacity to shape the AI-powered future world decisively.

Sharad Sherma is co-founder of iSPIRT Foundation. Umankant Soni is the Chairman AI foundry, General Partner ART Venture Fund.

Ready for India’s AI ambitions: We are now one step closer to having a modern regulation for and of AI

The passage of the Digital Personal Data Protection Bill 2023 (DPDP) by the Lok Sabha is significant in more ways than one. The Bill aims to enforce and promote lawful usage of digital personal data and stipulates how organisations and individuals should navigate privacy rights and handle personal data.

Creating effective mechanisms to enable data governance has become one of the top priorities for countries around the world. The challenge for policymakers is designing legal and regulatory frameworks that clearly lay down the rights of data principals and obligations for data fiduciaries.

The Digital Data Protection Bill is a much-needed step in this direction, taken after months of deliberations and discussions. Such normative frameworks are critical to secure regulatory certainty for enterprises. However, innovative technical measures are required to support their operationalisation.

In the past couple of years, India has made significant strides in adopting a techno-legal approach to data governance. Through this approach, India is building technical infrastructure for authorising access to datasets that embed privacy and security principles in its design.

Data also lies at the heart of AI innovations that can address significant global challenges. India’s unique techno-legal approach to data governance is applicable across the life cycle of machine learning systems.  It complements the country’s ambition of supporting its growing AI start-up ecosystem while providing privacy guarantees.

As part of India Stack, the Data Empowerment and Protection Architecture (DEPA) launch in 2017 was India’s paradigm-defining moment for the inference cycle of the machine learning life cycle. It proposed the setting up of Consent Managers (CMs), also known as Account Aggregators in the financial sector.

This approach, also mentioned in the current iteration of the DPDP (Chapter 2, [Sections 7-9]), ensures individuals can exercise control over their data and can provide revocable, granular, auditable, and secure consent for every piece of data using standard Application Programming Interface (APIs). The secured consent artefact records an individual’s consent for the stated purpose.
It allows users to transfer their data from those data businesses that hold it to those that have to use it to provide individuals certain services while ensuring purpose limitation. For instance, individuals can share their financial data residing within their banks with potential loan service providers to get the best loan package.

DEPA is India’s attempt at securing a consent-based data-sharing framework. It has facilitated the financial inclusion of millions of its citizens. Eight of India’s largest banks were early adopters of the framework starting in 2021. Currently, 415 entities, including CMs, Financial Information Providers, and Users, participate across various DEPA implementation stages.

However, the training cycle of an AI model demands substantially more data to make accurate predictions in the inference cycle. As such, there is a need for more of such robust technical solutions that disrupt data silos and connect data providers with model developers while providing privacy and security guarantees to individuals who are the real owners of their own data.

With DEPA 2.0, India is already experimenting with a solution inspired by confidential computing called the Confidential Computing Rooms, or CCRs. CCRs are hardware-protected secure computing environments where sensitive data can be accessed in an algorithmically controlled manner for model training.

These algorithms create an environment for data to be used while ensuring compliance with privacy and security guarantees for citizens are upheld and data does not exchange hands. Techniques like differential privacy introduce controlled noise or randomness into the training process to protect individuals’ privacy by making it harder to identify them or extract sensitive information.

To make CCR work, model certifications and e-contracts are essential elements. The model sent to CCR for training has to be certified to ensure it upholds privacy and security guidelines, and the e-contracts are required to facilitate authorized and auditable access to datasets. For example, loan providers can authorise access to a representative sample of the datasets residing with them to model developers via CCR for model training. This arrangement will be facilitated via e-contracts once the CCR verifies the validity of the model certification provided by the modeller.

India’s significant progress with technical measures that are aligned with domestic legal frameworks provides it with a head start in the AI innovation landscape. Countries all across the globe are struggling to find solutions to facilitate personal data sharing for model development that prioritises security and privacy. Multiple lawsuits have recently been filed against OpenAI across numerous jurisdictions for unlawfully using personal data to train their models.

India’s unique approach to data governance, where both technical and legal frameworks fit like a puzzle and balance the thin line of promoting AI innovation while providing privacy guarantees, is well-positioned to guide global approaches to data governance.

In a quiet and disciplined fashion, over the last six years, India has put the critical techno-legal pieces in place for becoming a significant AI player in the world alongside US and China. Like them, we have continental-scale data and the talent to shape our future. With the passage of the DPDP Bill, we are now one step closer to having modern regulatory tools for effective regulation of AI and regulation for AI.

Co-Authored by Antara Vats and Sharad Sharma
A version of this was published on Financial Express, August 9th, 2023.