DEPA-Training: Tech Updates

We’ve rolled out some exciting updates for DEPA‑Training, making it easier to rapidly prototype and run diverse training scenarios — complete with electronic contracts, confidential cleanrooms, privacy-preservation and configurable training SDKs.


✨ What’s new

👉 GUI for end-to-end execution

👉 Step-by-step guide to create and run your own training scenarios

👉 New scenarios introduced for complex multi-party training: MRI brain tumor segmentation, credit default risk prediction


Before we dive in, let’s quickly recall what the Data Empowerment and Protection Architecture (DEPA) really is.

What is DEPA and why does it matter?

India Stack is evolving at population scale, enabling the flow of people (Aadhaar, eKYC, DigiLocker, DigiYatra, etc), money (UPI, OCEN), and information (DEPA and Account Aggregator) through Digital Public Infrastructure (DPI). DEPA is critical in this third layer as it enables the responsible flow of data between individuals and organisations for more complex tasks such as AI model training, AI inference and analytics. 

As the name suggests, DEPA rests on two key elements. The first is protection, founded on the bedrock of privacy, consent, accountability and purpose limitation of data. The second is empowerment, democratizing data access and enabling the ecosystem to responsibly innovate with it, whether for training AI models, personalizing products and services, advancing scientific research, and a lot more.

In light of emerging data protection laws such as the DPDP, GDPR, and others, there is a need for a framework that enables the responsible use of data — unlocking its value while ensuring regulatory compliance and serving the broader public interest.

Ultimately, DEPA solves for two core challenges at the heart of data sharing — Trust and Flow — keeping the rest open and flexible for innovation.

What is DEPA‑Training?

The vision behind DEPA for Training (aka DEPA‑Training) is simple: For India to not only be a consumer of AI, but also a producer of AI, and in a responsible and democratized manner.

AI’s first big leap came from public data. That well is running dry. Our belief is that for the next wave of AI innovation — smarter AI for healthcare, personalized finance, scientific discovery and more — proprietary data will be crucial. But today, that data is fragmented, locked in silos, and difficult to use — often running into challenges around privacy, compliance, and regulatory constraints.

Enter DEPA-Training — a techno-legal Digital Public Infrastructure (DPI) designed to enable secure, agile, and scalable AI model training on sensitive data. It does so by assembling a set of frontier technological primitives:

  • Confidential Clean Rooms (CCRs): Isolated compute environments that can cryptographically attest to their integrity, where data can be processed securely without external exposure.
  • Electronic Contracts: Code-enforced legal agreements between transacting parties, that give data providers control over how their data is used, for eg. through purpose limitation, privacy safeguards and monetization.
  • Secure Training Sandbox: Modular and configurable sandboxes and SDKs for building privacy-preserving and compliant training pipelines across diverse model architectures and data types.

What’s new in DEPA-Training?

Graphical user interface

We’ve introduced an interactive GUI that enables users to explore, configure, and execute DEPA-Training scenarios end to end. The application automatically discovers available scenarios in the repository and provides an intuitive interface to run them — eliminating the need for command-line interaction. A similar GUI workflow is also provided for contract signing.

Scenarios you can try out today

To bring DEPA-Training to life, we showcase a diverse set of scenarios that demonstrate what’s possible in practice. These examples illustrate pathways toward solving larger global challenges and span multiple data modalities (e.g., tabular, images), model paradigms (e.g., classical ML, MLPs, CNNs), and prediction tasks (e.g., regression, classification, image segmentation).

Disease Surveillance Modeling

Pandemics don’t wait. Timely, accurate data can save millions of lives. Yet most infection data is scattered, siloed, and too sensitive to share. With differential privacy, institutions can securely pool data to track virus spread, map risk patterns, and test interventions — powering real-time, data-driven epidemic response.

Example: COVID-19 scenario

Medical Image Modeling

From cancer to cardiovascular disease, from neurology to rare disorders — modern medicine increasingly depends on imaging. Yet medical images are among the hardest datasets to share, trapped in hospital silos and governed by strict privacy laws. DEPA makes it possible to combine imaging data across borders and institutions, unlocking AI models that are more accurate, generalizable, and equitable. This accelerates breakthroughs in diagnostics, improves treatment planning, and addresses one of healthcare’s biggest global challenges: scaling precision medicine while safeguarding patient trust.

Example: BraTS scenario 

Financial Credit Risk Modeling

Access to fair credit fuels economic growth, but risk assessment is often limited by partial data. By safely combining insights across financial institutions, DEPA enables more accurate credit scoring, reduces defaults, and strengthens financial stability — empowering individuals and businesses alike with better access to capital.

Example: Credit Risk scenario

Build your own Scenarios

A new step-by-step guide walks you through building and running your own DEPA-Training scenarios — making it easy to rapidly prototype and iterate with training use-cases of your own.

Currently, DEPA-Training supports the following training frameworks, libraries and file formats (more will be included soon):

  • Frameworks: PyTorch, Scikit‑Learn, XGBoost (LLM Finetuning to be added soon!)
  • Libraries: Opacus, PySpark, Pandas (HuggingFace support coming soon!)
  • Formats: ONNX, Safetensors, Parquet, CSV, HDF5, PNG (No pickle-based formats for security reasons)

What’s in it for the ecosystem?

DEPA-Training democratizes responsible data sharing and model training for all!

  • Enterprises & Startups → Unlock the value of private data to build smarter products and services, while remaining compliant to data laws. Collaborate across organizations to create solutions that no single dataset could power.
  • Research Institutions → Pool data at scale to tackle grand challenges, drive scientific discovery, and advance knowledge for the public good.
  • Policy & Legal Experts → Shape the future of data governance by operationalizing privacy, consent, purpose limitation, and accountability in practice.
  • Builders & researchers → Join us in co-creating this framework!

Get started

👉 Get your hands dirty: DEPA‑Training on GitHub 🛠️

👉 Explore the documentation: DEPA.World 📜
👉 Watch the Open Houses: YouTube Playlist 🎬

👉 Think big: What challenges has data privacy kept off-limits? What data has felt forever inaccessible? With DEPA-Training, those doors may finally open. 💡

Interested in contributing to DEPA? Join our group of no-greed no-glory volunteers! Apply here

Please note: The blog post is authored by our volunteers, Sarang GaladaDr. Shyam Sundaram, Kapil Vaswani and Pavan kumar Adukuri

A Budget that missed the opportunity for being bold on both Strategic Autonomy & Reform action

 iSPIRT Foundation, a technology think-and-do tank, believes India’s hard problems can be solved only by leveraging public technology for private innovation. iSPIRT, as a think-and-do-tank, pioneered the concept of Digital Public Infrastructure (DPI)

The Budget starts by acknowledging that India is facing “an external environment in which trade and multilateralism are imperilled and access to resources and supply chains are disrupted”. But the details aren’t in line with the idea. The Government also acknowledged AI and cutting-edge technologies as force multipliers for better governance. 

AI has been spoken about a few times at different places. However, there is no material proposal on AI, except as a tool for “Bharat-VISTAAR”—a multilingual AI tool in agriculture.

FM announced Manufacturing support to seven strategic and frontier sectors, including Bio-Pharma, Chemicals, Semiconductors, and Electronic Components. This will help the ecosystem build up in these sectors and, in a way, support the cause of “Product Nation” from a building capacity and infrastructure point of view. However, it does not address “strategic autonomy” and technological sovereignty as a thought process. 

The one that most closely links to “Aatmnirbhar Bharat” or strategic autonomy is the announcement on ISM 2.0, to produce equipment and materials, design full-stack Indian IP, and fortify supply chains, including skilling and training. Also, the mention of established dedicated Rare Earth Corridors is a welcome move to fill the gaps in the supply chain in these areas, given the geopolitical situations. 

Any Government announcement takes about 2 years to roll out in the field. The AI Mission, National Quantum Mission, Anusandhan National Research Fund, and Research and Development and Innovation Fund have been mentioned by the FM in speech. RDI is rolling out now. The government missed the bus to announce a “market access” scheme or a fund for the products developed after taking all the steps in R&D and frontier technology advancements. 

We have maintained that our Economic Policy will need to foreground Strategic Autonomy as a core pillar, which becomes all the more imperative in the current global geopolitical scenario. But Strategic Autonomy is not possible without technological sovereignty. While the government has taken steps to “reduce critical import dependencies,” at a time when “new technologies are transforming production systems”, incremental steps are not enough.

“A market access plan for Indian products designed and developed in India by resident Indian companies is the need of the hour for any fruitful outcome from R&D and product development. The Government must consider this with all seriousness in the future,” said Amit Agrahari, volunteer at iSPIRT Foundation.

Last year, Bharat Trade Net was announced as an integrated trading platform. This year’s announcement of “Customs Integrated System (CIS) as a single, integrated and scalable platform for all the customs processes and use of non-intrusive scanning with advanced imaging and AI technology for risk assessment, takes the thought to the next level.  This is very much in line with our National Regulatory Compliance Grid (NRCG) approach and use of advanced technology for data-driven governance. 

However, our proposal of building a NRCG for all regulatory systems is still waiting. “Unless we use a Grid approach for digital transformation and connect all regulators, it is going to be difficult to reduce the regulatory cholesterol”, said Sudhir Singh, an iSPIRT Volunteer looking after Ease of Doing Business, and Policy. 

Linking TreDS with the GeM portal is a welcome step towards unlocking true Digital potential in Ease of Doing Business for MSMEs. “This can further create a grid approach by connecting to the Open Credit Enablement Network (OCEN) and trade finances for SME exporters,” said Tanuvi Thakur, volunteer at iSPIRT Foundation. This will further aid EoDB through quicker and cheaper access to credit by MSMEs. 

The other major welcome step in this regard has been the in-principle movement from penalty and prosecution to fees. This has also been our core decriminalisation aim for achieving EoDB.

Overall, it’s a subdued Budget despite the challenging geopolitical environment rather than a bold Budget that speaks on both “strategic autonomy” and “reforms”.

About iSPIRT Foundation – We are a non-profit think-and-do tank that builds public goods for Indian product startups to thrive and grow. iSPIRT aims to do what DARPA or Stanford University did in Silicon Valley for startups. iSPIRT builds four types of public goods – technology building blocks (aka India Stack), startup-friendly policies, market access programs like M&A Connect, and Playbooks that codify scarce tacit knowledge for product entrepreneurs of India. For more, visit www.ispirt.in.

For further queries, please reach out via email:  [email protected], [email protected] 

Please note: The blog post is authored by our volunteers, Sudhir Singh, Tanuvi Thakur and Amit Agrahari

Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-2

This is a two part blog series. The following is the second part.

In Part 1, we traced how data collaborations are being reimagined, and laid out the conceptual foundations. From redefining consent through the Account Aggregator framework, to recognizing the limits of consent. We explored how privacy-preserving frameworks like differential privacy protect individuals even when models are built from data; how electronic contracts replace slow, manual agreements with enforceable digital rules; and how confidential clean rooms combine secure hardware and privacy guarantees to enable computation without revealing raw data.

In Part 2, we explore how these building blocks come together in practice.

The Connective Tissue: Data Collabs

Technology alone cannot guarantee privacy, fairness, or effective collaboration. Data-sharing ecosystems need institutional scaffolding — entities that can operationalize trust, manage relationships, and abstract away complexity for participants.

This is where Data Collaboratives (or Data Collabs for short) come in.

A Data Collab isn’t a regulator or a government body. Rather, it is a facilitator organization — a neutral yet entrepreneurial entity that enables, orchestrates, and sustains data collaborations using the DEPA Framework behind the scenes, following its standards and processes set by trusted bodies like an Self-Regulatory Organization (SRO) and a Technology Standards Organization (TSO).

You can think of a Data Collab as the connective tissue of a data ecosystem — linking data providers, data consumers, and service providers.

In practice, a Data Collab:

  1. Provides tools and interfaces for participants to register, onboard, sign electronic contracts, and set up secure collaboration environments such as CCRs.
  2. Signs agreements with data providers to clean, prepare, and catalogue datasets so that they can be safely shared with authorized data consumers.
  3. Manages the flow of value — usually collecting payments from data consumers and distributing them fairly to data providers, while covering operational costs.
  4. Assumes accountability for ensuring that all interactions, permissions, and computations are compliant with the DEPA rules and contractual terms.
  5. Adds value beyond infrastructure — offering domain expertise, workflow design, governance and audit support — streamlining data collaborations.

Data Collabs will likely take different forms depending on the domain they serve. For example, some might focus on oncology research, others on financial fraud detection or climate-risk modeling. Each field has its own kinds of data, privacy rules, and ways of working — so it is natural for Data Collabs to specialize.

Because running these collaborations requires significant operational and technical effort, most Data Collabs will probably be for-profit enterprises. At the same time, because they operate on open, interoperable digital public infrastructure like DEPA, they are not monopolistic platforms. Instead, they enable a competitive marketplace where multiple Data Collabs can coexist, offering participants better choices, fairer pricing, and higher-quality services.

In this way, Data Collabs create a persistent institutional layer for responsible data use; enabling long-term, multi-party cooperation that would be impractical to coordinate through ad hoc agreements.

A real-world example: Accelerating Drug Discovery

Imagine three pharmaceutical companies, each developing treatments for the same rare disease. Each has conducted clinical trials with a few hundred patients — but individually, none has enough data in quantity, diversity, or parameter richness to train a robust predictive model of treatment response. 

Much like pieces of a puzzle, valuable insights often emerge only when data from different sources fit together — yet no single party should hold or see the entire picture.

If these companies could combine their datasets, and enrich them with other sources like gene expression profiles, cell imaging results, or public molecular databases, they could uncover deeper patterns and dramatically speed up drug discovery.

But three major barriers stand in their way:

  1. Competitive concerns: Each company treats its clinical data as proprietary and doesn’t want to reveal it to others.
  2. Privacy regulations: Patients gave consent only to the company that ran their trial — not to share data across firms.
  3. Practical limits: Many patients can’t be re-contacted to renew consent, making manual legal processes infeasible.

This is where the DEPA Framework fits in. Here’s how it would work:

A Data Collab is formed for long-term drug discovery collaborations. It signs electronic contracts with each company, defining rights, responsibilities, and permitted use of data. It handles registration, onboarding, and compliance checks through standardized interfaces.

Electronic contracts set out the exact terms of collaboration — specifying each party’s role, the artefacts they contribute, and the rules that govern privacy, usage, and value-sharing.

Each company uploads its encrypted trial data or model into a Confidential Clean Room. Data inside the CCR is decrypted only after checks confirm that all security and compliance conditions are met.

Data is programmatically joined and enriched within the CCR, followed by AI model training using privacy-enhancing techniques like differential privacy, which appropriately bound the chance of re-identifying patients.

Only the final trained model and its accompanying logs — never the underlying data — leave the CCR. The model can be decrypted solely by the authorized data consumer(s) (i.e. the modellers), protecting their trade secrets.

Auditors can review logs and trace the provenance of all artefacts at any time — via the DEPA AI Chain — to verify compliance and resolve disputes.

This framework delivers several benefits for all concerned stakeholders:

  • For society: Promising treatments reach patients faster, while a reusable governance and technology blueprint emerges for future biomedical collaborations. 
  • For the economy: A new data-driven economy is unlocked, enabling novel business interactions and boosting meaningful economic activity.
  • For companies: They can innovate together without exposing trade secrets or breaking regulatory rules, expanding what’s possible in research and development.
  • For regulators and auditors: Every transaction leaves a verifiable trail, simplifying oversight and boosting trust in the ecosystem.

Summing up

India’s journey toward responsible data use has been progressive and layered.

  • It began with the Account Aggregator framework — making consent Open, Revocable, Granular, Auditable, Notifying and Secure (ORGANS principle).
  • For model training and analytics, Privacy-Enhancing Technologies (PETs) — such as Differential Privacy — introduce mechanisms like the privacy budget to safeguard individuals while enabling learning.
  • To make collaboration faster and more reliable, Electronic Contracts replace traditional paper/PDF agreements with machine-readable, enforceable commitments — cutting through the friction of slow legal processes.
  • Confidential Clean Rooms (CCRs) operationalize these safeguards — enabling computation on sensitive data.
  • Finally, Data Collaboratives weave all these elements together — creating institutional and economic frameworks that make responsible, long-term data collaboration practical and sustainable.

This is the next frontier of Digital Public Infrastructure for AI — proving that protection and innovation are not opposites. With the right frameworks, we can have both.

Read Part 1: Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-1

Please note: The blog post is authored by our volunteers, Hari Subramanian and Sarang Galada

For more information, please visit: https://depa.world/

Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-1

This is a two part blog series. The following is the first part.

Every day, we generate vast amounts of digital data — withdrawing cash, visiting doctors, ordering groceries, using various mobile apps. These data trails have the potential to streamline services, personalize experiences, and drive breakthroughs in fields from medicine to finance. Yet they also carry risks: unfair profiling, intrusive targeting, and exposure of sensitive personal information.

This presents a fundamental challenge: How can we harness the value of data while preserving individual privacy?

Understanding Privacy

In the age of AI, privacy violations no longer just expose personal information. They erode autonomy and tilt power toward those who control data and algorithms. As AI systems harvest behavioral cues, digital footprints, and social networks, people lose control, not just over their information, but also over how they are profiled and influenced. This enables subtle yet pervasive forms of coercion, from tailored manipulation of choices to algorithmic exclusion from opportunities.

At scale, such surveillance dynamics erode trust and weaken democratic agency. In this era, privacy is not merely about secrecy, it is a precondition for freedom, dignity and meaningful participation in society.

Privacy is often mistaken for confidentiality, but it’s not simply about hiding information. Privacy is the property of not being able to identify individuals from the signals they produce. Confidentiality, on the other hand, is about limiting access to those signals in the first place. To protect privacy and confidentiality while respecting individual autonomy, we need strong control mechanisms that let people decide what data is shared, with whom, for what purpose, and for how long.

And privacy isn’t a one-time setting. Data moves through a lifecycle — it is collected, used, stored, reused, and eventually deleted. These protections must hold at every stage, or they are lost.

The Mechanics of Consent

Today, consent remains the most common mechanism for privacy — the basic control primitive intended to let people decide how their data is collected, shared, and used. The concept of consent actually predates the digital era — it began in a paper-based world, where signatures and written permissions served as the primary means of authorizing data use. 

It is important to distinguish between two kinds of consent:

  1. Consent to collect data – allowing an entity to initially gather your data (for example, an app accessing your camera).
  2. Consent to share data – granting permission for that data to be used or passed on for a specific purpose (for example, a bank sharing your salary details with a loan underwriter).

Our focus in this article is on consent to share data, since that is where both the greatest privacy challenges and the most meaningful opportunities for value creation lie.

Here is the problem with how consent is currently implemented today. Under frameworks like GDPR, consent has been defined as a very coarse-grained and blunt artifact. The same entity collects your data, gathers your consent, and enforces the rules around its use. For individuals, this typically means an all-or-nothing choice — share everything or nothing at all. And for innovators, it stifles the ability to responsibly explore new uses of data.

India’s Innovation: Unbundling Consent

When India designed its Account Aggregator system for financial data sharing, it chose a different path. Consent to share data was unbundled into two parts:

  • Collect consent: Managed by trusted intermediaries called Account Aggregators.
  • Enforce consent: Managed downstream by Financial Information Users (like banks or wealth advisors), under ecosystem oversight.

https://sahamati.org.in/what-is-account-aggregator/

At the heart of this design lies a set of principles that make consent Open, Revocable, Granular, Auditable, Notifying, and Secure or ORGANS for short.

The Account Aggregator (AA) framework became the first manifestation of DEPA — the Data Empowerment and Protection Architecture. It is now India’s go-to model for user-consented data sharing between institutions, especially for straightforward data transfers and simple inference tasks.

Consent works well for inferences — one-time decisions like a bank checking your last six months of transactions to approve a loan. Yet, in practice, consent has well-known limits. People are asked to grant permission repeatedly, often through long, opaque terms they don’t fully understand, leading to consent fatigue and a loss of meaningful control.

These limitations become clearer when we move from individual decisions to model training and large-scale analytics, where algorithms learn patterns from millions of records. Seeking or managing consent at that scale is neither practical nor effective. 

What’s worse is that models can sometimes memorize sensitive data and inadvertently reveal it later. This highlights the need for new, complementary control primitives that uphold privacy and accountability even when explicit consent isn’t feasible.

Attempts at de-identification — the process of removing or masking identifiers to anonymize data – have significant limitations in practice. Although anonymization is meant to ensure that individuals cannot be re-identified, de-identification techniques are often reversible when datasets are combined with external information. As a result, such approaches offer only weak privacy guarantees, and numerous cases have shown how easily supposedly “anonymous” data can be linked back to individuals.

Privacy-preserving Algorithms: A New Control Primitive for Training and Analytics

To address these limits, a new class of algorithms has emerged under the broad umbrella of Privacy-Enhancing Technologies (PETs). Let us call these privacy-preserving algorithms, to differentiate them from other classes of PETs. They provide a spectrum of technical safeguards that preserve privacy while still enabling useful computation and collaboration on sensitive data.

Among these, Differential Privacy (DP), a mathematical framework for preserving individual privacy in datasets, stands out as a powerful privacy primitive for model training and data analysis.

The key idea: DP adds carefully calibrated noise to queries or model updates so that the results are statistically indistinguishable whether or not any single individual’s data is included. This ensures that nothing specific about an individual can be reliably inferred.

To make this guarantee rigorous, DP introduces the concept of a privacy budget (often represented by the parameters epsilon ε and delta δ):

  • Each query or training step “spends” some of this budget.
  • With more queries or training epochs, the cumulative privacy loss increases.
  • Once the budget is exhausted, no further queries or training is allowed, keeping the risk of re-identification mathematically bounded.

Think of this as a quantitative accounting system for privacy loss. Note, however, that DP comes with a utility tradeoff: adding calibrated noise can reduce model accuracy or data usefulness. Hence, depending on the use-case, the right privacy controls may be achieved through other privacy-preserving algorithms, or a combination thereof.

Electronic Contracts: Digitizing Trust

While privacy-preserving computation enables data to be used securely, participants still need clear agreements defining who may use it, for what purpose, or under what conditions. For such collaborations to function effectively, there must be a well-defined and enforceable contractual framework that specifies each party’s rights, obligations, and permissions.

The need for such a framework becomes even more pressing as organizations seek to unlock real value from data. No single dataset is enough; the most meaningful insights arise when information from multiple sources — hospitals, banks, labs, startups, or agencies — can be combined and analyzed responsibly. Yet each participant brings its own rules, contracts, and compliance obligations, creating a patchwork of agreements that are difficult to align.

Traditionally, contracts are legal documents — PDFs or paper agreements — written in human language, interpreted by lawyers, and enforced by institutions. They work well when a few parties are involved, but in modern data collaborations, this model quickly breaks down.

Today, every new collaboration means drafting, signing, and managing a maze of separate legal agreements, often in different formats, scattered across systems, and maintained by hand. With every participant added, the web of contracts grows bulkier, making coordination slow, expensive and error-prone. Every change or dispute requires human intervention and can take weeks or months to resolve.

This contractual friction has long been the viscous drag holding back scalable, compliant data collaboration. Not because trust is missing, but because it is buried under paperwork.

Electronic contracts transform this equation. They are machine-readable, digitally signed, and executable agreements that translate legal promises into enforceable code. Instead of being static documents, they are active digital objects that the DEPA orchestration layer can interpret and act upon — automatically initiating workflows, enforcing permissions, and ensuring compliance.

In effect, electronic contracts bridge law and computation.  They enable trust, automation, and accountability at digital speed, replacing manual paperwork with a system that can verify, execute, and audit commitments in real time.

Confidential Clean Rooms (CCR)

To operationalize the above elements, we need infrastructure that embeds privacy and compliance mechanisms by design, while also supporting diverse collaboration modalities — from data analytics and model training to various forms of inference.

That’s where Confidential Clean Rooms (CCRs) come in. A CCR is a secure computing environment that allows organizations to collaborate on data without ever sharing it in plain form. You can think of it as a locked, monitored laboratory where data from multiple parties can be brought together for analysis — yet no participant, not even the operator of the lab, can peek inside.

At the heart of every CCR is Confidential Computing — a technology that uses Trusted Execution Environments (TEEs) built into modern processors.  When data enters a TEE, it is encrypted and isolated from the rest of the system, ensuring that even cloud providers or system administrators cannot access it. Computations run inside this protected enclave, and only verified results can leave. Each TEE also produces a cryptographic attestation, a proof that the computation was executed correctly and under the agreed conditions.

https://depa.world/training/architecture

On their own, CCRs provide secure execution. But when combined with other DEPA primitives..

  1. Electronic Contracts, which specify who can use what data for what purpose, and
  2. Privacy-preserving algorithms, which provide mathematical controls about what information can or cannot leak,

..they form a complete privacy-preserving data-sharing stack.

In essence, Confidential Clean Rooms (CCRs) enable confidential, techno-legal, and privacy-preserving computation on data. They make it possible to conduct large-scale data inference, analytics and modelling responsibly, without transferring raw data to any third party, and thereby eliminating the need for consent specifically for data sharing.

But technology alone doesn’t build ecosystems. Who brings this framework to life, abstracting away its complexity for everyday organizations? How might it help us confront our most urgent global challenges — in health, climate and finance? And how could it unlock entirely new kinds of enterprises, fueling a vibrant and responsible data economy for the Intelligence Age?

Data Collabs!

Read Part 2: Privacy in the Age of AI: New Frameworks for Data Collaboration-Part-2

Please note: The blog post is authored by our volunteers, Hari Subramanian and Sarang Galada

For more information, please visit: https://depa.world/

DEPA AI Chain: Empowerment Through Provenance

The DEPA AI Chain is central to operationalising data sharing for AI development and runtime use, while preserving privacy and maintaining verifiable provenance across the entire AI lifecycle — spanning dataset creation and licensing through training, release, inference, and content distribution. Risks and returns are managed through contracts and programmable controls; oversight is delivered via transparency logs and lightweight audits by a self-regulatory organisation (SRO), yielding an efficient and effective supervisory mechanism.

1.0 Unpacking Provenance

Provenance, in digital systems, refers to the systematic tracking of the origin of data and the complete history of the transformations and processes it undergoes throughout its lifecycle. It captures metadata about where the data came from, how it was created, and how it has been modified, combined, or interpreted over time.

Data provenance plays a critical role across a wide range of applications and scenarios. It is essential for ensuring the reproducibility of scientific experiments and computational workflows, enabling others to independently validate results. It supports fault diagnosis and fault tolerance by providing a traceable record that helps isolate and correct errors in complex systems. Provenance is also key to explainability (but also vastly different), as it clarifies how specific outcomes or decisions were derived, particularly in contexts such as AI and automated decision-making. In addition, provenance provides vital support for forensic investigations and auditing, where establishing the trustworthiness and integrity of data is crucial for compliance, accountability, and legal defensibility. By making the history of data transparent and verifiable, provenance serves as a foundational element of trustworthy digital systems.

In the context of personal data sharing, consent without provenance is an unauditable promise. There is a need to include a machine-readable trail linking consent or data protection compliance (the promise) to verifiable facts. 

The concept of provenance is increasingly critical in the context of modern AI systems, which are pervasive across numerous domains. In such systems — often characterised by Markovian or black-box behaviours — establishing clear causal relationships between inputs and outputs is inherently challenging. The opacity of many AI models, particularly deep learning models, makes it difficult to trace how specific outcomes arise, raising significant concerns around trust, accountability, and reproducibility.

Although parallel efforts exist under the banners of Explainable AI (XAI) and Trustworthy AI (TAI), provenance offers a complementary and, in many cases, more scalable and cost-effective approach to enhancing transparency. When thoughtfully designed and integrated into AI pipelines, provenance can provide a systematic, audit-friendly mechanism to capture the lineage and transformations of data and models, often with fewer assumptions than model-specific explainability techniques.

At its core, provenance in AI systems addresses concerns such as: (i) authenticity (of data and its origins), (ii) ownership, (iii) traceability, and (iv) (approximate) reproducibility. In contrast, frameworks such as TAI tend to emphasise aspects including (i) accuracy, (ii) fairness, (iii) explainability, and (iv) safety.

Yet, even with these clear distinctions, provenance is sometimes misframed in policy discussions. Treating any and all provenance artefacts as something that inevitably leads to identity disclosure is an error, one that conflates transparency with surveillance or identity tracking. As critics often put it in “Road to Perdition” terms, unfettered access to provenance data may indeed pose risks — but such access is not meant to be unfettered. It must come with safeguards, constrained by law and subject to due oversight. Framing the choice as either no provenance or dystopia ignores both context and the inevitability of provenance as part of the solution. Even references to Puttaswamy’s judgement, frequently invoked in this debate, are incomplete if not situated within its broader framework of proportionality and legitimate state aim. After all, without engaging with principles such as purpose limitation, retention bounds, or penalties for misuse, how else are systems meant to achieve reliability and harm reduction at scale? The answer lies not in abandoning provenance, but in advancing privacy-preserving provenance — mechanisms that preserve accountability and auditability without compromising individual rights.

1.1 Promise and Potential of AI Chain

The AI Chain is fundamentally a mechanism for capturing the lineage and transformations of data and models in a systematic, effective way, offering a complementary approach to XAI. The AI Chain promises to meet the following requirements:

  • Lineage: Lineage captures the complete journey of data and AI outputs—from consent and licensing, through training, to distribution—ensuring traceability, authenticity, and near-precise reproducibility of AI outcomes. It provides a granular record by assigning unique IDs to datasets and linking a Data Principal’s ID to their data and consent artefact, documenting how data is introduced, modified, combined, and interpreted. To preserve privacy, lineage can be applied to metadata rather than raw data. Cryptographic mechanisms such as hash chains and Merkle trees secure the integrity of the entire lineage.

  • Effective Verification and Its Impact on Liability Allocation: Verifiers can check provenance artefacts—including signatures, attestations, and log proofs—at scale. This may assist in liability and accountability allocation, since the responsibilities of Training Data Providers, Training Data Consumers, publishers, and platforms are clearly stated through policies and contracts, and their actions are immutably recorded in provenance artefacts.

Finally, this approach has second-order effects on data quality: established provenance artefacts increase the value of well-curated datasets.

1.2 What AI Chain Is Not Intended to Do

  • Truthfulness or correctness guarantees: The chain reveals who, what, when, and how a piece of content was created or modified—but it cannot confirm whether the content depicts reality.
  • Bias/fairness or safety adjudication: The chain records facts; value judgements belong to governance, post-facto audits, and external assessments.
  • Enforcement on off-chain actors: Entities falling outside the chain are not snapshotted and can ignore the guardrails.
  • Eliminate the need for legal process: The chain provides strong factual and indisputable evidence, not automatic verdicts.

We welcome feedback and suggestions from all stakeholders at [email protected]

Please note: The blog post is authored by Subodh Sharma, with inputs from Sunu Engineer and Raj Shekhar, all volunteers with iSPIRT.

FAQs and Facts on Techno-Legal Regulation 2.0

This blog continues our discussion on the techno-legal regulation of artificial intelligence (AI), building on our original post from 03.09.25—with a focus on key outstanding issues that required in-depth consideration, alongside the responses and questions we received from stakeholders as of 12.09.25.

Question 1: Since technology is constantly evolving, wouldn’t relying on technology to enable regulation be a flawed approach?

No—what would be flawed is mandating the use of specific technologies for regulation. In fast-evolving domains like AI, rigid technological mandates risk becoming obsolete within a short time—both stifling innovation and undermining public safety. A fundamental insight from systems theory reinforces this: to regulate or control a system that operates at speed x, the regulatory system itself must react and adapt at comparable or greater speed.

AI is evolving at breakneck speed and our understanding of the associated risks and failure pathways remains incomplete. This inherent uncertainty calls for a regulatory framework that is both flexible and adaptive. The most effective way to achieve this is by combining technological agility with failure-related metrics, all governed under lightweight legal constraints and conditions. The techno-legal approach is designed precisely for this: it sets clear outcome-focused obligations for system developers and operators, without prescribing rigid technical solutions, while promoting continuous system monitoring and adaptability to emerging risks.

For example, instead of mandating a particular technique for privacy preservation in AI training, policymakers under the techno-legal approach mandate only the regulatory outcome—i.e., privacy preservation—allowing developers to implement the latest techniques, such as differential privacy or federated learning, to achieve it. As a result, regulation remains effective and adaptive in the face of advancing technology and emerging risks.

Question 2: Isn’t a techno-legal approach most suitable when the subject of regulation is clearly defined? If so, doesn’t AI’s rapidly evolving and non-deterministic nature make it a poor candidate for such regulation?

A precise definition of the regulatory subject is essential for traditional command-and-control regulation. This model relies on ex ante identification and enumeration of risks and corresponding mitigation measures, typically framed as detailed, positive obligations that regulatees must follow. Without a clear regulatory subject, risk assessments can be inaccurate, leading to over-regulation in some areas and under-regulation in others. Given AI’s rapidly evolving and non-deterministic nature, it is ill-suited for such rigid regulation.

In contrast, a techno-legal approach focuses on defining the regulatory outcome, rather than the precise subject of regulation. The regulator requires that the outcome—such as privacy preservation in AI training—be embedded into the technical design of any system that could affect it, without prescribing specific methods to achieve compliance. This removes the need for exhaustive risk enumeration upfront and avoids the pitfalls of narrowly defining the regulatory subject. By focusing on outcomes rather than rigid processes, techno-legal regulation enables continuous adaptability, making it uniquely well-suited to govern AI systems that are non-deterministic and continuously evolving in capability and complexity.

For example, Musical AI’s Rights Management Platform is a techno-legal solution that embeds the regulatory objective of copyright protection directly into the AI model development process. The platform achieves this by restricting training of music generation models to licensed content and integrating attribution technology that logs each output, linking it to the original artist or song. This ensures seamless copyright enforcement and fair revenue sharing. Crucially, the focus remains exclusively on the outcome, i.e.—safeguarding creators’ exclusive rights over the use and distribution of their works, as mandated by copyright laws globally. For such a techno-legal solution to function, the regulator need not define specific AI model types for music generation as the regulatory subject, nor prescribe a particular rights management platform as a compliance mandate. Instead, technologists and companies remain free to innovate in AI music generation, applying any method or architecture they choose—as long as the regulatory outcome of effective copyright protection is achieved.

Question 3: How can techno-legal regulation be designed to avoid becoming redundant or leading to unintended or undesirable consequences?

Techno-legal approaches are intended to tackle the very problem of redundancy in AI regulation, setting clear, outcome-focused obligations for system developers and operators while enabling continuous monitoring and adaptability to emerging risks (as explained in response to Question 1 above).

That said, in addition to having clearly defined regulatory outcomes, techno-legal regulation depends on two key conditions to remain effective and adaptive, ensuring it does not ironically render itself redundant. First, the efficacy of any techno-legal solution must be assessed using well-defined metrics to track its progress toward the regulatory objective. Where direct measurement is impractical, appropriate proxy indicators can be used. Importantly, these metrics should be subject to regular review, ensuring they stay relevant and responsive to emerging externalities and shifts in the operating environment. Second, the techno-legal solution should undergo regular audits to verify its effectiveness and continued alignment with the regulatory objective. This ensures that the system continues to function as intended. When designed with clear objectives, measurable metrics, and periodic auditing—techno-legal regulation remains robust, avoiding potential redundancy and the risk of unintended or undesirable consequences.

Question 4: Wouldn’t the AI Chain architecture under DEPA 2.0 restrict the diversity of relationships in the value chain, thereby limiting novel pathways for innovation?

On the contrary, the AI Chain architecture is specifically designed to enable the broadest diversity of relationships in the AI value chain. Its open, modular design and transparent accountability mechanisms allow various actors—including developers, data providers, service operators, and others—to collaborate with trust and innovate without rigid barriers. This flexibility, in turn, fosters the emergence of novel and unexpected pathways for value creation.

Question 5: Can the allocation of liability—an inherently nuanced area of jurisprudence that has evolved over centuries—be effectively codified into a technology framework?

The allocation of liability, grounded in centuries of jurisprudence, becomes particularly complex when applied to AI. While techno-legal approaches may not be suited to directly assign liability and enforce penalties for AI harms on their own, they could certainly provide valuable tools to help navigate this complexity. For example, the AI Chain architecture under DEPA 2.0 leverages distributed ledger technology to provide end-to-end tracking of system activities and participant actions at a fine-grained level—capturing who performed which action, when, and using which model or dataset, with precise timestamps. Cryptographic proofs such as Merkle trees ensure that every step is irrefutably recorded and immutable. These detailed traces create a tamper-proof, transparent record of events, which auditors, courts, and regulators can use to reconstruct the sequence of actions leading to an AI-related harm.

The technological observability and causal traceability enabled by the architecture could incentivise good behaviour among ecosystem actors, reduce ambiguity in legal and adjudicatory processes, and support the development of robust AI liability jurisprudence—making liability allocation for AI harms streamlined, scalable, transparent, and fair.

We welcome feedback and suggestions from all stakeholders at [email protected]

Please note: The blog post is authored by Raj Shekhar, with inputs from Sunu Engineer and review by Subodh Sharma, all volunteers with iSPIRT.

FAQs and Facts on Techno-Legal Regulation

This blog is an invitation to advance public discourse on techno-legal regulation of artificial intelligence (AI). It builds on an article by Rahul Matthan (15 January 2025), in which he raised reservations about applying techno-legal regulation to AI governance and expressed concerns about the practicability of techno-legal artefacts-particularly their ability to establish liability chains among ecosystem actors-as a tool for enforcing good behaviour and ensuring accountability for AI harms. Through a Q&A format, this blog addresses those reservations and concerns directly, while explaining why techno-legal regulation is not only feasible but also the only practicable and scalable way to regulate AI effectively

Techno-legal regulation isn’t a monolithic concept, it can assume multiple implementations for different problems. DEPA Training embeds privacy and sovereignty requirements directly into AI training pipelines through confidential clean rooms and differential privacy. DEPA Inference creates consent-based data sharing. The proposed AI Chain architecture would establish liability tracking through distributed ledgers. Each solves a different problem using the same core principle: making regulatory compliance systematically enforced rather than legally suggested.

The confusion arises because people conflate these distinct systems. DEPA Training ensures AI models can do data collaboration. Privacy budgets will prevent individual contributions from being traced. DEPA Inference ensures PII based data can’t be accessed without consent because the cryptographic handshake fails without a valid consent artifact. AI Chain would ensure accountability can’t be avoided because every inference generates a log trace. Three different problems, three different techno-legal solutions, one underlying philosophy: architecture enforces what law requires.

Moreover, tools don’t meet the bar of techno-legal: that is precisely why one would want to craft techno-legal docs to accept technology substrates as keys ideas which are accepted and acknowledged as such to be mechanisable to meet certain key properties and invariants (in the real world). Tools are just instances of realising these mechanisable properties/invariants. For instance — can policy be put as attestable and executable code — why not? Policy is a set of unambiguous rules and so long as they are unambiguous and computable, they are automatable. If exceptions to the rule exist then they must also be documented.

There is a general worry that introducing identities into AI systems will erode privacy. From a computer-systems standpoint, that conclusion doesn’t follow. What matters is how identifiers are created and managed and what is recorded. With pairwise (service-scoped) identifiers, selective disclosure, and tamper-evident logging of metadata (not payloads), systems can offer accountability and simultaneously uphold Privacy by Design (PbD). These are not speculative ideas: the web and major identity programs already run variants at scale.

OpenID Connect has long supported pairwise subject identifiers, which purposely give each relying party a different, opaque value, curbing cross-service linkability. Aadhaar’s Virtual ID (VID) and UID tokenization make the same design choice in India: a revocable, tokenized identifier is presented instead of the Aadhaar number, and per-agency tokens prevent easy correlation across services while remaining auditable. In both cases, the principle is the same—identity is scoped to a context.

On the web, the W3C Verifiable Credentials (VC) 2.0 model and cryptographic suites such as BBS+ allow a holder to prove only the claims that are necessary (for example, “over 18”) while withholding the rest; the SD-JWT work in the IETF ecosystem supports similar selective-disclosure for JWTs (JSON Web Tokens). The direction of travel — both in standards and deployments — is to treat “need-to-know” as a first-class property.

Every time a browser trusts a public TLS certificate, it relies on Certificate Transparency (CT) — append-only Merkle-tree logs with efficient inclusion and consistency proofs—to keep Certificate Authorities honest. Chrome and Apple have required CT for certificates issued after 2018. Therein lies a lesson for AI: append-only, publicly auditable logs are one mature way to record event receipts without exposing content.

PbD’s “positive-sum” stance is compatible with a metadata-only accountability layer. Instead of retaining prompts, outputs, or personal payloads, systems can emit signed, append-only receipts that capture who/what/which/when: a scoped user identifier, model and dataset versions, operation type (e.g., generate/transform/moderate), timestamp, and the responsible (but not necessarily trusted) operator or process. Auditors later verify that events occurred and in which order via Merkle proofs; when a lawful process requires more detail, selective-disclosure credentials release the minimum necessary information. This is the same architectural separation that keeps web PKI and identity wallets both auditable and privacy-preserving.

When we track things securely, we do not create a surveillance state. We create a modelable, measurable, manageable state. When the tracking data is misused by parties – parties in power or parties with power to access the data, bypassing access checks – then they have the ability to create a surveillance state or cause damage. DEPA liability chains are designed to establish the connections between different parts of the data economy ecosystem, but using strong cryptographic techniques to detect and protect against unauthorised access.

Traceability and agency/activity chains are needed to construct the data economy ecosystem robustly.

India needs techno-legal regulation because we can’t afford not to have it. We don’t have thousands of judges to adjudicate AI harm. We don’t have armies of auditors to verify compliance. We have scale challenges the West doesn’t face, governing AI for 1.4 billion people requires architectural enforcement. We need to protect our people and enable our innovators.

The question isn’t whether we need techno-legal regulation, it’s whether we’re honest about what happens without it. Without DEPA Training’s cryptographic enforcement, AI systems will train on unauthorized data because detection is impossible at scale. Without immutable audit trails, companies will claim compliance while violating every principle because verification requires resources we don’t have. Without architectural enforcement, the most vulnerable Indians, those who can’t afford lawyers, don’t understand technology, can’t navigate bureaucracy, will be harmed first and most.

AI space is an unknown space. To define legal regulation in a space we need to be able to enumerate ( exhaustively if possible) all the failure modes in the system and then frame the regulations to prevent, to detect, to curtail impact, to correct after the event etc. When we know the details one can compute the legal implications and consequences and define a legal regulation ( 80 percent) supported by technology ( 20 percent). When we are dealing with an unknown space, unknown in the sense that the failure modes are not enumerable, then we can do techno-legal regulation in an evolutionary manner ( even more so when the activity is distributed in space and time and occurring with a high frequency ). Here we start with a base implementation and evolve it based on the discovery of failure modes. We can argue that such an evolutionary approach to creating regulation that not only protects but also fosters growth needs to be implemented on a technology substrate (80 percent tech 20 percent human). Otherwise the evolution will be very slow and the regulation will be out of sync with market needs.

True, current technologies may not be able to solve use limitation and/or data minimisation in the world of AI ex-ante, however, the question should be can we construct testable tech mechanisms to check violations of these requirements ex-post. I believe that is certainly possible — challenging but doable.

DEPA does solve for this indirectly. Retention restrictions, usage limitation, data minimization, all require deep understanding of how and where data is being used. DEPA chains track and trace and provide this information which will enable the DEPA framework itself to implement and enforce these and other constraints and conditions on data use. Without a technology framework to do this, it is likely that there will be many more violations of these kinds of conditions without coming to light. The more complex the regulations get, the more technologically advanced and evolutionary the substrate needs to be.

We’re not encoding Platonic ideals of fairness, we’re implementing specific, measurable requirements that regulators and courts have already defined. DEPA Training’s architecture can use techno-legal solutions to enforce fairness principles, it may work like this: when a dataset enters the clean room, the system automatically computes demographic distributions and compares them against regulatory baselines. If biases are detected appropriate remedial measures are effected.

We welcome feedback and suggestions from all stakeholders at [email protected]

Please note: The blog post is authored by our volunteers, Sunu Engineer, Subodh Sharma, Raj Shekhar and Harshit Kacholiya

Lessons from India’s Digital Public Infrastructure Journey

In just a decade, India has redefined how nations can harness technology for the public good. Through Digital Public Infrastructure (DPI), such as Aadhaar, UPI, and Account Aggregator, followed by newer innovations like OCEN and ONDC, India has shown the world how open, interoperable, and inclusive digital systems when designed as privately provisioned, public infrastructure; can spark innovation, scale rapidly, and empower communities at the grassroots.

To capture these lessons and provide a practical guide for policymakers, technologists, and global stakeholders, iSPIRT Foundation has contributed to the development of the DPI Handbook: Foundations of Digital Public Infrastructure. This handbook distills a decade of India’s pioneering experience into actionable insights, frameworks, and design principles that can help other nations build their own inclusive and interoperable DPI. The paper is now published on the Research and Information System for Developing Countries (RIS)

This Handbook is not the product of a single author, but rather the culmination of years of dedicated volunteerism at iSPIRT, where technologists, policymakers, entrepreneurs, and thinkers came together to exchange ideas, build prototypes, and debate design choices. Each page reflects this collaborative spirit, proof that when diverse minds work in concert, they can create frameworks that transform entire societies.

We extend our deepest gratitude to the iSPIRT volunteer community, past and present, whose passion and commitment, have been instrumental in shaping India’s DPI journey. Their contributions embody the ethos of building digital public infrastructure as a shared national mission.

We hope the DPI Handbook becomes both a guide and an inspiration, for nations building their own digital public infrastructure, and for all who believe that technology, when designed for the public good, can change the course of societies.

Please note: The blog post is co-authored by our volunteer, Arun Iyer

iSPIRT would like to extend its gratitude to Shri. Rajeev Chawla – IAS, Strategic Advisor and Chief Knowledge Officer- Ministry of Agriculture & Farmer’s Welfare who co-authored this Handbook, for his insightful perspectives. We would also like to thank Shri. Sachin Chaturvedi, Director General – RIS for graciously writing the Preface to this Handbook

A Budget with Good Intentions but lacks drive to realize Viksit Bharat & EODB

iSPIRT Foundation, a technology think-and-do tank, believes India’s hard problems can be solved only by leveraging public technology for private innovation. iSPIRT, as a think-and-do-tank, pioneered the concept of Digital Public Infrastructure (DPI)

Industry watched and waited to see if this budget will have bold, strategic announcements with long-term vision, aiming for Viksit Bharat 2047. Instead the budget has become more a tactical prescription for handling short-term economic correction.

Three themes capture our attention in the 2025 budget: The Investment in Innovation, Regulatory Reforms and MSME Credit.

Given our Product Nation initiative at iSPIRT, we were looking for bold steps in two major areas: Private sector R&D funding and Ease of Doing Business.

The funding for R&D made incremental progress in this budget. Under the theme ‘Investing in Innovation” the Finance Minister announced Rs. 20,000 Crore to be allocated from the policy announcement of 1 Lakh Crore made in previous budgets. There is no major decision or clarity on how this funding will be routed. In addition, intent is expressed to explore a Deep-Tech fund in the future.

Similarly two welcome announcements towards achieving strategic autonomy are, an outlay for 20,000 crore to develop Small Modular Reactors (SMR) and operationalise at least five indigenously developed SMRs by 2033 and, a National Geospatial Mission to develop foundational geospatial infrastructure and data.

There is a marked improvement in intent to solve the Ease of Doing Business (EoDB) problem in the budget made evident by yesterday’s Economic Survey.

We have been pursuing the Government of India (GOI) on EoDB to implement a comprehensive three-pronged approach, to bring India into the top 5 or 10 EoDB countries. This includes decriminalising 1200 provisions; rationalising multiple laws and implementing a National Regulatory Compliance grid at the center and then extending it to states. Our approach gels with the “Whole of Nation” thinking given by the Honorable Prime Minister.

The GOI appears to reflect this thinking in the statement, “A light-touch regulatory framework based on principles and trust will unleash productivity and employment. Through this framework, we will update regulations that were made under old laws. To develop this modern, flexible, people-friendly, and trust-based regulatory framework appropriate for the twenty-first century…”.

However, the specific action of forming a High-Level Committee for regulatory reforms that “will be set up for a review of all non-financial sector regulations, certifications, licenses, and permissions”, is welcome but appears to be incremental in nature and a slow-moving approach.

The Janvishwas 2.0 announcement has also been repeated stating 100 items to be taken up instead of 1200 reported by us.

Sudhir Singh, iSPIRT’s policy expert volunteer said, “The concept of digital transformation through National Regulatory Compliance Grid (NRCG) has been ignored in the budget. However, another idea of “Digital Port”, pursued by iSPIRT for more than two years now, on digital transformation of cross border trade has received attention and GoI seems to have framed it as ‘BharatTradeNet’ in the budget 2025.”

The budget speech states, “a digital public infrastructure, ‘BharatTradeNet’ (BTN) for international trade will be set-up as a unified platform for trade documentation and financing solutions”. This indeed is an important and welcome announcement.

The Finance Ministry’s move to separate R&D funding and Startup funding is encouraging. The startup-related announcements reflect the government’s continued support to startups.

The government has taken several steps to boost credit enablement for SME and Nano enterprises, a significant part of Priority Sector Lending. Some good announcements were made such as the Kisan Credit Card (KCC) limit increasing from INR 3 lakh to 5 lakh, which in three states – Karnataka, Maharashtra and Uttar Pradesh – is based on Open Credit Enablement Network (OCEN), an initiative of iSPIRT. Other welcome announcements include an increase in credit guarantee cover for MSMEs from INR 5 crores to 10 crores, introduction of customized credit cards for MSMEs, and capital infusion into select public sector banks.

A new Fund-of-Funds with a fresh contribution of another 10,000 crore has been announced. It is encouraging to see the time limit for u/s 80-IAC to avail income tax exemption benefit for startups has been extended by 5 years to 01.04.2030. Another announcement of interest for startups is an extension of credit credit availability with a guaranteed cover of INR 10 crores to 20 crores, with the guarantee fee being moderated to 1 per cent for loans in 27 focus sectors important for Atmanirbhar Bharat.

iSPIRT cofounder Sharad Sharma said, “It is heartening to see private sector R&D funding rolling out with a 20,000 crore fund allocation. Formation of Ease of Doing Business (EoDB) committee is also welcome. However, there is a need to take big moves and expedite these steps to meet 2047 deadlines. BharatTradeNet is a welcome announcement and we hope there will be a thorough and transparent industry consultation on it. Overall we are missing bold and specific actions on Strategic Autonomy and Product Nation. “

About iSPIRT Foundation – We are a non-profit think-and-do tank that builds public goods for Indian product startups to thrive and grow. iSPIRT aims to do what DARPA or Stanford University did in Silicon Valley for startups. iSPIRT builds four types of public goods – technology building blocks (aka India Stack), startup-friendly policies, market access programs like M&A Connect, and Playbooks that codify scarce tacit knowledge for product entrepreneurs of India. For more, visit www.ispirt.in.

For further queries please reach out via email: [email protected] , [email protected] , or [email protected]

Digital Public Infrastructures

This workshop was organized by the Indian think tank iSPIRT Foundation, French Embassy in India, Consulate General of France in Bangalore, and La French Tech in India, based on the following principles:

  • Gathering high level contributors from India and France: industrials, transdisciplinary academics, diplomats, officials, business founders, think tank members, technology makers;
  • Pushing a Workshop format (not an event, not a round table, not a scientific conference), organizing 3 different days with 3 different viewpoints:
    • Philosophical/epistemological/ human sciences,
    • Economical/techno-legal/social sciences/adoption,
    • Application domains and use cases (Health, Culture, Creative Cities, Agriculture);
  • Targeting recommendations toward the AI Action Summit (Paris, February 2025).

Results:

  • More than 80 speakers, 100 participants in person (in Bangalore or in Paris), 200 participants online;
  • 14 different countries represented all over the world (India, France, Canada, USA, Mexico, Guatemala, Brazil, Germany, Netherland, Italia, Spain, Portugal, Belgium, Thailand);
  • An opening session figuring the Ambassador of India to France H.E. Mr. Jawed Ashraf, the French Digital Affairs Ambassador H.E. Mr. Henri Verdier, the Consul general of France in Bangalore Mr. Marc Lamy;
  • A rich repository of content, including speaker presentations, slides and recordings (https://drive.google.com/drive/folders/1eTDbRgw1g8EOBtXS7uRim9I6gtQRbCZB).

Interim Budget 2024 – DPI’s the new factor of productivity

This being an interim budget, much was not expected as far as new announcements and taxation changes. However, for iSPIRT and the Product ecosystem of the country, it is heartening to know that some of our initiatives and thoughts as a ‘think-tank’ have become central to thinking of Government at the leadership level. The following are important to note

The Finance Minister mentioned that “DPI (digital public infrastructure), a new factor of production in the 21st century, is instrumental in the formalization of the economy”. She also mentioned the G-20 successes. ISPIRT pioneered the concept of DPI and played a vital role in rolling out many DPIs and covering the DPI advocacy as a knowledge partner to the Digital Economy Working Group. 

The second announcement that can hugely impact product nation-building is the funding of Research. FM announced that, “A corpus of rupees one lakh crore will be established with a fifty-year interest-free loan. The corpus will provide long-term financing or refinancing with long tenors and low or nil interest rates. This will encourage the private sector to scale up research and innovation significantly in sunrise domains.” Also, the thought of generating employment and empowering youth was central to this announcement. We hope that post-election a robust mechanism can be developed to implement this and capitalize on nation-building. This announcement is also important from iSPIRT’s thought process where a continuous push under its “Vishwamitra” initiative is being out on funding R&D in multiple ways at scale. 

Also notable is,  a new scheme for deep-tech technologies for defence aiming at expediting ‘Aatma-nirbharta’ is on the anvil. 

Although nothing new has been announced, Start-ups are central to the Government’s thinking for economic development. 

Overall it is a futuristic thinking budget speech with an emphasis on deep-tech, research funding, Capital inflows and startups along with capex and infrastructure. 

Though there was a mention of ‘Reform, Perform, and Transform’ as a guiding principle, the budget did not touch upon any specific reform or intent on Ease of Doing business. We wish this becomes an important agenda item along with funding research for our businesses to succeed in global competition. 

Open House on DPI for AI #4: Why India is best suited to be the breeding ground for AI innovation!

This is the 4th blog in a series of blogs describing and signifying the importance of DPI for AI, a privacy-preserving techno-legal framework for AI data collaboration. Readers are encouraged to first go over the earlier blogs for better understanding and continuity. 

We are at the cusp of history with regard to how AI advancements are unfolding and the potential to build a man-machine society of the future economically, socially, and politically. There is a great opportunity to understand and deliver on potentially breakthrough business and societal use cases while developing and advancing foundational capabilities that can adapt to new ideas and challenges in the future. The major startups in Silicon Valley and big techs are focused first on bringing the advancements of AI to first-world problems – optimized and trained for their contexts. However, we know that first world’s solutions may not work in diverse and unstructured contexts in the rest of the world – may not even for all sections of the developed world.

Let’s address the elephant in the room – what are the critical ingredients that an AI ecosystem needs to succeed –  Data, enabling regulatory framework, talent, computing, capital, and a large market. In this open house

we make a case that India is the place that excels in all these dimensions, making it literally a no-brainer whether you are an investor, a researcher, an AI startup, or a product company to come and do it in India for your own success. 

India has one of the most vibrant, diverse, and eager markets in the world, making it a treasure chest of diverse data at scale, which is vital for AI models. While much of this data happens to be proprietary, the DPI for AI data collaboration framework makes it available in an easy and privacy-preserving way to innovators in India. Literally, no other country has such a scale and game plan for training data. One may ask that diversity and scale are indeed India’s strengths but where is the data? Isn’t most of our data with the US-based platforms? In this context, there are three types of Data: 

a. Public Data,
b. Non-Personal Data (NPD), and
c. Proprietary Datasets.

Let’s look at health. India has far more proprietary datasets than the US. It is just frozen in the current setup. Unfreezing this will give us a play in AI. This is exactly what DPI for AI is doing – in a privacy-preserving manner. In the US, health data platforms like those of Apple and Google are entering into agreements with big hospital chains – to supplement their user health data that comes from wearables. How do we better that? This is the US Big Tech-oriented approach – not exactly an ecosystem approach. Democratic unfreezing of health data with hospitals is the key today. DPI for AI would do that even for all – small or big, developers or researchers! We have continental-scale data with more diversity than any other nation. We need a unique way to unlock them to enable the entire ecosystem, not just big corporations. If we can do that, and we think we can via DPI for AI, we will have AI winners from India.

Combine this with India’s forward looking regulatory thought process that balances Regulation for AI and Regulation of AI in a unique way that encourages innovation without compromising on individual privacy and other potential harms of the technology. The diversity and scale of the Indian market act like a forcing function for innovators to think of robustness, safety, and efficiency from the very start which is critical for the innovations in AI to actually result in financial and societal benefits at scale. There are more engineers and scientists of Indian origin who are both creating AI models or developing innovative applications around AI models. Given our demographic dividend, this is one of our strengths for decades to come. Capital and Compute are clearly not our strong points, but capital literally follows the opportunity. Given India’s position of strength on data, regulation, market, and talent, capital is finding its way to India!

So, what are you all waiting for? India welcomes you with continental scale data with a lightweight but safe regulatory regime and talent like no place else to come build, invest, and innovate in India. India has done it in the past in various sectors, and it is strongly positioned to do it again in AI. Let’s do this together. We are just getting started, and, as always, are very eager for your feedback, suggestions, and participation in this journey!

Please share your feedback here
For more information, please visit depa.world

Please note: The blog post is authored by our volunteers, Sharad Sharma, Gaurav Aggarwal, Umakant Soni, and Sunu Engineer

iSPIRT’s response to Union Budget 2023

Budget 2023 – Digital Public Infrastructure (DPI) the ‘Mantra’ for New India

iSPIRT Foundation, a technology think-and-do tank, believes that India’s hard problems can be solved only by leveraging public technology for private innovation. iSPIRT as a think tank pioneered the Digital Public infrastructure (DPIs)

India is at the cusp of what could be the most exciting quarter century of its post-independence existence, referred to as ‘Amrit Kaal’ by the Economic Survey yesterday and today in the Budget speech. The Economic Survey also mentioned that GDP could be boosted by 1% by Digital Public Infrastructure (DPIs), where India is stealing a March on the world for sure. 

The second testimony to the important contribution of DPIs to the economy comes in the budget speech today when the finance minister stated, “India’s rising global profile is because of several accomplishments: unique world class digital public infrastructure, e.g., Aadhaar, Co-Win and UPI” in the forefront. 

Development of DPIs, Stay-in-India Checklist (for Ease of Doing business of Startups), and a ‘jugalbandi’ between public technology and private innovation, through techno-legal regulations, are central to iSPIRT’s work in an attempt to build Product Nation. 

The union budget 2023, brings in cheer to see attempts on the following:

  • Digital Public Infrastructure: The resolve to deepen the DPI and the belief in their role in economic growth. India Stack to build the DPIs has become central to the thought process. Taking the queue ahead the budget 2023 announced the development of DPI for Agriculture, which will be an open source, OpenAPI digital public good, to build inclusive farmer-centric solutions, credit & insurance, farm inputs market intelligence. An Agriculture Accelerator Fund has been announced to promote Agritech start-ups.
  1. Vigyan Infrastructure: efforts to boost R&D, though limited to some sectors right now. Notable among these are – It encourages private sector R&D teams for encouraging collaborative research and innovation in select ICMR labs in the PPP model
  2. One hundred labs for developing applications using 5G services will be set up in engineering institutions. 
  3. Center of Excellence for AI for “Make AI in India and Make AI work for India
  • MSMEs funding & growth is part of the budget thought process, which may lead to the use of another DPI called Open Credit Enablement Networks (OCEN) for enabling MSME funding.
  • The importance of Ease of doing business is reflected in some announcements like using PAN as a Common digital identifier and entity DigiLocker for MSMEs.
  • Wanting to keep the startup revolution going is reflected in the intent to use Startups to build technology in multiple sectors and also use the policy for a new India.

However, beneath all the euphoria, some chronic issues remained to be addressed. The disappointment is on the Stay-in-India checklist (a list of Ease of doing business issues for Startups) to stop startups from slipping from India, which has not been addressed. The checklist is being continuously pursued by iSPIRT and is much needed to provide a competitive edge for India to refrain startups from leaving her jurisdiction.  

Overall it’s heartening to see the vision statement in budget, “Our vision for the Amrit Kaal includes technology-driven and knowledge-based economy”.   

About iSPIRT Foundation – We are a non-profit think-and-do tank that builds public goods for Indian product startups to thrive and grow. iSPIRT aims to do for Indian startups what DARPA or Stanford did in Silicon Valley. iSPIRT builds four types of public goods – technology building blocks (aka India stack), startup-friendly policies, market access programs like M&A Connect and Playbooks that codify scarce tacit knowledge for product entrepreneurs of India.

For more, visit www.ispirt.in.For further queries, reach out to Email:  [email protected] or [email protected].