Generative Computing

Available on-demand

Move beyond experimental chatbots and toward a robust IT stack capable of supporting autonomous agents. Sriram Raghavan, VP of AI Research at IBM, dismantles AI hype to reveal the “Generative Computing” paradigm. Learn why 95% of AI pilots fail and how a principles-based approach—leveraging small, fit-for-purpose models and rigorous governance—can transform enterprise uncertainty into secure, scalable, and efficient agentic applications.

The world is in a transformative era, with AI revolutionizing industries, reshaping innovation, and unlocking opportunities once thought impossible. Its vast potential inspires optimism for a future where technology drives progress across industries, governments, NGOs, and society as a whole. However, with this promise comes significant responsibility. Concerns over bias, inequitable access, safety vulnerabilities, and ethical uncertainties highlight the urgent need for a guiding framework. RISE (Responsible, Inclusive, Safe and Ethical) AI fulfills this role, ensuring that AI technologies are developed and applied responsibly, inclusively, and ethically.

The RISE AI Conference provides a unique platform to explore how artificial intelligence can be harnessed to tackle complex societal and contemporary challenges while upholding the principles of RISE. The inaugural RISE AI Conference took place from October 6-8, 2025 at the University of Notre Dame, and was hosted by the Lucy Family Institute for Data and Society.

For more information, please visit the RISE AI Conference website.

The current state of artificial intelligence is defined by a paradoxical “value gap.” Despite the breathless pace of innovation, widespread reports from MIT and Forbes suggest that up to 95% of AI pilots fail to reach production. Sriram Raghavan argues that this failure stems from a lack of formal understanding of AI as a new computing element. As enterprises rush from simple assistants to autonomous agents in a matter of months, they are finding that traditional software development practices are insufficient for the stochastic nature of Large Language Models (LLMs).

2.1 The Three-Layered Journey of Governance

Enterprise AI cannot exist without a disciplined, end-to-end governance strategy. Raghavan describes this as a series of “concentric circles” that have evolved over time. In classical machine learning, governance focused on distributional fairness and adversarial robustness. Generative AI added a second layer addressing hallucinations and source attribution. Now, the “Agentic” layer introduces risks regarding tool-calling and autonomous actions.

To navigate this, IBM developed the Risk Atlas, a comprehensive taxonomy of risks, and the Risk Atlas Nexus, an AI-based navigation advisor that guides developers from intent to specific mitigation measures. A critical focus here is tool-calling hallucinations. Raghavan distinguishes between “syntactic” errors (which fail harmlessly) and “semantic” errors—where an agent provides a correctly formatted but factually wrong command to a database.

The “So What?” Layer: The strategic implication is clear: organizations cannot treat governance as a piecemeal checklist. Because these risks are additive, a failure in the foundational layer (fairness) compromises the outer agentic layer (tool-calling). Reliability requires a platform approach where governance is baked into the entire lifecycle, from the Risk Atlas Nexus during development to runtime monitoring in production.

2.2 The Efficiency Frontier—Why Small Models “Hunt”

Raghavan dismisses the idea of a single “magic model.” Instead, he predicts a market polarization where the “middle bucket” of models (100B–200B parameters) disappears. These models are being squeezed: small models (under 10B) are now hitting performance benchmarks that previously required massive hardware, while Frontier APIs handle the most complex reasoning.

IBM’s Granite 3.0 series exemplifies this shift. Built on a Hybrid Mamba 2 architecture—which combines Transformers with state space models—these models provide enormous efficiency in memory footprint. The Granite 3.0 family is also the first open-source model to be ISO 42001 certified, providing a verified audit of data sanitization and governance.

The “So What?” Layer: For the enterprise, “small models hunt” because they provide 98% of a frontier model’s performance at 1/50th of the cost. By leveraging WebGPU to run these models directly in a browser or on a laptop, organizations can eliminate massive infrastructure overhead and keep sensitive data local, gaining a massive competitive advantage in operational efficiency.

2.3 Defining Generative Computing

The most provocative shift in the keynote is the move from “brutal” natural language prompting to “Generative Programming.” Raghavan argues that prompting is a trial-and-error process that is brittle, unportable, and insecure. To solve this, IBM introduced the Melia toolkit, which treats LLMs as computational agents rather than human-like entities.

Melia represents a return to CS 101 principles by enforcing the separation of instructions from data. It replaces English-language “praying” with structured Python-based control flow, moving the logic (if/then/retry) outside the stochastic model and into the deterministic code.

The “So What?” Layer: Treating an LLM as a software component rather than a conversational partner is the only path to reliability. By moving control flow into Python, developers can create modular, testable applications that don’t break when a model is upgraded from one version to the next.

Conclusion: These elements converge to form the Agentic Development Life Cycle (ADLC). This new framework, jointly authored with Anthropic, replaces the traditional SDLC and requires developers to master new muscles in evaluation and stochastic testing to build the next generation of enterprise software.

The “Middle Bucket” is Dying: The AI market is polarizing into massive frontier models and tiny, highly optimized task-specific models (like Granite 3.0). Impact: Organizations should stop over-investing in medium-sized, general-purpose models and instead focus R&D on fine-tuning 8B-parameter models for specific domain tasks.
Prompting is Not Engineering: The era of the “seven-page prompt” is a temporary workaround. Generative programming via toolkits like Melia is the future. Impact: Resource allocation must shift; enterprises should reallocate headcounts from “Prompt Engineers” to “Generative Software Engineers” capable of managing Python-based control flows.
The AI-Based Governance Advisor: Navigating risk is too complex for manual checklists. The Risk Atlas Nexus provides an AI advisor to map intents to mitigation. Impact: Compliance is no longer a bottleneck but an automated part of the developer workflow, ensuring semantic accuracy in agentic tool-calling.
The ADLC is the New Standard: The transition from SDLC to the Agentic Development Life Cycle (authored with Anthropic) is the biggest challenge for the modern workforce. Impact: Companies must invest in training software engineers in “evals” and data science practices to manage the stochastic nature of AI outputs.
Cryptographic Model Security is Mandatory: As models become bits of code, they require the same security rigor as software. Impact: To prevent tampering or poor quantization, IT stacks must mandate Cryptographic Signing for all deployed models to ensure the “bits” match the certified release.

Sriram Raghavan’s assertions carry the weight of a practitioner who has spent 25 years at IBM, transitioning cutting-edge R&D into a $25 billion software business. His perspective is grounded in the “last mile” challenge of making technology work for the world’s most regulated industries.

“There was all this MIT article that said 95% of AI pilots fail… there is a challenge in between the promise of AI and driving that value in enterprises.”
“The answer isn’t ever going to be like this one magic model that rules the world. It never is. The world is too diverse.”
“When the use case is specific enough, small models hunt. If you have a model that is 1/50th the cost… an enterprise will take it every day.”
“You just pray to the model… ‘do not hallucinate, my career depends on this.’ This is not engineering; this is trial and error.”
“Agents are just programs. This world called inference scaling—this is just logic on top of a model. Just because you call it an agent doesn’t change the fact that finally it’s a program.”