Red Hat unveils AI 3 to help UK firms scale enterprise AI
Red Hat has announced Red Hat AI 3, an update to its hybrid cloud-native AI platform aimed at enabling enterprises to deploy artificial intelligence at scale from experimentation into production environments.
This release follows new research indicating that while UK businesses plan to increase AI spending by 32% by 2026, some 89% are not yet witnessing customer value from their AI investments. The findings reflect broader challenges in operationalising AI, with data privacy, cost control, and model management highlighted as persistent concerns.
Distributed inference and cost control
Red Hat AI 3 brings together components including Red Hat AI Inference Server, Red Hat Enterprise Linux AI and Red Hat OpenShift AI. A key feature in OpenShift AI 3.0 is the general availability of llm-d, providing distributed and intelligent inference-aware scheduling, and support for large language model workloads. The company states that this is designed to increase performance and reduce infrastructure costs by scheduling AI tasks across hardware accelerators in datacentres, public cloud and edge deployments.
Model as a Service (MaaS) capability enables IT departments to centrally manage and serve models across teams, addressing issues identified in Red Hat's research, such as controlling costs and managing data privacy. The integrated AI hub offers a curated catalogue and registry for AI models, enabling engineers to deploy and monitor their assets centrally.
Shifting from training to inference
As AI projects mature, organisations are moving from developing and training models toward running them reliably in production-a process known as inference. Red Hat stated that its platform emphasises performant and cost-efficient inference, building on the open source vLLM and llm-d projects and integrating with open source technologies such as Kubernetes Gateway API Inference Extension and NVIDIA's NIXL low-latency data transfer capabilities.
The platform is designed to enable wide hardware compatibility, supporting both NVIDIA and AMD accelerators. Llm-d, in particular, moves vLLM from a single-node inference engine to a distributed and scalable system, addressing operational requirements for predictable performance and efficient infrastructure planning in the face of variable AI workloads.
Joe Fernandes, Vice President and General Manager, AI Business Unit, Red Hat, said: "As enterprises scale AI from experimentation to production, they face a new wave of complexity, cost and control challenges. With Red Hat AI 3, we are providing an enterprise-grade, open source platform that minimises these hurdles. By bringing new capabilities like distributed inference with llm-d and a foundation for agentic AI, we are enabling IT teams to more confidently operationalise next-generation AI, on their own terms, across any infrastructure."
Collaboration and platform unification
Red Hat AI 3 provides a unified environment for both platform and AI engineers. Features such as Gen AI studio allow developers to interactively experiment and prototype generative AI applications and test models in a hands-on environment. The AI hub offers a management portal for lifecycle and deployment of AI models, including open source options such as OpenAI's gpt-oss and DeepSeek-R1, and domain-specific models like Whisper for speech-to-text.
Agentic AI and modular frameworks
The platform release also emphasises support for agentic AI, which involves autonomous workflows driven by AI agents. Red Hat has introduced a Unified API layer based on Llama Stack, ensuring interoperability with industry standards such as OpenAI-compatible interfaces. The adoption of Model Context Protocol (MCP) is intended to streamline how models interact with external tools.
A new toolkit for model customisation includes Python libraries and integration with projects like Docling for processing unstructured data, synthetic data generation, and tools for training and evaluation. These are intended to help enterprises tailor their AI models to proprietary data and requirements.
Partner and customer perspectives
Dan McNamara, Senior Vice President and General Manager, Server and Enterprise AI at AMD, said: "As Red Hat brings distributed AI inference into production, AMD is proud to provide the high-performance foundation behind it. Together, we've integrated the efficiency of AMD EPYCTM processors, the scalability of AMD InstinctTM GPUs, and the openness of the AMD ROCmTM software stack to help enterprises move beyond experimentation and operationalise next-generation AI - turning performance and scalability into real business impact across on-prem, cloud, and edge environments."
Mariano Greco, Chief Executive Officer, ARSAT, said: "As a provider of connectivity infrastructure for Argentina, ARSAT handles massive volumes of customer interactions and sensitive data. We needed a solution that would move us beyond simple automation to 'Augmented Intelligence' while delivering absolute data sovereignty for our customers. By building our agentic AI platform on Red Hat OpenShift AI, we went from identifying the need to live production in just 45 days. Red Hat OpenShift AI has not only helped us improve our service and reduce the time engineers spend on support issues, but also freed them up to focus on innovation and new developments."
Rick Villars, Group Vice President, Worldwide Research, IDC, said: "2026 will mark an inflection point as enterprises shift from starting their AI pivot to demanding more measurable and repeatable business outcomes from investments. While initial projects focused on training and testing models, the real value - and the real challenge - is to operationalise model-derived insights with efficient, secure and cost-effective inference. This shift requires more modern infrastructure, data, and app deployment environments with ready to use production-grade inference capabilities that can handle real-world scale and complexity, especially as agentic AI supercharges inference loads. Companies that succeed in becoming AI-fueled businesses will be those who establish a unified platform to orchestrate these ever more sophisticated workloads in hybrid cloud environments, not just in silo domains."
Ujval Kapasi, Vice President, Engineering AI Frameworks, NVIDIA, said: "Scalable, high-performance inference is key to the next wave of generative and agentic AI. With built-in support for accelerated inference with open source NVIDIA Dynamo and NIXL technologies, Red Hat AI 3 provides a unified platform that empowers teams to move swiftly from experimentation to running advanced AI workloads and agents at scale."