logo
Red Hat launches enterprise AI inference server for hybrid cloud

Red Hat launches enterprise AI inference server for hybrid cloud

Techday NZ21-05-2025
Red Hat has introduced Red Hat AI Inference Server, an enterprise-grade offering aimed at enabling generative artificial intelligence (AI) inference across hybrid cloud environments.
The Red Hat AI Inference Server emerges as an offering that leverages the vLLM community project, initially started by the University of California, Berkeley. Through Red Hat's integration of Neural Magic technologies, the solution aims to deliver higher speed, improved efficiency with a range of AI accelerators, and reduced operational costs. The platform is designed to allow organisations to run generative AI models on any AI accelerator within any cloud infrastructure.
The solution can be deployed as a standalone containerised offering or as part of Red Hat Enterprise Linux AI (RHEL AI) and Red Hat OpenShift AI. Red Hat says this approach is intended to empower enterprises to deploy and scale generative AI in production with increased confidence.
Joe Fernandes, Vice President and General Manager for Red Hat's AI Business Unit, commented on the launch: "Inference is where the real promise of gen AI is delivered, where user interactions are met with fast, accurate responses delivered by a given model, but it must be delivered in an effective and cost-efficient way. Red Hat AI Inference Server is intended to meet the demand for high-performing, responsive inference at scale while keeping resource demands low, providing a common inference layer that supports any model, running on any accelerator in any environment."
The inference phase in AI refers to the process where pre-trained models are used to generate outputs, a stage which can be a significant inhibitor to performance and cost efficiency if not managed appropriately. The increasing complexity and scale of generative AI models have highlighted the need for robust inference solutions capable of handling production deployments across diverse infrastructures.
The Red Hat AI Inference Server builds on the technology foundation established by the vLLM project. vLLM is known for high-throughput AI inference, ability to handle large input context, acceleration over multiple GPUs, and continuous batching to enhance deployment versatility. Additionally, vLLM extends support to a broad range of publicly available models, including DeepSeek, Google's Gemma, Llama, Llama Nemotron, Mistral, and Phi, among others. Its integration with leading models and enterprise-grade reasoning capabilities places it as a candidate for a standard in AI inference innovation.
The packaged enterprise offering delivers a supported and hardened distribution of vLLM, with several additional tools. These include intelligent large language model (LLM) compression utilities to reduce AI model sizes while preserving or enhancing accuracy, and an optimised model repository hosted under Red Hat AI on Hugging Face. This repository enables instant access to validated and optimised AI models tailored for inference, designed to help improve efficiency by two to four times without the need to compromise on the accuracy of results.
Red Hat also provides enterprise support, drawing upon expertise in bringing community-developed technologies into production. For expanded deployment options, the Red Hat AI Inference Server can be run on non-Red Hat Linux and Kubernetes platforms in line with the company's third-party support policy.
The company's stated vision is to enable a universal inference platform that can accommodate any model, run on any accelerator, and be deployed in any cloud environment. Red Hat sees the success of generative AI relying on the adoption of such standardised inference solutions to ensure consistent user experiences without increasing costs.
Ramine Roane, Corporate Vice President of AI Product Management at AMD, said: "In collaboration with Red Hat, AMD delivers out-of-the-box solutions to drive efficient generative AI in the enterprise. Red Hat AI Inference Server enabled on AMD InstinctTM GPUs equips organizations with enterprise-grade, community-driven AI inference capabilities backed by fully validated hardware accelerators."
Jeremy Foster, Senior Vice President and General Manager at Cisco, commented on the joint opportunities provided by the offering: "AI workloads need speed, consistency, and flexibility, which is exactly what the Red Hat AI Inference Server is designed to deliver. This innovation offers Cisco and Red Hat opportunities to continue to collaborate on new ways to make AI deployments more accessible, efficient and scalable—helping organizations prepare for what's next."
Intel's Bill Pearson, Vice President of Data Center & AI Software Solutions and Ecosystem, said: "Intel is excited to collaborate with Red Hat to enable Red Hat AI Inference Server on Intel Gaudi accelerators. This integration will provide our customers with an optimized solution to streamline and scale AI inference, delivering advanced performance and efficiency for a wide range of enterprise AI applications."
John Fanelli, Vice President of Enterprise Software at NVIDIA, added: "High-performance inference enables models and AI agents not just to answer, but to reason and adapt in real time. With open, full-stack NVIDIA accelerated computing and Red Hat AI Inference Server, developers can run efficient reasoning at scale across hybrid clouds, and deploy with confidence using Red Hat Inference Server with the new NVIDIA Enterprise AI validated design."
Red Hat has stated its intent to further build upon the vLLM community as well as drive development of distributed inference technologies such as llm-d, aiming to establish vLLM as an open standard for inference in hybrid cloud environments.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

RNZ an easy target for flailing Goldsmith
RNZ an easy target for flailing Goldsmith

Newsroom

time10 hours ago

  • Newsroom

RNZ an easy target for flailing Goldsmith

Comment: Melissa Lee was dumped as Minister for Media and Communications for being, in her own words, 'a little slow'. Lee had done nothing in her six-month tenure and resembled an opossum in the headlights when Newshub closed and TVNZ slashed staff numbers. If the same rules applied, Lee's replacement, Paul Goldsmith, should also be handing the portfolio over to the next hopeful. In his 12-month reign, Goldsmith has failed to strengthen the media presence in New Zealand – the job given to him by his boss, Christopher Luxon. He hailed Sky's $1 takeover of Three as a welcome investment in local media. It was, of course, the American owners admitting defeat and bailing out. This might be harsh, but Goldsmith's one achievement seems to have been getting walked over by Google and Facebook. Since Goldsmith gave up on plans to force these global giants into paying for news, Google has voluntarily started renewing content deals that were in place before the last election. These deals pump millions into the media sector including RNZ and TVNZ. Under pressure to be seen to be doing something, anything, Goldsmith has turned his sights on an easier target – RNZ. Easier in multiple ways. No one in the coalition Government is going to stand up for RNZ. Act doesn't think there is any need for the public broadcaster and NZ First leader, Winston Peters, seems bent on revenge for a perceived lack of reporting on his party's successes. The Prime Minister struggles with his own performances in the media and is unlikely to deter Goldsmith from selecting RNZ as a whipping boy. RNZ's falling radio ratings are a soft target for Goldsmith to zero in on. The connection between a falling audience and management failure is an easy concept to push and a hard one to defend. Further slides in the ratings following Goldsmith's pronouncements left RNZ's CEO Paul Thompson in a very tough spot. As well as indirect pressure from Goldsmith, Thompson would have felt the heat from a new (Goldsmith appointed) board member, Brent Impey. Impey is a veteran of commercial radio, where ratings are everything. The current chair, Jim Mather, would also see the need for action. Appointed chair during Labour's time in office, Mather is an ex-military man who understands the chain of command and always does things by the book. He would have felt the need to respond to Goldsmith's concerns even if he didn't agree with them. Thompson decided on a bold move. He contracted RNZ's former news boss, Richard Sutherland, to produce a report looking at the reasons behind the ratings slide and possible solutions. Thompson would have known that Sutherland, who left RNZ in August 2023 after five years as head of news, was unlikely to take prisoners. It was hardly a secret in media circles that Sutherland had become frustrated with parts of the organisation's structure including the archaic separation of news and digital (RNZ's web content). He was furious at what he saw as a lack of accountability from those overseeing online during the Russian propaganda fiasco in mid 2023. Thompson would also have known that Sutherland's report would end up in the public arena. RNZ is subject to the Official Information Act and competing media, particularly NZME, delight in opportunities to cast the state broadcaster in a negative light. What Thompson possibly didn't anticipate was how big a swing Sutherland would take at his old employer. In a report most media have described as 'highly critical' or 'scathing' Sutherland criticised the quality of on-air work, the amount of time staffers are allowed to work from home and a Wellington bias in its news selection. But perhaps the most interesting revelation in the report is that interviews conducted by Sutherland revealed most of the staff see radio as a sunset industry. It is not hard to imagine Goldsmith and Impey (who will probably chair RNZ after Mather retires from the board) saying 'gotcha' as they read that part of the report. There is no doubt RNZ has undergone a culture change in the past few years. After Sutherland left, he was replaced as news boss by Mark Stevens from Stuff. Sutherland grew up in commercial radio and TV – he is a broadcaster through and through. Stevens has no radio experience but is well regarded for his digital know how. In many ways Stevens has been a good hire for RNZ. With Megan Whelan (Head of Content) they have dramatically broadened the range and scope of RNZ's online offering. This has led to rapid growth in RNZ's online audience, helped by Newshub's closure and spikes in readership of one-off lifestyle or fast-twitch content. If RNZ was private media company, its executive would have been praised for the successful investment in online media. But the inability to slow the rate decline of radio audiences is now creating huge pressure on Thompson and his team. Whelan has resigned and RNZ has advertised for a 'Chief Audio Officer.' Turning around the ratings will be hard, partly because the staff view that radio is a 'sunset industry' is not exactly wrong. Like audiences of most legacy media with linear offerings, it will keep declining but the end of the medium is someway off yet. Sutherland suggested a 'high profile' hire would be an important step on the road to redemption, but who? NZME will desperately hold on to its stars and on-air talent from the failed Today FM have mainly drifted out of the industry. Ex-TV3 journalists like Paddy Gower, Duncan Garner, Rebecca Wright and Melissa Chan-Green are names being mentioned and no doubt considered, but the search for outside talent also highlights RNZ's failure to develop more of its own presenters into top performers. Who is the next Kim Hill? Katherine Ryan is probably the closest to a Hill-type RNZ has, but is in the later stages of her career. The Sutherland report also presents Thompson, now the country's longest serving media CEO, with another problem. It paints a picture of failure; failure to address problems that have built up over years. Radio stations take time to turn around and it usually requires myriad small changes as well as major ones. The RNZ board will be acutely aware the underperforming media minister Paul Goldsmith won't want to hear that. He will want a quick result to improve his own scoreboard.

Auckland Council hires private investigator to track homeowner, forced sale looms over $220k rates bill
Auckland Council hires private investigator to track homeowner, forced sale looms over $220k rates bill

NZ Herald

time3 days ago

  • NZ Herald

Auckland Council hires private investigator to track homeowner, forced sale looms over $220k rates bill

The council would not disclose the current rating debt. Property records show the home, which is down a private driveway and part of a block of flats, was last sold in 1996 for $438,000. Its new council valuation is $1.025 million. 'For some years, we have been trying to contact the owner, and we are now entering the final opportunity before the property enters a rating sale process,' Tucker said. 'Despite extensive efforts to contact the owner over many years – including direct correspondence, public notification and professional services to find the owner – we haven't been able to make contact. 'We do not take a rating sale lightly, and it really is a last resort.' Tucker said all attempts to speak to the owner had been unsuccessful, apart from one instance. 'Despite a short period of email correspondence in 2023 and unverifiable claims from third parties purporting to act on the owner's behalf, no payment plan has been established, and no material payments have been received. 'The council is taking action now, as it needs to recover the unpaid rates, and there may be issues with unlawful access to the property and degradation of the site.' Due to the absence of verified contact, the property not being owner-occupied, and the failure of all previous engagement attempts, the statutory conditions for a forced rating sale had now been met under the Local Government (Ratings) Act, Tucker said. Private investigator hired to track Wu A timeline provided by the council shows the last full rates payment was made in 2005. The council was in contact with tenants and a property manager between 2006 and 2012, but neither had authority to address the rates arrears. In May 2014, the council hired a private investigator to track Wu before starting legal proceedings the following month, and registering a charging order against the property title in 2015. 'New information about the property's appropriate legal categorisation then emerged, which halted court proceedings while the council worked through associated legal details.' In 2021, the council applied to the District Court to sell the property as abandoned land. The property is down a private driveway and part of a block of flats. The owner last made a full rates payment in 2005. Photo / Google But, after posting a public notice in January 2023, the council received correspondence from a person purporting to be Choi Wu, which prevented the land from being treated as abandoned. The council is now calling for anyone who knows Wu or immediate family members to make contact 'to help resolve this matter and establish a solution'. If the sale went ahead, Tucker said the proceeds would be used to recover the full amount of outstanding rates, penalties and associated costs, including real estate agency and legal fees. The remainder of the proceeds would be released to the owner or held in trust until claimed. Tucker said anyone concerned about paying their rates was encouraged to get in touch to discuss assistance options. These included a government-funded rates rebate scheme, a rates postponement scheme for residential properties, and flexible payment options. Forced sale abandoned last year after discovery that owner had died Auckland Council was unable to contact the owners of this house in Guthrey Place, Ōtara, to arrange payment of outstanding rates and penalties totalling more than $300,000. Photo / Jason Oxenham In August last year, an imminent forced sale of a home in Ōtara was abandoned at the 11th hour after council officials learned the owner was dead. The Guthrey Pl house was set to be sold over an unpaid rates bill of $317,000. At the time, it was the city's longest outstanding rates bill. No payments had been made since March 2005. The council had tried for years to contact the owner and arrange repayment, without success. However, after coverage in the Herald, the court-ordered auction was abandoned when relatives of the property's owner, Joseph William Leef, contacted council officials to tell them Leef was dead. The only successful compulsory ratings sale in the supercity occurred in 2015. Charlotte Hareta Marsh lost her home of 20 years in a court-ordered sale after failing to pay rates for nine years. Charlotte Marsh at her former home in Manurewa before it was forcibly sold by Auckland Council. She had refused to pay rates arrears of more than $12,000. Photo / Dean Purcell Despite repeated warnings, she refused to recognise the authority of Auckland Council and claimed to have paid her rates instead to the 'rightful land owner', Arikinui o Tuhoe, a self-proclaimed sovereign authority. At the time of the sale, Marsh owed more than $12,000 in rates and penalties, and nearly $3000 in court costs. The late activist Penny Bright's 11-year refusal to pay rates nearly cost her her Kingsland home in the months before her death. Bright had disputed and refused to pay her rates, citing 'the lack of transparency in council spending on private-sector consultants and contractors'. The council went to court to have Bright's home forcibly sold to recoup tens of thousands of dollars in unpaid rates and penalties, and it was listed for sale in April 2017. But in May that year, a deal was struck after Bright applied for a rates postponement, which was accepted by the council. The forced sale proceedings were halted. Lane Nichols is Auckland desk editor for the New Zealand Herald, with more than 20 years' experience in the industry. Sign up to The Daily H, a free newsletter curated by our editors and delivered straight to your inbox every weekday.

YouTube turns to AI to spot children posing as adults
YouTube turns to AI to spot children posing as adults

RNZ News

time3 days ago

  • RNZ News

YouTube turns to AI to spot children posing as adults

Photo: AFP/ NurPhoto YouTube has started using artificial intelligence (AI) to figure out when users are children pretending to be adults on the popular video-sharing platform amid pressure to protect minors from sensitive content. The new safeguard is being rolled out in the United States as Google-owned YouTube and social media platforms such as Instagram and TikTok are under scrutiny to shield children from content geared for grown-ups. A version of AI referred to as machine learning will be used to estimate the age of users based on a variety of factors, including the kinds of videos watched and account longevity, according to YouTube Youth director of product management James Beser. "This technology will allow us to infer a user's age and then use that signal, regardless of the birthday in the account, to deliver our age-appropriate product experiences and protections," Beser said. "We've used this approach in other markets for some time, where it is working well." The age-estimation model enhances technology already in place to deduce user age, according to YouTube. Users will be notified if YouTube believes them to be minors, giving them the option to verify their age with a credit card, selfie, or government ID, according to the tech firm. Social media platforms are regularly accused of failing to protect the well-being of children. Australia will soon use its landmark social media laws to ban children under 16 from YouTube , a top minister said late last month, stressing a need to shield them from "predatory algorithms." Communications Minister Anika Wells said four in 10 Australian children had reported viewing harmful content on YouTube, one of the most visited websites in the world. Australia announced last year it was drafting laws that will ban children from social media sites such as Facebook, TikTok and Instagram until they turn 16. "Our position remains clear: YouTube is a video sharing platform with a library of free, high-quality content, increasingly viewed on TV screens," the company said in a statement at the time. "It's not social media." On paper, the ban is one of the strictest in the world. It is due to come into effect on 10 December. The legislation has been closely monitored by other countries, with many weighing whether to implement similar bans. - AFP

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store