Atlas Cloud Launches High-Efficiency AI Inference Platform, Outperforming DeepSeek
Developed with SGLang, Atlas Inference surpasses leading AI companies in throughput and cost, running DeepSeek V3 & R1 faster than DeepSeek themselves.
NEW YORK CITY, NEW YORK / ACCESS Newswire / May 28, 2025 / Atlas Cloud, the all-in-one AI competency center for training and deploying AI models, today announced the launch of Atlas Inference, an AI inference platform that dramatically reduces GPU and server requirements, enabling faster, more cost-effective deployment of large language models (LLMs).
Atlas Inference, co-developed with SGLang, an AI inference engine, maximizes GPU efficiency by processing more tokens faster and with less hardware. When comparing DeepSeek's published performance results, Atlas Inference's 12-node H100 cluster outperformed DeepSeek's reference implementation of their DeepSeek-V3 model while using two-thirds of the servers. Atlas' platform reduces infrastructure requirements and operational costs while addressing hardware costs, which represent up to 80% of AI operational expenses.
"We built Atlas Inference to fundamentally break down the economics of AI deployment," said Jerry Tang, Atlas CEO. "Our platform's ability to process 54,500 input tokens and 22,500 output tokens per second per node means businesses can finally make high-volume LLM services profitable instead of merely break-even. I believe this will have a significant ripple effect throughout the industry. Simply put, we're surpassing industry standards set by hyperscalers by delivering superior throughput with fewer resources."
Atlas Inference's performance also exceeds major players like Amazon, NVIDIA and Microsoft, delivering up to 2.1 times greater throughput using 12 nodes compared to competitors' larger setups. It maintains sub-5-second first-token latency and 100-millisecond inter-token latency with more than 10,000 concurrent sessions, ensuring a scaled, superior experience. The platform's performance is driven by four key innovations:
Prefill/Decode Disaggregation: Separates compute-intensive operations from memory-bound processes to optimize efficiencyDeepExpert (DeepEP) Parallelism with Load Balancers: Ensures over 90% GPU utilizationTwo-Batch OverlapTechnology: Increases throughput by enabling larger batches and utilization of both compute and communication phases simultaneouslyDisposableTensor Memory Models: Prevents crashes during long sequences for reliable operation
"This platform represents a significant leap forward for AI inference," said Yineng Zhang, Core Developer at SGLang. "What we built here may become the new standard for GPU utilization and latency management. We believe this will unlock capabilities previously out of reach for the majority of the industry regarding throughput and efficiency."
Combined with a lower cost per token, linear scaling behavior, and reduced emissions compared to leading vendors, Atlas Inference provides a cost-efficient and scalable AI deployment.
Atlas Inference works with standard hardware and supports custom models, giving customers complete flexibility. Teams can upload fine-tuned models and keep them isolated on dedicated GPUs, making the platform ideal for organizations requiring brand-specific voice or domain expertise.
The platform is available immediately for enterprise customers and early-stage startups.
About Atlas Cloud
Atlas Cloud is your all-in-one AI competency center, powering leading AI teams with safe, simple, and scalable infrastructure for training and deploying models. Atlas Cloud also offers an on-demand GPU platform that delivers fast, serverless compute. Backed by Dell, HPE, and Supermicro, Atlas delivers near instant access to up to 5,000 GPUs across a global SuperCloud fabric with 99% uptime and baked-in compliance. Learn more at atlascloud.ai.
SOURCE: Atlas Cloud
press release
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Business Insider
3 hours ago
- Business Insider
AI Daily: OpenAI argues to keep countersuit against Musk
Catch up on the top artificial intelligence news and commentary by Wall Street analysts on publicly traded companies in the space with this daily recap compiled by The Fly: Confident Investing Starts Here: COUNTERSUIT: Microsoft (MSFT)-backed OpenAI is arguing to keep its countersuit against Tesla (TSLA) CEO Elon Musk in the trial over for-profit shift, Reuters reports. OpenAI says the Tesla CEO's motion to dismiss the ChatGPT maker's claims has 'no grounding in facts,' and that its countersuit should be included in the expedited trial, rather than put on hold. AI PLATFORM Aurora Mobile (JG) announced the integration of newly updated DeepSeek-R1-0528-a groundbreaking open-source reasoning AI model that rivals proprietary giants like OpenAI's o3 and Google's (GOOG, GOOGL) Gemini 2.5 Pro-into its leading enterprise-grade AI platform This significant update, released by DeepSeek, brings enhanced reasoning capabilities and developer-friendly features, further empowering to deliver cutting-edge AI solutions to enterprises worldwide. The DeepSeek-R1-0528 model brings substantial advancements in reasoning capabilities, achieving notable benchmark improvements such as AIME 2025 accuracy rising from 70% to 87.5% and LiveCodeBench coding performance increasing from 63.5% to 73.3%. These enhancements empower users to tackle complex tasks in domains like math, science, business, and programming with greater precision and efficiency. Additionally, the model's reduced hallucination rate, along with support for JSON output and function calling, ensures seamless integration into business workflows, delivering reliable and consistent results. These improvements align perfectly with the mission of to provide secure, scalable, and enterprise-ready AI solutions. AI INITIATIVE: Tevogen (TVGN) provided stockholders a detailed overview of its artificial intelligence initiative, aimed at integrating advanced machine learning into its ExacTcell technology to enhance target identification and preclinical processes. currently has two proprietary technologies, PredicTcell and AdapTcell, both with patents pending. The company also highlighted strategic partnerships with Microsoft, providing AI expertise and cloud computing infrastructure, and Databricks, supplying data engineering and analytics capabilities. Tevogen plans to expand its headquarters for the team.
Yahoo
7 hours ago
- Yahoo
DeepSeek's R1 Upgrade Nears Top-Tier LLMs
DeepSeek today rolled out DeepSeek-R1-0528, an upgraded version of its R1 large language model that it says now rivals OpenAI's O3 and Google's (NASDAQ:GOOG) Gemini 2.5 Pro. The China-based AI firm credited enhanced post-training algorithmic optimizations and a beefed-up compute pipeline for boosting reasoning accuracy from 70% to 87.5% on complex logic tasks, while cutting hallucination rates and improving vibe coding performance. DeepSeek highlighted benchmark wins in mathematics, programming and general inference, positioning R1-0528 as a peer to leading Western models. This release follows DeepSeek's recent open-source launch of Prover-V2, a specialist reasoning engine, and comes amid a flurry of Chinese AI advancementsAlibaba's (NYSE:BABA) Qwen 3 and Baidu's (NASDAQ:BIDU) Ernie 4.5/X1, both touting hybrid reasoning firepower. DeepSeek argues that its combination of open-development ethos and performance parity gives it a unique edge in global AI research. Investors and partners should care because DeepSeek-R1-0528's near-par with top-tier LLMs could accelerate enterprise deployments in Asia and beyond, drive cloud-compute demand, and intensify competition in the rapidly evolving AI landscape. As Western and Chinese models vie for supremacy, benchmarks like these will shape strategic bets on talent, infrastructure and cross-border AI collaborations. With R1-0528 available now on Hugging Face, markets will watch for adoption by startups and research labs, potential licensing deals, and further advances in DeepSeek's open-source roadmap. This article first appeared on GuruFocus.


Forbes
7 hours ago
- Forbes
Should You Be Worried If Your Doctor Uses ChatGPT?
6 years ago, I wrote a piece, 'Doctors Use Youtube And Google All The Time. Should You Be Worried?' In 2025, it's time to ask, 'Your doctor may be using ChatGPT. Should you be worried?' In a recent unscientific survey, technology entrepreneur Jonas Vollmer asked physicians how many used ChatGPT. 76% percent of the respondents answered 'yes.' According to Volmer, a physician friend also told him, 'most doctors use ChatGPT daily. They routinely paste the full anonymized patient history (along with x-rays, etc.) into their personal ChatGPT account.' My own unofficial conversations with colleagues bears this out, with younger physicians more likely to regularly use AI than older ones I think AI tools such as ChatGPT, Grok, Claude, and other LLMs can be very helpful for physicians after they take a good patient history and perform a properly thorough physical exam. The physician can describe patient signs and symptoms with appropriate medical precision for the AI to analyze In particular, the AI can frequently suggest diagnoses that would not otherwise occur to the physician. For example, Vollmer noted that in a busy urgent care clinic, a patient might be taking some 'alternative medicines' with unusual side effects that might not be widely known in the traditional medical literature, but have been discussed in recent online articles and discussion forums. Thus, ChatGPT acts as an extension to a good physician, not a replacement. As always, the physician has the final responsibility of confirming any novel hypothesis offered by the AI with their own human judgment, which might include running additional tests to confirm the diagnosis. We've already seen non-physician patients report how ChatGPT made a diagnosis on themselves or loved ones after stumping doctors for years. And there are multiple studies showing that AI tools like ChatCPT can be surprisingly good at diagnoses when offered patient case reports. Of course, physicians need to be careful to adhere to all relevant medical privacy laws in their states/countries. And they may even consider getting explicit consent from their patients ahead of time to run their (anonymized) data through AI. Currently, physicians are allowed to seek second opinions from fellow doctors all the time, as long as privacy rules are met. The same guidelines should apply to consultations with AI. In many ways, this is comparable to how AIs in driverless cars perform comparably to human drivers. Driverless Waymo taxicabs in selected cities like Los Angeles perform as safely (or better) than human drivers in appropriately restricted settings. Tesla owners who use the self-driving mode can rely on the AI to drive safely most of the time, although they still have to be prepared to take control of the wheel in an emergency. Robot cars are not yet ready to replace human drivers in all settings (such as icy Colorado mountain highways in wintertime), but they continue to improve rapidly. Similarly, we may soon reach the point that a physician who does not use an AI consultant to double-check his diagnoses will be considered practicing below the standard of care. We are not there yet, but I can see that coming in the next few years. Summary: Tools like ChatGPT can be enormously helpful for physicians, provided that the doctor retains ultimate responsibility for the final diagnosis and treatments, and respects the appropriate privacy rules.