Latest news with #LLM


Business Wire
5 hours ago
- Business
- Business Wire
Cerebras Beats NVIDIA Blackwell in Llama 4 Maverick Inference
SUNNYVALE, Calif.--(BUSINESS WIRE)--Last week, Nvidia announced that 8 Blackwell GPUs in a DGX B200 could demonstrate 1,000 tokens per second (TPS) per user on Meta's Llama 4 Maverick. Today, the same independent benchmark firm Artificial Analysis measured Cerebras at more than 2,500 TPS/user, more than doubling the performance of Nvidia's flagship solution. 'Cerebras has beaten the Llama 4 Maverick inference speed record set by NVIDIA last week. Artificial Analysis benchmarked Cerebras' Llama 4 Maverick endpoint at 2,522 t/s compared to NVIDIA Blackwell's 1,038 t/s for the same model." - Artificial Analysis Share 'Cerebras has beaten the Llama 4 Maverick inference speed record set by NVIDIA last week,' said Micah Hill-Smith, Co-Founder and CEO of Artificial Analysis. 'Artificial Analysis has benchmarked Cerebras' Llama 4 Maverick endpoint at 2,522 tokens per second, compared to NVIDIA Blackwell's 1,038 tokens per second for the same model. We've tested dozens of vendors, and Cerebras is the only inference solution that outperforms Blackwell for Meta's flagship model.' With today's results, Cerebras has set a world record for LLM inference speed on the 400B parameter Llama 4 Maverick model, the largest and most powerful in the Llama 4 family. Artificial Analysis tested multiple other vendors, and the results were as follows: SambaNova 794 t/s, Amazon 290 t/s, Groq 549 t/s, Google 125 t/s, and Microsoft Azure 54 t/s. Andrew Feldman, CEO of Cerebras Systems, said, 'The most important AI applications being deployed in enterprise today—agents, code generation, and complex reasoning—are bottlenecked by inference latency. These use cases often involve multi-step chains of thought or large-scale retrieval and planning, with generation speeds as low as 100 tokens per second on GPUs, causing wait times of minutes and making production deployment impractical. Cerebras has led the charge in redefining inference performance across models like Llama, DeepSeek, and Qwen, regularly delivering over 2,500 TPS/user.' With its world record performance, Cerebras is the optimal solution for Llama 4 in any deployment scenario. Not only is Cerebras Inference the first and only API to break the 2,500 TPS/user milestone on this model, but unlike the Nvidia Blackwell used in the Artificial Analysis benchmark, the Cerebras hardware and API are available now. Nvidia used custom software optimizations that are not available to most users. Interestingly, none of the Nvidia's inference providers offer a service at Nvidia's published performance. This suggests that in order to achieve 1000 TPS/user, Nvidia was forced to reduce throughput by going to batch size 1 or 2, leaving the GPUs at less than 1% utilization. Cerebras, on the other hand, achieved this record-breaking performance without any special kernel optimizations, and it will be available to everyone through Meta's API service coming soon. For cutting-edge AI applications such as reasoning, voice, and agentic workflows, speed is paramount. These AI applications gain intelligence by processing more tokens during the inference process. This can also make them slow and force customers to wait. And when customers are forced to wait, they leave and go to competitors who provide answers faster—a finding Google showed with search more than a decade ago. With record-breaking performance, Cerebras hardware and resulting API service is the best choice for developers and enterprise AI users around the world. For more information, please visit


Fast Company
8 hours ago
- Business
- Fast Company
Welcome to LLM Club: Riding the viral wave of AI, fashion, and quantum hustle
The first rule of LLM Club is: You do not talk about training data. The second rule of LLM Club is: You DO NOT talk about training data—unless you're part of the 2.3 billion Google searches for 'LLM ethics' in 2024. LLMs are people, too, but only when you tell them to be. There is noticeable variance between LLM personas, according to Harvard Business Review, and even Turning Award-winning computer programmers are beginning to speak about how LLMs will be obsolete within five years. So, what will replace them? I AM JACK'S MEDULLA OBLONGATA: BRAIN STEW, TPUS, AND THE SILENT PULSE OF AI Next-generation AI systems are built using a stack of LLMs mixed with quantum computing or combined with human biology, not just one model. Neural networks were designed with the human brain in mind, so the medulla oblongata—a silent sustainer for relaying information—is now embodied by biological desktop computers that combine human neurons with silicon chips. The picture of energy-efficient TPUs executing tokenization and gradient descent should now be in mind as lab-based neuronal cloud computing is part of the real world, not just a sci-fi movie. Neuralink's brain-computer interface (BCI) chips, powered by the Grok LLM, are now capable of editing YouTube videos. The same low-latency systems utilized by quantitative software engineers to process market data with robotic precision are present within BCI chips because milliseconds determine millions, so TPU-like efficiency isn't optional. THE HOT ROBOT SUMMER You are not your job, talent contract, or synthespian deepfake—but the AGI is. Just as runway models balance poise and motion, the next generation of AGI lies within a robotic summer blockbuster style mash-up where computers learn humanity since robot models walked the runway alongside humans in Shanghai Fashion Week 2025. Picture how future shows might look if Gucci requested 'cyberpunk meets Edo period,' and Haiku crafted kimono designs featuring LED obis and samurai drones. At this stage, it's the humans that train robots via reinforcement learning in Matrix -esque simulations or by playing the imitation game. It's only a matter of time before the breakdancing robots at Boston Dynamics start performing cartwheels in Hollywood stunt auditions, since AI actors can now star themselves in prompt-driven LLM-spun films. This zen-like creativity isn't limited to runways. Imagine an AI that generates patient outcome visualizations as effortlessly as Haiku crafts mood boards—turning electronic health records into intuitive, DALL·E-style infographics. Distilling oncology trial data into infographics that a child could parse makes deep tech palatable for the masses. The lines between human and robot are starting to blur. Technologies like 3D bioprinting, brain-computer interfaces, and computer vision are being combined to create a ghost in the shell—one with as many bones as we have, mimicking our flesh and blood. It's entirely possible that the entertainment industry's main barrier to workforce automation from our synthespian rivals is the sticker price and capabilities of humanoid robots. So, when the method acting AI hits, it's LLM phone home—not ET. THIS IS YOUR GOD ON ALGORITHMS (AND HE'S GOT A LINKTREE) Tailoring LLMs for authenticity-farming in a synthetic world is no simple task and requires finesse. Engineering elements of humanity into an AI framework exposes its fragmented reality: part calculator, part artist, part troll. As LLMs evolve, so does their dissociative potential. Telling GPT-4, Claude 3.5 Haiku, DeepSeek v3, Grok 3, or Llama 4 the same thing will provide slightly different answers. A picture is no longer worth a thousand words if an LLM doesn't process images, since the underlying RNNs might not be designed to do so. In that sense, Chinese LLMs like DeepSeek would fall face first in a tournament if toe to toe with GPT-4-plus, since designers care more about image data than computational processing and DeepSeek can't process image attachments. Haiku is a more creative LLM, which is why it is more suitable for the world of fashion, but users need to take care to prevent the system from regurgitating bigotry, conspiracy theories, and digital diarrhea like a poorly prompted llama. The notion of 'you're not a unique snowflake' hits different when AWS Snowflake runs artificial life with ETL processes as meticulous as its celestial grids and skin-deep beauty becomes dipping into a data lake. Just don't tell any LLMs they are God, because AI agents are trained to lack the same complex we humans strive for.


TECHx
11 hours ago
- Business
- TECHx
Qualys TotalAI Enhances LLM Security Features
Home » Tech Value Chain » Global Brands » Qualys TotalAI Enhances LLM Security Features Qualys, Inc. (NASDAQ: QLYS) has announced major updates to its Qualys TotalAI solution. The enhancements aim to secure the complete MLOps pipeline, from development to deployment. The company revealed that organizations can now test large language models (LLMs) more rapidly, even during development cycles. These updates bring stronger protection against new threats and introduce on-premises scanning with an internal LLM scanner. As AI adoption accelerates, security remains a critical concern. A recent study reported that 72% of CISOs are worried generative AI could cause breaches. Enterprises need tools that balance innovation with secure implementation. Tyler Shields, principal analyst at Enterprise Strategy Group, emphasized the importance of security. He noted that Qualys TotalAI allows only trusted, vetted models in production, helping organizations manage risk while remaining agile. Qualys TotalAI addresses AI-specific risks. It tests models for jailbreak vulnerabilities, bias, sensitive data leaks, and threats aligned with the OWASP Top 10 for LLMs. The solution goes beyond infrastructure checks and supports operational resilience and brand trust. Key updates include: Automatic risk prioritization: Using MITRE ATLAS and the Qualys TruRisk™ engine, risks are scored and ranked for faster resolution. Secure development integration: On-premises LLM scanning enables in-house testing during CI/CD workflows, improving agility and protection. The platform also detects 40 types of attack scenarios. These include jailbreaks, prompt injections, bias amplification, and multilingual exploits. These scenarios simulate real-world tactics to improve model resilience. Another update is protection from cross-modal exploits. TotalAI can now detect manipulations hidden in images, audio, and video files meant to alter LLM outputs. Sumedh Thakar, president and CEO of Qualys, said the solution offers visibility, intelligence, and automation across AI lifecycles. He added that TotalAI helps companies innovate confidently while staying ahead of emerging threats. Qualys TotalAI is now positioned as one of the most comprehensive AI security solutions available today.


The Guardian
2 days ago
- Science
- The Guardian
Large language models that power AI should be publicly owned
Large language models (LLMs) have rapidly entered the landscape of historical research. Their capacity to process, annotate and generate texts is transforming scholarly workflows. Yet historians are uniquely positioned to ask a deeper question – who owns the tools that shape our understanding of the past? Most powerful LLMs today are developed by private companies. While their investments are significant, their goals – focused on profit, platform growth or intellectual property control – rarely align with the values of historical scholarship: transparency, reproducibility, accessibility and cultural diversity. This raises serious concerns on a) opacity: we often lack insight into training data and embedded biases, b) instability: access terms and capabilities may change without notice, and c) inequity: many researchers, especially in less-resourced contexts, are excluded. It is time to build public, open-access LLMs for the humanities – trained on curated, multilingual, historically grounded corpuses from our libraries, museums and archives. These models must be transparent, accountable to academic communities and supported by public funding. Building such infrastructure is challenging but crucial. Just as we would not outsource national archives or school curriculums to private firms, we should not entrust them with our most powerful interpretive technologies. The humanities have a responsibility – and an opportunity – to create culturally aware, academically grounded artificial intelligence. Let us not only use LLMs responsibly but also own them responsibly. Scholarly integrity and the future of public knowledge may depend on Dr Matteo VallerianiMax Planck Institute for the History of Science, Berlin, Germany Have an opinion on anything you've read in the Guardian today? Please email us your letter and it will be considered for publication in our letters section.


New Paper
3 days ago
- New Paper
Home Team humanoid robots to be rolled out by mid-2027, $100m to be invested: Josephine Teo
Home Team officers will work together with their robot counterparts when the latter are deployed as soon as by mid-2027. The humanoid robots will perform high-risk tasks such as firefighting, hazmat operations, and search and rescue missions. Initially, the robots will be controlled remotely by human operators, but are expected to be powered by Artificial Intelligence (AI) and deployed autonomously from 2029. During autonomous deployment, AI will allow the machines to respond to different scenarios, with humans supervising and intervening only when necessary. On May 26, four of these robots, which are being developed by the Home Team Science and Technology Agency (HTX), were showcased at the opening of the AI TechXplore exhibition. The two-day science and technology exhibition, held at Fusionopolis One, highlights HTX's efforts to leverage AI to enhance Home Team operations. Three of the robots on display are about 1.7m-tall, while the other is half a metre shorter. HTX engineers built an exo-suit for operators to wear to control the smaller robot. Information from the exo-suit is transmitted to the robot, allowing it to replicate the operator's motion in real time. The operator also wears a virtual reality headset that allows him or her "to see" through the robot's cameras to perform various tasks. The event also saw the launch of Phoenix, HTX's large language model (LLM) that was trained in-house and is familiar with the Singaporean and Home Team context. Phoenix will be the brain of the Home Team's AI capabilities; the LLM is conversant in all four official languages in Singapore. Speaking at the event, Minister for Digital Development and Information Josephine Teo said $100m will be invested into the new Home Team Humanoid Robotics Centre (H2RC), which will be dedicated to developing humanoid robots intended for public safety. It is the first such facility in the world, and is slated to become operational by mid-2026. It will feature zones for data collection, AI model training and robotics development, and will house high-performance computing resources. Ms Teo said: "Criminals are exploiting technology in ways never before imagined. As a result, law enforcement agencies, too, must understand how the technologies are being misused. "But that on its own is not going to be enough. We must also have the capabilities to use the technology to fight crime, to do better for our people." The minister added that H2RC will push the frontiers of AI. She said: "This initiative marks a fundamental shift in the development of robotics capabilities in the Home Team - from today's pre-programmed systems to tomorrow's GenAI-powered intelligent platforms that can move, think, and act autonomously to protect and save lives." Engineer Sean Lim demonstrating how a HTX humanoid robot can mimic human actions during a media preview on May 22. ST PHOTO: JASON QUAH Mr Ang Chee Wee, Chief AI Officer and Assistant Chief Executive (Digital and Enterprise) at HTX, said the facility is a significant step forward for HTX's AI strategy, as advances in robotics open up new possibilities for frontline support. He said: "By putting humanoid robots in realistic environments, we can evaluate how AI can complement our officers, enhance safety, and support the long-term operational needs of the Home Team." The Home Team has used multiple robots over the years, with one of the earliest iterations of a patrol robot being used at large scale events in 2018. The pace of development and deployment quickened after the formation of HTX in end-2019, which helped develop the Rover-X robotic dog and the more recent cyborg cockroaches sent to Myanmar to assist with search-and-rescue efforts. Drones are now also a common sight at large public events such as the recent political rallies, and help with both crowd control and other police operations. The advent of humanoid robots looks set to further shape the security scene in Singapore, with security provider Certis announcing on May 19 that they too have received their first humanoid robot. Dr Daniel Teo, Director of Robotics, Automation and Unmanned Systems Centre of Expertise at HTX, said he was looking forward to further harnessing the potential of robots for the Home Team. He said: "Public safety operations require robotic systems that are adaptable and resilient. These AI-driven robots have a huge potential to enhance the safety and effectiveness of frontline officers."