logo
#

Latest news with #KevinWeil

OpenAI brings GPT-4.1 and 4.1 mini to ChatGPT — what enterprises should know
OpenAI brings GPT-4.1 and 4.1 mini to ChatGPT — what enterprises should know

Business Mayor

time15-05-2025

  • Business
  • Business Mayor

OpenAI brings GPT-4.1 and 4.1 mini to ChatGPT — what enterprises should know

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More OpenAI is rolling out GPT-4.1, its new non-reasoning large language model (LLM) that balances high performance with lower cost, to users of ChatGPT. The company is beginning with its paying subscribers on ChatGPT Plus, Pro, and Team, with Enterprise and Education user access expected in the coming weeks. It's also adding GPT-4.1 mini, which replaces GPT-4o mini as the default for all ChatGPT users, including those on the free tier. The 'mini' version provides a smaller-scale parameter and thus, less powerful version with similar safety standards. The models are both available via the 'more models' dropdown selection in the top corner of the chat window within ChatGPT, giving users flexibility to choose between GPT-4.1, GPT-4.1 mini, and reasoning models such as o3, o4-mini, and o4-mini-high. Initially intended for use only by third-party software and AI developers through OpenAI's application programming interface (API), GPT-4.1 was added to ChatGPT following strong user feedback. OpenAI post training research lead Michelle Pokrass confirmed on X the shift was driven by demand, writing: 'we were initially planning on keeping this model api only but you all wanted it in chatgpt 🙂 happy coding!' OpenAI Chief Product Officer Kevin Weil posted on X saying: 'We built it for developers, so it's very good at coding and instruction following—give it a try!' GPT-4.1 was designed from the ground up for enterprise-grade practicality. Launched in April 2025 alongside GPT-4.1 mini and nano, this model family prioritized developer needs and production use cases. GPT-4.1 delivers a 21.4-point improvement over GPT-4o on the SWE-bench Verified software engineering benchmark, and a 10.5-point gain on instruction-following tasks in Scale's MultiChallenge benchmark. It also reduces verbosity by 50% compared to other models, a trait enterprise users praised during early testing. Context, speed, and model access GPT-4.1 supports the standard context windows for ChatGPT: 8,000 tokens for free users, 32,000 tokens for Plus users, and 128,000 tokens for Pro users. Read More Real-time data and AI thrust manufacturing into the future According to developer Angel Bogado posting on X, these limits match those used by earlier ChatGPT models, though plans are underway to increase context size further. While the API versions of GPT-4.1 can process up to one million tokens, this expanded capacity is not yet available in ChatGPT, though future support has been hinted at. This extended context capability allows API users to feed entire codebases or large legal and financial documents into the model—useful for reviewing multi-document contracts or analyzing large log files. OpenAI has acknowledged some performance degradation with extremely large inputs, but enterprise test cases suggest solid performance up to several hundred thousand tokens. OpenAI has also launched a Safety Evaluations Hub website to give users access to key performance metrics across models. GPT-4.1 shows solid results across these evaluations. In factual accuracy tests, it scored 0.40 on the SimpleQA benchmark and 0.63 on PersonQA, outperforming several predecessors. It also scored 0.99 on OpenAI's 'not unsafe' measure in standard refusal tests, and 0.86 on more challenging prompts. However, in the StrongReject jailbreak test—an academic benchmark for safety under adversarial conditions—GPT-4.1 scored 0.23, behind models like GPT-4o-mini and o3. That said, it scored a strong 0.96 on human-sourced jailbreak prompts, indicating more robust real-world safety under typical use. In instruction adherence, GPT-4.1 follows OpenAI's defined hierarchy (system over developer, developer over user messages) with a score of 0.71 for resolving system vs. user message conflicts. It also performs well in safeguarding protected phrases and avoiding solution giveaways in tutoring scenarios. Contextualizing GPT-4.1 against predecessors The release of GPT-4.1 comes after scrutiny around GPT-4.5, which debuted in February 2025 as a research preview. That model emphasized better unsupervised learning, a richer knowledge base, and reduced hallucinations—falling from 61.8% in GPT-4o to 37.1%. It also showcased improvements in emotional nuance and long-form writing, but many users found the enhancements subtle. Despite these gains, GPT-4.5 drew criticism for its high price — up to $180 per million output tokens via API —and for underwhelming performance in math and coding benchmarks relative to OpenAI's o-series models. Industry figures noted that while GPT-4.5 was stronger in general conversation and content generation, it underperformed in developer-specific applications. By contrast, GPT-4.1 is intended as a faster, more focused alternative. While it lacks GPT-4.5's breadth of knowledge and extensive emotional modeling, it is better tuned for practical coding assistance and adheres more reliably to user instructions. On OpenAI's API, GPT-4.1 is currently priced at $2.00 per million input tokens, $0.50 per million cached input tokens, and $8.00 per million output tokens. For those seeking a balance between speed and intelligence at a lower cost, GPT-4.1 mini is available at $0.40 per million input tokens, $0.10 per million cached input tokens, and $1.60 per million output tokens. Google's Flash-Lite and Flash models are available starting at $0.075–$0.10 per million input tokens and $0.30–$0.40 per million output tokens, less than a tenth the cost of GPT-4.1's base rates. But while GPT-4.1 is priced higher, it offers stronger software engineering benchmarks and more precise instruction following, which may be critical for enterprise deployment scenarios requiring reliability over cost. Ultimately, OpenAI's GPT-4.1 delivers a premium experience for precision and development performance, while Google's Gemini models appeal to cost-conscious enterprises needing flexible model tiers and multimodal capabilities. The introduction of GPT-4.1 brings specific benefits to enterprise teams managing LLM deployment, orchestration, and data operations: AI Engineers overseeing LLM deployment can expect improved speed and instruction adherence. For teams managing the full LLM lifecycle—from model fine-tuning to troubleshooting—GPT-4.1 offers a more responsive and efficient toolset. It's particularly suitable for lean teams under pressure to ship high-performing models quickly without compromising safety or compliance. can expect improved speed and instruction adherence. For teams managing the full LLM lifecycle—from model fine-tuning to troubleshooting—GPT-4.1 offers a more responsive and efficient toolset. It's particularly suitable for lean teams under pressure to ship high-performing models quickly without compromising safety or compliance. AI orchestration leads focused on scalable pipeline design will appreciate GPT-4.1's robustness against most user-induced failures and its strong performance in message hierarchy tests. This makes it easier to integrate into orchestration systems that prioritize consistency, model validation, and operational reliability. focused on scalable pipeline design will appreciate GPT-4.1's robustness against most user-induced failures and its strong performance in message hierarchy tests. This makes it easier to integrate into orchestration systems that prioritize consistency, model validation, and operational reliability. Data engineers responsible for maintaining high data quality and integrating new tools will benefit from GPT-4.1's lower hallucination rate and higher factual accuracy. Its more predictable output behavior aids in building dependable data workflows, even when team resources are constrained. responsible for maintaining high data quality and integrating new tools will benefit from GPT-4.1's lower hallucination rate and higher factual accuracy. Its more predictable output behavior aids in building dependable data workflows, even when team resources are constrained. IT security professionals tasked with embedding security across DevOps pipelines may find value in GPT-4.1's resistance to common jailbreaks and its controlled output behavior. While its academic jailbreak resistance score leaves room for improvement, the model's high performance against human-sourced exploits helps support safe integration into internal tools. Across these roles, GPT-4.1's positioning as a model optimized for clarity, compliance, and deployment efficiency makes it a compelling option for mid-sized enterprises looking to balance performance with operational demands. While GPT-4.5 represented a scaling milestone in model development, GPT-4.1 centers on utility. It is not the most expensive or the most multimodal, but it delivers meaningful gains in areas that matter to enterprises: accuracy, deployment efficiency, and cost. This repositioning reflects a broader industry trend—away from building the biggest models at any cost, and toward making capable models more accessible and adaptable. GPT-4.1 meets that need, offering a flexible, production-ready tool for teams trying to embed AI deeper into their business operations. As OpenAI continues to evolve its model offerings, GPT-4.1 represents a step forward in democratizing advanced AI for enterprise environments. For decision-makers balancing capability with ROI, it offers a clearer path to deployment without sacrificing performance or safety.

Cisco Appoints Kevin Weil to its Board of Directors
Cisco Appoints Kevin Weil to its Board of Directors

Yahoo

time14-05-2025

  • Business
  • Yahoo

Cisco Appoints Kevin Weil to its Board of Directors

News Summary: Cisco appoints Kevin Weil to its board of directors effective May 12, 2025 Weil's significant expertise in AI, technology innovation and product leadership will bring valuable perspective to Cisco's board Weil currently serves as the Chief Product Officer at OpenAI SAN JOSE, Calif., May 13, 2025 /PRNewswire/ -- Cisco (NASDAQ: CSCO) today announced the appointment of Kevin Weil to its board of directors effective May 12, 2025. "Kevin has an exceptional track record of scaling products to deliver meaningful business value for customers," said Chuck Robbins, chair and CEO of Cisco. "We are excited to leverage his deep expertise in AI and product innovation to guide our strategic initiatives and accelerate Cisco's growth." Weil brings to Cisco extensive expertise in product leadership and technology innovation from his notable roles at industry-leading organizations. In his current role as Chief Product Officer at OpenAI, Weil turns groundbreaking artificial intelligence research into practical, impactful solutions for over 500 million users, businesses, and developers globally. Prior to joining OpenAI, Weil served as the president of product and business for Planet Labs, cofounder of the Libra cryptocurrency, the vice president of product at Instagram, and the senior vice president of product at Twitter. "I'm thrilled to join the board of directors at Cisco, a company building the infrastructure necessary to fulfill the promise of AI," Weil said. "I look forward to working with Chuck and the Cisco team to drive impactful solutions and navigate the exciting opportunities ahead." In addition to Cisco, Weil serves on the board of The Nature Conservancy. He earned his Master of Science in Physics from Stanford University, and graduated summa cum laude with a Bachelor of Arts in Physics and Mathematics from Harvard University. For more information about Cisco's board of directors, visit here. About CiscoCisco (NASDAQ: CSCO) is the worldwide technology leader that is revolutionizing the way organizations connect and protect in the AI era. For more than 40 years, Cisco has securely connected the world. With its industry leading AI-powered solutions and services, Cisco enables its customers, partners and communities to unlock innovation, enhance productivity and strengthen digital resilience. With purpose at its core, Cisco remains committed to creating a more connected and inclusive future for all. Discover more on The Newsroom and follow us on X at @Cisco. Cisco and the Cisco logo are trademarks or registered trademarks of Cisco and/or its affiliates in the U.S. and other countries. A listing of Cisco's trademarks can be found at Third-party trademarks mentioned are the property of their respective owners. The use of the word 'partner' does not imply a partnership relationship between Cisco and any other company. View original content to download multimedia: SOURCE Cisco Systems, Inc. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data

OpenAI is reportedly in talks to buy Windsurf for $3B, with news expected later this week
OpenAI is reportedly in talks to buy Windsurf for $3B, with news expected later this week

Yahoo

time16-04-2025

  • Business
  • Yahoo

OpenAI is reportedly in talks to buy Windsurf for $3B, with news expected later this week

Windsurf, the maker of a popular AI coding assistant editor, is in talks to be acquired by OpenAI for about $3 billion, Bloomberg reported. If the deal happens, it would put OpenAI in direct competition with a number of other AI coding assistant providers, including Anysphere, the maker of Cursor, which OpenAI backed from its OpenAI Startup Fund. The acquisition could jeopardize the credibility of the OpenAI Startup Fund, given that it is one of Cursor's biggest investors, said a person familiar with Cursor's cap table. It's not clear whether OpenAI approached Cursor about an acquisition. In addition to what sources told Bloomberg, there are a few more clues that something is going on between the two companies. A couple of days ago, Windsurf users received an email that said that because of an announcement later this week, they have the option to lock in access to the coding editor at $10 a month. And OpenAI chief product officer Kevin Weil also released a video yesterday praising Windsurf's capabilities. Windsurf, the company formerly known as Codeium, has been in talks to raise fresh funds at a $2.85 billion valuation led by Kleiner Perkins, TechCrunch reported in February. The company has reached about $40 million in annualized recurring revenue (ARR), according to our reporting. That revenue run rate is much lower than Cursor's, which reportedly makes $200 million on an ARR basis. Cursor has been in talks to raise capital at about $10 billion valuation, Bloomberg reported last month. Since its founding in 2021 by Varun Mohan and his childhood friend and fellow MIT grad, Douglas Chen, Windsuf has previously raised $243 million from investors including Greenoaks Capital and General Catalyst, according to PitchBook data. Additional reporting by Sarah Perez Sign in to access your portfolio

OpenAI is reportedly in talks to buy Codeium for $3B, with news expected later this week
OpenAI is reportedly in talks to buy Codeium for $3B, with news expected later this week

Yahoo

time16-04-2025

  • Business
  • Yahoo

OpenAI is reportedly in talks to buy Codeium for $3B, with news expected later this week

Codeium, the maker of a popular AI coding assistant tool Windsurf, is in talks to be acquired by OpenAI for about $3 billion, Bloomberg reported. If the deal happens, it would put OpenAI in direct competition with a number of other AI coding assistant providers, including Anysphere, the maker of Cursor, which OpenAI backed from its OpenAI Startup Fund. The acquisition could jeopardize the credibility of the OpenAI Startup Fund, given that it is one of Cursor's biggest investors, said a person familiar with Cursor's captable. It's not clear whether OpenAI approached Cursor about an acquisition. In addition to what sources told Bloomberg, there are a few more clues that something is going on between the two companies. A couple of days ago, Windsurf users received an email that said that because of an announcement later this week, they have the option to lock in access to the coding editor at $10 a month. And OpenAI chief product officer Kevin Weil also released a video yesterday praising Windsurf's capabilities. Codeium has been in talks to raise fresh funds at a $2.85 billion valuation led by Kleiner Perkins, TechCrunch reported in February. The company has reached about $40 million in annualized recurring revenue (ARR), according to our reporting. That revenue run rate is much lower than Cursor's, which reportedly makes $200 million on an ARR basis. Cursor has been in talks to raise capital at about $10 billion valuation, Bloomberg reported last month. Since its founding in 2021 by Varun Mohan and his childhood friend and fellow MIT grad, Douglas Chen, Codeium has previously raised $243 million from investors including Greenoaks Capital and General Catalyst, according to PitchBook data. Additional reporting by Sarah Perez This article originally appeared on TechCrunch at Sign in to access your portfolio

OpenAI's New GPT 4.1 Models Excel at Coding
OpenAI's New GPT 4.1 Models Excel at Coding

WIRED

time14-04-2025

  • WIRED

OpenAI's New GPT 4.1 Models Excel at Coding

Apr 14, 2025 1:40 PM GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano are all available now—and will help OpenAI compete with Google and Anthropic. PHOTO COLLAGE: J.D. REEVES; GETTY IMAGES OpenAI announced today that it is releasing a new family of artificial intelligence models optimized to excel at coding, as it ramps up efforts to fend off increasingly stiff competition from companies like Google and Anthropic. The models are available to developers through OpenAI's application programming interface (API). OpenAI is releasing three sizes of models: GPT 4.1, GPT 4.1 Mini, and GPT 4.1 Nano. Kevin Weil, chief product officer at OpenAI, said on a livestream that the new models are better than OpenAI's most widely used model, GPT-4o, and better than its largest and most powerful model, GPT-4.5, in some ways. GPT-4.1 scored 55 percent on SWE-Bench, a widely used benchmark for gauging the prowess of coding models. The score is several percentage points above that of other OpenAI models. The new models are 'great at coding, they're great at complex instruction following, they're fantastic for building agents,' Weil said. The capacity for AI models to write and edit code has improved significantly in recent months, enabling more automated ways of prototyping software, and improving the abilities of so-called AI agents. In the past few months, rivals like Anthropic and Google have both introduced models that are especially good at writing code. The arrival of GPT-4.1 has been widely rumored in recent weeks. OpenAI apparently tested the model on some popular leaderboards under the pseudonym Alpha Quasar, sources say. Some users of the 'stealth' model reported impressive coding abilities. 'Quasar fixed all the open issues I had with other code genarated [sic] via llms's which was incomplete,' one person wrote on Reddit. 'Developers care a lot about coding and we've been improving our model's ability to write functional code,' Michelle Pokrass, who works on post-training at OpenAI, said during the Monday livestream. 'We've been working on making it follow different formats and better explore repos, run unit tests and write code that compiles.' Over the past couple of years, OpenAI has parlayed feverish interest in ChatGPT, a remarkable chatbot first unveiled in late 2022, into a growing business selling access to more advanced chatbots and AI models. In a TED interview last week, Altman said that OpenAI had 500 million weekly active users, and that usage was 'growing very rapidly.' OpenAI now offers a smorgasbord of different flavors of models with different capabilities and different pricing. The company's largest and most powerful model, called GPT-4.5, was launched in February, though OpenAI called the launch a 'research preview' because the product is still experimental. The company also offers models called o1 and o3 that are capable of performing a simulated kind of reasoning, breaking a problem down into parts in order to solve it. These models also take longer to respond to queries and are more expensive for users. ChatGPT's success has inspired an army of imitators, and rival AI players have ramped up their investments in research in an effort to catch up to OpenAI in recent years. A report on the state of AI published by Stanford University this month found that models from Google and DeepSeek now have similar capabilities to models from OpenAI. It also showed a gaggle of other firms including Anthropic, Meta, and the French firm Mistral in close pursuit. Oren Etzioni, a professor emeritus at the University of Washington who previously led the Allen Institute for AI (AI2), says it is unlikely that any single model or company will be dominant in the future. 'We will see even more models over time as cost drops, open source increases, and specialized models win out in different arenas including biology, chip design, and more,' he says. Etzioni adds that he would like to see companies focus on reducing the cost and environmental impact of training powerful models in the years ahead. OpenAI faces pressure to show that it can build a sustained and profitable business by selling access to its AI models to other companies. The company's chief operating officer, Brad Lightcap, told CNBC in February that the company had more than 400 million weekly active users, a 30 percent increase from December 2024. But the company is still losing billions as it invests heavily in research and infrastructure. In January, OpenAI announced that it would create a new company called Stargate in collaboration with SoftBank, Oracle, and MGX. The group collectively promised to invest $500 billion in new AI datacenter infrastructure. In recent weeks, OpenAI has teased a flurry of new models and features. Last week, Altman announced that ChatGPT would receive a memory upgrade allowing the chatbot to better remember and refer back to previous conversations. In late March, Altman announced that OpenAI plans to release an open weight model, which developers will be able to download and modify for free, in the summer. The company said it would begin testing the model in the coming weeks. Open weight models are already popular with researchers, developers, and startups because they can be tailored for different uses and are often cheaper to use. This is a developing story. Please check back for updates .

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store