logo
#

Latest news with #GameArena

OpenAI beats Grok in chess
OpenAI beats Grok in chess

Tahawul Tech

time3 days ago

  • Tahawul Tech

OpenAI beats Grok in chess

OpenAI's o3 model defeated xAI's Grok 4 in a chatbot chess tournament, this contest help prove the advanced capabilities of everyday interactive agents which occurred nearly 30 years after a machine first beat a grandmaster. The o3 model even beat Open AI's o4 mini on its way to a final against Grok 4 on Google's Kaggle Game Arena, which it won by four games to nought. A face-off between Google's Gemini 2.5 Pro and o4 mini for third place went the way of the search giant's AI in a match where one game was drawn. Kaggle explained a seeding system was used to ensure top tier chatbots did not meet before the final. Matches were streamed, with details of each model's reasoning displayed and the rate of play optimised for viewing. The results are significant because Kaggle's Game Arena is a benchmarking platform using games to measure the performance of leading AI models, an approach it explained offers 'a clear, unambiguous signal of success'. Kaggle added games push AI models to demonstrate skills spanning 'strategic reasoning, long-term planning and dynamic adaptation'. Bloomberg noted the chess abilities of the AI models in the contest fall well short of the capabilities of IBM supercomputers which are now at a stage of teaching themselves complex games, however BBC News caveated this by highlighting the chatbots are everyday items not built from the ground up for a complex game. Grandmaster battle IBM's Deep Blue became the first computer to win a chess game against a grandmaster in 1996. The machine ultimately lost the contest against Garry Kasparov, but took the overall win in a rematch the following year. Deep Blue used 32 processors and was capable of evaluating 200 million chess positions per second. IBM states the machine was central to advancing the abilities of supercomputers to take on complex calculations needed for functions in pharmaceutical and financial sectors, along with analysing huge datasets and performing human gene research. Source: Mobile World Live Image Credit: OpenAI

Magnus Carlsen grills Elon Musk's Grok 4 as OpenAI's o3 crushes it 4-0: 'Learnt theory, knows nothing else'
Magnus Carlsen grills Elon Musk's Grok 4 as OpenAI's o3 crushes it 4-0: 'Learnt theory, knows nothing else'

First Post

time7 days ago

  • Entertainment
  • First Post

Magnus Carlsen grills Elon Musk's Grok 4 as OpenAI's o3 crushes it 4-0: 'Learnt theory, knows nothing else'

Grok 4 was the strongest among eight AI models competing in the exhibition tournament hosted on Google's Kaggle Game Arena, storming into the final only to suffer a 0-4 thrashing at the hands of o3 in the final. read more Magnus Carlsen couldn't help but slam Grok 4, developed by the Elon Musk-owned X for its performance in a chess game against OpenAI's o3. Image: Grand Chess Tour/Reuters Chess world No 1 Magnus Carlsen did not hold back while roasting Elon Musk after his AI model Grok 4 was thrashed 4-0 by Sam Altman-founded OpenAI's o3 during an AI chess exhibition tournament. Grok 4 was among eight general purpose large language models competing on Google's Kaggle Game Arena alongside Google's Gemini 2.5 Pro and Gemini 2.5 Flash, Open AI's o3 and o4-mini, Claude 4 Opus (Anthropic), DeepSeek R1 and Kimi k2 (Moonshot AI). STORY CONTINUES BELOW THIS AD And while Grok 4 did go the distance by reaching the final, it made a series of questionable decisions in the final against o3 that did not escape the attention of Carlsen, who was commentating on the game along with British Grandmaster David Howell on Take Take Take. 'This is like watching kids' games. In those tournaments you always play them out,' Carlsen told Howell when Grok 4 was trailing 0-3 and was about to play the fourth and final game as well despite the unassailable lead. 'Hope everyone feels better about their games after watching this,' he added. The Norwegian GM would then go on to compare Grok to a player who was all theory but had little skills otherwise. 'Oh my God. Nooo. Nooo. The combination of knight g4 and queen a2…Yeah now you clearly have somebody… there's always that one guy in the club as well who's learned theory and literally nothing else. Just makes the worst blunders after that. Come on!' Carlsen continued during the commentary. "Oh god, nooo, nooo." @grok shows off with a classic Botez Gambit in game 2. — Take Take Take (@TakeTakeTakeApp) August 7, 2025 STORY CONTINUES BELOW THIS AD Carlsen compares Grok and o3's style of play Carlsen also couldn't help but compare the two AI models and their way of calculating moves. 'I'm not loving the fact that o3 didn't prepare for the game. It's literally going to the board, 'What am I going to play? What am I going to play? I'm going to play d6, e6, knight c6, maybe g6, oh knight f6, maybe I'm going to do that. Okay I'll play d6.' And then Grok is just like, 'I don't know, d4',' Carlsen said during commentary. 'You basically summed up the two types of chess player out there in the world, right?' said Howell in reply. Grok 4 was the most dominant of the eight AI models participating in the AI chess event right up until the final, where it committed a series of blunders starting with the opening game itself. As for the third spot, Gemini 2.5 Pro defeated o4-mini 3.5-0.5 to win the bronze medal. STORY CONTINUES BELOW THIS AD

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store