Latest news with #ChatGPTAgent

Can Comet replace Google Chrome? An in-depth look at Perplexity's new agentic AI browser

Indian Express

2 hours ago

Business
Indian Express

Can Comet replace Google Chrome? An in-depth look at Perplexity's new agentic AI browser

Most people experience generative AI today through chatbots like ChatGPT. However, web browsers are emerging as a potential gateway for everyday user interactions with large language models (LLMs). This shift from standalone chatbots to AI agent-driven web browsers has been underscored by recent product releases such as OpenAI's ChatGPT Agent, which lets AI agents use a browser to surf the web on behalf of the user. Last month, The Browser Company unveiled Dia, which is essentially a web browser integrated with an AI chatbot. Then there is Comet, Perplexity's desktop browser which goes a step further by offering users access to an in-built AI agent. Google is reportedly testing a similar Gemini integration in Chrome while OpenAI is also rumoured to be developing its own AI-powered web browser. But why are tech companies increasingly focusing on AI native browsers instead of AI chatbots? It might have something to do with user context. Web browsers offer a better understanding of a user's online activity such as reading articles, writing emails, online shopping, etc. This information might be useful for developers to build AI tools that can automate these tasks. Perplexity is looking to challenge Google's dominance in web browsing as well as online search. The limited roll-out of Comet comes at a time when both these markets could potentially open up to upstarts like Perplexity, but only if Google is forced to spin off Chrome in the US search antitrust remedies case. In this context, let's take a look at how Comet is different from Google Chrome, its common use cases, and whether the latest offering could give Perplexity an edge in the rapidly intensifying browser wars. Comet is an LLM-based web browser that comes with an in-built AI Assistant. Users have the option to link their Google account to the browser in order to transfer all the context and browser extensions from Chrome to Comet. The browser is built on the Chromium framework, an open-source architecture that is maintained by Google and underpins several other web browsers including Microsoft Edge, Brave, and DuckDuckGo. However, Comet runs on Perplexity's 'answer engine' which is wrapped around foundational LLMs like OpenAI's GPT-4o and Anthropic's Claude 4.0 Sonnet, though it also has its own LLM called Sonar. Comet can be used to generate summaries of articles and YouTube videos. Users can also ask it to describe an image on their screen or perform deeper research about a particular topic. It is also able to provide AI-generated summaries of all the web pages that feature as open tabs on the browser as well as compare products on those pages. 'Comet is not just another chatbot. It's an AI-native browser that performs operational tasks, like a silent worker running continuously in the background,' Perplexity CEO Aravind Srinivas was quoted as saying by The Verge. Comet is available for both Mac and Windows users. However, accessing the browser is currently possible if you are a Perplexity Max subscriber or on the company's early access waitlist. Access to Comet could be available to users for free but the more advanced AI features may continue to be gated behind a subscription tier, Srinivas said in a Reddit AMA held earlier this month. Firstly, Comet replaces Google Search results with Perplexity's AI 'answer engine. This means users will receive AI-generated information about what they are looking for instead of seeing a web page with a list of blue links to various online websites. Unlike traditional search engines, Perplexity reportedly surfaces links to relevant websites before providing AI-generated responses to users' search queries. Another key difference between Comet and traditional web browsers is the tabbed interface, which lets users access information quickly. Comet also suggests related content based on what a user is looking at and what they have already read. To be sure, Google has also been integrating several new features such as AI Overviews and AI Mode in Chrome. Its Gemini chatbot is also accessible through the browser for users in the US. However, what truly sets Comet apart from Chrome and other AI native web browsers on the market is its agentic capabilities. Comet has an Assistant button at the top-right corner of the browser that opens up a sidebar with a chat interface. Besides search queries, users can ask the Assistant to carry out certain tasks on their behalf. For instance, you can ask it to write and send an email, unsubscribe from promotional emails, write and publish posts on social media platforms such as LinkedIn, and close all the tabs on the browser, among other activities. The Assistant within Comet is better at completing tasks on its own when users start their prompt with 'take control of my browser and…,' according to a report by The Verge. Comet's Assistant pulls user context from third-party apps. This means that the browser's agentic capabilities work only when users are logged in to these apps. 'You can still take over from the agent and complete [the task] when you feel like it is not able to do it,' Srinivas said. However, Comet's agentic features may not work reliably for all tasks. 'Some of the more complicated agentic actions like shopping do have a higher failure rate than simpler tasks, but this is actually a limitation of current AI models. So this will only get easier and better in Comet,' a Perplexity spokesperson was quoted as saying. Perplexity recorded over 780 million user queries in May alone this year, according to Srinivas. While the company's search products have witnessed more than 20 per cent growth month-over-month, AI agents are more compute-intensive than chatbots which means that they are more expensive to run. These costs can be offset by reaching more paying users. Perplexity is currently in talks with mobile device makers to pre-install its new Comet browser on smartphones, according to a report by Reuters.

Tahawul Tech

3 hours ago

Business
Tahawul Tech

OpenAI upgrades the ChatGPT platform

OpenAI has introduced a new AI agent for the ChatGPT platform, expanding the chatbot's capabilities to manage more complex and multi-step tasks. The new agentic system builds on the company's earlier features including Operator, which enabled ChatGPT to interact with websites, and its in house Deep Research offering, designed for carrying out in-depth, multi-step tasks. With the update, these capabilities have been integrated into a single, more powerful agentic system dubbed ChatGPT agent, enabling the model to independently complete complex tasks by switching between reasoning and action. The update allows ChatGPT to autonomously handle tasks such as preparing slide decks, summarising emails, analysing competitors, or booking travel. The agent can browse websites, run code, interact with APIs and generate editable content outputs like spreadsheets or presentations. To deliver these services, the AI agent uses a virtual computer with web access and a suite of tools. It can also connect with third-party apps including Gmail and GitHub, enabling it to retrieve relevant information based on user prompts. In a statement, OpenAI emphasised safeguards with the latest update aimed at upholding security, including 'watch mode' for active user supervision of critical tasks like sending emails, and ChatGPT agent seeking explicit user permission before executing consequential actions. The upgrade is now available for Pro, Plus and Team subscribers, with enterprise and education users set to receive access in the coming weeks. The chatbot's Operator research tool will also be sunset in the coming weeks as its functionality has been integrated into ChatGPT agent. The latest update comes as AI agents rapidly gain traction across the tech industry. Major players including Microsoft, Salesforce and Oracle are investing heavily in the technology to enhance productivity and streamline operations. Source: Mobile World Live Image Credit: OpenAI

Mint

a day ago

Business
Mint

Mint Explainer: Is OpenAI exaggerating the powers of its new ChatGPT Agent?

Leslie D'Monte OpenAI has flagged the agent as high-risk under its safety framework. Is this just marketing hype or a sign that AI is genuinely becoming more powerful and autonomous? OpenAI CEO Sam Altman. Photo AFP Gift this article On Thursday, OpenAI launched its autonomous ChatGPT Agent, a tool that's capable of finding and buying things online, managing your calendar, and booking you an appointment with a doctor. It's essentially a digital assistant that doesn't just provide information but complete actual tasks. On Thursday, OpenAI launched its autonomous ChatGPT Agent, a tool that's capable of finding and buying things online, managing your calendar, and booking you an appointment with a doctor. It's essentially a digital assistant that doesn't just provide information but complete actual tasks. That being said, OpenAI has flagged the agent as high-risk under its safety framework, warning it could potentially be used to create dangerous biological or chemical substances. Is this just marketing hype, timed to build momentum for the launch of GPT-5, or a sign that AI agents are genuinely becoming more powerful and autonomous, akin to the agents who protect the computer-generated world of The Matrix? What is ChatGPT Agent? Say you want to rearrange your calendar, find a doctor and schedule an appointment, or research competitors and deliver a report. ChatGPT Agent can now do it for you. Also Read | Deep research with AI is days' worth of work in minutes The agent can browse websites, run code, analyse data, and even create slide decks or spreadsheets—all based on your instructions. It combines the strengths of OpenAI's earlier tools—operator (which could navigate the web) and deep research (which could analyse and summarise information)—into a single system. You stay in control throughout: ChatGPT asks for permission before doing anything important, and you can stop or take over at any time. This new capability is available to Pro, Plus, and Team users through the tools dropdown. How does it work? ChatGPT Auses a powerful set of tools to complete tasks, including a visual browser to interact with websites like a human, a text-based browser for reasoning-heavy searches, a terminal for code execution, and direct application programming interface (API) access. It can also connect to apps such as Gmail or GitHub to fetch relevant information. You can log in to websites within the agent's browser, allowing it to dig deeper into personalised content. All of this runs on its own virtual computer, which keeps track of context even across multiple tools. The agent can switch between browsers, download and edit files, and adapt its methods to complete tasks quickly and accurately. It's built for back-and-forth collaboration—you can step in anytime to guide or change the task, and ChatGPT can ask for more input when needed. If a task takes time, you'll get updates and a notification on your phone once it's done. Has OpenAI tested its performance? OpenAI said on Humanity's Last Exam (HLE), which tests expert-level reasoning across subjects, ChatGPT Agent achieved a new high score of 41.6, rising to 44.4 when multiple attempts were run in parallel and the most confident response was selected. On FrontierMath, the toughest known math benchmark, the agent scored 27.4% using tools such as a code-executing terminal—far ahead of previous models. In real-world tasks, ChatGPT agent performs at or above human levels in about half of the cases, based on OpenAI's internal evaluations. These tasks include building financial models, analysing competitors, and identifying suitable sites for green hydrogen projects. ChatGPT Agent also outperforms others on specialised tests such as DSBench for data science, and the SpreadsheetBench for spreadsheet editing (45.5% vs Copilot Excel's 20.0%). On BrowseComp and WebArena, which test browsing skills, the agent achieves the highest scores to date, according to OpenAI. What are some of the things it can do? Consider the case of travel planning. The agent won't just suggest ideas but navigate booking websites, fill out forms, and even make reservations one you give it permission. You can also ask it to read your emails, find meeting invitations, and automatically schedule appointments in your calendar, or even draft and send follow-up emails. This level of coordination typically required juggling between apps, but the agent manages it in a single conversational flow. Another example involves shopping and price comparison. You can tell the agent to 'order the best-reviewed smartphone under ₹ 15,000", and it can search online stores, compare prices and reviews, and proceed to checkout on a preferred platform. Customer support and task automation are other examples, where the agent is used to troubleshoot an issue, log into support portals, and even file return or refund requests. How are AI agents typically built? Unlike basic chat bots, AI agents are autonomous systems that can plan, reason, and complete complex, multi-step tasks with minimal input—such as coding, data analysis, or generating reports. They are built by combining ways to take in information, think, and take action. Developers begin by deciding what the agent should do, following which the agent collects data like such as or images from its environment. AI agents use large language models (LLMs) like GPT-4 as their core 'brain", which allows them to understand and respond to natural language instructions. To allow AI agents to take action, developers connect the LLM to things like a web browser, code editor, calculator, and APIs for services such as Gmail or Slack. Frameworks like LangChain help integrate these parts, and keep track of information. Some AI agents learn from experience and get better over time. Testing and careful setup make sure they work well and follow rules. Does ChatGPT Agent have credible competition? Google's Project Astra, part of its Gemini AI line, is developing a multimodal assistant that can see, hear, and respond in real time. Gemini CLI is an open-source AI agent that brings Google's Gemini model directly to the terminal for fast, lightweight access. It integrates with Gemini Code Assist, offering developers on all plans AI-powered coding in both VS Code and the command line. Microsoft is embedding Copilot into Windows, Office, and Teams, giving its agent access to workflows, system controls, and productivity tools, soon enhanced by a dedicated Copilot Runtime. Meta is building more socially focused agents within messaging and the metaverse, which could evolve into utility tools. Apple is revamping Siri through Apple Intelligence, combining GPT-level reasoning with strict privacy features and deep on-device integration. Other smart agents include Oracle's Miracle Agent, IBM's Watson tools, Agentforce from Salesforce Anthropic's Claude 3.5, and Perplexity AI's action-oriented agents through its Comet project, blending search with agentic behaviour. The competitive advantage, though, may go to companies that can integrate these AI agents into everyday applications and call for action with a single, unified tool – a task that ChatGPT Agent has demonstrated. Why did OpenAI warn that ChatGPT Agent could be used to trigger biological warfare? OpenAI claimed ChatGPT Agent's superior capabilities could, in theory, be misused to help someone create dangerous biological or chemical substances. However, it clarified that there was no solid evidence it could actually do so. Regardless, OpenAI is activating the highest level of safety measures under its internal 'preparedness framework'. These include thorough threat modeling to anticipate potential misuse, special training to ensure the model refuses harmful requests, and constant monitoring using automated systems that watch for risky behaviour. There are also clear procedures in place for suspicious activity. Should we take this risk seriously? Ja-Nae Duane, AI expert and MIT Research Fellow and co-author of SuperShifts, said the more autonomous the agent, the more permissions and access rights it would require. For example, buying a dress requires wallet access; scheduling an event requires calendar and contact list access. 'While standard ChatGPT already presents privacy risks, the risks from ChatGPT Agent are exponentially higher because people will be granting it access rights to external tools containing personal information (like calendar, email, wallet, and more). There's a significant gap between the pace of AI development and AI literacy; many people haven't even fully understood ChatGPT's existing privacy risks, and now they're being introduced to a feature with exponentially more risks," he said. Also Read | Google's Veo 3 brings the era of video on command Duane added that the key risks included data leaks, mistaken actions, prompt injection, and account compromise, especially when handling sensitive information. Malicious actors, he warned, could exploit them by manipulating inputs, abusing tool access, stealing credentials, or poisoning data to bias outputs. Poor third-party integration and an over-reliance of them could worsen the impact, while the agent's 'black box" nature would make it hard to trace errors, he added. In the wrong hands, these agents could be weaponised for fraud, phishing, or even to generate malware. What are the other concern areas for enterprises? Developers are increasingly deploying AI agents across IT, customer service, and enterprise workflows. According to Nasscom, 46% of Indian firms are experimenting with these agents, particularly in IT, HR, and finance, while manufacturing leads in robotics, quality control, and automation. Beyond concerns around hallucinations, security, privacy, and copyright or intellectual property (IP) violations, a key challenge for businesses is ensuring a return on investment. Gartner noted that many so-called agentic use cases could be handled by simpler tools and predicted that more than 40% of such projects would be scrapped by 2027 over high costs, unclear value, or inadequate risk controls. Of the thousands of vendors in this space, only around 130 are seen as credible; many engage in 'agent washing" by repackaging chatbots, robotic process automation (RPA), or basic assistants as autonomous agents. Nasscom corroborated these concerns, highlighting that 62% of enterprises were still only testing agents in-house. Why is 'humans-in-the-loop' a must? OpenAI CEO Sam Altman advised granting agents only the minimum access needed for each task, not blanket permissions. Nasscom believes that to scale responsibly, enterprises must prioritise human-AI collaboration, trust, and data readiness. It has recommended firms adopt AI agents with a 'human-in-the-loop" approach, reflecting the need for oversight and contextual judgment. According to Duane, users must understand both the tool's strengths and its limits, especially when handling sensitive data. Caution is key, as misuse could have serious consequences. She also emphasised the importance of AI literacy, noting that AI was evolving far faster than most people's understanding of how to use it responsibly. Also Read | Mint Primer: Are firms wasting their money on AI agents? Topics You May Be Interested In

Meet Yash Kumar, The Lead Behind OpenAI's ChatGPT Agent That Works By Self

NDTV

2 days ago

Business
NDTV

Meet Yash Kumar, The Lead Behind OpenAI's ChatGPT Agent That Works By Self

Yash Kumar's Chatgpt Agent 2025: OpenAI recently introduced a new capability in its chatbot ChatGPT - the ChatGPT Agent, which can semi-autonomously perform digital tasks without requiring constant user intervention. The project is currently being led by Yash Kumar, a graduate of the Indian Institute of Technology (IIT), Hyderabad. The new ChatGPT Agent is designed to handle tasks from start to finish - whether it's preparing meal plans, booking tickets, or summarizing meetings on your calendar. The agent can think, plan, and act on the user's behalf using its own virtual computer. It switches between browsers, applications, and terminals to complete tasks end-to-end. The tool is now available to users, but it remains a work in progress and continues to be improved. All About Yash Kumar- Educational and Corporate Background Yash Kumar, an Indian-origin technologist, is a Member of Technical Staff at OpenAI's San Francisco headquarters and currently serves as the project lead for the ChatGPT Agent. He joined OpenAI in 2023 and previously worked at several well-known companies, including Google, where he spent eight years, and DoorDash. He earned his Bachelor's degree in Computer Science from IIT Hyderabad in 2011. In his previous roles, Yash primarily led engineering teams - including serving as head of the merchant engineering division at DoorDash and overseeing engineering, product, and design teams at Scratch. While the ChatGPT Agent is nearly autonomous, it still requires user approval for high-stakes actions such as purchases or logging into certain websites.

Weekly Tech Recap: Perplexity Pro goes free for Airtel users, ChatGPT agent launched and more

Mint

2 days ago

Business
Mint

Weekly Tech Recap: Perplexity Pro goes free for Airtel users, ChatGPT agent launched and more

With news coming in throughout the week, it can be difficult to sift out the important updates from the noise. To keep readers up to date, we've compiled the Weekly Tech Recap, where we take a look at the top news that shook up the world of technology. This week, OpenAI unveiled ChatGPT Agent, Perplexity announced free Pro subscriptions for Airtel users, OnePlus revealed sale details for the Pad 3, and more. Earlier in the week, Airtel announced that its prepaid and postpaid customers will be able to enjoy Perplexity Pro subscription for free for a year, a service that costs upwards of over ₹ 17,000/ year. Notably, Perplexity Pro subscription allows users access to the latest AI models from OpenAI, Google, Anthropic and xAI. Soon after announcing the offer, Perplexity gained popularity in India and the app was soon ranked number 1 in the top free app category on the Apple App Store, dethroning ChatGPT at the top. Click here to read the full report OnePlus has announced that its latest tablet, the OnePlus Pad 3 will go on sale in India in the month of September. The tablet was first teased during the OnePlus 13s launch earlier in the year but while it was unveiled in the global markets an India launch date remained elusive. OnePlus has confirmed the specs for the Pad 3 but the pricing and exact sale date remains elusive for now. Click here to read the full report OpenAI has introduced a new AI agent called the 'ChatGPT agent', which integrates more deeply with the company's chatbot of the same name. The ChatGPT agent is a general-purpose agentic system that can take actions on a user's behalf. With the new ChatGPT agent, the chatbot can handle tasks such as accessing the user's calendar, shopping on their behalf, creating spreadsheets, and browsing the web to gather information. It can also automate repetitive tasks, such as converting screenshots or dashboards into presentations, planning and booking offsites, updating spreadsheets with new data, and more. OpenAI says the ChatGPT agent brings the best of both worlds from the Operator and DeepResearch agentic AI's, allowing ChatGPT agent to transition naturally from a simple conversation to executing actions within the same chat. Click here to read the full report Privacy focused search engine DuckDuckGo has brought a new feature that allows users to block AI generated images in their search results. However, the feature isn't enabled by default. Users will need to manually go into the search menu to change the filters if they want to reduce the number of AI-generated images they see. On the DuckDuckGo website, a new 'AI Images' option now appears in the search menu. By default, it is set to 'Show', but users can manually click on the menu and set it to 'Hide' to significantly reduce AI-generated images in search results. DuckDuckGo admits the feature won't catch every AI-generated image, but it says their frequency should drop significantly. To power the feature, DuckDuckGo relies on open-source blocklists like the 'nuclear' list from uBlockOrigin and the uBlacklist Huge AI Blacklist. Click here to read the full report Netflix has confirmed that it used generative AI for the first time to create a scene in one of its original shows. Netflix co-CEO Ted Sarandos described the technology as an 'incredible opportunity to help creators make films and series better, not just cheaper.' Sarandos said everyone involved with the project was 'thrilled' with the result generated by AI. Since generative AI gained prominence in late 2022, there has been significant anxiety across industries about its impact on jobs, with creative fields expected to be among the most affected. In 2023, a Hollywood actors and writers strike led to new guidelines around the use of AI in productions.

Latest news with #ChatGPTAgent

Can Comet replace Google Chrome? An in-depth look at Perplexity's new agentic AI browser

OpenAI upgrades the ChatGPT platform

Mint Explainer: Is OpenAI exaggerating the powers of its new ChatGPT Agent?

Meet Yash Kumar, The Lead Behind OpenAI's ChatGPT Agent That Works By Self

Weekly Tech Recap: Perplexity Pro goes free for Airtel users, ChatGPT agent launched and more

Get Started Now: Download the App