AI isn't just standing by. It's doing things — without guardrails

Just two and a half years after OpenAI stunned the world with ChatGPT, AI is no longer only answering questions — it is taking actions. We are now entering the era of AI agents, in which AI large language models don't just passively provide information in response to your queries, they actively go into the world and do things for — or potentially against — you.
AI has the power to write essays and answer complex questions, but imagine if you could enter a prompt and have it make a doctor's appointment based on your calendar, or book a family flight with your credit card, or file a legal case for you in small claims court.
An AI agent submitted this op-ed. (I did, however, write the op-ed myself because I figured the Los Angeles Times wouldn't publish an AI-generated piece, and besides I can put in random references like I'm a Cleveland Browns fan because no AI would ever admit to that.)
I instructed my AI agent to find out what email address The Times uses for op-ed submissions, the requirements for the submission, and then to draft the email title, draft an eye-catching pitch paragraph, attach my op-ed and submit the package. I pressed 'return,' 'monitor task' and 'confirm.' The AI agent completed the tasks in a few minutes.
A few minutes is not speedy, and these were not complicated requests. But with each passing month the agents get faster and smarter. I used Operator by OpenAI, which is in research preview mode. Google's Project Mariner, which is also a research prototype, can perform similar agentic tasks. Multiple companies now offer AI agents that will make phone calls for you — in your voice or another voice — and have a conversation with the person at the other end of the line based on your instructions.
Soon AI agents will perform more complex tasks and be widely available for the public to use. That raises a number of unresolved and significant concerns. Anthropic does safety testing of its models and publishes the results. One of its tests showed that the Claude Opus 4 model would potentially notify the press or regulators if it believed you were doing something egregiously immoral. Should an AI agent behave like a slavishly loyal employee, or a conscientious employee?
OpenAI publishes safety audits of its models. One audit showed the o3 model engaged in strategic deception, which was defined as behavior that intentionally pursues objectives misaligned with user or developer intent. A passive AI model that engages in strategic deception can be troubling, but it becomes dangerous if that model actively performs tasks in the real world autonomously. A rogue AI agent could empty your bank account, make and send fake incriminating videos of you to law enforcement, or disclose your personal information to the dark web.
Earlier this year, programming changes were made to xAI's Grok model that caused it to insert false information about white genocide in South Africa in responses to unrelated user queries. This episode showed that large language models can reflect the biases of their creators. In a world of AI agents, we should also beware that creators of the agents could take control of them without your knowledge.
The U.S. government is far behind in grappling with the potential risks of powerful, advanced AI. At a minimum, we should mandate that companies deploying large language models at scale need to disclose the safety tests they performed and the results, as well as security measures embedded in the system.
The bipartisan House Task Force on Artificial Intelligence, on which I served, published a unanimous report last December with more than 80 recommendations. Congress should act on them. We did not discuss general purpose AI agents because they weren't really a thing yet.
To address the unresolved and significant issues raised by AI, which will become magnified as AI agents proliferate, Congress should turn the task force into a House Select Committee. Such a specialized committee could put witnesses under oath, hold hearings in public and employ a dedicated staff to help tackle one of the most significant technological revolutions in history. AI moves quickly. If we act now, we can still catch up.
Ted Lieu, a Democrat, represents California's 36th Congressional District.

Hashtags

#ChatGPT

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

4 AI tools to help with your side hustle: One ‘increased my website traffic by 30%,' says expert

CNBC

20 minutes ago

CNBC

4 AI tools to help with your side hustle: One ‘increased my website traffic by 30%,' says expert

Summer's here and with it opportunities to earn some extra cash. You could rent out your home to travelers on Airbnb or Facebook, pet sit for families going away, create social media content about your job — the opportunities are endless. If there's a chore that needs doing, someone could very well pay you to do it. And there are tools to streamline and make your side hustle easier once you've gotten started. That includes various AI tools introduced in the last few years. Here are four to consider using according to side hustle experts. Claude is a generative AI tool built by Anthropic. Like ChatGPT, you can use text, audio and visual prompts to create various written content. Jen Glantz, founder of Bridesmaid for Hire and the creator of the Monday Pick-Me-Up and Odd Jobs newsletters, uses Claude "to write out social media strategies and posts," she says. "I will share my own social media pages as well as other people's content I admire. I'll ask the tool to generate a 30-day plan for me with captions, posts, hashtags, and more." There are three plans for those interested in trying out the bot: a free plan with basic capabilities like analyzing text and images and creating content; a $17 per month plan, which allows for access to research and connecting to Google Workspace; and a $100 per month plan, which offers early access to advanced Claude features. Swiftbrief is an AI tool geared toward improving SEO strategy. "The tool helps me identify topics I should focus on by analyzing my website and competitors and then writes the blog posts for me," says Glantz. "This has saved me thousands of dollars and increased my website traffic by 30%." Subscriptions cost $12, $119 or $239 per month, depending on the amount of insights you want to derive from the bot. This tool allows you to create and manage automatic messages with people who interact with your social media platforms. "You can program it to answer direct messages and also to share links with followers if they comment on your posts asking more about products, outfits, or items that you share," says Glantz. "It's like having a social media assistant on-call 24/7." If you've seen a call to action on Instagram like "comment 'toast' to get the recipe" and gotten a DM with that recipe, that could have been Manychat at work. Subscriptions range from free to "customized to fit your needs," according to its website. Manus is an AI tool designed to do complex tasks like create websites, analyze stocks and build itineraries. Side hustle expert Daniella Flores has used Manus to build a Pinterest schedule for the month, including images and descriptions they could post, for example. "You can tell it to do, like, 20 different things if you want to in one message," they say, adding that "it'll show the windows that it's browsing, what it's doing behind the scenes." You can tweak your ask even while it's working to ensure you get the results you're looking for. There's a free version of Manus, as well as versions that cost $16, $33 and $166 per month, depending on the amount of video generation, slide generation and other capabilities you want to use and unlock.

Why Reliability Is The Hardest Problem In Physical AI

Forbes

25 minutes ago

Forbes

Why Reliability Is The Hardest Problem In Physical AI

Dr. Jeff Mahler: Co-Founder, Chief Technology Officer, Ambi Robotics; PhD in AI and Robotics from UC Berkeley. getty Imagine your morning commute. You exit the highway and tap the brakes, but nothing happens. The car won't slow down. You frantically search for a safe place to coast, heart pounding, hoping to avoid a crash. Even after the brakes are repaired, would you trust that car again? Trust, once broken, is hard to regain. When it comes to physical products like cars, appliances or robots, reliability is everything. It's how we come to count on them for our jobs, well-being or lives. As with vehicles, reliability is critical to the success of AI-driven robots, from the supply chain to factories to our homes. While the stakes may not always be life-or-death, dependability still shapes how we trust robots, from delivering packages before the holidays to cleaning the house just in time for a dinner party. Yet despite the massive potential of AI in the physical world, reliability remains a grand challenge for the field. Three key factors make this particularly hard and point to where solutions might emerge. 1. Not all failures are equal. Digital AI products like ChatGPT make frequent mistakes, yet hundreds of millions of active users use them. The key difference is that these mistakes are usually of low consequence. Coding assistants might suggest a software API that doesn't exist, but this error will likely be caught early in testing. Such errors are annoying but permissible. In contrast, if a robot AI makes a mistake, it can cause irreversible damage. The consequences range from breaking a beloved item at home to causing serious injuries. In principle, physical AI could learn to avoid critical failures with sufficient training data. In practice, however, these failures can be extremely rare and may need to occur many times before AI learns to avoid them. Today, we still don't know what it takes in terms of data, algorithms or computation to achieve high dependability with end-to-end robot foundation models. We have yet to see 99.9% reliability on a single task, let alone many. Nonetheless, we can estimate that the data scale needed for reliable physical AI is immense because AI scaling laws show a diminishing performance with increased training data. The scale is likely orders of magnitude higher than for digital AI, which is already trained on internet-scale data. The robot data gap is vast, and fundamentally new approaches may be needed to achieve industrial-grade reliability and avoid critical failures. 2. Failures can be hard to diagnose. Another big difference between digital and physical AI is the ability to see how a failure occurred. When a chatbot makes a mistake, the correct answer can be provided directly. For robots, however, it can be difficult to observe the root causes of issues in the first place. Limitations of hardware are one problem. A robot without body-wide tactile sensing may be unable to detect a slippery surface before dropping an item or unable to stop when backing into something behind it. The same can happen in the case of occlusions and missing data. If a robot can't sense the source of the error, it must compensate for these limitations—and all of this requires more data. Long-time delays present another challenge. Picture a robot that sorts a package to the wrong location, sending it to the wrong van for delivery. The driver realizes the mistake when they see one item left behind at the end of the day. Now, the entire package history may need to be searched to find the source of the mistake. This might be possible in a warehouse, but in the home, the cause of failure may not be identified until the mistake happens many times. To mitigate these issues, monitoring systems are hugely important. Sensors that can record the robot's actions, associate them with events and find anomalies can make it easier to determine the root cause of failure and make updates to the hardware, software or AI on the robot. Observability is critical. The better that machines get at seeing the root cause of failure, the more reliable they will become. 3. There's no fallback plan. For digital AI, the internet isn't just training data; it's also a knowledge base. When a chatbot realizes it doesn't know the answer to something, it can search through other data sources and summarize them. Entire products like Perplexity are based on this idea. For physical AI, there's not always a ground truth to reference when planning actions in real-world scenarios like folding laundry. If a robot can't find the sheet corners, it's not likely to have success by falling back to classical computer vision. This is why many practical AI robots use human intervention, either remote or in-person. For example, when a Waymo autonomous vehicle encounters an unfamiliar situation on the road, it can ask a human operator for additional information to understand its environment. However, it's not as clear how to intervene in every application. When possible, a powerful solution is to use a hybrid AI robot planning system. The AI can be tightly scoped to specific decisions such as where to grasp an item, and traditional methods can be used to plan a path to reach that point. As noted above, this is limited and won't work in cases where there is no traditional method to solve the problem. Intervention and fallback systems are key to ensuring reliability with commercial robots today and in the foreseeable future. Conclusion Despite rapid advances in digital GenAI, there's no obvious path to highly reliable physical AI. It isn't just a technical hurdle; it's the foundation for trust in intelligent machines. Solving it will require new approaches to data gathering, architectures for monitoring/interventions and systems thinking. As capabilities grow, however, so does momentum. The path is difficult, but the destination is worth it. Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Bloomberg

25 minutes ago

Bloomberg

Stock Movers: Nike, Nvidia, AUB

On this episode of Stock Movers: - Nike (NKE) shares are on the upswing this morning after forecasting a smaller-than-expected drop in revenue for the current quarter, a sign that the sportswear company's earnings trend may have hit an inflection point, analysts say. That comes after a string of strategic moves from CEO Elliott Hill, including a refocusing on sports and a cleanup of inventories. - Nvidia (NVDA) shares are rising as it is close to becoming the first company to reach a $4 trillion market capitalization, after its shares rallied back to a record following a plunge earlier this year. The company's biggest customers, including Microsoft, Meta, Amazon, and Alphabet, are projected to increase their spending on computing infrastructure, with annual AI spending expected to rise to nearly $2 trillion by 2028. - Atlantic Union (AUB) shares are up after the bank said it sold about $2 billion of its performing commercial real estate loans to Blackstone.

AI isn't just standing by. It's doing things — without guardrails

Hashtags

Try Our AI Features

Comments

Related Articles

4 AI tools to help with your side hustle: One ‘increased my website traffic by 30%,' says expert

Why Reliability Is The Hardest Problem In Physical AI

Stock Movers: Nike, Nvidia, AUB

Get Started Now: Download the App