Microsoft pushes staff to use internal AI tools more, and may consider this in reviews. 'Using AI is no longer optional.'

Microsoft is asking some managers to evaluate employees based on how much they use AI internally, and the software giant is considering adding a metric related to this in its review process, Business Insider has learned.
Julia Liuson, president of the Microsoft division responsible for developer tools such as AI coding service GitHub Copilot, recently sent an email instructing managers to evaluate employee performance based on their use of internal AI tools like this.
"AI is now a fundamental part of how we work," Liuson wrote. "Just like collaboration, data-driven thinking, and effective communication, using AI is no longer optional — it's core to every role and every level."
Liuson told managers that AI "should be part of your holistic reflections on an individual's performance and impact."
Microsoft's performance requirements vary from team to team, and some are considering including a more formal metric about the use of internal AI tools in performance reviews for its next fiscal year, according to a person familiar with the situation. This person asked not to be identified discussing private matters.
These changes are meant to address what Microsoft sees as lagging internal adoption of its Copilot AI services, according to another two people with knowledge of the plans. The company wants to increase usage broadly, but also wants the employees building these products have a better understanding of the tools.
In Liuson's organization, GitHub Copilot is facing increasing competition from AI coding services including Cursor. Microsoft lets employees use some external AI tools that meet certain security requirements. Staff are currently allowed to use coding assistant Replit, for example, one of the people said.
A recent note from Barclays cited data suggesting that Cursor recently surpassed GitHub Copilot in a key part of the developer market.
Competition among coding tools is even becoming a sticking point in Microsoft's renegotiation of its most important partnership with OpenAI. OpenAI is considering acquiring Cursor competitor Windsurf, but Microsoft's current deal with OpenAI would give it access to Windsurf's intellectual property and neither Windsurf nor OpenAI wants that, a person with knowledge of the talks said.

Hashtags

Business

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

I Let AI Agents Plan My Vacation—and It Wasn't Terrible

WIRED

an hour ago

WIRED

I Let AI Agents Plan My Vacation—and It Wasn't Terrible

The latest wave of AI tools claim to take the pain out of booking your next trip. From transport and accommodation to restaurants and attractions, we let AI take the reins to put this to the test. Photo-Illustration: Wired Staff/Victoria Turk The worst part of travel is the planning: the faff of finding and booking transport, accommodation, restaurant reservations—the list can feel endless. To help, the latest wave of AI agents, such as OpenAI's Operator and Anthropic's Computer Use claim they can take these dreary, cumbersome tasks from befuddled travelers and do it all for you. But exactly how good are they are digging out the good stuff? What better way to find out than deciding on a last-minute weekend away. I tasked Operator, which is available to ChatGPT Pro subscribers, with booking me something budget-friendly, with good food and art, and told it that I'd prefer to travel by train. What's fascinating is that you can actually watch its process in real time—the tool opens a browser window and starts, much as I would, searching for destinations accessible by rail. It scrolls a couple of articles, then offers two suggestions: Paris or Bruges. 'I recently went to Paris,' I type in the chat. 'Let's do Bruges!' Armed with my decision, Operator goes on to look up train times on the Eurostar website and finds a return ticket that will take me to Brussels and includes onward travel within Belgium. I intervene, however, when I see the timings: It selected an early-morning train out on Saturday, and an equally early train back on Sunday—not exactly making the most of the weekend, I point out. It finds a later return option. So far impressed, I wait to double-check my calendar before committing. When I return, however, the session has timed out. Unlike ChatGPT, Operator closes conversations between tasks, and I have to start again from scratch. I feel irrationally slighted, as if my trusty travel assistant has palmed me off to a colleague. Alas, the fares have already changed, and I find myself haggling with the AI: can't it find something cheaper? Tickets eventually selected, I take over to enter my personal and payment details. (I may be trusting AI to blindly send me across country borders, but I'm not giving it my passport information.) Using ChatGPT's Operator to book a train ticket to Bruges. Courtesy of Victoria Turk Trains booked, Operator thinks its job is done. But I'll need somewhere to stay, I remind it—can it book a hotel? It asks for more details and I'm purposefully vague, specifying that it should be comfy and conveniently located. Comparing hotels is perhaps my least favorite aspect of travel planning, so I'm happy to leave it scrolling through I restrain myself from jumping in when I see it's set the wrong dates, but it corrects this itself. It spends a while surveying an Ibis listing, but ends up choosing a three-star hotel called Martin's Brugge, which I note users have rated as having an excellent location. Now all that's left is an itinerary. Here, Operator seems to lose steam. It offers a perfunctory one-day schedule that appears to have mainly been cribbed from a vegetarian travel blog. On day 2, it suggests I 'visit any remaining attractions or museums.' Wow, thanks for the tip. The day of the trip arrives, and, as I drag myself out of bed at 4:30AM, I remember why I usually avoid early departures. Still, I get to Brussels without issue. My ticket allows for onward travel, but I realize I don't know where I'm going. I fire up Operator on my phone and ask which platform the next Bruges-bound train departs from. It searches the Belgian railway timetables. Minutes later, it's still searching. I look up and see the details on a station display. I get to the platform before Operator has figured it out. Bruges is delightful. Given Operator's lackluster itinerary, I branch out. This kind of research task is perfect for a large language model, I realize—it doesn't require agentic capabilities. ChatGPT, Operator's OpenAI sibling, gives me a much more thorough plan, plotting activities by the hour with suggestions of not just where to eat, but what to order (Flemish stew at De Halve Mann brewery). I also try Google's Gemini and Anthropic's Claude, and their plans are similar: Walk to the market square; see the belfry tower; visit the Basilica of the Holy Blood. Bruges is a small city, and I can't help but wonder if this is simply the standard tourist route, or if the AI models are all getting their information from the same sources. Various travel-specific AI tools are trying to break through this genericness. I briefly try MindTrip, which provides a map alongside a written itinerary, offers to personalize recommendations based on a quiz, and includes collaborative features for shared trips. CEO Andy Moss says it expands on broad LLM capabilities by leveraging a travel-specific 'knowledge base' containing things like weather data and real-time availability. Courtesy of Victoria Turk After lunch, I admit defeat. According to ChatGPT's itinerary I should spend the afternoon on a boat tour, taking photos in another square, and visiting a museum. It has vastly overestimated the stamina of a human who's been up since 4:30AM. I go to rest at my hotel, which is basic, but indeed ideally located. I'm coming around to Operator's lazier plans: I'll do the remaining attractions tomorrow. As a final task, I ask the agent to make a dinner reservation—somewhere authentic but not too expensive. It gets bamboozled by a dropdown menu during the booking process but manages a workaround after a little encouragement. I'm impressed as I walk past the obvious tourist traps to a more out-of-the-way dining room that serves classic local cuisine and is themed around pigeons. It's a good find—and one that doesn't seem to appear on the top 10 lists of obvious guides like TripAdvisor or The Fork. On the train home, I muse on my experience. The AI agent certainly required supervision. It struggled to string tasks together and lacked an element of common sense, such as when it tried to book the earliest train home. But it was refreshing to outsource decision-making to an assistant that could present a few select options, rather than having to scroll through endless listings. For now, people mainly use AI for inspiration, says Emma Brennan at travel agent trade association ABTA; it doesn't beat the human touch. 'An increasing number of people are booking with the travel agents for the reason that they want someone there if something goes wrong,' she says. It's easy to imagine AI tools taking over the information gateway role from search and socials, with businesses clamoring to appear in AI-generated suggestions. 'Google isn't going to be the front door for everything in the future,' says Moss. Are we ready to give this power to a machine? But then, perhaps that ship has sailed. When planning travel myself, I'll reflexively check a restaurant's Google rating, look up a hotel on Instagram, or read TripAdvisor reviews of an attraction, despite desires to stay away from the default tourist beat. Embarking on my AI trip, I worried I'd spend more time staring at my screen. By the end, I realize I've probably spent less.

OpenAI and Microsoft are dueling over AGI. These real-world tests will prove when AI is really better than humans.

Business Insider

2 hours ago

Business Insider

OpenAI and Microsoft are dueling over AGI. These real-world tests will prove when AI is really better than humans.

AGI is a pretty silly debate. It's only really important in one way: It governs how the world's most important AI partnership will change in the coming months. That's the deal between OpenAI and Microsoft. This is the situation right now: Until OpenAI achieves Artificial General Intelligence — where AI capabilities surpass those of humans — Microsoft gets a lot of valuable technological and financial benefits from the startup. For instance, OpenAI must share a significant portion of its revenue with Microsoft. That's billions of dollars. One could reasonably argue that this might be why Sam Altman bangs on about OpenAI getting close to AGI soon. Many other experts in the AI field don't talk about this much, or they think the AGI debate is off base in various ways, or just not that important. Even Anthropic CEO Dario Amodei, one of the biggest AI boosters on the planet, doesn't like to talk about AGI. Microsoft CEO Satya Nadella sees things very differently. Wouldn't you? If another company is contractually required to give you oodles of money if they don't reach AGI, then you're probably not going to think we're close to AGI! Nadella has called the push toward AGI "benchmark hacking," which is so delicious. This refers to AI researchers and labs designing AI models to perform well on wonky industry benchmarks, rather than in real life. Here's OpenAI's official definition of AGI: "highly autonomous systems that outperform humans at most economically valuable work." Other experts have defined it slightly differently. But the main point is that AI machines and software must be better than humans at a wide variety of useful tasks. You can already train an AI model to be better at one or two specific things, but to get to artificial general intelligence, machines must be able to do many different things better than humans. Please help BI improve our Business, Tech, and Innovation coverage by sharing a bit about your role — it will help us tailor content that matters most to people like you. Continue By providing this information, you agree that Business Insider may use this data to improve your site experience and for targeted advertising. By continuing you agree that you accept the Terms of Service and Privacy Policy . My real-world AGI tests Over the past few months, I've devised several real-world tests to see if we've reached AGI. These are fun or annoying everyday things that should just work in a world of AGI, but they don't right now for me. I also canvassed input from readers of my Tech Memo newsletter and tapped my source network for fun suggestions. Here are my real-world tests that will prove we've reached AGI: The PR departments of OpenAI and Anthropic use their own AI technology to answer every journalist's question. Right now, these companies are hiring a ton of human journalists and other communications experts to handle a barrage of reporter questions about AI and the future. When I reach out to these companies, humans answer every time. Unacceptable! Unless this changes, we're not at AGI. This suggestion is from a hedge fund contact, and I love it: Please, please can my Microsoft Outlook email system stop burying important emails while still letting spam through? This one seems like something Microsoft and OpenAI could solve with their AI technology. I haven't seen a fix yet. In a similar vein, can someone please stop Cactus Warehouse from texting me every 2 days with offers for 20% off succulents? I only bought one cactus from you guys once! Come on, AI, this can surely be solved! My 2024 Tesla Model 3 Performance hits potholes in FSD. No wonder tires have to be replaced so often on these EVs. As a human, I can avoid potholes much better. Elon, the AGI gauntlet has been thrown down. Get on this now. Can AI models and chatbots make valuable predictions about the future, or do they mostly just regurgitate what's already known on the internet? I tested this recently, right after the US bombed Iran. ChatGPT's stock-picking ability was put to the test versus a single human analyst. Check out the results here. TL;DR: We are nowhere near AGI on this one. There's a great Google Gemini TV ad where a kid is helping his dad assemble a basketball net. The son is using an Android phone to ask Gemini for the instructions and pointing the camera at his poor father struggling with parts and tools. It's really impressive to watch as Gemini finds the instruction manual online just by "seeing" what's going on live with the product assembly. For AGI to be here, though, the AI needs to just build the damn net itself. I can sit there and read out instructions in an annoying way, while someone else toils with fiddly assembly tasks — we can all do that. Yes, I know these tests seem a bit silly — but AI benchmarks are not the real world, and they can be pretty easily gamed. That last basketball net test is particularly telling for me. Getting an AI system and software to actually assemble a basketball net — that might happen sometime soon. But, getting the same system to do a lot of other physical-world manipulation stuff better than humans, too? Very hard and probably not possible for a very long time. As OpenAI and Microsoft try to resolve their differences, the companies can tap experts to weigh in on whether the startup has reached AGI or not, per the terms of their existing contract, according to The Information. I'm happy to be an expert advisor here. Sam and Satya, let me know if you want help! For now, I'll leave the final words to a real AI expert. Konstantin Mishchenko, an AI research scientist at Meta, recently tweeted this, while citing a blog by another respected expert in the field, Sergey Levine: "While LLMs learned to mimic intelligence from internet data, they never had to actually live and acquire that intelligence directly. They lack the core algorithm for learning from experience. They need a human to do that work for them," Mishchenko wrote, referring to AI models known as large language models. "This suggests, at least to me, that the gap between LLMs and genuine intelligence might be wider than we think. Despite all the talk about AGI either being already here or coming next year, I can't shake off the feeling it's not possible until we come up with something better than a language model mimicking our own idea of how an AI should look," he concluded.

OpenAI boss accuses Meta of trying to poach staff with $100m sign-on bonuses

Yahoo

3 hours ago

Yahoo

OpenAI boss accuses Meta of trying to poach staff with $100m sign-on bonuses

The boss of OpenAI has claimed that Mark Zuckerberg's Meta has tried to poach his top artificial intelligence experts with 'crazy' signing bonuses of $100m (£74m), as the scramble for talent in the booming sector intensifies. Sam Altman spoke about the offers in a podcast on Tuesday. They have not been confirmed by Meta. OpenAI, the company that developed ChatGPT, said it had nothing to add beyond its chief executive's comments. 'They started making these giant offers to a lot of people on our team – $100m signing bonuses, more than that comp [compensation] per year,' Altman told the Uncapped podcast, which is presented by his brother, Jack. 'It is crazy. I'm really happy that, at least so far, none of our best people have decided to take them up on that.' He said: 'I think the strategy of a tonne of upfront, guaranteed comp, and that being the reason you tell someone to join … the degree to which they're focusing on that, and not the work and not the mission – I don't think that's going to set up a great culture.' Meta last week launched a $15bn drive towards computerised 'super-intelligence' – a type of AI that can perform better than humans at all tasks. The company bought a large stake in the $29bn startup Scale AI, set up by the programmer Alexandr Wang, 28, who joined Meta as part of the deal. Last week, a Silicon Valley venture capitalist, Deedy Das, tweeted: 'The AI talent wars are absolutely ridiculous'. Das, a principal at Menlo Ventures, said Meta had been losing AI candidates to rivals despite offering $2m-a-year salaries. Another report last month found that Anthropic, an AI company backed by Amazon and Google and set up by engineers who left Altman's company was 'siphoning top talent from two of its biggest rivals: OpenAI and DeepMind'. The scramble to recruit the best developers comes amid rapid advances in AI technology and a race to achieve human-level AI capacity – known as artificial general intelligence. The spending on hardware is greater still, with recent estimates from the Carlyle Group, reported by Bloomberg, that $1.8tn could be spent on computing power by 2030. That is more than the annual gross domestic product of Australia. Some tech firms are buying whole companies to lock in top talent, as seen in part with Meta's Scale AI deal and Google spending $2.7bn last year on which was founded by the leading AI researcher Noam Shazeer. He co-wrote the 2017 research paper Attention is all you Need, which is considered a seminal contribution to the current wave of large language model AI systems. While Meta was founded as a social media company and OpenAI as non-profit – becoming a for-profit business last year – the two are now rivals. Altman told his brother's podcast that he did not feel Meta would succeed in it's AI push, adding: 'I don't think they're a company that's great at innovation.' He said he had once heard Zuckerberg say that it had seemed rational for Google to try to develop a social media function in the early days of Facebook, but 'it was clear to people at Facebook that that was not going to work'. 'I feel a little bit similar here,' Altman added. Despite the huge investments in the sector, Altman suggested the result could be 'we build legitimate super intelligence, and it doesn't make the world much better [and] doesn't change things as much as it sounds like it should'. 'The fact that you can have this thing do this amazing stuff for you, and you kind of live your life the same way you did two years ago,' he said. 'The thing that I think will be the most impactful in that five to 10-year timeframe is AI will actually discover new science. This is a crazy claim to make, but I think it is true, and if it is correct, then over time I think that will dwarf everything else [AI has achieved].'