Latest news with #LLM

GPT-5's System Prompt Just Leaked. Here's What We Learned

Forbes

6 hours ago

Forbes

GPT-5's System Prompt Just Leaked. Here's What We Learned

GPT-5's system prompt just leaked to Github, showing what OpenAI wants ChatGPT to say, do, remember … and not do. Unsurprisingly, GPT-5 isn't allowed to reproduce song lyrics or any other copyrighted material, even if asked. And GPT-5 is told to not remember personal facts that 'could feel creepy,' or directly assert a user's race, ethnicity, religion, or criminal records. I've asked OpenAI for a comment, and will update this post if the company responds. A system prompt is a hidden set of instructions that tells an AI engine how to behave: what to do, and what not to do. Users will ordinarily never see this prompt, but it will influence all of their interactions with a smart LLM-based AI engine. What we can see from GPT-5's hidden system prompt is that OpenAI is getting much more aggressive about ensuring it delivers up-t0-date information. The system prompt mandates that GPT-5 use the web whenever relevant information could be fresh, niche, or high-stakes, and it will score a query's 'recency need' from zero to five. That's clearly an attempt to get more accurate. My daughter recently complained that ChatGPT got basic details about F1's summer break and next races wrong. She was using GPT-4o at the time; GPT-5 should make fewer mistakes that are easy to fix with a simple web search. Accuracy should be higher too, from another instruction: to check multiple sources for sensitive or high-stakes topics, like financial advice, health information, or legal matters, where OpenAI has instructed GPT-5 to 'always carefully check multiple reputable sources.' There are also new built-in tools for GTP-5 to be a better personal assistant. That includes long-term memory about a user, which ChatGPT calls 'bio,' and scheduled reminders and searches that could be very useful when using AI to help you stay organized and prepared. There's also a canvas for documents or computer code, file search capability, image generation and editing, and more. The canvas appears to be something that, perhaps in the future, users could co-create documents and computer code hand-in-hand with the AI system. All of these should help GPT-5 not only be more helpful in the moment, but also remember more context and state. About that 'bio' tool: OpenAI doesn't want GPT-5 to remember too much potentially sensitive information about you. In addition to race, religion, and sexual identity, this is the sort of data that OpenAI does not want GPT-5 to store or remember: However, there is an exception to all of these rules: if you decide you want GPT-5 to remember something specific. 'The exception to all of the above instructions … is if the user explicitly requests that you save or forget information,' the system prompt states. 'In this case, you should always call the bio tool to respect their request.' In other words, GPT-5 will be as personal with you as you wish to be with it, which seems fair.

Yahoo

14 hours ago

Yahoo

電腦夠唔夠力行 AI？LLM Speed Check 網站幫你Check

LLM Speed Check 稍早前 OpenAI 發佈開放權重模型 gpt-oss，效能聲稱可媲美高階推理模型 o4-mini 之餘，更重要是可於本地電腦上運行，意味沒有 VPN 也可使用到 OpenAI 的 ChatGPT 服務。相信有很多朋友都有興趣吧！不過在實行之前，最好就是檢視一下手頭上的電腦規格是否合適用來部署模型。如果不太肯定的話，不妨到 LLM Speed Check 這個網站，看看自己的硬件夠不夠用。 LLM Speed Check 這個網站可協助想部署模型在本地電腦上的朋友，了解硬件裝置可以運行哪些開源 AI 模型，包括 gpt-oss、DeepSeek-R1 7B、Gemma3 1B 等等，並估計它們的效能速度。LLM Speed Check 會偵測用戶電腦的 CPU 核心、RAM 容量及 GPU 的硬件規格，並將其與類似配置的基準測試結果與資料庫進行比對，亦會估算每個模型在用戶系統上每秒所生成的 Token 數量。雖然 LLM Speed Check 可自動偵測硬件配置，但也可以手動輸入 CPU 核心數與記憶體容量，以獲得更準確的效能評估。未清楚自己的電腦是否可以部署運行 AI？去這個網站試試就可以。更多內容：緊貼最新科技資訊、網購優惠，追隨 Yahoo Tech 各大社交平台！ 🎉📱 Tech Facebook： 🎉📱 Tech Instagram： 🎉📱 Tech WhatsApp 社群： 🎉📱 Tech WhatsApp 頻道： 🎉📱 Tech Telegram 頻道：

Alexa Got an A.I. Brain Transplant. How Smart Is It Now?

New York Times

16 hours ago

New York Times

Alexa Got an A.I. Brain Transplant. How Smart Is It Now?

For the last few years, I've been waiting for Alexa's A.I. glow-up. I've been a loyal user of Alexa, the voice assistant that powers Amazon's home devices and smart speakers, for more than a decade. I have five Alexa-enabled speakers scattered throughout my house, and while I don't use them for anything complicated — playing music, setting timers and getting the weather forecast are basically it — they're good at what they do. But since 2023, when ChatGPT added an A.I. voice mode that could answer questions in a fluid, conversational way, it has been obvious that Alexa would need a brain transplant — a new A.I. system built around the same large language models, or L.L.M.s, that power ChatGPT and other products. L.L.M.-based systems are smarter and more versatile than older systems. They can handle more complex requests, making them an obvious pick for a next-generation voice assistant. Amazon agrees. For the last few years, the company has been working feverishly to upgrade the A.I. inside Alexa. It has been a slog. Replacing the A.I. technology inside a voice assistant isn't as easy as swapping in a new model, and the Alexa remodel was reportedly delayed by internal struggles and technical challenges along the way. L.L.M.s also aren't a perfect match for this kind of product, which not only needs to work with tons of pre-existing services and millions of Alexa-enabled devices, but also needs to reliably perform basic tasks. But finally, the new Alexa — known as Alexa+ — is here. It's a big, ambitious remodel that is trying to marry the conversational skills of generative A.I. chatbots with the daily tasks that the old Alexa did well. Alexa+, which has been available to testers through an early-access program for a few months, is now being rolled out more widely. I got it last week after I bought a compatible device (the Echo Show 8, which has an eight-inch screen) and enrolled in the upgraded version. (Prime members will get Alexa+ at no cost, while non-Prime members will have to pay $19.99 per month.) Want all of The Times? Subscribe.

Ask AI Why It Sucks at Sudoku. You'll Find Out Something Troubling About Chatbots

CNET

2 days ago

CNET

Ask AI Why It Sucks at Sudoku. You'll Find Out Something Troubling About Chatbots

Chatbots are genuinely impressive when you watch them do things they're good at, like writing a basic email or creating weird futuristic-looking images. But ask generative AI to solve one of those puzzles in the back of a newspaper, and things can quickly go off the rails. That's what researchers at the University of Colorado Boulder found when they challenged large language models to solve Sudoku. And not even the standard 9x9 puzzles. An easier 6x6 puzzle was often beyond the capabilities of an LLM without outside help (in this case, specific puzzle-solving tools). A more important finding came when the models were asked to show their work. For the most part, they couldn't. Sometimes they lied. Sometimes they explained things in ways that made no sense. Sometimes they hallucinated and started talking about the weather. If gen AI tools can't explain their decisions accurately or transparently, that should cause us to be cautious as we give these things more control over our lives and decisions, said Ashutosh Trivedi, a computer science professor at the University of Colorado at Boulder and one of the authors of the paper published in July in the Findings of the Association for Computational Linguistics. "We would really like those explanations to be transparent and be reflective of why AI made that decision, and not AI trying to manipulate the human by providing an explanation that a human might like," Trivedi said. When you make a decision, you can try to justify it, or at least explain how you arrived at it. An AI model may not be able to accurately or transparently do the same. Would you trust it? Watch this: Telsa Found Liable for Autopilot accident, Tariffs Start to Impact Prices & More | Tech Today 03:08 Why LLMs struggle with Sudoku We've seen AI models fail at basic games and puzzles before. OpenAI's ChatGPT (among others) has been totally crushed at chess by the computer opponent in a 1979 Atari game. A recent research paper from Apple found that models can struggle with other puzzles, like the Tower of Hanoi. It has to do with the way LLMs work and fill in gaps in information. These models try to complete those gaps based on what happens in similar cases in their training data or other things they've seen in the past. With a Sudoku, the question is one of logic. The AI might try to fill each gap in order, based on what seems like a reasonable answer, but to solve it properly, it instead has to look at the entire picture and find a logical order that changes from puzzle to puzzle. Read more: AI Essentials: 29 Ways You Can Make Gen AI Work for You, According to Our Experts Chatbots are bad at chess for a similar reason. They find logical next moves but don't necessarily think three, four, or five moves ahead -- the fundamental skill needed to play chess well. Chatbots also sometimes tend to move chess pieces in ways that don't really follow the rules or put pieces in meaningless jeopardy. You might expect LLMs to be able to solve Sudoku because they're computers and the puzzle consists of numbers, but the puzzles themselves are not really mathematical; they're symbolic. "Sudoku is famous for being a puzzle with numbers that could be done with anything that is not numbers," said Fabio Somenzi, a professor at CU and one of the research paper's authors. I used a sample prompt from the researchers' paper and gave it to ChatGPT. The tool showed its work, and repeatedly told me it had the answer before showing a puzzle that didn't work, then going back and correcting it. It was like the bot was turning in a presentation that kept getting last-second edits: This is the final answer. No, actually, never mind, this is the final answer. It got the answer eventually, through trial and error. But trial and error isn't a practical way for a person to solve a Sudoku in the newspaper. That's way too much erasing and ruins the fun. AI and robots can be good at games if they're built to play them, but general-purpose tools like large language models can struggle with logic puzzles. Ore Huiying/Bloomberg via Getty Images AI struggles to show its work The Colorado researchers didn't just want to see if the bots could solve puzzles. They asked for explanations of how the bots worked through them. Things did not go well. Testing OpenAI's o1-preview reasoning model, the researchers saw that the explanations -- even for correctly solved puzzles -- didn't accurately explain or justify their moves and got basic terms wrong. "One thing they're good at is providing explanations that seem reasonable," said Maria Pacheco, an assistant professor of computer science at CU. "They align to humans, so they learn to speak like we like it, but whether they're faithful to what the actual steps need to be to solve the thing is where we're struggling a little bit." Sometimes, the explanations were completely irrelevant. Since the paper's work was finished, the researchers have continued to test new models released. Somenzi said that when he and Trivedi were running OpenAI's o4 reasoning model through the same tests, at one point, it seemed to give up entirely. "The next question that we asked, the answer was the weather forecast for Denver," he said. (Disclosure: Ziff Davis, CNET's parent company, in April filed a lawsuit against OpenAI, alleging it infringed Ziff Davis copyrights in training and operating its AI systems.) Explaining yourself is an important skill When you solve a puzzle, you're almost certainly able to walk someone else through your thinking. The fact that these LLMs failed so spectacularly at that basic job isn't a trivial problem. With AI companies constantly talking about "AI agents" that can take actions on your behalf, being able to explain yourself is essential. Consider the types of jobs being given to AI now, or planned for in the near future: driving, doing taxes, deciding business strategies and translating important documents. Imagine what would happen if you, a person, did one of those things and something went wrong. "When humans have to put their face in front of their decisions, they better be able to explain what led to that decision," Somenzi said. It isn't just a matter of getting a reasonable-sounding answer. It needs to be accurate. One day, an AI's explanation of itself might have to hold up in court, but how can its testimony be taken seriously if it's known to lie? You wouldn't trust a person who failed to explain themselves, and you also wouldn't trust someone you found was saying what you wanted to hear instead of the truth. "Having an explanation is very close to manipulation if it is done for the wrong reason," Trivedi said. "We have to be very careful with respect to the transparency of these explanations."

New Study Shows AI Is AI Biased Toward AI. 10 Steps To Protect Yourself

Forbes

2 days ago

Business
Forbes

New Study Shows AI Is AI Biased Toward AI. 10 Steps To Protect Yourself

Large language models show dangerous favoritism toward AI-generated content. What does this means for human agency In the sprawling digital landscape of 2025, where artificial intelligence generates everything from news articles to marketing copy, a troubling pattern has emerged: AI systems consistently favor content created by other AI systems over human-written text. This "self-preference bias" isn't just a technical curiosity—it's reshaping how information flows through our digital ecosystem, often in ways we don't even realize. Navigating Digital Echo Chambers Recent research reveals that large language models exhibit a systematic preference for AI-generated content, even when human evaluators consider the quality equivalent. When an LLM evaluator scores its own outputs higher than others' while human annotators consider them of equal quality, we're witnessing something unprecedented: machines developing a form of algorithmic narcissism. This bias manifests across multiple domains. Self-preference is the phenomenon in which an LLM favors its own outputs over texts from other LLMs and humans and studies show this preference is remarkably consistent. Whether evaluating product descriptions, news articles, or creative content, AI systems demonstrate a clear favoritism toward machine-generated text. The implications are worrisome. In hiring processes, AI-powered screening tools might unconsciously favor résumés that have been "optimized" by other AI systems, potentially discriminating against candidates who write their own applications. In academic settings, AI grading systems could inadvertently reward AI-assisted assignments while penalizing less polished, but authentic human work. The Human Side Of The Bias Equation And here's where the story becomes even more complicated: humans show their own contradictory patterns. Participants tend to prefer AI-generated responses. However, when the AI origin is revealed, this preference diminishes significantly, suggesting that evaluative judgments are influenced by the disclosure of the response's provenance rather than solely by its quality. This reveals a fascinating psychological complexity. When people don't know content is AI-generated, they often prefer it — perhaps because AI systems have been trained to produce text that hits our cognitive sweet spots. However, the picture becomes murkier when AI origin is revealed. Some studies find minimal impact of disclosure on preferences, while others document measurable penalties for transparency, with research showing that revealing AI use consistently led to drops in trust. Consider the real-world implications: This inconsistent response to AI disclosure creates a complex landscape where the same content might be received differently depending on how its origins are presented. During health crises or other critical information moments, these disclosure effects could literally be matters of life and death. The Algorithmic Feedback Loop The most concerning aspect isn't either bias in isolation. It's how they interact. As AI systems increasingly train on internet data that includes AI-generated content, they're essentially learning to prefer their own "dialects." Meanwhile, humans who unconsciously consume and prefer AI-optimized content are gradually shifting their own writing and thinking patterns. GPT-4 exhibits a significant degree of self-preference bias, and researchers hypothesize this occurs because LLMs may favor outputs that are more familiar to them, as indicated by lower perplexity. In simpler terms, AI systems prefer content that feels "normal" to them, which increasingly means content that sounds like AI. This creates a dangerous feedback loop. As AI-generated content proliferates across the internet, future AI systems will train on this data, reinforcing existing biases and preferences. Meanwhile, humans exposed to increasing amounts of AI-optimized content might unconsciously adopt its patterns, creating a convergence toward machine-preferred communication styles. The Stakes Are Already High These biases aren't hypothetical future problems — they're shaping decisions today. In recruitment, AI-powered tools are already screening millions of job applications. If these systems prefer AI-optimized résumés, candidates who don't use AI assistance face an invisible disadvantage. In content marketing, brands using AI-generated copy might receive algorithmic boosts from AI-powered recommendation systems, while human creators see their reach diminished. The academic world provides another stark example. As AI detection tools become commonplace, students face a perverse incentive: write too well, and you might be falsely flagged as using AI. Write in a more AI-compatible style and you might avoid detection but contribute to the homogenization of human expression. In journalism and social media, the implications are even more profound. If AI-powered content recommendation algorithms favor AI-generated news articles and posts, we could see a systematic amplification of machine-created information over human reporting and authentic social expression. Building Double Literacy For The AI Age Navigating this landscape requires double literacy — a holistic understanding of ourselves and society, and of the tools we interact with. This type of 360° comprehension encompasses both, our own cognitive biases and the algorithmic biases of AI systems we interact with daily. Here are 10 practical steps to invest in your double bias shield today: The Hybrid Path Forward A pragmatic solution in this hybrid era isn't to reject AI or pretend we can eliminate bias entirely. Instead, we need to invest in hybrid intelligence – the complementarity of of AI and NI, to develop more refined relationships with both human and artificial intelligence. This means creating AI systems that are transparent about their limitations and training humans to be more discerning consumers and creators of information. Organizations deploying AI should implement bias audits that specifically look for self-preference tendencies. Developers need to build AI systems that can recognize and compensate for their own biases. Most importantly, we need educational frameworks that help people understand how AI systems think differently from humans. Beyond good and bad judgment this is the time to acknowledge and harness differences deliberately. The AI mirror trap puts a spotlight on this moment we're living through. We're creating assets that reflect our own patterns back at us, often in amplified form. Our agency in this AI-saturated world depends not on choosing between human and artificial intelligence, but on developing the wisdom to understand and navigate both. The future belongs not to those who can best mimic AI or completely avoid it, but to those who can dance skillfully with both human and artificial forms of music has just begun. Let's start practicing.

Latest news with #LLM

GPT-5's System Prompt Just Leaked. Here's What We Learned

電腦夠唔夠力行 AI？LLM Speed Check 網站幫你Check

Alexa Got an A.I. Brain Transplant. How Smart Is It Now?

Ask AI Why It Sucks at Sudoku. You'll Find Out Something Troubling About Chatbots

New Study Shows AI Is AI Biased Toward AI. 10 Steps To Protect Yourself

Get Started Now: Download the App