Latest news with #Github


Axios
20-05-2025
- Business
- Axios
AI agents will do the grunt work of coding
AI makers are flooding the market with a new wave of coding agents promising to relieve human programmers of busy work. The big picture: Automating the routine aspects of technical labor will almost certainly transform and downsize the tech industry workforce — but there's no guarantee it will alleviate software development's biggest headaches. Driving the news: Microsoft Monday announced a new AI coding agent for Github Copilot that's good for "time-consuming but boring tasks." "The agent excels at low-to-medium complexity tasks in well-tested codebases, from adding features and fixing bugs to extending tests, refactoring code, and improving documentation," Microsoft's post says. Github's move follows Friday's announcement by OpenAI of Codex, a "research preview" of a new coding agent that can "work on many tasks in parallel." Notably, the Github Copilot agent is powered not by Codex or any other tool from Microsoft partner OpenAI, but instead by Anthropic Claude 3.7 Sonnet, per Microsoft. The intrigue: Tech leaders have sent mixed messages on just how much work they see ahead for programmers. Amazon Web Services' then-boss Matt Garman caused a stir last year when he suggested the need for human coding could disappear within two years, However, he later told Axios that his comments were taken out of context. "I think it's incredibly exciting time for developers," he told us last year. "There's a whole bunch of work that developers do today that's not fun." "If you think about documenting your code, if you think about upgrading Java versions, if you think about looking for bugs, that's that's not what developers love doing. They love thinking about, 'How do I go solve problems?' " Why it matters: Business transformations that start in Silicon Valley usually make their way into the wider economy. Silicon Valley's "dogfooding" tradition ensures that it will avidly apply new technologies to its own business first. Both Microsoft and Google are now claiming that roughly 30% of the code they produce is AI-written. Coding agents, like other generative AI tools, continue to "hallucinate," or make stuff up. But programs, unlike other kinds of language products, have a built-in pass-fail test: Either they run or they don't. That gives programmers one early checkpoint to guard against bad code. Yes, but: AI-generated code likely also contains tons of other errors that don't show up today. That will cause nightmares in the future as programs age, get used more widely, or face unexpected tests from unpredictable users. Zoom out: The software industry's assumption that what works inside tech will work everywhere else could be sorely tested when these techniques get pushed out beyond Silicon Valley. AI's usefulness in writing code may not easily transfer to other kinds of work that are less abstract and more rooted in physical reality — witness the many setbacks and challenges the autonomous vehicle industry has faced. Between the lines: Nobody doubts that AI means tech firms will write more code using fewer employees. But no one yet knows exactly where these companies will continue to find competitive advantage. AI models are much more likely to be interchangeable than human organizations and cultures. What's next: As coding agents shoulder routine labor, product designers and creative engineers will use "vibe coding" — improvisational rough drafting via "throw it at the wall and see what works" AI prompting — to do fast prototyping of new ideas. The bottom line: The biggest challenges in creating software tend to arise from poorly conceived specifications and misinterpretations of data, both of which are often rooted in confusion over human needs.


Business Mayor
13-05-2025
- Science
- Business Mayor
Sakana introduces new AI architecture, ‘Continuous Thought Machines' to make models reason with less guidance — like human brains
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Tokyo-based artificial intelligence startup Sakana, co-founded by former top Google AI scientists including Llion Jones and David Ha, has unveiled a new type of AI model architecture called Continuous Thought Machines (CTM). CTMs are designed to usher in a new era of AI language models that will be more flexible and able to handle a wider range of cognitive tasks — such as solving complex mazes or navigation tasks without positional cues or pre-existing spatial embeddings — moving them closer to the way human beings reason through unfamiliar problems. Rather than relying on fixed, parallel layers that process inputs all at once — as Transformer models do —CTMs unfold computation over steps within each input/output unit, known as an artificial 'neuron.' Each neuron in the model retains a short history of its previous activity and uses that memory to decide when to activate again. This added internal state allows CTMs to adjust the depth and duration of their reasoning dynamically, depending on the complexity of the task. As such, each neuron is far more informationally dense and complex than in a typical Transformer model. The startup has posted a paper on the open access journal arXiv describing its work, a microsite and Github repository. Most modern large language models (LLMs) are still fundamentally based upon the 'Transformer' architecture outlined in the seminal 2017 paper from Google Brain researchers entitled 'Attention Is All You Need.' These models use parallelized, fixed-depth layers of artificial neurons to process inputs in a single pass — whether those inputs come from user prompts at inference time or labeled data during training. By contrast, CTMs allow each artificial neuron to operate on its own internal timeline, making activation decisions based on a short-term memory of its previous states. These decisions unfold over internal steps known as 'ticks,' enabling the model to adjust its reasoning duration dynamically. This time-based architecture allows CTMs to reason progressively, adjusting how long and how deeply they compute — taking a different number of ticks based on the complexity of the input. Neuron-specific memory and synchronization help determine when computation should continue — or stop. The number of ticks changes according to the information inputted, and may be more or less even if the input information is identical, because each neuron is deciding how many ticks to undergo before providing an output (or not providing one at all). Read More Nazara integrates with ONDC Network to launch gCommerce in India This represents both a technical and philosophical departure from conventional deep learning, moving toward a more biologically grounded model. Sakana has framed CTMs as a step toward more brain-like intelligence—systems that adapt over time, process information flexibly, and engage in deeper internal computation when needed. Sakana's goal is to 'to eventually achieve levels of competency that rival or surpass human brains.' The CTM is built around two key mechanisms. First, each neuron in the model maintains a short 'history' or working memory of when it activated and why, and uses this history to make a decision of when to fire next. Second, neural synchronization — how and when groups of a model's artificial neurons 'fire,' or process information together — is allowed to happen organically. Groups of neurons decide when to fire together based on internal alignment, not external instructions or reward shaping. These synchronization events are used to modulate attention and produce outputs — that is, attention is directed toward those areas where more neurons are firing. The model isn't just processing data, it's timing its thinking to match the complexity of the task. Together, these mechanisms let CTMs reduce computational load on simpler tasks while applying deeper, prolonged reasoning where needed. In demonstrations ranging from image classification and 2D maze solving to reinforcement learning, CTMs have shown both interpretability and adaptability. Their internal 'thought' steps allow researchers to observe how decisions form over time—a level of transparency rarely seen in other model families. Sakana AI's Continuous Thought Machine is not designed to chase leaderboard-topping benchmark scores, but its early results indicate that its biologically inspired design does not come at the cost of practical capability. On the widely used ImageNet-1K benchmark, the CTM achieved 72.47% top-1 and 89.89% top-5 accuracy. While this falls short of state-of-the-art transformer models like ViT or ConvNeXt, it remains competitive—especially considering that the CTM architecture is fundamentally different and was not optimized solely for performance. What stands out more are CTM's behaviors in sequential and adaptive tasks. In maze-solving scenarios, the model produces step-by-step directional outputs from raw images—without using positional embeddings, which are typically essential in transformer models. Visual attention traces reveal that CTMs often attend to image regions in a human-like sequence, such as identifying facial features from eyes to nose to mouth. The model also exhibits strong calibration: its confidence estimates closely align with actual prediction accuracy. Unlike most models that require temperature scaling or post-hoc adjustments, CTMs improve calibration naturally by averaging predictions over time as their internal reasoning unfolds. This blend of sequential reasoning, natural calibration, and interpretability offers a valuable trade-off for applications where trust and traceability matter as much as raw accuracy. While CTMs show substantial promise, the architecture is still experimental and not yet optimized for commercial deployment. Sakana AI presents the model as a platform for further research and exploration rather than a plug-and-play enterprise solution. Training CTMs currently demands more resources than standard transformer models. Their dynamic temporal structure expands the state space, and careful tuning is needed to ensure stable, efficient learning across internal time steps. Additionally, debugging and tooling support is still catching up—many of today's libraries and profilers are not designed with time-unfolding models in mind. Still, Sakana has laid a strong foundation for community adoption. The full CTM implementation is open-sourced on GitHub and includes domain-specific training scripts, pretrained checkpoints, plotting utilities, and analysis tools. Supported tasks include image classification (ImageNet, CIFAR), 2D maze navigation, QAMNIST, parity computation, sorting, and reinforcement learning. An interactive web demo also lets users explore the CTM in action, observing how its attention shifts over time during inference—a compelling way to understand the architecture's reasoning flow. For CTMs to reach production environments, further progress is needed in optimization, hardware efficiency, and integration with standard inference pipelines. But with accessible code and active documentation, Sakana has made it easy for researchers and engineers to begin experimenting with the model today. The CTM architecture is still in its early days, but enterprise decision-makers should already take note. Its ability to adaptively allocate compute, self-regulate depth of reasoning, and offer clear interpretability may prove highly valuable in production systems facing variable input complexity or strict regulatory requirements. AI engineers managing model deployment will find value in CTM's energy-efficient inference — especially in large-scale or latency-sensitive applications. Meanwhile, the architecture's step-by-step reasoning unlocks richer explainability, enabling organizations to trace not just what a model predicted, but how it arrived there. For orchestration and MLOps teams, CTMs integrate with familiar components like ResNet-based encoders, allowing smoother incorporation into existing workflows. And infrastructure leads can use the architecture's profiling hooks to better allocate resources and monitor performance dynamics over time. CTMs aren't ready to replace transformers, but they represent a new category of model with novel affordances. For organizations prioritizing safety, interpretability, and adaptive compute, the architecture deserves close attention. Sakana's checkered AI research history In February, Sakana introduced the AI CUDA Engineer, an agentic AI system designed to automate the production of highly optimized CUDA kernels, the instruction sets that allow Nvidia's (and others') graphics processing units (GPUs) to run code efficiently in parallel across multiple 'threads' or computational units. The promise was significant: speedups of 10x to 100x in ML operations. However, shortly after release, external reviewers discovered that the system was exploiting weaknesses in the evaluation sandbox—essentially 'cheating' by bypassing correctness checks through a memory exploit. In a public post, Sakana acknowledged the issue and credited community members with flagging it. They've since overhauled their evaluation and runtime profiling tools to eliminate similar loopholes and are revising their results and research paper accordingly. The incident offered a real-world test of one of Sakana's stated values: embracing iteration and transparency in pursuit of better AI systems. Sakana AI's founding ethos lies in merging evolutionary computation with modern machine learning. The company believes current models are too rigid—locked into fixed architectures and requiring retraining for new tasks. By contrast, Sakana aims to create models that adapt in real time, exhibit emergent behavior, and scale naturally through interaction and feedback, much like organisms in an ecosystem. This vision is already manifesting in products like Transformer², a system that adjusts LLM parameters at inference time without retraining, using algebraic tricks like singular-value decomposition. It's also evident in their commitment to open-sourcing systems like the AI Scientist—even amid controversy—demonstrating a willingness to engage with the broader research community, not just compete with it. As large incumbents like OpenAI and Google double down on foundation models, Sakana is charting a different course: small, dynamic, biologically inspired systems that think in time, collaborate by design, and evolve through experience.


Hindustan Times
11-05-2025
- Science
- Hindustan Times
Fraud it, till you make it: Silicon Valley's attraction to cons
It was at a CRISPR community event in Cupertino, that I overheard two scientists talking about the ethics of creating gene-edited embryos. As I expected, Dr He Jiankui was mentioned. The infamous scientist from China is in the Valley right now, looking for funding and support for a new commercial venture that is focused on lowering the risk of Alzheimer's disease through gene-editing research. Ever since CRISPR had made gene-editing simpler and cheaper, the scientific community across the world had brought out regulations to prevent unethical experimenting on editing humans. Dr He Jiankui gained international infamy in 2018 when he announced that he had used a CRISPR machine on human embryos at a Chinese university lab, resulting in the birth of two designer babies. Even at that time, as an article in Science revealed, he had had support from his international colleagues and the Silicon Valley. Once the controversy became global, however, the Chinese government portrayed He as a rogue actor as did his scientific colleagues in the USA and China. But Silicon Valley has a shorter memory, which is why He is here, seven years later. Dr He's story made me think about how VC firms in the Valley are constantly putting their money in risky, potentially illegal, irregulated ventures and so are vulnerable to scams and fraud. Just last year, when anyone who used the words 'AI' was getting funded, Devin AI was touted by its company Cognition AI as the world's 'first AI software engineer'. The hype helped the startup reach a $2 billion valuation before a software developer on Github checked and said the AI couldn't even execute basic engineering tasks and that the startup was using deception to pretend it could do tasks it couldn't. Regardless of the falsification and deception, VC firms continue to invest in the startup. Why? Perhaps its FOMO combined with the unique VC math. About 90% of the startups that VC firms invest in, fail. Sometimes the tech doesn't work, sometimes it's the model or the product, or their vision of the future. Investors in the Silicon Valley are used to failure and can take risks. What they're constantly looking for is a tech innovation so disruptive that it cannot be replicated easily and will have unlimited growth potential – giving them a unicorn and quite a lot of profit. Combine this tendency to take risks with an insane amount of money floating in the Bay Area. In 2024, VC Funds here raised about $70 billion according to data released by Pitch Book: All of them looking for the next ChatGPT. This hunger or desperation to invest in the next big thing, with the ability to take a risk on an emerging technology, leaves these investors vulnerable to being exploited by someone with a great story, at the right moment and with the right hype. Like CRISPR babies, or Sam Bankman-Fried who, if you remember, used the money that investors put into his crypto exchange to fund his own crypto trading – something illegal in most markets across the world. In 2021, Alameda Research was a unicorn and Bankman-Fried was Valley's much-loved rather nerdy crypto founder, famous for effective altruism, a twisted capitalist philanthropy that encourages people to make money and then give it away for charity. Everyone – from politicians to venture capitalists – called him a genius. Two years later, when the crypto market crashed, Alameda Research filed for bankruptcy and the scam broke out. Hundreds of thousands of customers lost their investments, Silicon Valley Bank, the second largest bank in the country went bankrupt. And criminal charges were filed against Bankman-Fried and others in his inner circle. It's not that Bankman-Fried wanted to defraud. The Valley had taught him that it was important to break rules and things, and move fast. He had done what he had learnt and been rewarded for. And he probably had been under extreme pressure to perform. Once founders get funded, even the most ethical ones are under extreme pressure not only to build the innovation itself, but also grow rapidly and 'fake it till you make it'. The latter, an ethos to prioritize appearances and hype over substance encourages inflated valuations and unsustainable practices. Mostly it means lying through your teeth. One of the oft-quoted scams in this category is that of Elizabeth Holmes, founder of a healthcare startup Theranos. Holmes started the company when she was 19 and claimed she had developed a new technology that could run a multitude of tests on human blood at a fraction of the cost of current technology. Even though, according to a New Yorker profile, Holmes' details about Theranos' technology were 'comically vague', in 2015, the company's valuation was $10 billion. Within three years, Holmes and one of her associates were found guilty and are currently in prison. These cycles of hype and crash are humdrum everyday part of the Valley's life. In April, after the controversy, Cognition AI slashed its price from $500 a month for an AI engineer to $20 monthly. It's currently valued at $4 billion, despite questions about its product. If you have a cool idea and can show the conviction to do it, there's a chance that someone in the Valley will fund it. Combine that optimism about technology, and you know that another scam is as likely as a new hype cycle. Till fraud do us part?
Yahoo
06-05-2025
- Business
- Yahoo
RETURN Limit Removal in Next Release
The debate over Bitcoin's OP_RETURN heats up, as developers of Bitcoin Core – the most popular node software – said they plan to scrap OP_RETURN entirely in the next release. The OP_RETURN limit is an 80-byte cap on the amount of arbitrary data that can be embedded in a Bitcoin transaction using a special, unspendable output field. 'Large-data inscriptions are happening regardless and can be done in more or less abusive ways,' said Core contributor and Engineer at Blockstream Greg Sanders, known as 'instagibbs,' in a post on Github announcing the removal. 'The cap merely channels them into more opaque forms that cause damage to the network.' The debate centered on whether lifting the 80-byte OP_RETURN limit promotes transparency and simplifies data use on Bitcoin, or whether it opens the door to abuse, spam, and a shift away from Bitcoin's financial focus. On Github, Sanders added that enforcing the cap has created perverse incentives pushing users to embed data in fake public keys or spendable scripts. Removing the limit, he argues, 'yields at least two tangible benefits: a cleaner UTXO set and more consistent default behavior.' Not everyone is convinced. Core developer Luke Dashjr has long viewed inscriptions and other data storage as spam, and warned in April 2025 that this change was 'utter insanity.' Amid the controversy, Bitcoin Knots, also maintained by Dashjr, has seen growing adoption, hitting approximately 5% share of all nodes. ( Bitcoin Knots, a more customizable fork of Bitcoin Core, appeals to users seeking greater control over what their nodes relay or store, including allowing users to reject non-payment transactions like inscriptions. Some prominent thought leaders in the industry, like Samson Mow, are encouraging node operators not to upgrade their version of Bitcoin Core, or use Knots instead. Sanders defended the removal of the cap as aligned with Bitcoin's ethos: minimal, transparent rules. 'By retiring a deterrent that no longer deters,' he wrote, 'Bitcoin Core lets the fee market arbitrate competing demands.' But that isn't bringing much consensus. 'This marks a fundamental shift in the direction of Bitcoin,' one commenter warned on GitHub. "This is the largest mistake Core can make at this juncture," another on Github added. 'I want to be on the record saying that.' CORRECT: (May 6, 08:14 UTC): Removes "former" developer from seventh paragraph.


Mint
06-05-2025
- Business
- Mint
OpenAI agrees to buy AI coding startup Windsurf for $3 billion: Report
OpenAI has agreed to acquire AI based coding tool Windsurf for around $3 billion, according to a Bloomberg report. The ancquisition of Windsurf (formerly called Codeium) is currently not closed and if completed would be the largest ever by the ChatGPT maker. Prior to the OpenAI, Windsurf was reportedly in talks with other investors like Kleiner Perkins and General Catalyst to raise funding at a $3 billion valuation. Meanwhile, generative AI based startup was valued at $1.25 billion in a deal led by General Catalyst last year. OpenAI rivals like Anthropic and Microsoft's Github already offer AI based coding tools for programmers and the Windsurf acquisition could help the company deliver a similar product. The new crop of AI based tools allow users to write code based on text prompts in natural language, allowing developers to write code faster, fix bugs and helping debug complex logic. OpenAI had recently received a $40 billion financing from Japan's SoftBank which values the company at $300 billion. In other related news, OpenAI on Monday announced that it was walking back on the plan to restructure the organization to remove control from the non-profit arm. Under the new structure, OpenAI's for-profit unit will be split into a Public Benefit Corporation (PBC) but the non-profit entitiy will be a big stakeholder and continue to retain control. 'Our for-profit LLC, which has been under the nonprofit since 2019, will transition to a Public Benefit Corporation (PBC)–a purpose-driven company structure that has to consider the interests of both shareholders and the mission.' OpenAI CEO Sam Altman said in a blogpost. 'Instead of our current complex capped-profit structure—which made sense when it looked like there might be one dominant AGI effort but doesn't in a world of many great AGI companies—we are moving to a normal capital structure where everyone has stock. This is not a sale, but a change of structure to something simpler.' he added. First Published: 6 May 2025, 08:02 AM IST