
Nvidia launches fully open source transcription AI model Parakeet-TDT-0.6B-V2 on Hugging Face
Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
Nvidia has become one of the most valuable companies in the world in recent years thanks to the stock market noticing how much demand there is for graphics processing units (GPUs), the powerful chips Nvidia makes that are used to render graphics in video games but also, increasingly, train AI large language and diffusion models.
But Nvidia does far more than just make hardware, of course, and the software to run it. As the generative AI era wears on, the Santa Clara-based company has also been steadily releasing more and more of its own AI models — mostly open source and free for researchers and developers to take, download, modify and use commercially — and the latest among them is Parakeet-TDT-0.6B-v2, an automatic speech recognition (ASR) model that can, in the words of Hugging Face's Vaibhav 'VB' Srivastav, 'transcribe 60 minutes of audio in 1 second [mind blown emoji].'
This is the new generation of the Parakeet model Nvidia first unveiled back in January 2024 and updated again in April of that year, but this version two is so powerful, it currently tops the Hugging Face Open ASR Leaderboard with an average 'Word Error Rate' (times the model incorrectly transcribes a spoken word) of just 6.05% (out of 100).
To put that in perspective, it nears proprietary transcription models such as OpenAI's GPT-4o-transcribe (with a WER of 2.46% in English) and ElevenLabs Scribe (3.3%).
And it's offering all this while remaining freely available under a commercially permissive Creative Commons CC-BY-4.0 license, making it an attractive proposition for commercial enterprises and indie developers looking to build speech recognition and transcription services into their paid applications.
The model boasts 600 million parameters and leverages a combination of the FastConformer encoder and TDT decoder architectures.
It is capable of transcribing an hour of audio in just one second, provided it's running on Nvidia's GPU-accelerated hardware.
The performance benchmark is measured at an RTFx (Real-Time Factor) of 3386.02 with a batch size of 128, placing it at the top of current ASR benchmarks maintained by Hugging Face.
Released globally on May 1, 2025, Parakeet-TDT-0.6B-v2 is aimed at developers, researchers, and industry teams building applications such as transcription services, voice assistants, subtitle generators, and conversational AI platforms.
The model supports punctuation, capitalization, and detailed word-level timestamping, offering a full transcription package for a wide range of speech-to-text needs.
Developers can deploy the model using Nvidia's NeMo toolkit. The setup process is compatible with Python and PyTorch, and the model can be used directly or fine-tuned for domain-specific tasks.
The open-source license (CC-BY-4.0) also allows for commercial use, making it appealing to startups and enterprises alike.
Parakeet-TDT-0.6B-v2 was trained on a diverse and large-scale corpus called the Granary dataset. This includes around 120,000 hours of English audio, composed of 10,000 hours of high-quality human-transcribed data and 110,000 hours of pseudo-labeled speech.
Sources range from well-known datasets like LibriSpeech and Mozilla Common Voice to YouTube-Commons and Librilight.
Nvidia plans to make the Granary dataset publicly available following its presentation at Interspeech 2025.
The model was evaluated across multiple English-language ASR benchmarks, including AMI, Earnings22, GigaSpeech, and SPGISpeech, and showed strong generalization performance. It remains robust under varied noise conditions and performs well even with telephony-style audio formats, with only modest degradation at lower signal-to-noise ratios.
Parakeet-TDT-0.6B-v2 is optimized for Nvidia GPU environments, supporting hardware such as the A100, H100, T4, and V100 boards.
While high-end GPUs maximize performance, the model can still be loaded on systems with as little as 2GB of RAM, allowing for broader deployment scenarios.
NVIDIA notes that the model was developed without the use of personal data and adheres to its responsible AI framework.
Although no specific measures were taken to mitigate demographic bias, the model passed internal quality standards and includes detailed documentation on its training process, dataset provenance, and privacy compliance.
The release drew attention from the machine learning and open-source communities, especially after being publicly highlighted on social media. Commentators noted the model's ability to outperform commercial ASR alternatives while remaining fully open source and commercially usable.
Developers interested in trying the model can access it via Hugging Face or through Nvidia's NeMo toolkit. Installation instructions, demo scripts, and integration guidance are readily available to facilitate experimentation and deployment.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
32 minutes ago
- Yahoo
NVIDIA Stockholder Meeting Set for June 25; Individuals Can Participate Online
SANTA CLARA, Calif., June 11, 2025 (GLOBE NEWSWIRE) -- NVIDIA today announced it will hold its 2025 Annual Meeting of Stockholders online on Wednesday, June 25, at 9 a.m. PT. The meeting will take place virtually at Stockholders will need their control number included in their notice or proxy card to access the meeting and may vote and submit questions while attending the meeting. Non-stockholders are welcome to attend by going to the above link and registering under 'Guest Login.' The matters to be voted on at the meeting are set forth in the company's proxy statement filed on May 13, 2025, with the U.S. Securities and Exchange Commission. The proxy statement is available at A replay of the 2025 annual meeting webcast will be available until June 24, 2026, at About NVIDIANVIDIA (NASDAQ: NVDA) is the world leader in accelerated computing. For further information, contact: NVIDIA Investor Relationsir@ NVIDIA Corporate Communicationspress@ © 2025 NVIDIA Corporation. All rights reserved. NVIDIA and the NVIDIA logo are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data
Yahoo
34 minutes ago
- Yahoo
Altman-Backed Coco Robotics Raises $80 Million for Delivery Bots
Coco Robotics, an urban delivery startup using small autonomous robots, has secured $80 million in new funding from OpenAI CEO Sam Altman and other investors, Bloomberg reported Wednesday. The financing round was led by venture capital firm SNR, with participation from Pelion Venture Partners, Offline Ventures, and Max Altman, Sam's brother. The latest investment brings Coco's total raised capital to over $110 million. The company did not disclose a new valuation. Warning! GuruFocus has detected 7 Warning Sign with DASH. Founded in 2020 and formally known as Cyan Robotics Inc., the Santa Monica-based startup deploys about 1,300 cooler-sized electric robots across cities including Miami, Chicago, Los Angeles and Helsinki. The devices deliver food and small packages and are integrated into logistics platforms from Uber Technologies Inc. (UBER) and DoorDash Inc. (DASH, Financials). Coco also works directly with merchants and recently deepened its partnership with OpenAI. Under a March agreement, the company uses OpenAI's language and vision models alongside its own software stack to help its robots navigate obstacles and make real-time decisions. The two firms also share data from delivery routes to train AI systems. However, CEO Zach Rash said Sam Altman was not involved in structuring that collaboration. Coco is one of several startups racing to bring robotics to last-mile delivery logistics, a segment where cost-cutting and speed remain key challenges. Despite the sector's volatility, investors are betting that Coco's full-stack software and early commercial traction can differentiate it in a growing market. This article first appeared on GuruFocus.
Yahoo
38 minutes ago
- Yahoo
Home Depot Targeted in Los Angeles Immigration Raids
The Home Depot, Inc. (NYSE:HD) is one of the best Dow stocks to invest in. Recently, the company has become a focal point in the recent federal immigration raids and the protests that followed in Los Angeles. On June 6, federal agents targeted a Home Depot in the Westlake area, along with other sites like Ambiance Apparel in downtown L.A., resulting in dozens of arrests. The arrests near The Home Depot, Inc. (NYSE:HD) involved day laborers hired by the store's customers, such as homeowners and contractors who often rely on undocumented workers for home repairs and construction. A The Home Depot, Inc. (NYSE:HD) spokesperson confirmed that the company was not informed about the raids beforehand and was not involved in the enforcement actions. The Atlanta-based retailer now faces challenges as its stores have become common targets for raids, which may discourage customers. On June 9, Home Depot's shares fell 0.6%, closing at $36.20. While we acknowledge the potential of HD as an investment, we believe certain AI stocks offer greater upside potential and carry less downside risk. If you're looking for an extremely undervalued AI stock that also stands to benefit significantly from Trump-era tariffs and the onshoring trend, see our free report on the best short-term AI stock. READ NEXT: and Disclosure. None. Sign in to access your portfolio