
ByteDance's Dolphin OCR Sets New Benchmark in Document AI
ByteDance has unveiled 'Dolphin', an OCR model released under an MIT licence designed to revolutionise document processing by combining layout analysis and parsing in a unified workflow. This new tool is poised to enhance accuracy and adaptability across complex document types, marking a major advancement in optical character recognition.
Dolphin operates by first analysing the document layout—identifying paragraphs, tables, figures and formulas—and then parsing each section in parallel, a method experts describe as 'analyze‑then‑parse'. The model architecture aligns with Donut, a document-oriented vision‑language model, but excels by integrating a two‑step pipeline that improves both structural understanding and text recognition efficiency.
Since its LinkedIn announcement, Dolphin and its source code were published on GitHub and the Hugging Face Hub. Industry users, including practitioners from the Transformers community, have actively benchmarked it, noting its strong performance for structured documents containing scientific equations and dense layouts. Initial commentary suggests Dolphin matches or outperforms contemporaries like Donut and DocFormer in speed and layout robustness.
ADVERTISEMENT
This release underscores ByteDance's expanding role in document‑AI, under its BytePlus technology brand. BytePlus has been promoting OCR and translation capabilities via ModelArk, targeting finance, small business, logistics and automatable workflows. With OCR projected to become a US$43 billion market by 2032, growth driven by demand in banking, healthcare and supply chain sectors, Dolphin arrives at a critical juncture for industry needs.
Key to Dolphin's innovation is layout‑first processing. By segmenting a document before interpreting textual content, it reduces errors particularly on documents with heterogeneous formats. As noted by Merve Noyan and others, this approach facilitates precise parsing of tables, mathematical notation, captions and images. Early adopters are testing its effectiveness on complex scientific papers and structured forms, areas where traditional OCR solutions frequently falter.
ByteDance enters a crowded landscape of emerging OCR tools. Nanonets' small model supports markdown and LaTeX; MonkeyOCR from Huazhong University follows a structure‑recognition‑relation paradigm; and giants like Google, Microsoft and IBM continue to offer strong enterprise OCR services. Yet Dolphin distinguishes itself through open‑source licensing and its advanced pipeline, potentially accelerating adoption and collaborative development.
Despite its promise, Dolphin's real-world strengths remain to be quantified. Benchmarks comparing its accuracy, latency and resource usage against established commercial solutions are limited. Additionally, performance under varying document quality—such as low‑resolution scans, handwriting or languages beyond English—has not been fully validated. Experts expected comparative benchmarks; however ByteDance has not yet released detailed evaluations.
ByteDance's broader AI portfolio supports the strategic placement of Dolphin within an integrated multimodal stack. The firm's other recent innovations include Seed 1.5‑VL, a state‑of‑the‑art vision‑language model acclaimed for visual reasoning, GUI interaction and OCR applications, and the Doubao chatbot, enhanced with visual‑language capabilities for real‑time analysis in video calls. Together, these systems showcase ByteDance's ambition to lead in both document‑centric and broad visual‑language AI.
ADVERTISEMENT
By open‑sourcing Dolphin, ByteDance enables community collaboration and integration into platforms like Hugging Face, where machine learning engineers are already adapting the model into tools such as Transformers, vLLM and Docext. This contrasts with more proprietary offerings, opening pathways for wider testing, research, and adaptation in niche domains such as regulatory compliance, legal document processing or academic publishing.
Adoption of Dolphin benefits organisations aiming to automate complex documentation tasks—ranging from invoice reconciliation and regulatory filings to academic publishing and insurance claims. The layout‑aware model structure enhances recognition and data extraction accuracy, while the permissive licence removes traditional barriers to deployment. Its integration into BytePlus also enables developers to tap into scalable API and cloud‑based services, suited for finance, logistics and SME segments.
However, absorption of Dolphin into enterprise systems will depend on rigorous validation. Leading market players—like ABBYY, Adobe Acrobat and Microsoft Azure—continue to set high standards in OCR performance, ecosystem support and regulatory compliance. ByteDance must supply detailed performance tests, language support, and enterprise‑grade features to compete effectively. Furthermore, addressing security, data privacy and accuracy in edge‑case layouts remains vital.
The emergence of Dolphin reflects an accelerating trend: OCR is evolving beyond simple character reading into intelligent document understanding powered by AI and visual‑language paradigms. As the global OCR market approaches an estimated US$43 billion, technologies like Dolphin are expanding the frontier of what automated document systems can achieve.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Zawya
5 hours ago
- Zawya
MIT Jameel Clinic and CSAIL launch new AI model accelerating the future of drug discovery ‘Boltz-2'
Cambridge, Massachusetts – The Jameel Clinic, the epicentre of artificial intelligence (AI) and health at the Massachusetts Institute of Technology (MIT), announced today the release of Boltz-2 — a groundbreaking artificial intelligence model which will transform the speed and accuracy of drug discovery. The announcement was made together with the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL) and the biotechnology company Recursion. Boltz-2 breaks new ground by jointly modelling both structure and binding affinity, a critical parameter in small molecule drug discovery. A big leap for small molecule drug discovery Boltz-2 builds on the success of Boltz-1, a pioneering model first released in 2024 that can determine protein structures, by adding a powerful new ability: accurately predicting how strongly a drug molecule will bind to a target protein — a crucial factor in determining its effectiveness. In doing so, Boltz-2 addresses one of the most complex challenges in early-stage drug development. Boltz-2's affinity module was trained on millions of real lab measurements, showing how strongly different molecules bind to proteins. Thanks to this, Boltz-2 can now predict binding strength with unprecedented accuracy across several benchmarks reflecting different stages of real-world drug discovery. Boltz-2's predictions come very close to those produced by full-physics free energy perturbation (a precise computer simulation that predicts how strongly a drug sticks to its target, but that can take up to a day to run one test even on a GPU) – at over 1,000 times the speed. It is the first deep learning model to deliver that level of precision. Saro Passaro, researcher at the MIT Jameel Clinic and co-lead of the Botlz-2 project, said: 'This release is especially significant for small molecule drug discovery, where progress has lagged behind the rapid gains seen in biologics and protein engineering. 'While models like AlphaFold and Boltz-1 allowed a significant leap in the computational design of antibodies and protein-based therapeutics, we have not seen a similar improvement in our ability to screen small molecules, which make up the majority of drugs in the global pipeline. 'Boltz-2 directly addresses this gap by providing accurate binding affinity predictions that can dramatically reduce the cost and time of early-stage screening.' Gabriele Corso, PhD student at MIT CSAIL and one of the lead researchers behind Boltz-1 and Boltz-2, said: 'This performance increase makes Boltz-2 not just a research tool, but a practical engine for real-world drug development. 'Instead of spending hours simulating the interaction between a single molecule and its target, scientists can now screen vast chemical libraries within the same time frame, enabling early-stage teams to prioritise only the most promising compounds for lab testing.' Open-source and optimised for medical research The Boltz-2 model also introduces a new feature, Boltz-Steering, which refines molecular structure predictions and make them more realistic. This allows researchers to guide the model using experimental data, example structures, or design goals — giving them greater control and customisability in their search for new treatments. Boltz-2 will be released as a fully open-source model under the MIT licence, including the code, weights and training data, enabling researchers around the world to freely access and build upon its capabilities. A breakthrough for the MIT Jameel Clinic and CSAIL The model represents a major milestone in an ambitious research programme launched in early 2023 by the MIT Jameel Clinic and CSAIL. The team set out to develop a machine learning system that could not only predict the 3D shape of proteins — like AlphaFold — but also understand how and why molecules interact, as well as how likely they are to bind to each other. This deeper understanding is essential for designing effective new therapies, particularly for diseases caused by molecular dysfunction. Boltz-1, released in 2024, was the first result of that effort. Created as a fast, accessible alternative to AlphaFold3, Boltz-1 quickly became the most widely adopted open-source tool of its kind, used by thousands of scientists across academia, biotech startups, and pharmaceutical companies. It demonstrated that open and interpretable models could rival the best in the field. Now, with Boltz-2, the MIT team is taking the next step — targeting small molecule drug discovery, an area that has historically lagged behind biologics and protein engineering in terms of computational tools. Boltz-2 is the latest milestone in MIT Jameel Clinic's growing portfolio of open-source tools for health, developed at the intersection of AI and medicine — and part of a broader mission by the Jameel Clinic to make cutting-edge technology accessible for solving the world's most pressing health challenges. The team includes MIT Jameel Clinic AI faculty lead Professor Regina Barzilay; MIT CSAIL principal investigator Professor Tommi Jaakkola; PhD students Gabriele Corso and Jeremy Wohlwend; MIT Jameel Clinic researcher Saro Passaro; as well as additional collaborators from Recursion. About Jameel Clinic: The Jameel Clinic is the epicentre of artificial intelligence (AI) and healthcare at MIT. It works to develop AI technologies that will change the landscape of healthcare. This includes early diagnostics, drug discovery, care personalisation and management. Building on MIT's pioneering history in artificial intelligence and life sciences, the Jameel Clinic works on novel algorithms suitable for modelling biological and clinical data across a range of modalities including imaging, text and genomics. While achieving this goal, the team strives to make new discoveries in machine learning, biology, chemistry and clinical sciences. The Jameel Clinic was co-founded in 2018 by MIT and Community Jameel, the independent, global organisation advancing science to help communities thrive in a rapidly changing world. About Community Jameel: Community Jameel advances science and learning for communities to thrive. An independent, global organisation, Community Jameel was launched in 2003 to continue the tradition of philanthropy and community service established by the Jameel family of Saudi Arabia in 1945. Community Jameel supports scientists, humanitarians, technologists and creatives to understand and address pressing human challenges in areas such as climate change, health and education. The work enabled and supported by Community Jameel has led to significant breakthroughs and achievements, including the MIT Jameel Clinic's discovery of the new antibiotics Halicin and Abaucin, critical modelling of the spread of COVID-19 conducted by the Jameel Institute at Imperial College London, and a Nobel Prize-winning experimental approach to alleviating global poverty developed by the co- founders of the Abdul Latif Jameel Poverty Action Lab at MIT.


Hi Dubai
5 hours ago
- Hi Dubai
Grow Big on a Small Budget: Smart Digital Marketing for Dubai Businesses
What if you could grow your business in Dubai without breaking the bank? In a city where skyscrapers compete for attention and innovation drives progress, standing out in the business world can feel like a daunting challenge. Dubai's market is a vibrant mix of opportunity and competition, with businesses vying for the attention of a diverse, affluent and tech-savvy audience. For small and medium enterprises (SMEs) and startups, the pressure to make every dirham count is real. Lavish marketing campaigns might work for global giants, but for most local businesses, cost-effective strategies are the key to sustainable growth. Dubai's business landscape is unique. It blends local traditions with a global outlook, attracting entrepreneurs from every corner of the world. The city's residents and visitors, from Emiratis to expatriates, form a multicultural consumer base with varied preferences. This diversity demands marketing that is smart, targeted and budget-conscious. Cost-effective digital marketing isn't about cutting corners; it's about maximizing impact with minimal spend. By leveraging free tools, local insights and strategic approaches, businesses can thrive without draining their resources. Explore the actionable, budget-friendly digital marketing strategies tailored for Dubai businesses. From understanding your audience to mastering local SEO and harnessing the power of social media, these tips will help SMEs and startups compete in Dubai's fast-paced market. Know Your Audience: Local Market Insight Success in Dubai's market starts with knowing who you're talking to. The city's population is a melting pot of cultures, with over 200 nationalities calling it home. This diversity shapes consumer behavior, making it critical to understand your target audience. Are you catering to affluent Emiratis, young expat professionals or budget-conscious families? Each group has distinct preferences and tailoring your marketing to their needs is essential. Data and analytics are your best friends here. Tools like Google Analytics and social media insights can reveal who's engaging with your brand. For example, you might find that your audience is mostly young professionals in Dubai Marina who prefer Instagram over LinkedIn. Use this data to refine your messaging and focus your efforts. Surveys or feedback forms can also provide direct insights into customer preferences without costing a fortune. Cultural sensitivity is non-negotiable. Dubai's consumers value respect for local traditions, even in a cosmopolitan city. Avoid content that could be seen as culturally insensitive and consider language preferences. While English is widely used, Arabic content can resonate deeply with Emirati and Arab expat audiences. Bilingual posts or ads can broaden your reach while showing respect for the local culture. Digital marketing doesn't require a hefty budget when you tap into free or affordable tools. Social media platforms like Instagram, LinkedIn and TikTok are wildly popular in the UAE, with Instagram leading the pack for consumer engagement. These platforms offer free accounts and tools to create professional content. For instance, Instagram Stories and Reels let you showcase your brand creatively without spending a dime. Google My Business is a must for any Dubai business. It's free, easy to set up and boosts your visibility on Google Search and Maps. A complete profile with photos, hours and contact details can drive foot traffic to your store or office. Canva is another gem, offering free templates for eye-catching graphics, from social media posts to flyers. Meta Business Suite lets you manage your Facebook and Instagram accounts, schedule posts and track performance; all at no cost. For SEO, tools like Ubersuggest and AnswerThePublic offer free or low-cost plans to identify keywords and trending topics. These tools help you understand what your audience is searching for, allowing you to optimize your website or blog content. By combining these tools, you can build a robust digital presence without straining your budget. Content Marketing on a Budget Content is king, even on a shoestring budget. High-value content like blogs, videos and infographics can attract and engage customers without costing a fortune. Start a blog on your website to share tips, industry insights or local stories relevant to your audience. For example, a Dubai fitness studio could write about 'Top 5 Outdoor Workout Spots in Dubai' to draw in health-conscious residents. Repurposing content is a game-changer. Turn a blog post into a series of Instagram posts, a short TikTok video or an email newsletter. This stretches your content's reach without extra effort. Tools like Canva or Adobe Express (free versions) make it easy to create visuals that pop. Video content doesn't need to be expensive either; smartphone cameras and free editing apps like CapCut can produce professional-looking clips. User-generated content is a goldmine. Encourage customers to share photos or reviews of your product or service, then feature them on your platforms. A café in Al Barsha could ask patrons to post their favorite coffee moment with a branded hashtag. Testimonials and reviews add authenticity and build trust, especially when shared on social media or your website. Choose the Right Platforms Not all platforms suit every business. B2B companies, like consultancies in DIFC, thrive on LinkedIn for professional networking. B2C businesses, such as restaurants or retail, do better on Instagram or TikTok, where visuals drive engagement. Research where your audience spends time and focus your efforts there. Organic Growth Tactics Hashtags are your friend in Dubai's social media scene. Use location-specific tags like #DubaiEats or #DubaiFitness to reach local users. Reels and live sessions are powerful for engagement. A boutique could host a live styling session on Instagram, showcasing new arrivals. Consistency is key; post regularly to stay top-of-mind. Micro-Influencer Collaborations Dubai's influencer market is booming, but you don't need a celebrity budget. Micro-influencers (1,000–10,000 followers) often have engaged, niche audiences. A small spa in JLT could partner with a local wellness blogger for a review or giveaway. These partnerships are affordable and deliver targeted exposure. Local SEO & Google Maps Optimization Local SEO is a lifeline for Dubai businesses, especially those relying on foot traffic. When someone searches 'coffee shop near me' or 'best salon in Dubai,' you want your business to appear. Optimizing your Google Business Profile is the first step. Ensure your profile is complete with accurate details, high-quality photos and regular updates like posts about promotions or events. Encourage customers to leave reviews. Positive reviews boost your ranking and build trust. Respond to reviews, good or bad, to show you value feedback. Use location-based keywords on your website and content, like 'Dubai Marina gym' or 'Deira tailor.' These terms help you rank higher in local searches, driving more clicks and visits. Google Maps is a powerful tool in Dubai, where navigation is key for residents and tourists. A well-optimized profile ensures your business pops up when someone searches for services in your area. Regularly update your profile with new photos or offers to stay relevant. Email Marketing and WhatsApp Campaigns Email marketing remains a cost-effective way to nurture customer relationships. Start by building an email list organically: offer a discount or free resource (like an eBook) in exchange for sign-ups. Tools like Mailchimp or Sender offer free plans for small lists, with templates for professional emails. Personalize your emails with the recipient's name or tailored offers to boost engagement. WhatsApp Business is a Dubai favorite due to its widespread use. It's free and allows direct, personal communication. Share updates, promotions or appointment reminders with customers. For example, a salon could send a quick message about last-minute slots. Keep messages concise and professional to avoid spamming your audience. Automation saves time. Use tools to schedule emails or WhatsApp broadcasts, ensuring consistent communication without daily effort. Track open rates and clicks to refine your campaigns over time. Budget-Friendly Paid Ads Paid ads don't have to be expensive. Platforms like Meta (Facebook and Instagram) and Google Ads offer flexible budgets, letting you start small. Geo-targeting is a must in Dubai; focus your ads on specific areas like Bur Dubai or by language to reach your ideal audience. A restaurant in Downtown Dubai could target ads to nearby residents or office workers. A/B testing is critical to maximize ROI. Create two versions of an ad with different headlines or images, then see which performs better. Start with a small budget, analyze results and scale up what works. Google Ads' keyword planner helps you find affordable, high-traffic keywords to bid on, ensuring your ad spend delivers results. Collaboration is a low-cost way to expand your reach. Partner with complementary businesses, like a gym teaming up with a healthy café for a joint promotion. Cross-promote through social media shoutouts, shared content or bundled offers. A pet store and a vet clinic could co-host a pet care workshop, splitting costs and doubling exposure. Community engagement is powerful in Dubai. Join local business groups or participate in events like markets or festivals. Host a giveaway with another business to attract new customers. These strategies build relationships and expose your brand to new audiences without hefty marketing costs. Tracking ROI and Adjusting Tactics You can't improve what you don't measure. Free tools like Google Analytics and Meta Insights track website visits, ad performance and social media engagement. Set clear KPIs, like website traffic or lead conversions, to gauge success. For example, a retail store might aim for 100 new Instagram followers monthly. Regularly review your data. If a strategy isn't working, like low engagement on TikTok, pivot to what is, like Instagram Reels. Flexibility is key in Dubai's dynamic market. Use insights to refine your campaigns, ensuring your budget delivers maximum impact. Final Tips for Staying Competitive on a Budget Stay on Top of Trends Dubai's digital landscape evolves fast. Follow local influencers or industry pages on X to spot emerging platforms or tactics. For instance, short-form video content is booming; adapt by creating quick, engaging clips. Invest Time, Not Just Money DIY marketing can save thousands. Learn basic graphic design or SEO through free online courses. Outsource only when necessary, like for complex ad campaigns, to keep costs low. Consistency Over Flash A consistent online presence trumps sporadic, expensive campaigns. Post regularly, engage with followers and update your Google profile to stay visible, even with limited resources. For SMEs and startups, the key is to act smart, not just spend big. A consistent, targeted digital presence builds trust and loyalty in Dubai's fast-moving market. Start small, experiment, and scale what works. With these cost-effective strategies, your business can shine as brightly as Dubai's skyline. Also Read: Influencer Marketing in Dubai: A Guide for Businesses to Drive ROI Unlock ROI with our practical guide to Influencer Marketing in Dubai. Learn strategies, navigate UAE regulations, identify key players & avoid pitfalls for business growth. Insights into the Growing Influencer Marketing Scene in MENA Influencers leverage multiple platforms, channels, and media to creatively connect with their followers, while brands seek successful collaborations to tap into these audiences. The Benefits of Investing in Online Advertising and Google Ads This article explores the many benefits of online advertising, especially how Google Ads can revolutionize your marketing strategy and help your business flourish in the digital age Dubai's Thriving E-commerce Market: How to Start Your Online Business Are you fascinated by Dubai and its remarkable advancements so far? Check out this comprehensive guide on achieving E-commerce success in the UAE. Top Advertising Agencies in Dubai Advertising firms in Dubai play a multifaceted role in the business landscape by creating impactful campaigns that enhance brand recognition, drive sales, and cultivate customer loyalty


Arabian Post
17 hours ago
- Arabian Post
ByteDance's Dolphin OCR Sets New Benchmark in Document AI
ByteDance has unveiled 'Dolphin', an OCR model released under an MIT licence designed to revolutionise document processing by combining layout analysis and parsing in a unified workflow. This new tool is poised to enhance accuracy and adaptability across complex document types, marking a major advancement in optical character recognition. Dolphin operates by first analysing the document layout—identifying paragraphs, tables, figures and formulas—and then parsing each section in parallel, a method experts describe as 'analyze‑then‑parse'. The model architecture aligns with Donut, a document-oriented vision‑language model, but excels by integrating a two‑step pipeline that improves both structural understanding and text recognition efficiency. Since its LinkedIn announcement, Dolphin and its source code were published on GitHub and the Hugging Face Hub. Industry users, including practitioners from the Transformers community, have actively benchmarked it, noting its strong performance for structured documents containing scientific equations and dense layouts. Initial commentary suggests Dolphin matches or outperforms contemporaries like Donut and DocFormer in speed and layout robustness. ADVERTISEMENT This release underscores ByteDance's expanding role in document‑AI, under its BytePlus technology brand. BytePlus has been promoting OCR and translation capabilities via ModelArk, targeting finance, small business, logistics and automatable workflows. With OCR projected to become a US$43 billion market by 2032, growth driven by demand in banking, healthcare and supply chain sectors, Dolphin arrives at a critical juncture for industry needs. Key to Dolphin's innovation is layout‑first processing. By segmenting a document before interpreting textual content, it reduces errors particularly on documents with heterogeneous formats. As noted by Merve Noyan and others, this approach facilitates precise parsing of tables, mathematical notation, captions and images. Early adopters are testing its effectiveness on complex scientific papers and structured forms, areas where traditional OCR solutions frequently falter. ByteDance enters a crowded landscape of emerging OCR tools. Nanonets' small model supports markdown and LaTeX; MonkeyOCR from Huazhong University follows a structure‑recognition‑relation paradigm; and giants like Google, Microsoft and IBM continue to offer strong enterprise OCR services. Yet Dolphin distinguishes itself through open‑source licensing and its advanced pipeline, potentially accelerating adoption and collaborative development. Despite its promise, Dolphin's real-world strengths remain to be quantified. Benchmarks comparing its accuracy, latency and resource usage against established commercial solutions are limited. Additionally, performance under varying document quality—such as low‑resolution scans, handwriting or languages beyond English—has not been fully validated. Experts expected comparative benchmarks; however ByteDance has not yet released detailed evaluations. ByteDance's broader AI portfolio supports the strategic placement of Dolphin within an integrated multimodal stack. The firm's other recent innovations include Seed 1.5‑VL, a state‑of‑the‑art vision‑language model acclaimed for visual reasoning, GUI interaction and OCR applications, and the Doubao chatbot, enhanced with visual‑language capabilities for real‑time analysis in video calls. Together, these systems showcase ByteDance's ambition to lead in both document‑centric and broad visual‑language AI. ADVERTISEMENT By open‑sourcing Dolphin, ByteDance enables community collaboration and integration into platforms like Hugging Face, where machine learning engineers are already adapting the model into tools such as Transformers, vLLM and Docext. This contrasts with more proprietary offerings, opening pathways for wider testing, research, and adaptation in niche domains such as regulatory compliance, legal document processing or academic publishing. Adoption of Dolphin benefits organisations aiming to automate complex documentation tasks—ranging from invoice reconciliation and regulatory filings to academic publishing and insurance claims. The layout‑aware model structure enhances recognition and data extraction accuracy, while the permissive licence removes traditional barriers to deployment. Its integration into BytePlus also enables developers to tap into scalable API and cloud‑based services, suited for finance, logistics and SME segments. However, absorption of Dolphin into enterprise systems will depend on rigorous validation. Leading market players—like ABBYY, Adobe Acrobat and Microsoft Azure—continue to set high standards in OCR performance, ecosystem support and regulatory compliance. ByteDance must supply detailed performance tests, language support, and enterprise‑grade features to compete effectively. Furthermore, addressing security, data privacy and accuracy in edge‑case layouts remains vital. The emergence of Dolphin reflects an accelerating trend: OCR is evolving beyond simple character reading into intelligent document understanding powered by AI and visual‑language paradigms. As the global OCR market approaches an estimated US$43 billion, technologies like Dolphin are expanding the frontier of what automated document systems can achieve.