logo
#

Latest news with #BytePlus

ByteDance's Dolphin OCR Sets New Benchmark in Document AI
ByteDance's Dolphin OCR Sets New Benchmark in Document AI

Arabian Post

time23-06-2025

  • Business
  • Arabian Post

ByteDance's Dolphin OCR Sets New Benchmark in Document AI

ByteDance has unveiled 'Dolphin', an OCR model released under an MIT licence designed to revolutionise document processing by combining layout analysis and parsing in a unified workflow. This new tool is poised to enhance accuracy and adaptability across complex document types, marking a major advancement in optical character recognition. Dolphin operates by first analysing the document layout—identifying paragraphs, tables, figures and formulas—and then parsing each section in parallel, a method experts describe as 'analyze‑then‑parse'. The model architecture aligns with Donut, a document-oriented vision‑language model, but excels by integrating a two‑step pipeline that improves both structural understanding and text recognition efficiency. Since its LinkedIn announcement, Dolphin and its source code were published on GitHub and the Hugging Face Hub. Industry users, including practitioners from the Transformers community, have actively benchmarked it, noting its strong performance for structured documents containing scientific equations and dense layouts. Initial commentary suggests Dolphin matches or outperforms contemporaries like Donut and DocFormer in speed and layout robustness. ADVERTISEMENT This release underscores ByteDance's expanding role in document‑AI, under its BytePlus technology brand. BytePlus has been promoting OCR and translation capabilities via ModelArk, targeting finance, small business, logistics and automatable workflows. With OCR projected to become a US$43 billion market by 2032, growth driven by demand in banking, healthcare and supply chain sectors, Dolphin arrives at a critical juncture for industry needs. Key to Dolphin's innovation is layout‑first processing. By segmenting a document before interpreting textual content, it reduces errors particularly on documents with heterogeneous formats. As noted by Merve Noyan and others, this approach facilitates precise parsing of tables, mathematical notation, captions and images. Early adopters are testing its effectiveness on complex scientific papers and structured forms, areas where traditional OCR solutions frequently falter. ByteDance enters a crowded landscape of emerging OCR tools. Nanonets' small model supports markdown and LaTeX; MonkeyOCR from Huazhong University follows a structure‑recognition‑relation paradigm; and giants like Google, Microsoft and IBM continue to offer strong enterprise OCR services. Yet Dolphin distinguishes itself through open‑source licensing and its advanced pipeline, potentially accelerating adoption and collaborative development. Despite its promise, Dolphin's real-world strengths remain to be quantified. Benchmarks comparing its accuracy, latency and resource usage against established commercial solutions are limited. Additionally, performance under varying document quality—such as low‑resolution scans, handwriting or languages beyond English—has not been fully validated. Experts expected comparative benchmarks; however ByteDance has not yet released detailed evaluations. ByteDance's broader AI portfolio supports the strategic placement of Dolphin within an integrated multimodal stack. The firm's other recent innovations include Seed 1.5‑VL, a state‑of‑the‑art vision‑language model acclaimed for visual reasoning, GUI interaction and OCR applications, and the Doubao chatbot, enhanced with visual‑language capabilities for real‑time analysis in video calls. Together, these systems showcase ByteDance's ambition to lead in both document‑centric and broad visual‑language AI. ADVERTISEMENT By open‑sourcing Dolphin, ByteDance enables community collaboration and integration into platforms like Hugging Face, where machine learning engineers are already adapting the model into tools such as Transformers, vLLM and Docext. This contrasts with more proprietary offerings, opening pathways for wider testing, research, and adaptation in niche domains such as regulatory compliance, legal document processing or academic publishing. Adoption of Dolphin benefits organisations aiming to automate complex documentation tasks—ranging from invoice reconciliation and regulatory filings to academic publishing and insurance claims. The layout‑aware model structure enhances recognition and data extraction accuracy, while the permissive licence removes traditional barriers to deployment. Its integration into BytePlus also enables developers to tap into scalable API and cloud‑based services, suited for finance, logistics and SME segments. However, absorption of Dolphin into enterprise systems will depend on rigorous validation. Leading market players—like ABBYY, Adobe Acrobat and Microsoft Azure—continue to set high standards in OCR performance, ecosystem support and regulatory compliance. ByteDance must supply detailed performance tests, language support, and enterprise‑grade features to compete effectively. Furthermore, addressing security, data privacy and accuracy in edge‑case layouts remains vital. The emergence of Dolphin reflects an accelerating trend: OCR is evolving beyond simple character reading into intelligent document understanding powered by AI and visual‑language paradigms. As the global OCR market approaches an estimated US$43 billion, technologies like Dolphin are expanding the frontier of what automated document systems can achieve.

Playing to win
Playing to win

Broadcast Pro

time22-05-2025

  • Entertainment
  • Broadcast Pro

Playing to win

At the BroadcastPro Summit KSA, Dr Mohammed Alshahrani, Country Manager KSA at BytePlus, and Arif Aabed, Senior Manager of Technology Partnerships at MBC Group, explored what's powering the region's gaming boom and how MBC's Wizzo app is shaping the scene. The Middle East is witnessing an unprecedented boom in gaming and esports, with multiple reports confirming its rapid growth. According to the MENA Gaming & eSports Summit 2023, there are more than 377m gamers across the Middle East. Statista says Egypt has the region's largest gaming population, Saudi Arabia the most gaming revenue and the UAE the highest revenue per user. An AnalysysMason report names Saudi Arabia as the Middle East leader, accounting for over 40% of MENA's overall market share, followed by the UAE. It's clear that gaming is thriving in the region. Amid this explosive growth, Dr Mohammed Alshahrani, Country Manager KSA at BytePlus, sat down for a fireside chat with Arif Aabed, Senior Manager of Technology Partnerships at MBC Group, at the BroadcastPro Summit KSA. The duo discussed the factors driving gaming in the region, and the evolving role and impact of MBC's Wizzo app. Opening the 'Playing to Win: How MBC Wizzo is engaging the swipe generation' session, Dr Alshahrani noted how MBC had been an early mover in gaming, launching the Wizzo app in 2015 before the industry had gathered momentum. Shedding light on the insights that shaped MBC's foray into gaming, Aabed said: 'We changed how we looked at Gen Z and the Alpha generation. They don't have shorter attention spans, they simply make quicker decisions about the type of content they want to consume. They also multi-screen and can consume content from up to five screens simultaneously. And finally, they prefer content they can relate to. Videos filmed on an iPhone are more appealing to them than high-quality content produced with cutting-edge technology.' 'They also want content that is interactive, relevant and can help them build a community. This is where gaming comes in. With gaming, they are no longer passive consumers. They become actively involved in its progress and direction. They must put effort into the game, which earns them bragging rights and allows them to build a community. We were able to give them all of this through Wizzo.' With the gaming industry experiencing a dramatic spike in metaverse integrations and cross-platform play that allows seamless experiences across multiple devices, Dr Alshahrani asked how MBC keeps pace with these ever-changing trends. 'The Arab world spans a huge geography, from the Middle East to North Africa. Each country has its own cultural nuances, which we incorporated into our games. We didn't just build games, we created experiences that were tailored to every individual region. For instance, Egypt is driven by humour-based content, while games for Saudi Arabia focus on the traditions of Arab warriors and the stories around them,' Aabed explained. This deep cultural integration is reflected in Wizzo's inclusion in the region's pop culture. Aabed recalled a recent viral moment: 'When Egypt's airport systems went down and the authorities started stamping travel documents manually, people started referencing our Madame Afaf game, which has a similar concept.' Another feature that makes Wizzo immensely popular with gamers is its live streaming and video uploading capability. For this, Wizzo partners with BytePlus to offer a seamless and enhanced in-app experience. 'Live streaming has become an important pillar of Wizzo's community-building offering. It's helped us capitalise on gamers who are not just bounty hunters but covet bragging rights. These people are the most invested in the game in terms of time and effort; they play with a certain showmanship and are invested in the game itself rather than just the monetary rewards it brings,' said Aabed. The popularity of video hosting and distribution was reconfirmed by Dr Alshahrani, who highlighted how Capcut, a ByteDance video editing app, is currently number three in Saudi Arabia. 'We leverage AI-driven solutions to continuously enhance product experience for our users and enterprise customers. In fact, we have a new AI-enhanced video editing tool that can transform archived videos to HD at very pocket-friendly prices.' Here, Wizzo's live chat option plays an important role too. 'There are different ways, especially in open-world games, of completing missions,' explained Aabed. 'Multiplayer games boast artistic ways of scoring points or getting the kill. All these elements serve as the starting point of many conversations, and to be able to live chat with other players or audiences at these moments makes the game even more rewarding.' Another critical factor driving Wizzo's popularity is user security. 'With a significant number of users being children or young adults, Wizzo's security measures, such as the ability to delete user data, has succeeded in earning the confidence of families,' commented Dr Alshahrani. Beyond gaming enthusiasts, Wizzo has been attracting passionate and talented stakeholders including game designers, publishers and distributors. Dr Alshahrani highlighted MBC's initiatives to support SMEs and startups in the gaming sector. By collaborating with local and regional game developers, the group creates opportunities for them to showcase their work in Saudi Arabia. Elaborating on the process, Aabed said, 'We're always working at finding the right talent to bring into Saudi Arabia, or connecting them with established companies to make relocating to Saudi Arabia lucrative. We can do this on a global scale thanks to our leading position in the media industry. As the biggest media group in the region, we give them a taste of what it's like to launch in this part of the world. We are also training people at the grassroots level. We are reaching out to the Kingdom's graduates and are trying to convince them to adopt gaming as an actual business.' Such efforts have not been without challenges. 'There are a lot of cultural issues with parents resisting their children from working in gaming, so we conduct many awareness and educational campaigns, regionally and globally.' Looking ahead, Wizzo is gearing up to launch an updated version soon. While the release date hasn't been confirmed, gamers can look forward to several new features and an enhanced immersive experience.

Starzplay and BytePlus to bring AI-powered features to streaming
Starzplay and BytePlus to bring AI-powered features to streaming

Campaign ME

time03-03-2025

  • Business
  • Campaign ME

Starzplay and BytePlus to bring AI-powered features to streaming

Subscription video on demand (SVOD) platform, Starzplay has partnered with BytePlus to introduce AI-powered features like short dramas, recommendations, and in-app shopping and deliver a smoother, more intuitive streaming experience. BytePlus' proprietary technologies will facilitate intuitive navigation across Starzplay's app, making browsing seamless. The platform will also be introducing new content genres, including short-form dramas, to further diversify its offerings. Users will also enjoy a vertical scrolling experience specifically designed to enhance the viewing experience and cater to today's on-the-go consumption habits. 'At Starzplay, our goal is to redefine the streaming experience by continuously innovating for our users. Our partnership with BytePlus reflects our commitment to leveraging cutting-edge AI technologies to make content discovery seamless and engaging,' said Maaz Sheikh, CEO of Starzplay. Through BytePlus' AI-powered content recognition and machine learning technology, Starzplay will introduce in-app item detection and generate shoppable content. Through this integration, users will be able to effortlessly browse and purchase items and clothing featured in their favourite movies and shows directly within the app for an interactive viewing experience. 'By combining our advanced machine learning and content recognition technologies with Starzplay's commitment to innovation, we're enabling a more personalised, interactive, and seamless viewing experience. Together, we're setting a new benchmark for streaming in the region, making every user interaction intuitive and meaningful,' said Li Long, MEA General Manager of BytePlus Further enhancing personalisation, Starzplay will combine its existing AI personalisation engine with BytePlus' machine learning capabilities to offer enhanced recommended carousels and advanced algorithms. This hybrid approach aims to enhance the Starzplay viewer experience by delivering even more tailored content recommendations. 'With these new features, we are introducing transformative ways for users to connect with their favourite shows and movies, delivering an entertainment experience that is more intuitive, personalised, and interactive than ever before,' said Sheikh. Starzplay will roll out these features from Q2 this year.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store