logo
Top 5 Decentralized Data Collection Providers In 2025 For AI Business

Top 5 Decentralized Data Collection Providers In 2025 For AI Business

Forbes02-05-2025

Adam Selipsky CEO of Amazon Web Service (AWS), speaking at the Keynote: Delivering a new World, ... More Barcelona, Spain, on March 01 2022. (Photo by Joan Cros/NurPhoto via Getty Images)
The world runs on data, and businesses increasingly rely on it. However, traditional data sourcing methods often present challenges related to diversity, transparency, privacy, and cost. This article reviews the current state of decentralized data collection and outlines key steps for wisely selecting a decentralized data provider—along with a shortlist of top options to consider.
Traditionally, centralized data collection involves gathering data from various sources—such as apps, devices, or websites—and sending it to a single central server or database controlled by one organization. This data is collected via APIs, sensors, tracking tools, or manual input. The biggest bottleneck of this model for AI's future and for businesses is the inability to collect truly 'global' and 'diverse' data from different regions and cultures. Decentralized data collection addresses this by leveraging blockchain technology. It enables small-scale cross-border payments which encourages global users to contribute data voluntarily in exchange for incentives—something that centralized or Web2 platforms cannot achieve.
Another key aspect is transparency. Centralized AI and data collection are often criticized for operating as " black boxes," lacking transparency and accountability. People have no idea how and where they collect these data for their business. Furthermore, it's difficult to verify whether data is collected lawfully and ethically. In contrast, decentralized data collection enhances transparency by recording the data collection process on blockchain and storing data across multiple independent nodes rather than under a single authority. This blockchain-powered structure allows users to trace how and where their data is used efficiently, reduces the risk of hidden manipulation, and ensures that no single party can alter or monopolize the data without broad consensus.
As a result, decentralized solutions are emerging as a strong alternative for businesses seeking more robust data strategies. By leveraging blockchain technology, decentralized data collection enhances both data diversity and verifiability, opening access to new, previously untapped data sources.
Businesses interested in exploring decentralized data collection should:
Below are five noteworthy platforms operating in the decentralized data collection space, outlining their core functionalities and potential business applications.
Core offering: Decentralized data marketplace for AI and ML datasets.
Strengths:
Best for: Anyone looking to buy/sell datasets or run compute-to-data workloads.
Example: access a specific medical imaging dataset to train a diagnostic AI, with the data provider maintaining control over the data itself.
Website: https://oceanprotocol.com/
Core offering: Decentralized knowledge agent platform and AI data marketplace.
Strengths:
Best for: AI developers looking to build autonomous agents trained on community-owned or enterprise-specific knowledge bases.
Example: Collect a large and diverse dataset of user reviews to train a sentiment analysis AI agent.
Website: https://oceanprotocol.com/
Core Offering: Decentralized data collection and labeling solution for AI.
Strengths:
Best For: Enterprises needing diverse, real-world, and structured datasets to train or fine-tune AI models.
Example: Collect a 50-language and high-quality dataset for a specialized natural language processing AI.
Website:https://www.oortech.com/oort-datahub-b2b
Core offering: Decentralized platform for users to control, monetize, and pool personal data for AI.
Strengths:
Best for: Building AI models with ethically sourced, user-consented personal data, especially in social, health, and lifestyle domains.
Example: Users can leverage Vana to own, control, and monetize their personal data by contributing it to community-led AI projects
Website: https://www.vana.com
Core offering: Real-time data network for decentralized data streams.
Strengths:
Best for: AI systems that rely on live data feeds like autonomous vehicles, smart cities, or trading bots.
Example: If your AI business focuses on predicting traffic patterns, you could use Streamr to access real-time data feeds from connected vehicles and sensors.
Website: https://streamr.network/
As AI continues to scale, the true bottleneck won't be algorithms—it will be data. Success in the coming wave of AI innovation hinges on timely access to high-quality, well-labeled, and diverse datasets. Yet, efficient data collection infrastructure remains in its infancy. Forward-thinking organizations that invest in scalable, ethical, and AI-ready decentralized data collection solutions now will be the ones leading the industry tomorrow. The age of intelligent data sourcing isn't a trend—it's the next mainstream.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Automakers face challenges in managing software-defined vehicles at scale
Automakers face challenges in managing software-defined vehicles at scale

Yahoo

time11 hours ago

  • Yahoo

Automakers face challenges in managing software-defined vehicles at scale

This story was originally published on Automotive Dive. To receive daily news and insights, subscribe to our free daily Automotive Dive newsletter. NOVI, MICHIGAN — With the auto industry's shift toward building more connected vehicles powered by software continuously updated over-the-air, OEMs are rapidly moving from hardware-centric vehicle development processes to a software-first approach. This pivot also includes the integration of AI and adoption of a cloud-based development environment for software-based vehicles. However, to support this transition, legacy automakers still face challenges in data management and technology integration, according to a recent panel discussion on the topic at the AutoTech 2025 conference in Michigan. The panel, which was moderated by Maite Bezerra, principal analyst for software-defined vehicles at Wards Intelligence, included industry experts from Bosch, Stellantis, Toptal, and the Scalable Open Architecture for Embedded Edge (SOAFEE) industry group, which is working with automakers to expedite development of software-defined vehicles. SOAFEE aims to create an open source vehicle platform using cloud-native architecture that supports multiple hardware configurations. 'SOAFEE is really kind of more about bringing some of the modern software techniques to automotive software development,' said panelist Robert Day, the group's governing body representative. 'Over the last couple or three years, people are actually starting to do their development in the cloud using the tools, technologies and methodologies that are well developed and well used in cloud development.' Although adopting a cloud-based software development approach is a common practice for developers working in the tech space, it's an entirely new field for some legacy automakers. "The problem is the car is not the cloud,' said Day. 'It has things like safety and things like mutual physicality, heterogeneous computing.' The software development challenges for automakers also create the need for OEMs to recruit top talent to integrate the technology into next-generation vehicles, often from outside of the industry. Some companies are providing services to expedite such recruitment. For example, Toptal operates a freelancing platform that connects companies with in-demand software engineers and other technology specialists. 'We have a lot of partners in the automotive space,' said panelist Paul Timmermann, VP of product at Toptal. Stellantis is one of the automakers encountering the challenges of shifting towards SDVs for its future vehicles. "We [automakers] are always hardware first, and now the switch is happening to, you know, software, and then comes the hardware," said panelist Sangeeta Theru, director of virtual validation platforms at Stellantis. 'Tools, processes…everything is changing,' she said. Theru also highlighted the importance of training internal teams at Stellantis, adding that the automaker recently launched "big training on AWS cloud and architecture' for employees. 'There was a lot of effort in upskilling and training internal people,' she said. A major driver of increasing vehicle complexity is automakers launching more advanced driver assist systems and autonomous driving functionality using AI-powered software, according to the panelists. Vehicles with automated driving capabilities, for example, are equipped with dozens of cameras and sensors, generating "many, many terabytes of data" for a single car, scaling to "well beyond petabytes" across large fleets, explained panelist Steven Miller, product management of ADAS and technical expert at Bosch. 'Clearly you're not going to upload all of that data,' he said. 'The other even harder data problem is okay, what's the right data to upload to the cloud?' With rollout of more advanced autonomous driving features, automakers need to be adept at processing and merging extremely large data sets. One of these challenges is processing high volumes of vehicle data in real-time, as well as making it more manageable to transfer to and from the cloud. Automakers must also decide which vehicle data to upload to the cloud to train AI models. Therefore, the panelists emphasized the need for OEMs to create efficient data pipelines to manage this complexity. The panelists also foresee AI being integrated into other vehicle systems, such as remote diagnostics and infotainment. The use of AI will also likely extend to corporate organizational processes. "This is one of the most transformational shifts that we are seeing in the automotive industry," said Bezerra. The panel discussion also delved into automakers' adopting open source software with a higher level of standardization to reduce development times and costs. In November 2024, Panasonic Automotive Systems and Arm announced a collaboration to standardize automotive architecture. The two companies said they recognized the need for the industry to shift from a hardware-centric to a software-first development model to address challenges created by high-cost, vendor-specific proprietary interfaces for vehicles. While the use of open source automotive software has traditionally been met with caution due to safety and liability concerns, an April 2025 report from the Eclipse Foundation found a significant jump in industry appetite to use it for safety-critical vehicle systems. According to the report, 79% of automotive software professionals currently use open source tools and/or in-vehicle software for development, and the number of users actively contributing to open source projects increased by 4% from last year. The big advantage of open source is it provides a standard between companies, explained Day. 'If you're starting to use open standard or open source, it makes that collaboration easier,' he said. Day also highlighted another long-term strategy decision facing OEMs. "What would you choose to open source first? What would you actually keep in-house?" he said. Despite the prospects of adopting open source software for vehicles, the panelists acknowledged that some key areas needed more attention, including cybersecurity. This area is even more critical for automated driving and connected infotainment systems that can be used to pay for goods and services, such as EV charging sessions. Day raised a critical point about security. 'I don't think it's placed enough attention to, and certainly don't think [automakers] spend enough money on it,' he said. According to chipmaker Arm, a modern vehicle can have up to 650 million lines of code, and this number will only increase in the future. But software will revolutionize how drivers interact with their vehicles and redefine the relationship between OEMs and vehicle owners, according to the company. Disclosure: AutoTech2025 is run by Informa, which owns a controlling stake in Informa TechTarget, the publisher behind Automotive Dive. Informa has no influence over Automotive Dive's coverage. Recommended Reading Panasonic Automotive Systems, Arm team up on SDV standardization

If I Could Buy Only 1 Artificial Intelligence Stock Over the Next Year, Amazon Would Be It, but Here's the Key Reason
If I Could Buy Only 1 Artificial Intelligence Stock Over the Next Year, Amazon Would Be It, but Here's the Key Reason

Yahoo

time12 hours ago

  • Yahoo

If I Could Buy Only 1 Artificial Intelligence Stock Over the Next Year, Amazon Would Be It, but Here's the Key Reason

There are several reasons to like Amazon as a long-term investment right now. AWS could be a particularly massive growth driver for the rest of the decade and beyond. The cloud computing business could drive market-beating performance all by itself. 10 stocks we like better than Amazon › There are some excellent artificial intelligence (AI) stocks you can buy right now. However, my favorite -- and largest AI play in my own portfolio -- is Amazon (NASDAQ: AMZN). To be sure, there are a lot of reasons why I like Amazon as a long-term investment. E-commerce still represents less than one-fifth of all U.S. retail, and there's massive international expansion potential for the business, just to name a few pluses. But the No. 1 reason I love the stock is Amazon Web Services (AWS) and its potential to drive profits higher over the next decade. AWS makes up less than 20% of Amazon's revenue, but it's the fastest-growing, most profitable part of the company. Despite accounting for less than one-fifth of sales, as noted, AWS was responsible for 63% of the company's operating income in the first quarter. However, this could be just the beginning. The global cloud computing market is expected to roughly triple in size by 2030, compared with 2024 levels. Assuming AWS simply maintains its current market share, this means that AWS revenue could rise from $107.6 billion in 2024 to about $342 billion in 2030. If Amazon can maintain its current operating margin for AWS (it's likely the margin will improve as the business scales), this would result in about $87 billion in additional annual operating income just from AWS. This alone would likely drive excellent stock returns -- and that's on top of any value added through profit increases from the retail side. Before you buy stock in Amazon, consider this: The Motley Fool Stock Advisor analyst team just identified what they believe are the for investors to buy now… and Amazon wasn't one of them. The 10 stocks that made the cut could produce monster returns in the coming years. Consider when Netflix made this list on December 17, 2004... if you invested $1,000 at the time of our recommendation, you'd have $657,871!* Or when Nvidia made this list on April 15, 2005... if you invested $1,000 at the time of our recommendation, you'd have $875,479!* Now, it's worth noting Stock Advisor's total average return is 998% — a market-crushing outperformance compared to 174% for the S&P 500. Don't miss out on the latest top 10 list, available when you join . See the 10 stocks » *Stock Advisor returns as of June 9, 2025 John Mackey, former CEO of Whole Foods Market, an Amazon subsidiary, is a member of The Motley Fool's board of directors. Matt Frankel has positions in Amazon. The Motley Fool has positions in and recommends Amazon. The Motley Fool has a disclosure policy. If I Could Buy Only 1 Artificial Intelligence Stock Over the Next Year, Amazon Would Be It, but Here's the Key Reason was originally published by The Motley Fool Error while retrieving data Sign in to access your portfolio Error while retrieving data Error while retrieving data Error while retrieving data Error while retrieving data

Amazon joins the big nuclear party, buying 1.92 GW for AWS
Amazon joins the big nuclear party, buying 1.92 GW for AWS

Yahoo

timea day ago

  • Yahoo

Amazon joins the big nuclear party, buying 1.92 GW for AWS

Amazon tapped into an emerging trend this week, one that's seeing big tech firms buy power from existing nuclear power plants. The tech company will power a chunk of its AWS cloud and AI servers using 1.92 gigawatts of electricity from Talen Energy's Susquehanna nuclear power plant in Pennsylvania. Amazon is the latest hyperscaler to go direct to big nuclear operators, following on the heels of Microsoft and Meta. Amazon's deal was announced Wednesday, but it's not entirely new, instead modifying an existing arrangement with Talen. The old version had Amazon building a data center next to the Susquehanna power plant, siphoning electricity directly from the facility without first sending it to the grid. That deal was killed by regulators over concerns that customers would unfairly shoulder the burden of running the grid. Today, Susquehanna provides power to the grid, meaning every kilowatt-hour includes transmission fees that support the grid's maintenance and development. Amazon's behind-the-meter arrangement would have sidestepped those fees. This week's revisions shift Amazon's power purchase agreement in front of the meter, meaning the AWS data center will be billed like other similar customers who are grid-connected. The transmission lines will be reconfigured in spring of 2026, Talen said, and the deal covers energy purchased through 2042. But wait, there's more: The two companies also said they will look to build small modular reactors 'within Talen's Pennsylvania footprint' and expand generation at existing nuclear power plants. Expanding existing power plants is typically an easier way to add new nuclear. They might include switching to more highly enriched fuel to produces more heat, tweaking the settings to squeeze out more power, or renovating the turbines for a bigger bump. Microsoft kicked off the trend last year when it announced that it would work with Constellation Energy to restart a reactor at Three Mile Island, a $1.6 billion project that will generate 835 megawatts. Meta hopped aboard earlier this month, also with Constellation, to buy the 'clean energy attributes' of a 1.1 gigawatt nuclear power plant in Illinois. Amazon and Talen's pledge to build new small modular reactors is a longer shot, though there, too, Amazon is in good company with its peers. Several startups pursuing the concept with the hopes of cutting construction costs by mass producing parts. Amazon has invested in an SMR startup, X-energy, which is planning to add 300 megawatts of nuclear generating capacity in the Pacific Northwest and Virginia. New generation at existing reactors and new SMRs are intended 'to add net-new energy to the PJM grid,' Talen said, referring to the region's grid operator. That last bit is likely a bid to head off any criticism from regulators about leaving ratepayers holding the bag.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store