
Why Data Curation Is The Key To Enterprise AI
Nick Burling, Senior Vice President of Product at Nasuni.
All the enterprise customers and end users I'm talking to these days are dealing with the same challenge. The number of enterprise AI tools is growing rapidly as ChatGPT, Claude and other leading models are challenged by upstarts like DeepSeek. There's no single tool that fits all, and it's dizzying to try to analyze all the solutions and determine which ones are best suited to the particular needs of your company, department or team.
What's been lost in the focus on the latest and greatest models is the paramount importance of getting your data ready for these tools in the first place. To get the most out of the AI tools of today and tomorrow, it's important to have a complete view of your file data across your entire organization: the current and historical digital output of every office, studio, factory, warehouse and remote site, involving every one of your employees. Curating and understanding this data will help you deploy AI successfully.
The potential of effective data curation is clear in the development of self-driving cars. Robotic vehicles can rapidly identify and distinguish between trees and cars in large part because of a dataset called ImageNet. This collection contains more than 14 million images of common everyday objects that have been labeled by humans. Scientists were able to train object recognition algorithms on this data because it was curated. They knew exactly what they had.
Another example is the use of machine learning to identify early signs of cancer in radiological scans. Scientists were able to develop these tools in part because they had high-quality data (radiological images) and a deep understanding of the particulars of each image file. They didn't attempt to develop a tool that analyzed all patient data or all hospital files. They worked with a curated segment of medical data that they understood deeply.
Now, imagine you're managing AI adoption and strategy at a civil engineering firm. Your goal is to utilize generative AI (GenAI) to streamline the process of creating proposals. And you've heard everyone in the AI world boasting about how this is a perfect use case.
A typical civil engineering firm is going to have an incredibly broad range of files and complex models. Project data is going to be multimodal—a mix of text, video, images and industry-specific files. If you were to ask a standard GenAI tool to scan this data and produce a proposal, the result would be garbage.
But let's say all this data was consolidated, curated and understood at a deeper level. Across tens of millions of files, you'd have a sense of which groups own which files, who accesses them often, what file types are involved and more. Assuming you had the appropriate security guardrails in place to protect the data, you could choose a tool specifically tuned for proposals and securely give that tool access to only the relevant files within your organization. Then, you'd have something truly useful that helps your teams generate better, more relevant proposals faster.
Even with curation, there can be challenges. Let's say a project manager (PM) overseeing multiple construction sites wants to use a large language model (LLM) to automatically analyze daily inspection reports. At first glance, this would seem to be a perfect use case, as the PM would be working with a very specific set of files. In reality, though, the reports would probably come in different formats, ranging from spreadsheets to PDFs and handwritten notes. The dataset might include checklists or different phrasings representing the same idea.
A human would easily recognize this collected data as variations of a site inspection report, but a general-purpose LLM wouldn't have that kind of world or industry knowledge. A tool like this would likely generate inaccurate and confusing results. Yet, having curated and understood this data, the PM would still be in a much better position. They'd recognize early that the complexity and variation in the inspection reports would lead to challenges and save the organization the expense and trouble of investing in an AI tool for this application.
The opportunities that could grow out of organization-wide data curation stretch far beyond specific departmental use cases. Because most of your organization's data resides within your security perimeter, no AI model has been trained on those files. You have a completely unique dataset that hasn't yet been mined for insights. You could take the capabilities of the general AI models developed in training on massive, general datasets and (with the right security framework in place) fine-tune them to your organization's unique gold mine of enterprise data.
This is already happening at an industry scale. The virtual paralegal Harvey has been fine-tuned on curated legal data, including case law, statutes, contracts, legal briefs and the rest. BioBERT, a model optimized for medical research, was trained on a curated dataset of biomedical texts. The researchers who developed this tool did so because biomedical texts have such a particular or specific language.
Whether you want to embark on an ambitious project to create a fine-tuned model or select the right existing tool for a department or project team's needs, it all starts with data curation. In this period of rapid change and model evolution, the one constant is that if you don't know what sort of data you have, you're not going to know how to use it.
Forbes Technology Council is an invitation-only community for world-class CIOs, CTOs and technology executives. Do I qualify?

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles
Yahoo
19 minutes ago
- Yahoo
AI-driven search ad spending set to surge to $26 billion by 2029, data shows
By Jaspreet Singh (Reuters) -Spending on AI-powered search advertising is poised to surge to nearly $26 billion by 2029 from just over $1 billion this year in the U.S., driven by rapid adoption of the technology and more sophisticated user targeting, data from Emarketer showed on Wednesday. Companies that rely on traditional keyword-based search ads could experience revenue declines due to the growing popularity of AI search ads, which offer greater convenience and engagement for users, according to the research firm. WHY IT'S IMPORTANT Search giants such as Alphabet-owned Google and Microsoft's Bing have added AI capabilities to better compete with chatbots such as OpenAI's ChatGPT and Perplexity AI, which provide users with direct information without requiring to click through multiple results. Apple is exploring the integration of AI-driven search capabilities into its Safari browser, potentially moving away from its longstanding partnership with Google. The report has come as concerns grew about users increasingly turning to the chatbots for conversational search and AI-powered search results could upend business models of some companies. Online education firm Chegg said in May that it would lay off about 248 employees as it looks to cut costs and streamline operations because students are using AI-powered tools including ChatGPT over traditional edtech platforms. QUOTE "Publishers and other sites are feeling the pain from AI search. As they lose out on traffic, we're seeing publishers lean into subscriptions and paid AI licensing deals to bolster revenue," Emarketer analyst Minda Smiley said. GRAPHIC CONTEXT AI search ad spending is expected to constitute nearly 1% of total search ad spending this year and 13.6% by 2029 in the U.S., according to Emarketer. Sectors such as financial services, technology, telecom, and healthcare are embracing AI as they are seeing clear advantages in using the technology to enhance their ad strategies, while the retail industry's adoption is slow, the report said. Google recently announced the expansion of its AI-powered search capabilities into the consumer packaged goods sector through enhancements in Google Shopping. Error in retrieving data Sign in to access your portfolio Error in retrieving data Error in retrieving data Error in retrieving data Error in retrieving data


Forbes
20 minutes ago
- Forbes
Will AI Really Take Your Job? Experts Reveal The True Outlook Today
Experts break down what AI job loss really means — and why reinvention, not just replacement, is the ... More future of work. Last month in a CNN interview with Anderson Cooper, Dario Amodei, CEO of Anthropic — the AI company behind Claude — issued a warning that landed like a thunderclap in Silicon Valley and beyond. In what sounded almost like an apocalyptic future for workers around the globe, the 42-year billionaire predicted that within five years, AI could automate away up to 50% of all entry-level white-collar jobs. It was a jarring prediction, even for an industry accustomed to provocative soundbites. The quote quickly ricocheted across news outlets, igniting headlines and debates about the economic future of billions. CNN, notably, cast the comments in a more skeptical light, asking whether dire forecasts about AI are becoming self-fulfilling. Others, like Axios, highlighted the fear among young professionals who are just beginning to understand how automation might shadow their careers. 'This is coming faster than people think,' Amodei noted in his interview with Cooper, echoing concerns that have been quietly escalating across the AI industry in recent times. For recent graduates, entry-level workers and companies just beginning to embrace automation, the warning felt less like a forecast and more like a countdown. But is it true? Experts across telecom, software and enterprise architecture suggest a more nuanced reality. Yes, AI is changing work — faster than ever before. But this isn't just a story of job loss. It's also about reinvention, overcorrection and the uniquely human skills machines still struggle to replicate. 'Any industrial or technology revolution results in job loss. This has happened many times over,' said Andy Thurai, Field CTO at Cisco, in an interview with me. 'What's different this time is the speed. The AI hype cycle is moving much faster than anything we've seen before.' Dima Gutzeit, founder and CEO of LeapXpert, echoed this sentiment in an interview. 'We're entering a high-speed workforce transformation,' he said. 'What's different this time? The pace. Automation used to take decades — now it's happening in quarters.' In other words, AI isn't fundamentally new. But the compressed timeline between research breakthroughs and enterprise deployment has never been this short. Cloud-native architectures and API-first models have made it easier to scale new tools organization-wide in months, not years. The fear of missing out — or FOMO — is also pushing organizations, many of which aren't even ready for such a pivot, to integrate AI into their workflows. Essentially, too many things are happening so fast in the AI space that it feels almost inevitable that it could cause disruptions larger than the scale that was seen when mobile popped onto the scene in the early 2000s. However, there's more to this revolution than FOMO-ladened messages of doom. And as Allison Morrow noted in her poignant analysis on CNN Business, the narrative about a 'white-collar bloodbath' is likely another part of the AI hype machine. Klarna made headlines in 2024 when it replaced 700 customer support agents with an AI chatbot. But it quietly brought back some of those roles in early 2025, realizing customers preferred human support to AI. Why? Because the bots weren't flawless, as industry experts continue to warn. Many companies are trimming senior teams and hoping AI-enhanced mid-level hires can close the gap. But just like in Klarna's case, it's not always working. 'The results have been mixed so far,' said Thurai. 'The pendulum always swings wide. Companies get seduced by cost savings and forget about institutional memory and strategic insight.' The math doesn't always check out. Generative AI tools still struggle with hallucinations, context retention and compliance guardrails. And in industries like finance and healthcare, these flaws aren't just bugs — they're liabilities. Another big concern across sectors isn't whether jobs will be lost, but about who gets to keep them. 'A skilled digital worker can be replaced by someone with less expertise but greater AI proficiency,' Thurai said. In other words, AI adoption across organizations is creating a new kind of talent gap; one not defined by degrees, but by fluency in these AI tools, which are evolving faster than education systems can keep up. But Thurai also noted that augmenting certain human roles with AI is 'a perceived cost savings that could backfire.' It's, therefore, necessary for business leaders to keep in mind that diving into the AI ocean two feet first could be catastrophic in the end rather than beneficial. Organizations need the right dose of innovation and caution. Yes, there are likely routines that could be automated right away, but businesses must also stop to count the potential costs of such automation. As Gutzeit noted, 'automation without strategy is dangerous.' He added that this is especially true for regulated industries where 'AI needs a human firewall.' Nowhere is this more evident than in telecom. Arnd Baranowski, founder and CEO of Oculeus, explains that while AI has become essential to fraud detection, it still needs human judgment. 'AI allows telecom providers to analyze massive volumes of traffic well beyond human capacity. But when fraudsters adopt unpredictable new methods, only humans can anticipate the shift. That requires imagination — and that's something AI lacks.' The risk of over-reliance is real. 'Telcos that downsize their fraud prevention teams too aggressively risk becoming less capable of stopping fraud altogether,' Baranowski warned. This hybrid approach — AI as analyst, human as strategist — is becoming the new normal across industries. According to Gutzeit, while AI is indeed replacing and redefining routine, entry-level roles, with two-thirds of companies expecting to add AI-related roles, it opens the door to higher-value, human-centric work. 'Smart companies are building AI-augmented teams that are more productive, more consistent, and more client-focused. And they're not stopping at tools — they're investing in people who know how to orchestrate AI to elevate results,' he said. For Artin Avanes, head of core data platform at Snowflake, AI is not a net destroyer of jobs. He likens today's moment to Snowflake's own rise. 'We disrupted traditional business intelligence teams. Suddenly, business users could do analytics without IT. Some roles disappeared. But most evolved,' Avanes said. 'The same thing is happening now. AI won't erase people. It will change what they do.' His concern is less about job loss and more about organizational readiness. 'The biggest bottleneck to AI adoption isn't talent. It's infrastructure. You need secure, compliant access to the right data. Without that, no AI agent — no matter how smart — can work.' Thurai believes many of the more dramatic claims from AI vendors serve a strategic purpose. 'Obviously, the AI providers — Anthropic, OpenAI, consultants — have to say extreme things to gain attention and instill FOMO,' he said. 'But there are people like IBM's CEO with a more realistic picture of the future.' Yes, AI will cause job losses. But it will also create roles — including data scientists, prompt engineers, AI governance experts — that didn't exist five years ago. In a clap back at Amodei, American billionaire and investor Mark Cuban wrote on social platform Bluesky that 'someone needs to remind the CEO that at one point there were more than 2 million secretaries. There were also separate employees to do in-office dictation. They were the original white-collar displacements.' Cuban further noted that 'new companies with new jobs will emerge,' adding that 'people have to stop whining and start preparing.' Still, the math is sobering: those new roles won't fully replace the sheer volume of jobs displaced. There is no perfect one-to-one exchange. Which is why, even amid the hype, preparation matters more than ever. Maybe not. But the person who knows how to use it might. The big message from the experts is for global workers to move beyond the realm of FOMO into really understanding how to leverage AI tools for improved efficiency. As Avanes put it: 'AI isn't here to optimize systems. It's here to free people to focus on what matters. The question is whether we'll let it.' For Gutzeit, this an urgent call to reskill the global workforce. 'The traditional career ladder is being cut off at the bottom. If we don't reskill aggressively, we risk locking out an entire generation from meaningful career starts,' he said.


Business Wire
an hour ago
- Business Wire
HubSpot Launches First CRM Deep Research Connector With ChatGPT
BOSTON--(BUSINESS WIRE)--More than 250,000 businesses rely on HubSpot as their single source of truth for customer data across marketing, sales, and service. This complete view of the customer journey gives our customers an edge, especially in an era where AI is only as powerful as the data behind it. Today, we're excited to announce that HubSpot is the first CRM to launch a deep research connector with ChatGPT. With over 75% of HubSpot customers already using ChatGPT*, we're making it easy for them to apply powerful, doctorate-level research and analysis to their own customer data and context–and to put those business insights to work. This is game-changing for go-to-market teams. Within ChatGPT, for example: Marketers can ask 'find my highest-converting cohorts from recent contacts and create a tailored nurture sequence to boost engagement,' then use the insights to launch an automated workflow in HubSpot. Sales teams can find new opportunities by asking, 'segment my target companies by annual revenue, industry, and technology stack. Based on that, identify the top opportunities for enterprise expansion,' then bring them back to HubSpot for prospecting. Customer success teams can say, 'identify inactive companies with growth potential and generate targeted plays to re-engage and revive pipeline,' then take those actions in HubSpot to drive retention. Support teams can say, 'analyze seasonal patterns in ticket volume by category to forecast support team staffing needs for the upcoming quarter,' and activate Breeze Customer Agent in HubSpot to handle spikes in support tickets. 'Launching the HubSpot deep research connector means businesses and their employees get faster, better insights because ChatGPT has more context. We're thrilled to work together to bring powerful AI to many of today's most important workflows.' - Nate Gonzalez, Head of Business Products at OpenAI. 'The HubSpot connector is like having an extra analyst on the team, empowering sales reps to identify risks, opportunities, and next best actions,' said Colin Johnson, Senior Manager, CRM at Youth Enrichment Brands. 'For a non-technical user, the fact that it's easy to use and talks directly to my data is huge.' 'We're building tools that help businesses lead through the AI shift, not just adapt to it,' said Karen Ng, SVP of Product and Partnerships at HubSpot. 'By connecting HubSpot CRM data directly to ChatGPT, even small teams without time or data resources can run deep analysis and take action on those insights — fueling better outcomes across marketing, sales, and service.' Easy to use and easy to trust HubSpot customers who have admin controls can enable the connector for their organization by going to ChatGPT and turning on the HubSpot deep research connector function, selecting HubSpot as a data source, and authenticating their account. From there, any user in the organization can toggle it on, sign in, and start asking questions. In addition to being easy to use, the HubSpot deep research connector is also easy to trust. We built it to ensure users only see the CRM data they're allowed to access in HubSpot. For example, individual sales reps will only see pipeline data for deals they own or manage. With the HubSpot deep research connector, customer data is not used for AI training in ChatGPT. Availability The HubSpot deep research connector will automatically be available to all HubSpot customers across all tiers with a paid ChatGPT plan. (EU: Team, Enterprise, and Edu; all other regions: Team, Enterprise, Pro, Plus, and Edu). All available languages can be found here. *HubSpot 2025 Q1 AI customer sentiment survey About HubSpot HubSpot (NYSE: HUBS) is the leading AI-powered customer platform for growing businesses. The platform includes engagement hubs for marketing, sales, and customer service, a connected Smart CRM, and an ecosystem of over 1,800 integrations—all built on a unified data foundation that powers HubSpot's AI and enables smarter, faster, more personalized customer experiences. More than 250,000 customers across 135+ countries use HubSpot to unify their data, align their teams, and grow better. HubSpot is headquartered in Cambridge, Massachusetts. Learn more at