
Sourcing video data for AI training: overcoming the challenges of scale, safety and representation
To unlock these benefits, however, organisations must always err on the side of responsible use. Most importantly, this also extends to the data used to power and train AI models.
Understanding data transparency and quality
As a recent Amnesty International report highlighted, there needs to be transparency around the use of AI-powered video surveillance.
However, to do so, it's vital that we go back a stage and closely consider the quality and origin of the data that an AI model has been trained and fine-tuned on. As AI-enabled video is rolled out, developers mustn't fall into the same trap as their colleagues working on the large language models (LLM) on which generative AI depends, namely, the challenge of sourcing enough data for AI model training. LLM developers, unfortunately, find themselves paying increasing amounts for large data sets, due to their scarcity, lack of privacy measures, or facing litigation from rights holders. Bias from unrepresentative data may have also tainted the implementations of AI, with knock-on impacts on people's trust in AI and the insights delivered.
Building accurate video AI models
In order to mitigate many of these challenges before they spread, considering how to gain access to high-quality, responsibly sourced, visual data to train AI video models is crucial. The datasets used to train AI models need to be representative, diverse to ensure accuracy and fairness, and legally sourced to respect data owners' IP rights. This is not a simple task to obtain, especially when dealing with sensors such as cameras that can collect a lot of personal or confidential information.
One solution to this challenge is Project Hafnia; a platform developed by Milestone Systems in partnership with NVIDIA, leveraging NVIDIA NeMo Curator and AI model. Project Hafnia enables data generators to share and utilise their data and allows developers to access traceable and regulatory-compliant annotated video data, which thereby can be used to train AI models.
One of the first data generators to the platform is the American city of Dubuque, Iowa. Along with AI analytics company Vaidio, Milestone built a collaborative visual language model that transformed Dubuque's raw video, via anonymization and curation, into powerful training material, improving AI accuracy from 80% to over 95%. This leap forward has enabled smarter traffic management, quicker emergency responses, and stronger public safety. All done responsibly and without massive infrastructure overhauls.
With Milestone's recent acquisition of brighter AI, a company specializing in anonymization solutions, a further layer of data privacy has been added to Project Hafnia. Thus, brighter AI's technology automatically detects a personal identifier such as a face or a license plate and generates a synthetic replacement.
Consolidating and curating data from multiple data generators is one way for developers to obtain enough visual data on to develop accurate AI models to detect events such as vandalism, vehicle accidents, and traffic flow.
Synthetic data for hard-to-gather data sets
Another solution comes in the form of synthetic data, which is artificially generated or augmented datasets that simulate or generalise real-world conditions. Using synthetic data, AI developers can train models on vast amounts of diverse and representative information while mitigating the ethical and legal concerns surrounding privacy and consent.
For example, in Aalborg Harbour in Denmark, training an AI model to detect individuals falling into the harbour was not possible due to the dangers that would pose to human volunteers. The dataset also needed to include a diversity of human actors such as wheelchair users. Using dummies couldn't fully capture the full complexity, either. The best solution, therefore, was synthetic data that could expand the training dataset with diverse falling scenarios, avoiding safety or ethics concerns. The AI model developed through this process shows promising results to alert rescue teams if and when a person falls into the harbour, increasing the chances of survival by minimising response times and reducing cold water exposure.
Unlocking the potential of AI in video
AI holds great promise for our cities, buildings, and individual safety. Yet, this can only be realised with AI models that fully capture the complexities of our built environments and human behaviour. Video analytics developers should explore their options when trying to build a comprehensive data set for AI model training. New, responsible options are emerging - from consolidated data gathered across many data generators to synthetic data generators. It's just a matter of where to look.
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Techday NZ
an hour ago
- Techday NZ
Exclusive: Garrett O'Hara on Mimecast's AI fight against cyber risk
In a world where cyberattacks are growing more sophisticated and frequent, organisations are increasingly focusing on what Garrett O'Hara calls the "most unpredictable element in security" - humans. Speaking during a recent interview, Garrett O'Hara, Senior Director of Solutions Engineering for APAC at Mimecast, explained how artificial intelligence (AI) is now being deployed to manage and mitigate human risk at scale. "Human risk is anything people can do that exposes an organisation to risk, either by accident or intent," he said. "Most of the time, it's not malicious - it's tiredness, deadlines, or someone trying to do their job more efficiently." He pointed out that employees often unintentionally bypass security policies under pressure. "They might upload sensitive documents to a personal drive just so they can work from home, not realising the huge risk that introduces," he added. AI tools, while offering productivity benefits, have also opened new doors for attackers. "We're seeing employees use tools like ChatGPT to summarise documents or create presentations, not realising they're potentially uploading sensitive corporate data to third-party platforms," he said. On the flip side, O'Hara said AI is a vital asset in the fight against these new types of threats. "AI is incredibly good at detecting patterns and threats that traditional methods might miss. For example, analysing URLs for slight variations that indicate a phishing attempt or identifying AI-generated scam emails." He described how phishing campaigns have become almost indistinguishable from genuine communications. "The old advice about bad grammar or strange formatting doesn't apply anymore. With AI, attackers are producing flawless emails in seconds," he said. "But the good news is that AI on the defensive side is just as powerful." Mimecast's platform uses AI throughout its stack, from sandboxing and behavioural analysis to identifying language markers in emails associated with business email compromise (BEC). "We look for those AI fingerprints - which often show up in machine-generated messages," he explained. For example, if there was an email that simulates a CEO urgently requesting staff to buy gift cards - a common BEC tactic - Mimecast's AI can intercept it. "Instead of an employee reacting to that urgency, we use AI to throw bubble wrap around them, flagging the threat before any action is taken," he said. Trust in AI is still an issue, however. "It's a double-edged sword," O'Hara acknowledged. "There's hype fatigue in cybersecurity - zero trust, now AI. And the problem is when vendors slap 'AI' onto everything, it erodes trust." He noted that some vendors rely solely on AI, which leads to high false positive rates and overburdened security teams. "AI is probability-based. Without cross-checking, it can trigger too many false alarms, and analysts burn out sifting through them," he said. "Our platform uses a layered approach - AI decisions are supported by additional checks across other systems, improving accuracy." Mimecast has gone a step further by achieving ISO certification for ethical use of AI, addressing concerns about bias and data misuse. "Transparency matters. You need to understand how the model works, especially if it goes off track," he said. "That's why we plan for machine unlearning - to rollback models if they learn something they shouldn't." Looking ahead, O'Hara envisions a future where AI acts as a sort of digital guardian angel. "Imagine a Clippy-like assistant - but useful - that knows your role, your habits, and quietly keeps you safe behind the scenes," he said. He also discussed how application programming interfaces (APIs) play a crucial role in integrating Mimecast's human risk platform with other systems. "We pull in data from HR, endpoint and identity platforms to paint a picture of risk - right down to the individual level," he explained. "If someone's on notice or switching roles, their risk profile changes. APIs help us adapt protection accordingly." Importantly, AI in cybersecurity is no longer just about detection and defence. Mimecast also uses it for prediction and prevention. "With data from 44,000 companies and billions of emails daily, our AI tools can identify emerging threats early and act before damage is done," he said. "That's where we're moving - from reactive to proactive security." But for smaller organisations, predictive security can seem out of reach. "The average Australian SMB doesn't have the budget or capacity for that level of protection," he noted. "We offer it as a service - so they benefit without the overhead." As for the future of cybersecurity training, O'Hara predicts a shift from generic instruction to highly tailored behavioural nudges. "Instead of monthly sessions, we'll see hyper-contextual, AI-generated interventions in the moment," he said. "That's the power of AI - it knows how to reach each individual in a way that resonates." He added that balancing automation with human oversight remains a key concern. "Right now, most organisations use automation to assist - not replace - analysts. And that's wise," he said. "False positives can grind a business to a halt if something like Salesforce gets blocked. But as AI improves, that balance will shift." Ultimately, he believes that the most exciting developments are still unknown. "I'm genuinely excited by what we don't yet see coming," he said. "AI has unlocked possibilities that feel like magic." And while security teams dream of AI replacing their most tedious tasks, O'Hara points out there's a long way to go. "If AI can act like Cinderella's godmother - guiding users to return home just before the stroke of midnight - then we're on the right track," he said.


Techday NZ
3 hours ago
- Techday NZ
GigaChat AI assistant achieves 93% accuracy in medical diagnoses
SberHealth's GigaChat-powered artificial intelligence assistant has demonstrated a diagnostic accuracy rate of 93% during recent tests conducted by the Artificial Intelligence Research Institute (AIRI). The experiment involved the AI healthcare assistant, which is based on the GigaChat neural network model, diagnosing 30 real clinical cases that were randomly selected from the New England Journal of Medicine. These cases varied in complexity, and the testing methodology used was similar to an experiment conducted by Microsoft to verify its own AI diagnostic orchestrator, MAI-DxO. According to AIRI, the SberHealth system established correct diagnoses in 28 out of 30 cases, while a comparable foreign solution recorded an 85% accuracy rate. The AI assistant operated with limited initial data, receiving only the patient's gender, age and symptoms before interacting through simulated doctor-patient dialogues. It followed a sequence of requesting additional clinical tests, imaging, or consultation information as needed to make diagnoses. The median number of dialogue turns between the AI and the simulated patient was three, indicating a relatively high speed of decision-making. Sergey Zhdanov, Director of the Healthcare Industry Centre at Sberbank, said: "The experiment demonstrated that our technology is not only competitive but also sets new standards in medical diagnostics worldwide. We observe how multi-agent architecture speeds up and enhances the diagnostic process. It's particularly important that the system exhibits flexibility: it revises hypotheses, requests additional data, and even responds to the emotional presentation of clinical scenarios. In the future, this opens up opportunities for interdisciplinary care teams, with AI serving as a reliable assistant to physicians." During the experiment, each clinical case was labelled by level of difficulty. The AI system was able to successfully identify and diagnose several rare conditions, including Whipple disease, which it recognised in one step, aceruloplasminemia, identified in six moves, and rasburicase-induced methemoglobinemia. The assistant's performance was characterised by several features, according to researchers. It typically completed diagnoses in three moves, deployed logical reasoning, and handled both rare and complex pathologies. The system was also noted for its ability to blend clinical accuracy with a dialogue logic that could adapt effectively to different presentation styles, which included effectively responding to emotional cues in simulated scenarios. Ivan Oseledets, Chief Executive Officer of AIRI, commented: "Today, multi-agent systems are capable of confidently identifying rare, masked pathologies that go beyond typical emergency department algorithms. Can a medical AI assistant adjust its hypothesis in time, discarding the most probable but incorrect pathway? The AI assistant proved it could, doing so faster than anticipated by a seasoned observer with 15 years of medical experience." The researchers at AIRI described the experiment as exploratory and indicated that further development is planned. They have proposed expanding the sample size by incorporating additional cases from other medical journals to investigate the capabilities of the assistant more widely. The system's potential uses were not limited solely to practical medicine but also extended to the area of physician training, where it could offer realistic simulations of complex clinical cases. The GigaChat-based assistant is a product of cooperation between AIRI and SberMedAI. Since its introduction, it has been piloted in the SberHealth app and has already been used over 160,000 times in real conditions to assist people seeking medical support. Follow us on: Share on:


Techday NZ
3 hours ago
- Techday NZ
Five ways AI is reshaping finance talent, and how CFOs are responding
CFOs are under pressure from all sides - tightening margins, rising expectations and a wave of AI-enabled systems landing in their tech stack. As ERP platforms, planning tools and reconciliation workflows become increasingly AI-driven, the finance function is evolving into something new. Automation is steadily absorbing the repetitive and transactional. In its place, finance professionals are being asked to operate at a higher level. This means combining analytical rigour with business acumen, storytelling and strategic foresight. It's no longer enough to close the books and report the numbers. The expectation now is to shape the conversation. According to IBM's 2024 CFO Study, 65% of finance leaders are under pressure to accelerate ROI across their technology investments. But technology alone won't deliver that return. What matters is how finance teams engage with it. And whether they have the skills, mindset and structures to translate new AI capabilities into business value. Here are five ways AI is reshaping the finance talent landscape, and what leading CFOs are doing to lead the shift. AI literacy is the new Excel For much of the last decade, Excel and accounting standards defined the minimum baseline of competence. That's changing. As AI becomes embedded in forecasting, reconciliation and variance detection workflows, finance professionals need a working knowledge of how these models operate, when to trust them and when to step in. It's less about coding and more about curiosity - understanding model logic, recognising data limitations and having the confidence to challenge what's presented. Without these skills, teams risk missing critical risks buried in automation. AI won't replace your team but it will redefine its value Alongside this skills shift is the question of where humans now create the most value. As automation takes over reconciliations, variance analysis and data validation, finance teams will spend less time producing information and more time turning insights into actionable outcomes. The differentiator here is how fast and effectively teams can translate machine‑generated results into business action. That demands sharper judgement, closer collaboration across functions and a willingness to take ownership of the decision‑making process. Hybrid roles are rising in value CFOs looking to expand their team's value should focus less on adding headcount and more on adding capability. The most valuable finance roles now combine technical skill with strategic contribution, blurring the line between analyst, business partner and technologist. These hybrid profiles are especially critical in functions like FP&A, where scenario modelling and rolling forecasts increasingly depend on AI-enhanced systems. Professionals who can interpret outputs, engage stakeholders and shape the planning process will deliver more value than those who simply generate reports. Centres of excellence as the new playbook Leading CFOs are taking a deliberate approach to building internal AI fluency. Many are formalising their efforts through what IBM refers to as data science centres of excellence - cross-functional groups designed to turn AI potential into tangible business outcomes. These centres bring together finance, IT and operational leaders to test emerging tools, scale successful use cases and identify skills gaps across the function. The focus is practical - execution, enablement and embedding new ways of working. According to IBM's 2024 CFO Study, leading CFOs have launched data science centres of excellence 104% more often than their peers. Yet across all CFOs, just 31% have done so. For finance teams looking to accelerate adoption, strengthen capabilities and share learnings across the business, these centres are quickly becoming a core part of the transformation playbook. Culture is either the accelerant or the bottleneck IBM's 2024 report found that 64% of CEOs say AI success depends more on people's adoption than the technology itself. This is because the shift to AI demands a new mindset, one that values experimentation, continuous learning and cross-functional thinking. But driving this kind of change requires clear modelling and reinforcement from the top. CFOs have a unique role to play in creating the cultural conditions for AI success. That means carving out space for learning, rewarding curiosity and connecting AI initiatives to business outcomes that matter to their teams. It also means acknowledging fear and resistance and actively supporting teams as they navigate the change. What it all means for CFOs CFOs leading through the AI shift understand that tools alone won't deliver the promised transformation. What matters is the mindset and capability of the people using them. As AI's tendrils continue to make their way into core finance workflows, building an AI-ready finance function means developing new skills, rethinking roles and creating the right conditions for teams to engage confidently with intelligent systems. Learn more >