logo
AI tools collect and store data about you from all your devices, here's how to be aware of what you're revealing

AI tools collect and store data about you from all your devices, here's how to be aware of what you're revealing

Like it or not, artificial intelligence has become part of daily life. Many devices, including electric razors and toothbrushes, have become "AI-powered," using machine learning algorithms to track how a person uses the device, how the device is working in real time, and provide feedback. From asking questions to an AI assistant like ChatGPT or Microsoft Copilot to monitoring a daily fitness routine with a smartwatch, many people use an AI system or tool every day. While AI tools and technologies can make life easier, they also raise important questions about data privacy. These systems often collect large amounts of data, sometimes without people even realizing their data is being collected. The information can then be used to identify personal habits and preferences, and even predict future behaviours by drawing inferences from the aggregated data.
As an assistant professor of cybersecurity at West Virginia University, I study how emerging technologies and various types of AI systems manage personal data and how we can build more secure, privacy-preserving systems for the future. Generative AI software uses large amounts of training data to create new content such as text or images. Predictive AI uses data to forecast outcomes based on past behaviour, such as how likely you are to hit your daily step goal, or what movies you may want to watch. Both types can be used to gather information about you.
How AI tools collect data
Generative AI assistants such as ChatGPT and Google Gemini collect all the information users type into a chat box. Every question, response and prompt that users enter is recorded, stored and analysed to improve the AI model. OpenAI's privacy policy informs users that "we may use content you provide us to improve our Services, for example to train the models that power ChatGPT." Even though OpenAI allows you to opt out of content use for model training, it still collects and retains your personal data. Although some companies promise that they anonymise this data, meaning they store it without naming the person who provided it, there is always a risk of data being reidentified.
Predictive AI
Beyond generative AI assistants, social media platforms like Facebook, Instagram and TikTok continuously gather data on their users to train predictive AI models. Every post, photo, video, like, share and comment, including the amount of time people spend looking at each of these, is collected as data points that are used to build digital data profiles for each person who uses the service. The profiles can be used to refine the social media platform's AI recommender systems. They can also be sold to data brokers, who sell a person's data to other companies to, for instance, help develop targeted advertisements that align with that person's interests. Many social media companies also track users across websites and applications by putting cookies and embedded tracking pixels on their computers. Cookies are small files that store information about who you are and what you clicked on while browsing a website. One of the most common uses of cookies is in digital shopping carts: When you place an item in your cart, leave the website and return later, the item will still be in your cart because the cookie stored that information. Tracking pixels are invisible images or snippets of code embedded in websites that notify companies of your activity when you visit their page. This helps them track your behaviour across the internet. This is why users often see or hear advertisements that are related to their browsing and shopping habits on many of the unrelated websites they browse, and even when they are using different devices, including computers, phones and smart speakers. One study found that some websites can store over 300 tracking cookies on your computer or mobile phone.
Data privacy controls - and limitations
Like generative AI platforms, social media platforms offer privacy settings and opt-outs, but these give people limited control over how their personal data is aggregated and monetized. As media theorist Douglas Rushkoff argued in 2011, if the service is free, you are the product. Many tools that include AI don't require a person to take any direct action for the tool to collect data about that person. Smart devices such as home speakers, fitness trackers and watches continually gather information through biometric sensors, voice recognition and location tracking. Smart home speakers continually listen for the command to activate or "wake up" the device. As the device is listening for this word, it picks up all the conversations happening around it, even though it does not seem to be active. Some companies claim that voice data is only stored when the wake word - what you say to wake up the device - is detected. However, people have raised concerns about accidental recordings, especially because these devices are often connected to cloud services, which allow voice data to be stored, synced and shared across multiple devices such as your phone, smart speaker and tablet. If the company allows, it's also possible for this data to be accessed by third parties, such as advertisers, data analytics firms or a law enforcement agency with a warrant.
Privacy rollbacks
This potential for third-party access also applies to smartwatches and fitness trackers, which monitor health metrics and user activity patterns. Companies that produce wearable fitness devices are not considered "covered entities" and so are not bound by the Health Information Portability and Accountability Act. This means that they are legally allowed to sell health- and location-related data collected from their users. Concerns about HIPAA data arose in 2018, when Strava, a fitness company released a global heat map of user's exercise routes. In doing so, it accidentally revealed sensitive military locations across the globe through highlighting the exercise routes of military personnel. The Trump administration has tapped Palantir, a company that specializes in using AI for data analytics, to collate and analyse data about Americans. Meanwhile, Palantir has announced a partnership with a company that runs self-checkout systems. Such partnerships can expand corporate and government reach into everyday consumer behaviour. This one could be used to create detailed personal profiles on Americans by linking their consumer habits with other personal data. This raises concerns about increased surveillance and loss of anonymity. It could allow citizens to be tracked and analysed across multiple aspects of their lives without their knowledge or consent.
Some smart device companies are also rolling back privacy protections instead of strengthening them. Amazon recently announced that starting on March 28, 2025, all voice recordings from Amazon Echo devices would be sent to Amazon's cloud by default, and users will no longer have the option to turn this function off. This is different from previous settings, which allowed users to limit private data collection. Changes like these raise concerns about how much control consumers have over their own data when using smart devices. Many privacy experts consider cloud storage of voice recordings a form of data collection, especially when used to improve algorithms or build user profiles, which has implications for data privacy laws designed to protect online privacy.
Implications for data privacy
All of this brings up serious privacy concerns for people and governments on how AI tools collect, store, use and transmit data. The biggest concern is transparency. People don't know what data is being collected, how the data is being used, and who has access to that data. Companies tend to use complicated privacy policies filled with technical jargon to make it difficult for people to understand the terms of a service that they agree to. People also tend not to read terms of service documents. One study found that people averaged 73 seconds reading a terms of service document that had an average read time of 29-32 minutes. Data collected by AI tools may initially reside with a company that you trust, but can easily be sold and given to a company that you don't trust. AI tools, the companies in charge of them and the companies that have access to the data they collect can also be subject to cyberattacks and data breaches that can reveal sensitive personal information. These attacks can by carried out by cybercriminals who are in it for the money, or by so-called advanced persistent threats, which are typically nation/state- sponsored attackers who gain access to networks and systems and remain there undetected, collecting information and personal data to eventually cause disruption or harm.
While laws and regulations such as the General Data Protection Regulation in the European Union and the California Consumer Privacy Act aim to safeguard user data, AI development and use have often outpaced the legislative process. The laws are still catching up on AI and data privacy. For now, you should assume any AI-powered device or platform is collecting data on your inputs, behaviours and patterns.
Using AI tools
Although AI tools collect people's data, and the way this accumulation of data affects people's data privacy is concerning, the tools can also be useful. AI-powered applications can streamline workflows, automate repetitive tasks and provide valuable insights. But it's crucial to approach these tools with awareness and caution.
When using a generative AI platform that gives you answers to questions you type in a prompt, don't include any personally identifiable information, including names, birth dates, Social Security numbers or home addresses. At the workplace, don't include trade secrets or classified information. In general, don't put anything into a prompt that you wouldn't feel comfortable revealing to the public or seeing on a billboard. Remember, once you hit enter on the prompt, you've lost control of that information. Remember that devices which are turned on are always listening - even if they're asleep. If you use smart home or embedded devices, turn them off when you need to have a private conversation. A device that's asleep looks inactive, but it is still powered on and listening for a wake word or signal. Unplugging a device or removing its batteries is a good way of making sure the device is truly off. Finally, be aware of the terms of service and data collection policies of the devices and platforms that you are using. You might be surprised by what you've already agreed to.

Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Bhavish Aggarwal's Krutrim bets on India-first AI to rival global peers
Bhavish Aggarwal's Krutrim bets on India-first AI to rival global peers

Business Standard

time2 hours ago

  • Business Standard

Bhavish Aggarwal's Krutrim bets on India-first AI to rival global peers

Krutrim, the artificial intelligence startup founded by Ola's Bhavish Aggarwal, is positioning its recently launched flagship assistant, Kruti, to stand apart from global peers like OpenAI's ChatGPT and Google's Gemini by leveraging deep local integration, multilingual capabilities, and agentic intelligence tailored to India's unique digital ecosystem. The company calls Kruti India's first agentic AI, capable of booking cabs, paying bills, and generating images while supporting 13 Indian languages using a localised large language model. In the Indian context, the firm competes with global AI giants such as OpenAI, Anthropic and Google, as well as local players such as Sarvam AI and 'Our key differentiator will come with integrating local services,' said Sunit Singh, Senior Vice-President for Product at Krutrim. 'That's not something that will be very easy for global players to do.' Krutrim has already integrated India-specific services, with plans to scale this integration further. The strategy aims to embed Kruti deeply into Indian digital life, allowing it to perform functional tasks through local service connections. This is an area where international competitors may struggle due to regulatory and infrastructural complexities in the Indian market. Voice-first As Krutrim positions Kruti to serve India's linguistically diverse population, the company is doubling down on voice-first, multilingual AI as a core enabler of scale and accessibility. Navendu Agarwal, Group CIO of Ola, emphasised that India's unique language landscape demands a fundamentally different approach from Western AI products. 'India is a voice-first world. So we are building voice-first models,' Agarwal said, outlining Krutrim's strategy to prioritise natural, speech-driven interactions. Currently, Kruti supports voice commands in multiple Indian languages, with plans underway to expand that footprint. Agarwal said the long-term vision is to enable seamless, speech-based interactions that go deeper into local dialects. The company's multilingual, voice-first design is central to its go-to-market strategy, especially in reaching non-English speakers in semi-urban and rural India. The plan also includes integrating with widely used Indian services and government platforms. Krutrim's long-term vision for Kruti centres on true agentic intelligence, where the assistant can act autonomously on behalf of users. Whether it's 'book me a cab to the airport' or 'order my usual lunch', Kruti understands intent and executes tasks without micromanagement. 'Think about it—a super agent which can do food, do apps, provide you help and education information and which can also manage your budget and finance,' said Agarwal. 'So that's what is a mega-agent, or the assistant which is communicating with all of them seamlessly wherever it is needed.' Hybrid technology Rather than relying solely on a single in-house model, Krutrim has opted for a composite approach aimed at optimising accuracy, scalability and user experience, according to Chandra Khatri, the company's Vice-President and Head of AI. 'The goal is to build the best and most accurate experience,' Khatri said. 'If that means we need to leverage, say Claude for coding, which is the best coding model in the world, we'll do that.' Kruti is powered by Krutrim's latest large language model, Krutrim V2, alongside open-source systems. The AI agents evaluate context-specific needs and choose from this suite of models to deliver tailored responses. Investments Krutrim reached unicorn status last year after raising $50 million in equity during its inaugural funding round. The round, which valued the company at $1 billion, included participation from investors such as Matrix Partners India. Earlier this year, company founder Bhavish Aggarwal announced an investment of ₹2,000 crore in Krutrim, with a commitment to invest an additional ₹10,000 crore by next year. The company also launched the Krutrim AI Lab and released some of its work to the open-source community. As Krutrim's AI assistant begins to interface with highly contextual and personal user data, the company emphasises a stringent, India-first approach to data privacy and regulatory compliance. The company employs internal algorithms to manage and isolate user data, ensuring it remains secure and compartmentalised. While Krutrim is open to competing globally, it remains committed to addressing India's market complexities first. 'We don't shy away from going global. But our primary focus is India first,' Agarwal said. Krutrim's emphasis on embedded, action-oriented intelligence—capable of not just understanding queries but also fulfilling them through integrations—could define its edge in the increasingly competitive AI landscape. Here, localisation and service depth may become as critical as raw model power.

Nearly 7,000 UK University Students Caught Cheating Using AI: Report
Nearly 7,000 UK University Students Caught Cheating Using AI: Report

NDTV

time3 hours ago

  • NDTV

Nearly 7,000 UK University Students Caught Cheating Using AI: Report

Nearly 7,000 university students in the UK were caught cheating using ChatGPT and other artificial intelligence tools during the 2023-24 academic year, according to data obtained by The Guardian. As part of the investigation, the British newspaper contacted 155 universities under the Freedom of Information Act. Of those, 131 institutions responded. The latest figures show 5.1 confirmed cases of AI-related cheating for every 1,000 students, a rise from 1.6 per 1,000 the previous year. Early projections for the current academic cycle suggest the number could climb even higher to 7.5 per 1,000 students. The growing reliance on AI tools like ChatGPT is proving to be a major challenge for higher education institutions. At the same time, cases of traditional plagiarism have dropped. From 19 per 1,000 students in 2019-20 to 15.2 last year, the number has gone down and is expected to fall further to 8.5 per 1,000. Experts warn that the recorded cases may be only scratching the surface. "I would imagine those caught represent the tip of the iceberg," said Dr Peter Scarfe, associate professor of psychology at the University of Reading. "AI detection is very unlike plagiarism, where you can confirm the copied text. As a result, in a situation where you suspect the use of AI, it is near impossible to prove, regardless of the percentage AI that your AI detector says (if you use one). This is coupled with not wanting to falsely accuse students." Evidence suggests AI misuse is far more widespread than reported. A February survey by the Higher Education Policy Institute found that 88 per cent of students admitted to using AI for assessments. Researchers at the University of Reading tested their own systems last year and found AI-generated submissions went undetected 94 per cent of the time. Online platforms are making it easier. The report found dozens of videos on TikTok promoting AI paraphrasing and essay-writing tools that help students bypass standard university detectors by "humanising" ChatGPT-generated content. Dr Thomas Lancaster, an academic integrity researcher at Imperial College London, said, "When used well and by a student who knows how to edit the output, AI misuse is very hard to prove. My hope is that students are still learning through this process." Science and technology secretary Peter Kyle told The Guardian that AI should be used to "level up" opportunities for dyslexic children. Tech giants are already targeting students as key users. Google offers university students a free 15-month upgrade to its Gemini AI tool, while OpenAI provides discounted access to students in the US and Canada.

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into the world of global news and events? Download our app today from your preferred app store and start exploring.
app-storeplay-store