
From generative AI and multimodal AI to digital twins, AI explains itself
2. Multimodal AI
Above Gemini, developed by Google Deepmind, is a multimodal AI that can interpret various inputs such as images, text, code and video ()
Multimodal AI refers to AI systems that can understand and process multiple types of information or 'modes' at once, such as text, images and audio. This allows them to combine different information sources to understand situations more comprehensively.
Think of it like how humans use multiple senses: we hear words, see facial expressions and notice body language simultaneously to understand a conversation. Similarly, multimodal AI might analyse a video by recognising spoken dialogue, understanding visual actions and processing any text that appears on screen. This creates a richer understanding than any single mode could provide alone.
You might experience multimodal AI in tools like ChatGPT Vision or Google Gemini, where you can upload an image and get detailed text-based analysis or ask questions about it, seamlessly blending visuals and language.
Read more: AI leaders are unlocking the future with technology 3. AI companions
Above A user customises an avatar for their personal AI chatbot Replika through a smartphone app ()
AI companions are virtual assistants or digital friends powered by AI. These applications are designed to interact with you in a human-like manner; they can chat, help with tasks, answer questions or even provide entertainment.
Unlike simple tools, they aim to offer a sense of relationship and personalisation. They learn your preferences over time, adapting to your needs and habits.
A virtual assistant on your phone might notice you typically set morning alarms and eventually begin suggesting them at your usual times. Some more advanced companions can even engage in casual conversation or remember details from past interactions to create a more continuous relationship.
Yet, there is some controversy surrounding the impact of the AI companion bots on their users. In 2024, a mother in the US filed a lawsuit against Character Technologies, Inc, claiming that its bot contributed to her son's suicide. She alleged that the teenager openly discussed suicidal thoughts and engaged in highly sexual conversations with it.
Read more: Meet Moflin, Casio's emotional support robot, and other AI companions 4. Computer vision
Above A display showing the image processed by Quantum AI security camera at a booth during the Mobile World Congress 2024 in Spain ()
Computer vision enables machines to 'see' and make sense of visual information from the world around them. Just as our eyes capture images and our brains interpret them, computer vision systems use cameras to capture visual data and algorithms to understand what they are looking at.
This field encompasses a range of capabilities: image classification, object detection, image segmentation and facial recognition. It has transformed numerous industries, from autonomous vehicles that rely on computer vision to navigate safely to medical imaging systems that help doctors spot diseases in X-rays and MRIs with greater accuracy.
Read more: How AI Guided's Florence Chan helps the visually impaired navigate the world better with a smart belt 5. Digital twins
Above A tablet displays the buildings at The Bund in Shanghai using a digital twin app ()
Digital twins are virtual replicas of real-world objects, systems or processes. Imagine having a digital model of your car, your house or even an entire factory that mirrors how the real thing operates and behaves.
They have become increasingly valuable across industries to improve efficiency and predict potential issues. For example, in luxury real estate, digital twins can preview smart home configurations or model interior lighting throughout the day before the property is ever built.
Read more: The AI book scraping issue explained 6. Synthetic media
Synthetic media refers to content generated or altered using AI and other digital technologies. This includes images, videos, audio and text that are either entirely created by AI or manipulated to create new, realistic experiences.
You might encounter synthetic media in social media filters that transform your appearance in real time or in music, where AI can compose songs that sound remarkably like popular artists performed them.
Above A CBC news video explores the potential misuse of AI-generated videos in elections (Video: CBC News)
While this technology presents creative opportunities, it also raises significant concerns regarding authenticity and trust. In some instances, individuals have been deceived by synthetic media featuring the voices of their loved ones in distress, leading them to transfer thousands of dollars before realising they had been scammed.
Read more: 7 most expensive AI art pieces ever sold 7. Prompt engineering
Above A user typed a prompt to inquire about OpenAI's ChatGPT ()
Prompt engineering is the skill of crafting questions or instructions for AI systems in ways that produce the most valuable and accurate responses.
Think of it as learning how to communicate effectively with AI. Just as you might phrase a question differently depending on whom you're asking, prompt engineering involves understanding how to frame your requests to AI systems for optimal results.
For instance, asking a vague question like 'What do you think?' might yield a generic response. However, asking something specific, such as 'What are three critically acclaimed British films released in the past two years?' is more likely to generate a precise and helpful answer.
As AI becomes more integrated into our daily lives, understanding how to communicate our needs to these systems effectively becomes an increasingly valuable skill.
Meet the Gen.T Leaders of Tomorrow from the Technology sector.
NOW READ
The hidden wisdom of investing legends: 5 powerful lessons for a prosperous life
Here's what to know about 'soft power' and Joseph Nye, the man who conceptualised it
From the world's best action camera to the worst drone: The ups and downs of GoPro
Credits
This article was created with the assistance of AI tools
Hashtags

Try Our AI Features
Explore what Daily8 AI can do for you:
Comments
No comments yet...
Related Articles


Tatler Asia
15-05-2025
- Tatler Asia
From generative AI and multimodal AI to digital twins, AI explains itself
2. Multimodal AI Above Gemini, developed by Google Deepmind, is a multimodal AI that can interpret various inputs such as images, text, code and video () Multimodal AI refers to AI systems that can understand and process multiple types of information or 'modes' at once, such as text, images and audio. This allows them to combine different information sources to understand situations more comprehensively. Think of it like how humans use multiple senses: we hear words, see facial expressions and notice body language simultaneously to understand a conversation. Similarly, multimodal AI might analyse a video by recognising spoken dialogue, understanding visual actions and processing any text that appears on screen. This creates a richer understanding than any single mode could provide alone. You might experience multimodal AI in tools like ChatGPT Vision or Google Gemini, where you can upload an image and get detailed text-based analysis or ask questions about it, seamlessly blending visuals and language. Read more: AI leaders are unlocking the future with technology 3. AI companions Above A user customises an avatar for their personal AI chatbot Replika through a smartphone app () AI companions are virtual assistants or digital friends powered by AI. These applications are designed to interact with you in a human-like manner; they can chat, help with tasks, answer questions or even provide entertainment. Unlike simple tools, they aim to offer a sense of relationship and personalisation. They learn your preferences over time, adapting to your needs and habits. A virtual assistant on your phone might notice you typically set morning alarms and eventually begin suggesting them at your usual times. Some more advanced companions can even engage in casual conversation or remember details from past interactions to create a more continuous relationship. Yet, there is some controversy surrounding the impact of the AI companion bots on their users. In 2024, a mother in the US filed a lawsuit against Character Technologies, Inc, claiming that its bot contributed to her son's suicide. She alleged that the teenager openly discussed suicidal thoughts and engaged in highly sexual conversations with it. Read more: Meet Moflin, Casio's emotional support robot, and other AI companions 4. Computer vision Above A display showing the image processed by Quantum AI security camera at a booth during the Mobile World Congress 2024 in Spain () Computer vision enables machines to 'see' and make sense of visual information from the world around them. Just as our eyes capture images and our brains interpret them, computer vision systems use cameras to capture visual data and algorithms to understand what they are looking at. This field encompasses a range of capabilities: image classification, object detection, image segmentation and facial recognition. It has transformed numerous industries, from autonomous vehicles that rely on computer vision to navigate safely to medical imaging systems that help doctors spot diseases in X-rays and MRIs with greater accuracy. Read more: How AI Guided's Florence Chan helps the visually impaired navigate the world better with a smart belt 5. Digital twins Above A tablet displays the buildings at The Bund in Shanghai using a digital twin app () Digital twins are virtual replicas of real-world objects, systems or processes. Imagine having a digital model of your car, your house or even an entire factory that mirrors how the real thing operates and behaves. They have become increasingly valuable across industries to improve efficiency and predict potential issues. For example, in luxury real estate, digital twins can preview smart home configurations or model interior lighting throughout the day before the property is ever built. Read more: The AI book scraping issue explained 6. Synthetic media Synthetic media refers to content generated or altered using AI and other digital technologies. This includes images, videos, audio and text that are either entirely created by AI or manipulated to create new, realistic experiences. You might encounter synthetic media in social media filters that transform your appearance in real time or in music, where AI can compose songs that sound remarkably like popular artists performed them. Above A CBC news video explores the potential misuse of AI-generated videos in elections (Video: CBC News) While this technology presents creative opportunities, it also raises significant concerns regarding authenticity and trust. In some instances, individuals have been deceived by synthetic media featuring the voices of their loved ones in distress, leading them to transfer thousands of dollars before realising they had been scammed. Read more: 7 most expensive AI art pieces ever sold 7. Prompt engineering Above A user typed a prompt to inquire about OpenAI's ChatGPT () Prompt engineering is the skill of crafting questions or instructions for AI systems in ways that produce the most valuable and accurate responses. Think of it as learning how to communicate effectively with AI. Just as you might phrase a question differently depending on whom you're asking, prompt engineering involves understanding how to frame your requests to AI systems for optimal results. For instance, asking a vague question like 'What do you think?' might yield a generic response. However, asking something specific, such as 'What are three critically acclaimed British films released in the past two years?' is more likely to generate a precise and helpful answer. As AI becomes more integrated into our daily lives, understanding how to communicate our needs to these systems effectively becomes an increasingly valuable skill. Meet the Gen.T Leaders of Tomorrow from the Technology sector. NOW READ The hidden wisdom of investing legends: 5 powerful lessons for a prosperous life Here's what to know about 'soft power' and Joseph Nye, the man who conceptualised it From the world's best action camera to the worst drone: The ups and downs of GoPro Credits This article was created with the assistance of AI tools

Crypto Insight
10-05-2025
- Crypto Insight
Gemini to launch crypto derivatives in Europe with new license
Gemini, the cryptocurrency exchange founded by Cameron and Tyler Winklevoss, has received regulatory approval to expand crypto derivatives trading across Europe. Gemini secured a Markets in Financial Instruments Directive II (MiFID II) license from the Malta Financial Services Authority (MFSA), allowing the exchange to offer crypto derivatives in the European Union, it announced on May 9. 'Once we commence business activities, we will be able to offer regulated derivatives throughout the EU and EEA [European Economic Area] under MiFID II,' said Gemini's head of Europe, Mark Jennings. According to the exec, the MiFID II license is a big milestone in Gemini's European expansion, putting it one step closer to offering derivatives to both retail and institutional users. Advanced traders will get perpetual futures Gemini's upcoming derivatives offering in the EU and EEA will include perpetual futures and other derivatives, which will be available to advanced users of Gemini, Jennings noted. 'Over the coming months, we will be working toward meeting the required conditions to launch these products across Europe,' he added. According to MFSA records, Gemini's Maltese entity, Gemini Intergalactic EU Artemis, was issued a license on May 8. MiCA license yet to be issued Gemini's latest license builds on the growing regulatory progress of the US-based exchange in Europe. In January, Gemini officially announced that it would choose Malta as its hub for compliance with the European Union's Markets in Crypto-Assets (MiCA) framework. The move came shortly after Gemini received its sixth European virtual asset service provider (VASP) registration from the MFSA in December 2024. However, the exchange has not yet obtained full MiCA licensing. Derivatives are a hot trend in crypto Gemini's upcoming crypto derivatives launch in Europe is yet another milestone in a growing trend toward derivatives in the global crypto industry. Coinbase, the biggest crypto exchange in the US by trading volume, on May 8 announced the $2.9 billion acquisition of Deribit, one of the world's largest crypto derivatives platforms. The deal came just a few days after rival exchange Kraken confirmed plans to purchase the derivatives trading platform NinjaTrader to offer futures trading on May 1. The firm previously said it had agreed to acquire NinjaTrader for $1.5 billion. Source:


Times
23-04-2025
- Times
Can't lick a badger twice: how AI invents meaning for nonsense sayings
You can't lick a badger twice, as the saying goes. At least, that is, according to Google's artificial intelligence assistant Gemini, which says the maxim means you 'can't trick or deceive someone a second time after they've been tricked once'. The only issue is that the proverbial badger has never once been licked, let alone twice, and the saying is nonsense invented to show how AI does not know its onions when it comes to English adages. The tech giant's chatbot has been found to frequently 'hallucinate' when asked to return the meaning behind made-up sayings. Instead of pointing out that the question itself is wrong, the algorithm accepts the saying as authentic and goes on a wild goose chase in an attempt to explain