15-05-2025
From generative AI and multimodal AI to digital twins, AI explains itself
2. Multimodal AI
Above Gemini, developed by Google Deepmind, is a multimodal AI that can interpret various inputs such as images, text, code and video ()
Multimodal AI refers to AI systems that can understand and process multiple types of information or 'modes' at once, such as text, images and audio. This allows them to combine different information sources to understand situations more comprehensively.
Think of it like how humans use multiple senses: we hear words, see facial expressions and notice body language simultaneously to understand a conversation. Similarly, multimodal AI might analyse a video by recognising spoken dialogue, understanding visual actions and processing any text that appears on screen. This creates a richer understanding than any single mode could provide alone.
You might experience multimodal AI in tools like ChatGPT Vision or Google Gemini, where you can upload an image and get detailed text-based analysis or ask questions about it, seamlessly blending visuals and language.
Read more: AI leaders are unlocking the future with technology 3. AI companions
Above A user customises an avatar for their personal AI chatbot Replika through a smartphone app ()
AI companions are virtual assistants or digital friends powered by AI. These applications are designed to interact with you in a human-like manner; they can chat, help with tasks, answer questions or even provide entertainment.
Unlike simple tools, they aim to offer a sense of relationship and personalisation. They learn your preferences over time, adapting to your needs and habits.
A virtual assistant on your phone might notice you typically set morning alarms and eventually begin suggesting them at your usual times. Some more advanced companions can even engage in casual conversation or remember details from past interactions to create a more continuous relationship.
Yet, there is some controversy surrounding the impact of the AI companion bots on their users. In 2024, a mother in the US filed a lawsuit against Character Technologies, Inc, claiming that its bot contributed to her son's suicide. She alleged that the teenager openly discussed suicidal thoughts and engaged in highly sexual conversations with it.
Read more: Meet Moflin, Casio's emotional support robot, and other AI companions 4. Computer vision
Above A display showing the image processed by Quantum AI security camera at a booth during the Mobile World Congress 2024 in Spain ()
Computer vision enables machines to 'see' and make sense of visual information from the world around them. Just as our eyes capture images and our brains interpret them, computer vision systems use cameras to capture visual data and algorithms to understand what they are looking at.
This field encompasses a range of capabilities: image classification, object detection, image segmentation and facial recognition. It has transformed numerous industries, from autonomous vehicles that rely on computer vision to navigate safely to medical imaging systems that help doctors spot diseases in X-rays and MRIs with greater accuracy.
Read more: How AI Guided's Florence Chan helps the visually impaired navigate the world better with a smart belt 5. Digital twins
Above A tablet displays the buildings at The Bund in Shanghai using a digital twin app ()
Digital twins are virtual replicas of real-world objects, systems or processes. Imagine having a digital model of your car, your house or even an entire factory that mirrors how the real thing operates and behaves.
They have become increasingly valuable across industries to improve efficiency and predict potential issues. For example, in luxury real estate, digital twins can preview smart home configurations or model interior lighting throughout the day before the property is ever built.
Read more: The AI book scraping issue explained 6. Synthetic media
Synthetic media refers to content generated or altered using AI and other digital technologies. This includes images, videos, audio and text that are either entirely created by AI or manipulated to create new, realistic experiences.
You might encounter synthetic media in social media filters that transform your appearance in real time or in music, where AI can compose songs that sound remarkably like popular artists performed them.
Above A CBC news video explores the potential misuse of AI-generated videos in elections (Video: CBC News)
While this technology presents creative opportunities, it also raises significant concerns regarding authenticity and trust. In some instances, individuals have been deceived by synthetic media featuring the voices of their loved ones in distress, leading them to transfer thousands of dollars before realising they had been scammed.
Read more: 7 most expensive AI art pieces ever sold 7. Prompt engineering
Above A user typed a prompt to inquire about OpenAI's ChatGPT ()
Prompt engineering is the skill of crafting questions or instructions for AI systems in ways that produce the most valuable and accurate responses.
Think of it as learning how to communicate effectively with AI. Just as you might phrase a question differently depending on whom you're asking, prompt engineering involves understanding how to frame your requests to AI systems for optimal results.
For instance, asking a vague question like 'What do you think?' might yield a generic response. However, asking something specific, such as 'What are three critically acclaimed British films released in the past two years?' is more likely to generate a precise and helpful answer.
As AI becomes more integrated into our daily lives, understanding how to communicate our needs to these systems effectively becomes an increasingly valuable skill.
Meet the Gen.T Leaders of Tomorrow from the Technology sector.
NOW READ
The hidden wisdom of investing legends: 5 powerful lessons for a prosperous life
Here's what to know about 'soft power' and Joseph Nye, the man who conceptualised it
From the world's best action camera to the worst drone: The ups and downs of GoPro
Credits
This article was created with the assistance of AI tools