logo
PolyU-led research reveals that sensory and motor inputs help large language models represent complex concepts

PolyU-led research reveals that sensory and motor inputs help large language models represent complex concepts

Malay Mail09-06-2025
A research team led by Prof. Li Ping, Sin Wai Kin Foundation Professor in Humanities and Technology, Dean of the PolyU Faculty of Humanities and Associate Director of the PolyU-Hangzhou Technology and Innovation Research Institute, explored the similarities between large language models and human representations, shedding new light on the extent to which language alone can shape the formation and learning of complex conceptual knowledge.
HONG KONG SAR - Media OutReach Newswire - 9 June 2025 - Can one truly understand what "flower" means without smelling a rose, touching a daisy or walking through a field of wildflowers? This question is at the core of a rich debate in philosophy and cognitive science. While embodied cognition theorists argue that physical, sensory experience is essential to concept formation, studies of the rapidly evolving large language models (LLMs) suggest that language alone can build deep, meaningful representations of the world.By exploring the similarities between LLMs and human representations, researchers at The Hong Kong Polytechnic University (PolyU) and their collaborators have shed new light on the extent to which language alone can shape the formation and learning of complex conceptual knowledge. Their findings also revealed how the use of sensory input for grounding or embodiment – connecting abstract with concrete concepts during learning – affects the ability of LLMs to understand complex concepts and form human-like representations. The study, in collaboration with scholars from Ohio State University, Princeton University and City University of New York, was recently published in Nature Human Behaviour Led by Prof. LI Ping, Sin Wai Kin Foundation Professor in Humanities and Technology, Dean of the PolyU Faculty of Humanities and Associate Director of the PolyU-Hangzhou Technology and Innovation Research Institute, the research team selected conceptual word ratings produced by state-of-the-art LLMs, namely ChatGPT (GPT-3.5, GPT-4) and Google LLMs (PaLM and Gemini). They compared them with human-generated word ratings of around 4,500 words across non-sensorimotor (e.g., valence, concreteness, imageability), sensory (e.g., visual, olfactory, auditory) and motor domains (e.g., foot/leg, mouth/throat) from the highly reliable and validated Glasgow Norms and Lancaster Norms datasets.The research team first compared pairs of data from individual humans and individual LLM runs to discover the similarity between word ratings across each dimension in the three domains, using results from human-human pairs as the benchmark. This approach could, for instance, highlight to what extent humans and LLMs agree that certain concepts are more concrete than others. However, such analyses might overlook how multiple dimensions jointly contribute to the overall representation of a word. For example, the word pair "pasta" and "roses" might receive equally high olfactory ratings, but "pasta" is in fact more similar to "noodles" than to "roses" when considering appearance and taste. The team therefore conducted representational similarity analysis of each word as a vector along multiple attributes of non-sensorimotor, sensory and motor dimensions for a more complete comparison between humans and LLMs.The representational similarity analyses revealed that word representations produced by the LLMs were most similar to human representations in the non-sensorimotor domain, less similar for words in sensory domain and most dissimilar for words in motor domain. This highlights LLM limitations in fully capturing humans' conceptual understanding. Non-sensorimotor concepts are understood well but LLMs fall short when representing concepts involving sensory information like visual appearance and taste, and body movement. Motor concepts, which are less described in language and rely heavily on embodied experiences, are even more challenging to LLMs than sensory concepts like colour, which can be learned from textual data.In light of the findings, the researchers examined whether grounding would improve the LLMs' performance. They compared the performance of more grounded LLMs trained on both language and visual input (GPT-4, Gemini) with that of LLMs trained on language alone (GPT-3.5, PaLM). They discovered that the more grounded models incorporating visual input exhibited a much higher similarity with human representations.Prof. Li Ping said, "The availability of both LLMs trained on language alone and those trained on language and visual input, such as images and videos, provides a unique setting for research on how sensory input affects human conceptualisation. Our study exemplifies the potential benefits of multimodal learning, a human ability to simultaneously integrate information from multiple dimensions in the learning and formation of concepts and knowledge in general. Incorporating multimodal information processing in LLMs can potentially lead to a more human-like representation and more efficient human-like performance in LLMs in the future."Interestingly, this finding is also consistent with those of previous human studies indicating the representational transfer. Humans acquire object-shape knowledge through both visual and tactile experiences, with seeing and touching objects activating the same regions in human brains. The researchers pointed out that – as in humans – multimodal LLMs may use multiple types of input to merge or transfer representations embedded in a continuous, high-dimensional space. Prof. Li added, "The smooth, continuous structure of embedding space in LLMs may underlie our observation that knowledge derived from one modality could transfer to other related modalities. This could explain why congenitally blind and normally sighted people can have similar representations in some areas. Current limits in LLMs are clear in this respect".Ultimately, the researchers envision a future in which LLMs are equipped with grounded sensory input, for example, through humanoid robotics, allowing them to actively interpret the physical world and act accordingly. Prof. Li said, "These advances may enable LLMs to fully capture embodied representations that mirror the complexity and richness of human cognition, and a rose in LLM's representation will then be indistinguishable from that of humans."Hashtag: #PolyU #HumanCognition #LargeLanguageModels #LLMs #GenerativeAI
The issuer is solely responsible for the content of this announcement.
Orange background

Try Our AI Features

Explore what Daily8 AI can do for you:

Comments

No comments yet...

Related Articles

Public sector data digitalisation policy being drafted, says Gobind
Public sector data digitalisation policy being drafted, says Gobind

Malay Mail

time28 minutes ago

  • Malay Mail

Public sector data digitalisation policy being drafted, says Gobind

KUALA LUMPUR, Aug 21 — The National Digital Department (JDN) is drafting a public sector data digitalisation policy as part of the new technology development agenda under the 13th Malaysia Plan (13MP), said Digital Minister Gobind Singh Deo. He said the data digitalisation policy will cover the identification of data existing across all ministries and government agencies. 'This is to ensure that data, whether in the form of analogue, audio, visual or a combination thereof, can be digitised immediately in accordance with established standards, taking into account the type of data, security levels and privacy aspects required to enable it to be used promptly if needed. 'The policy is expected to be completed by the end of this year,' he said when winding up debate on the 13MP for the Digital Ministry in the Dewan Rakyat today. He said that once the data digitalisation effort is implemented, data storage will also become a priority for the Digital Ministry through the establishment of a Data Bank as outlined under the 13MP. 'In the 13MP presentation, the Prime Minister (Datuk Seri Anwar Ibrahim) mentioned that a data bank will be established for data storage purposes, and this ministry is drawing up the necessary measures to ensure the success of this agenda,' he said. Gobind also said that a Data Commission will be set up under the 13MP as a regulatory body to ensure governance, compliance with standards, and the effectiveness of data protection remain safe, transparent and with integrity. To further improve the government's digital service system, he said the Digital Ministry will launch the MyGOV Malaysia mobile application as a one-stop centre for government digital services. 'This application integrates services from various government agencies in stages with MyDigital ID as the main login key, and it offers various categories of services such as information display, payment verification and applications. 'Integration with federal government, state government and local authority (PBT) services will be implemented in stages starting next year until 2030,' he said. — Bernama

Pixel 10: Google's base flagship model features Tensor G5, new telephoto camera, bigger battery, Pixelsnap (VIDEO)
Pixel 10: Google's base flagship model features Tensor G5, new telephoto camera, bigger battery, Pixelsnap (VIDEO)

Malay Mail

time4 hours ago

  • Malay Mail

Pixel 10: Google's base flagship model features Tensor G5, new telephoto camera, bigger battery, Pixelsnap (VIDEO)

KUALA LUMPUR, Aug 21 — Almost nine years since the original Pixel smartphone was launched, here comes the new Google Pixel 10. As per what you may expect, the 2025 iteration of Google's flagship smartphone brings together a number of upgrades which revolves around performance, camera, and AI features. Just like the 9th generation Pixel phones, the new Pixel 10 can only be obtained with just one screen size which is 6.3-inch. In fact, the Pixel 10 screen still carries plenty of characteristics from its predecessor such as OLED panel, 1080 x 2424 resolution, pixel density of 422 PPI, Corning Gorilla Glass Victus 2, and dynamic refresh rate range of 60 to 120Hz. However, Pixel 10's display has a much higher peak brightness of 3000 nits as opposed to Pixel 9 which maxes out at 2700 nits. At the core of the Pixel 10 is the new Google Tensor G5 processor which is also being used on the Pixel 10 Pro, 10 Pro XL, and 10 Pro Fold. Unlike its Pro siblings though, the Pixel 10 only has 12GB of RAM instead of 16GB. Pixel 10 Pro Fold mobile phone is presented during the 'Made by Google' event, organised to introduce the latest additions to Google's Pixel portfolio of devices, in Brooklyn, New York, August 20, 2025. — Reuters pic Another fresh feature that the Pixel 10 shares with its Pro sibling is the support for the Pixelsnap magnetic accessories. This implementation also allows the new phone to support Apple MagSafe accessories too but Pixel 10's Qi2 wireless charging capability is still limited to 15W. Meanwhile, one of the biggest upgrades that Google has implemented on the Pixel 10 is by implementing a tri-camera setup. While it still features the 48MP main camera and 12MP ultrawide camera, Pixel 10 also has a new 10.8MP telephoto camera that also comes with 20x Super Res Zoom capability. A person holds Google Pixel 10 Pro mobile phones during the 'Made by Google' event, organised to introduce the latest additions to Google's Pixel portfolio of devices, in Brooklyn, New York, August 20, 2025. — Reuters pic However, no changes to the front-mounted camera though as Pixel 10 still has a 10.5MP Dual PD autofocus selfie camera as per its predecessor. Similar to Pixel 9, Google claimed that Pixel 10 has a battery life of more than 24 hours. That being said, it is actually being powered by a much larger 4,970mAh battery. In Malaysia, the Pixel 10 is now available for pre-order and will be released into store on 28 August with a starting price of RM3,999. Running on Android 16, the new phone can be obtained in several colourways including Indigo, Frost, Lemongrass, and Obsidian. — SoyaCincau Google Pixel 10 Pro Fold mobile phone is presented during the 'Made by Google' event, organised to introduce the latest additions to Google's Pixel portfolio of devices. — Reuters pic

Zoom opens new Singapore office at Marina Bay, unveils AI-powered tools
Zoom opens new Singapore office at Marina Bay, unveils AI-powered tools

Malay Mail

time4 hours ago

  • Malay Mail

Zoom opens new Singapore office at Marina Bay, unveils AI-powered tools

SINGAPORE, Aug 21 — Zoom Communications has unveiled a new 7,500 sq ft office in Singapore's Marina Bay, marking a significant step in its expansion across the region, according to CNA. The facility, located on the 24th floor of IOI Central Boulevard, will replace the video conferencing giant's former co-working space at Asia Square Tower 2. Staff are expected to move in from next week. A key highlight of the new space is the 2,000 sq ft 'Zoom Experience Hub,' designed for demonstrations and customer engagements. The hub is more than twice the size of its previous version, CNA reported. Zoom Asia head Lucas Lu described the office as a 'proper Zoom office' that underlines the company's commitment to Singapore as a strategic hub for its Asia-Pacific operations. According to the CNA report, the company, which employs more than 7,000 people worldwide, has been steadily increasing its presence in the region. Its recent investments include expanded data infrastructure in Singapore to support growing demand. During a media preview of the new facility, Zoom showcased its latest artificial intelligence features, highlighting tools aimed at boosting productivity and efficiency. Among the features demonstrated was Zoom AI Companion, which can help reschedule meetings, generate chat summaries and compare documents. Other tools include solutions for frontline workers, as well as a virtual agent designed to enhance customer service interactions, CNA said. The company stressed that these innovations were not just about convenience but also about ensuring adaptability in a rapidly changing digital landscape. By choosing Singapore as a hub, Zoom signalled its confidence in the city-state's position as a gateway for regional growth and innovation. — CNA

DOWNLOAD THE APP

Get Started Now: Download the App

Ready to dive into a world of global content with local flavor? Download Daily8 app today from your preferred app store and start exploring.
app-storeplay-store