Speaker Details

Speaker 1

Mohan Kankanhalli

National University of Singapore

Mohan Kankanhalli is Provost's Chair Professor of Computer Science at the National University of Singapore (NUS), where he is the Director of the NUS AI Institute. He is also the Deputy Executive Chairman of AI Singapore, which is Singapore’s National AI R&D Program. Mohan obtained his BTech from IIT Kharagpur and MS & PhD from the Rensselaer Polytechnic Institute.

Mohan’s research interests are in Multimodal Computing, Computer Vision, and Trustworthy AI. As Director of NUS AI Institute, he leads initiatives on multimodal models and trustworthy machine learning. He is also engaged in leadership roles in multimedia computing such as being the Senior Editor of ACM Transactions on Multimedia Computing journal and the Associate Editor-in-Chief of IEEE Multimedia magazine.

Mohan is a member of World Economic Forum's Global Future Council on Artificial Intelligence. He is an IEEE Fellow.

Talk

Title: Safety and Trustworthiness Challenges for LLMs

Abstract: Safe deployment of LLMs is of significant concern nowadays due to the anticipated wide adoption of generative AI in businesses and daily life. However, there are many unresolved safety issues that are still being studied. We present two recent works in this area. The first is an empirical approach (PromptAttack) to audit the LLM’s adversarial robustness via a prompt-based adversarial attack. PromptAttack converts adversarial textual attacks into an attack prompt that can cause the victim LLM to output the adversarial sample to fool itself. It uses a fidelity filter to ensure that PromptAttack maintains the original semantic meanings of the adversarial examples. The second is a theoretical work about hallucinations in LLMs. Since it has been recognized as a serious issue with current models, there have been many mitigation strategies advanced. However, these efforts have mostly been empirical so it is not clear whether hallucinations can be completely eliminated. We formalize the problem and employ results from learning theory to show that it is impossible to eliminate hallucinations in LLMs. We will then discuss the challenges and open research questions in the field that can help towards safe deployment of LLMs.