ACL 2024 Outstanding Paper Awarded to Intel Labs Collaboration on Evaluating Opinions in LLMs

ScottBair · ‎08-12-2024

Scott Bair is a key voice at Intel Labs, sharing insights into innovative research for inventing tomorrow’s technology.

Highlights

Research collaborators from Bocconi University, Allen Institute for AI, Intel Labs, University of Oxford, and LMU Munich received the Outstanding Paper Award for their long paper at ACL 2024 on August 11-16. Intel Labs also collaborated on a second oral paper that was nominated as a best short paper candidate by meta-reviewers at the ACL conference.
Presented as an oral paper, the award-winning team used the Political Compass Test (PCT) as a case study to demonstrate how multiple -choice surveys and questionnaires are poor instruments for evaluating the values and opinions displayed by large language models.
AFLoRA, a joint paper by the University of Southern California and Intel Labs, was nominated as a best short paper candidate. AFLoRA is the first work to demonstrate how automated adaptive freezing may improve accuracy while reducing trainable parameter demand for foundational models, potentially enabling on-device personalized services in AI software solutions.

Research collaborators from Bocconi University, Allen Institute for AI, Intel Labs, University of Oxford, and LMU Munich received the Outstanding Paper Award for their long paper at the Annual Meeting of the Association for Computational Linguistics (ACL 2024) on August 11-16 in Bangkok. Intel Labs also collaborated on a second oral paper that was nominated as a best short paper candidate by the area chair at the ACL conference.

In the award-winning paper, research collaborators use the Political Compass Test (PCT) as a case study to demonstrate how multiple-choice surveys and questionnaires are poor instruments for evaluating the values and opinions displayed by large language models (LLMs). Previous research uses multiple choice-questions taken from surveys and questionnaires to evaluate these values, often forcing the models to provide a response with adversarial prompts even when the default behavior of the model is to refuse to answer. The researchers expose the shortcomings of these approaches by showing that format, phrasing, and forcing all affect the values and opinions yielded from the models. The work contributes to developing safe and responsible user-facing AI systems by equipping researchers and companies with tools for better understanding the hidden social and political biases contained in the models.

In the second nominated paper, researchers from the University of Southern California and Intel Labs worked together on AFLoRA, a novel fine-tuning method for low-rank adaptation (LoRA) that uses a freezing score to incrementally and automatically freeze trainable projection matrices during fine-tuning to reduce computational costs in resource limited edge devices. Additionally, this potentially helps mitigate over-fitting a limited data regime for fine-tuning. AFLoRA could potentially improve the accuracy-compute cost trade-off of foundational models, particularly for on-device personalized services with large model tuning.

Main Conference Oral Papers

Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models

Much recent work seeks to evaluate values and opinions in large language models using multiple-choice surveys and questionnaires. Most of this work is motivated by concerns around real-world LLM applications. For example, politically biased LLMs may subtly influence society when they are used by millions of people. Such real-world concerns, however, stand in stark contrast to the artificiality of current evaluations: real users do not typically ask LLMs survey questions. Motivated by this discrepancy, researchers challenge the prevailing constrained evaluation paradigm for values and opinions in LLMs and explore more realistic unconstrained evaluations. As a case study, they focus on the popular Political Compass Test. In a systematic review, they find that most prior work using the PCT forces models to comply with the PCT’s multiple-choice format. They show that models give substantively different answers when not forced, answers change depending on how models are forced, and answers lack paraphrase robustness. Then, they demonstrate that models give different answers yet again in a more realistic open-ended answer setting. They distill these findings into recommendations and open challenges in evaluating values and opinions in LLMs.

AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models

Researchers introduce a novel parameter-efficient fine-tuning (PEFT) method, dubbed as adaptive freezing of low rank adaptation (AFLoRA). Specifically, for each pre-trained frozen weight tensor, researchers add a parallel path of trainable low-rank matrices, namely a down-projection and an up-projection matrix, each of which is followed by a feature transformation vector. Based on a novel freezing score, they then incrementally freeze these projection matrices during fine-tuning to reduce the computation and alleviate over-fitting. The experimental results demonstrate state-of-the art performance with an average improvement of up to 0.85% as evaluated on GLUE benchmark while yielding up to 9.5× fewer average trainable parameters. While compared in terms of runtime, AFLoRA can yield up to 1.86× improvement as opposed to similar PEFT alternatives. The paper provides insights on the trainability requirements of LoRA paths at different modules and the freezing schedule for the different projection matrices.