Accepted Main Conference Papers

Long Papers

  • Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models
    Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Qing Li, Yong Jiang, Zhihao Jia
  • Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances
    Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao
  • MAGE: Machine-generated Text Detection in the Wild
    Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang
  • PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models
    Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi YAO, Yuan Yao, Yangqiu Song
  • GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators
    Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, EngSiong Chng
  • Exploring Chain-of-Thought for Multi-modal Metaphor Detection
    Yanzhi Xu, Yueying Hua, Shichen Li, Zhongqing Wang
  • BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation
    DaYou Du, Yijia Zhang, Shijie Cao, Jiaqi Guo, Ting Cao, Xiaowen Chu, Ningyi Xu
  • A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation
    Kai Chen, Ye Wang, Yitong Li, Aiping Li, Han Yu, Xin Song
  • Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation
    Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou
  • CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers
    Yong Hu, Fandong Meng, Jie Zhou
  • Evaluating Dynamic Topic Models
    Charu Karakkaparambil James, Mayank Nagda, Nooshin Haji Ghassemi, Marius Kloft, Sophie Fellenz
  • How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition
    Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou
  • Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification
    Shanshan Xu, Santosh T.Y.S.S, Oana Ichim, Barbara Plank, Matthias Grabmair
  • Inference to the Best Explanation in Large Language Models
    Dhairya Dalal, Marco Valentino, Andre Freitas, Paul Buitelaar
  • A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus
    Eduard Poesina, Cornelia Caragea, Radu Tudor Ionescu
  • DeVAn: Dense Video Annotation for Video-Language Models
    Tingkai Liu, Yunzhe Tao, Haogeng Liu, Qihang Fang, Ding Zhou, Huaibo Huang, Ran He, Hongxia Yang
  • MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering
    Xiusi Chen, Jyun-Yu Jiang, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Wei Wang
  • SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs
    Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu
  • SciMON: Scientific Inspiration Machines Optimized for Novelty
    Qingyun Wang, Doug Downey, Heng Ji, Tom Hope
  • Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction
    Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang
  • Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models
    Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami
  • Retrieval-Augmented Multilingual Knowledge Editing
    Weixuan Wang, Barry Haddow, Alexandra Birch
  • Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge
    Brendan Park, Madeline Janecek, Naser Ezzati-Jivan, Yifeng Li, Ali Emami
  • Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models
    Abhishek Kumar, Sarfaroz Yunusov, Ali Emami
  • Framing in the Presence of Supporting Data: A Case Study in U.S. Economic News
    Alexandria Leto, Elliot E. Pickens, Coen D. Needell, David Rothschild, Maria Leonor Pacheco
  • Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences
    Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Fuxiao Liu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang
  • TTM-RE: Memory-Augmented Document-Level Relation Extraction
    Chufan Gao, Xuan Wang, Jimeng Sun
  • Answer is All You Need: Instruction-following Text Embedding via Answering the Question
    Letian Peng, Yuwei Zhang, Zilong Wang, Jayanth Srinivasa, Gaowen Liu, Zihan Wang, Jingbo Shang
  • Explore Spurious Correlations at the Concept Level in Language Models for Text Classification
    Yuhang Zhou, Paiheng Xu, Xiaoyu Liu, Bang An, Wei Ai, Furong Huang
  • Every Answer Matters: Evaluating Commonsense with Probabilistic Measures
    Qi Cheng, Michael Boratko, Pranay Kumar Yelugam, Tim O’Gorman, Nalini Singh, Andrew McCallum, Xiang Lorraine Li
  • GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis
    Yueqi XIE, Minghong Fang, Renjie Pi, Neil Zhenqiang Gong
  • How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs
    Yi Zeng, Hongpeng Lin, Jingwen Zhang, Diyi Yang, Ruoxi Jia, Weiyan Shi
  • Pouring Your Heart Out: Investigating the Role of Figurative Language in Online Expressions of Empathy
    Gyeongeun Lee, Christina Wong, Meghan Guo, Natalie Parde
  • An Information-Theoretic Approach to Analyze NLP Classification Tasks
    Luran Wang, Mark Gales, Vatsal Raina
  • Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent Encoders
    Yuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su, Hwanjun Song, Saab Mansour
  • Wav2Gloss: Generating Interlinear Glossed Text from Speech
    Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel Romney Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R Mortensen, Lori Levin
  • Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political Relation Classification
    Yibo Hu, Erick Skorupa Parolin, Latifur Khan, Patrick Brandt, Javier Osorio, Vito D’Orazio
  • SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text Generation
    Ziyao Xu, Houfeng Wang
  • OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following
    Haochen Shi, Zhiyuan Sun, Xingdi Yuan, Marc-Alexandre Côté, Bang Liu
  • Multimodal Instruction Tuning with Conditional Mixture of LoRA
    Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang
  • DocLens: Multi-aspect Fine-grained Medical Text Evaluation
    Yiqing Xie, Sheng Zhang, Hao Cheng, Pengfei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn Rose
  • FOFO: A Benchmark to Evaluate LLMs’ Format-Following Capability
    Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong
  • Hyper-CL: Conditioning Sentence Representations with Hypernetworks
    Young Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim
  • Analysis of Multi-Source Language Training in Cross-Lingual Transfer
    Seonghoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim
  • ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions
    Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramaneswaran S, S Sakshi, Dinesh Manocha
  • The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants
    Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa
  • Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving
    Chenyang An, Zhibo Chen, Qihao Ye, Emily First, Letian Peng, Jiayun Zhang, Zihan Wang, Sorin Lerner, Jingbo Shang
  • Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach
    Saehyung Lee, Sangwon Yu, Junsung Park, Jihun Yi, Sungroh Yoon
  • IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction
    Inna Wanyin Lin, Ashish Sharma, Christopher Michael Rytting, Adam S Miner, Jina Suh, Tim Althoff
  • Token-wise Influential Training Data Retrieval for Large Language Models
    Huawei Lin, Jikai Long, Zhaozhuo Xu, Weijie Zhao
  • Tree-of-Counterfactual Prompting for Zero-Shot Stance Detection
    Maxwell Weinzierl, Sanda Harabagiu
  • VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks
    Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Ruslan Salakhutdinov, Daniel Fried
  • FineSurE: Fine-grained Summarization Evaluation using LLMs
    Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour
  • Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback
    Daechul Ahn, Yura Choi, Youngjae Yu, Dongyeop Kang, Jonghyun Choi
  • Prompt Refinement with Image Pivot for Text-to-Image Generation
    Jingtao Zhan, Qingyao Ai, Yiqun LIU, Yingwei Pan, Ting Yao, Jiaxin Mao, Shaoping Ma, Tao Mei
  • The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models
    Adithya Bhaskar, Dan Friedman, Danqi Chen
  • Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation
    Masato Mita, Soichiro Murakami, Akihiko Kato, Peinan Zhang
  • AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation
    Zhaowei Wang, Wei Fan, Qing Zong, Hongming Zhang, Sehyun Choi, Tianqing Fang, Xin Liu, Yangqiu Song, Ginny Wong, Simon See
  • Reflect-RL: Two-Player Online RL Fine-Tuning for LMs
    Runlong Zhou, Simon Shaolei Du, Beibin Li
  • Can ChatGPT’s Performance be Improved on Verb Metaphor Detection Tasks? Bootstrapping and Combining Tacit Knowledge
    Cheng Yang, Puli Chen, Qingbao Huang
  • Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning
    Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu
  • An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation
    kun Zhu, Xiaocheng Feng, Xiyuan Du, Yuxuan Gu, Weijiang Yu, Haotian Wang, Qianglong Chen, Zheng Chu, Jingchang Chen, Bing Qin
  • RORA: Robust Free-Text Rationale Evaluation
    Zhengping Jiang, Yining Lu, Hanjie Chen, Daniel Khashabi, Benjamin Van Durme, Anqi Liu
  • Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents
    Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun
  • Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models
    Lei Li, Yuqi Wang, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu
  • L-Eval: Instituting Standardized Evaluation for Long Context Language Models
    Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu
  • DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related Languages
    Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, Antonios Anastasopoulos
  • InstructProtein: Aligning Human and Protein Language via Knowledge Instruction
    Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen
  • Causal-Guided Active Learning for Debiasing Large Language Models
    Zhouhao Sun, Li Du, Xiao Ding, Yixuan Ma, Yang Zhao, Kaitao Qiu, Ting Liu, Bing Qin
  • ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models
    Aparna Elangovan, Ling Liu, Lei Xu, Sravan Babu Bodapati, Dan Roth
  • Linguistically Conditioned Semantic Textual Similarity
    Jingxuan Tu, Keer Xu, Liulu Yue, Bingyang Ye, Kyeongmin Rim, James Pustejovsky
  • Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future
    Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Tao He, Haotian Wang, Weihua Peng, Ming Liu, Bing Qin, Ting Liu
  • TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models
    Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Haotian Wang, Ming Liu, Bing Qin
  • BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering
    Zheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang, kun Zhu, Xiyuan Du, Weijiang Yu, Ming Liu, Bing Qin
  • ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base
    Siyu Yuan, Jiangjie Chen, Changzhi Sun, Jiaqing Liang, Yanghua Xiao, Deqing Yang
  • TaSL: Continual Dialog State Tracking via Task Skill Localization and Consolidation
    Yujie Feng, Xu Chu, Yongxin Xu, Guangyuan SHI, Bo LIU, Xiao-Ming Wu
  • DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
    Damai Dai, Chengqi Deng, Chenggang Zhao, R.X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y.K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang
  • Grounding Language Model with Chunking-Free In-Context Retrieval
    Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou
  • Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation
    Jiaxin Bai, Yicheng Wang, Tianshi Zheng, Yue Guo, Xin Liu, Yangqiu Song
  • Active Prompting with Chain-of-Thought for Large Language Models
    Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang
  • EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs
    Xiangyu Zhao, Bo LIU, Qijiong Liu, Guangyuan SHI, Xiao-Ming Wu
  • Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search
    Haochen Li, Xin Zhou, Zhiqi Shen
  • A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications
    Naomi Baes, Nick Haslam, Ekaterina Vylomova
  • Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal
    Jianheng Huang, Leyang Cui, Ante Wang, chengyiyang, Xinting Liao, Linfeng Song, Junfeng Yao, Jinsong Su
  • Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency
    Baizhou Huang, Shuai Lu, Xiaojun Wan, Nan Duan
  • Citation-Enhanced Generation for LLM-based Chatbots
    Weitao Li, Junkai Li, Weizhi Ma, Yang Liu
  • Transitive Consistency Constrained Learning for Entity-to-Entity Stance Detection
    Haoyang Wen, Eduard Hovy, Alexander G Hauptmann
  • Feature-Adaptive and Data-Scalable In-Context Learning
    Jiahao Li, Quan Wang, Licheng Zhang, Guoqing Jin, Zhendong Mao
  • Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games
    Yizhe Zhang, Jiarui Lu, Navdeep Jaitly
  • WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models
    Shangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou, Juanzi Li
  • Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models
    Yida Zhao, Chao Lou, Kewei Tu
  • A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation
    Zhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng, Min zhang
  • PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents
    Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang
  • Probing Language Models for Pre-training Data Detection
    Zhenhua Liu, Tong Zhu, Chuanyuan Tan, Bing Liu, Haonan Lu, Wenliang Chen
  • Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding
    Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua
  • IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation
    Senyu Han, Lu Chen, Li-Min Lin, Zhengshan Xu, Kai Yu
  • Language Model Adaption for Reinforcement Learning with Natural Language Action Space
    Jiangxing Wang, Jiachen Li, Xiao Han, Deheng Ye, Zongqing Lu
  • Evaluating Intention Detection Capability of Large Language Models in Persuasive Dialogues
    Hiromasa Sakurai, Yusuke Miyao
  • LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression
    Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu
  • Persuading across Diverse Domains: a Dataset and Persuasion Large Language Model
    Chuhao Jin, Kening Ren, Lingzhen Kong, Xiting Wang, Ruihua Song, huan chen
  • HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy
    Mengxi Xiao, Qianqian Xie, Ziyan Kuang, Zhicheng Liu, Kailai Yang, Min Peng, Weiguang Han, Jimin Huang
  • Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition
    Zirun Guo, Tao Jin, Zhou Zhao
  • An Effective Pronunciation Assessment Approach Leveraging Hierarchical Transformers and Pre-training Strategies
    Bi-Cheng Yan, Jiun-Ting Li, Yi-Cheng Wang, Hsin Wei Wang, Tien-Hong Lo, Yung-Chang Hsu, Wei-Cheng Chao, Berlin Chen
  • Detection-Correction Structure via General Language Model for Grammatical Error Correction
    Wei Li, Houfeng Wang
  • Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer
    Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu
  • Selene: Pioneering Automated Proof in Software Verification
    Lichen Zhang, Shuai Lu, Nan Duan
  • Dissecting Human and LLM Preferences
    Junlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, hai zhao, Pengfei Liu
  • UniCoder: Scaling Code Large Language Model via Universal Code
    Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li
  • AoE: Angle-optimized Embeddings for Semantic Textual Similarity
    Xianming LI, Jing Li
  • InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews
    Xintao Wang, Yunze Xiao, Jen-tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang, Jiangjie Chen, Cheng Li, Yanghua Xiao
  • Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better
    Shengchao Liu, Xiaoming Liu, Yichen Wang, Zehua Cheng, Chengzhengxu Li, Zhaohan Zhang, Yu Lan, Chao Shen
  • AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators
    Jingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold
  • Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering
    Tobias Schimanski, Jingwei Ni, Mathias Kraus, Elliott Ash, Markus Leippold
  • LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin
    Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Wei Shen, Limao Xiong, Yuhao Zhou, Xiao Wang, Zhiheng Xi, Xiaoran Fan, Shiliang Pu, Zhu Jiang, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
  • Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation
    Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen M. Meng
  • M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions
    Zheng Wang, Shu Xian Teo, Jieer Ouyang, Yongjun xu, Wei Shi
  • AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension
    Qian Yang, Jin Xu, Wenrui Liu, Yunfei Chu, Ziyue Jiang, Xiaohuan Zhou, Yichong Leng, Yuanjun Lv, Zhou Zhao, Chang Zhou, Jingren Zhou
  • Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies
    Tom Kocmi, Vilém Zouhar, Christian Federmann, Matt Post
  • ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models
    Yuanyi Ren, Haoran Ye, Hanjun Fang, Xin Zhang, Guojie Song
  • DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon Induction
    Ling Hu, Yuemei Xu
  • SparseFit: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations
    Jesus Solano, Mardhiyah Sanni, Oana-Maria Camburu, Pasquale Minervini
  • Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation
    Wen Wu, Bo Li, Chao Zhang, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N Sainath, Phil Woodland
  • REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation
    Jinyuan Fang, Zaiqiao Meng, Craig MacDonald
  • Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks
    Yingji Zhang, Danilo Carvalho, Andre Freitas
  • MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation
    Yan Ma, Yu Qiao, Pengfei Liu
  • Open-Set Semi-Supervised Text Classification via Adversarial Disagreement Maximization
    Junfan Chen, Richong Zhang, Junchi Chen, Chunming Hu
  • ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
    Junjie Ye, Sixian Li, Guanyu Li, Huangcaishuang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang
  • A synthetic data approach for domain generalization of NLI models
    Mohammad Javad Hosseini, Andrey Petrov, Alex Fabrikant, Annie Louis
  • Enhancing Contrastive Learning with Noise-Guided Attack: Towards Continual Relation Extraction in the Wild
    Ting Wu, Jingyi Liu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang
  • LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models
    Jiaqi Zhao, Miao Zhang, Chao Zeng, Ming Wang, Xuebo Liu, Liqiang Nie
  • VariErr NLI: Separating Annotation Error from Human Label Variation
    Leon Weber-Genzel, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank
  • Towards Better Understanding of Contrastive Sentence Representation Learning: A Unified Paradigm for Gradient
    Mingxin Li, Richong Zhang, Zhijie Nie
  • Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation
    Xunjian Yin, Xu Zhang, Jie Ruan, Xiaojun Wan
  • ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval
    Soyoung Yoon, Eunbi Choi, Jiyeon Kim, Hyeongu Yun, Yireun Kim, seung-won hwang
  • Exploring the Potential of Large Language Models in Computational Argumentation
    Guizhen Chen, Liying Cheng, Anh Tuan Luu, Lidong Bing
  • TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Semantic Tasks
    Viktor Moskvoretskii, Ekaterina Neminova, Alina Lobanova, Alexander Panchenko, Irina Nikishina
  • CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning
    Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Cheng Jiayang, Chunkit Chan, Yangqiu Song
  • MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter
    Jitai Hao, Weiwei Sun, Xin Xin, Qi Meng, Zhumin Chen, Pengjie Ren, Zhaochun Ren
  • Surgical Feature-Space Decomposition of LLMs: Why, When and How?
    Arnav Chavan, Nahush Lele, Deepak Gupta
  • Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance
    Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Junqi Dai, Qinyuan Cheng, Xuanjing Huang, Xipeng Qiu
  • Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question Answering
    Junnan Dong, Qinggang Zhang, Huachi Zhou, Daochen Zha, Pai Zheng, Xiao Huang
  • Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression
    Peiyu Liu, Ze-Feng Gao, Xin Zhao, Yipeng Ma, Tao Wang, Ji-Rong Wen
  • Emergent Word Order Universals from Cognitively-Motivated Language Models
    Tatsuki Kuribayashi, Ryo Ueda, Ryo Yoshida, Yohei Oseki, Ted Briscoe, Timothy Baldwin
  • VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models
    Seoyeon Kim, Kwangwook Seo, Hyungjoo Chae, Jinyoung Yeo, Dongha Lee
  • Making Long-Context Language Models Better Multi-Hop Reasoners
    Yanyang Li, Shuo Liang, Michael Lyu, Liwei Wang
  • TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models
    Yihong Liu, Chunlan Ma, Haotian Ye, Hinrich Schuetze
  • Extreme Miscalibration and the Illusion of Adversarial Robustness
    Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis
  • HyCoRec: Hypergraph-Enhanced Multi-Preference Learning for Alleviating Matthew Effect in Conversational Recommendation
    Yongsen Zheng, Ruilin Xu, Ziliang Chen, Guohua Wang, Mingjie Qian, Jinghui Qin, Liang Lin
  • Co-training for Low Resource Scientific Natural Language Inference
    Mobashir Sadat, Cornelia Caragea
  • RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models
    Jiongxiao Wang, Junlin Wu, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao
  • Time is Encoded in the Weights of Finetuned Language Models
    Kai Nylund, Suchin Gururangan, Noah A. Smith
  • Long-Context Language Modeling with Parallel Context Encoding
    Howard Yen, Tianyu Gao, Danqi Chen
  • SirLLM: Streaming Infinite Retentive LLM
    Yao Yao, Zuchao Li, hai zhao
  • IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models
    Tao Feng, Lizhen Qu, Zhuang Li, Haolan Zhan, YUNCHENG HUA, Reza Haf
  • Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale
    Xiang Hu, Pengyu Ji, Qingyang Zhu, Wei Wu, Kewei Tu
  • MELA: Multilingual Evaluation of Linguistic Acceptability
    Ziyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu
  • Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View
    Jintian Zhang, Xin Xu, Ningyu Zhang, Ruibo Liu, Bryan Hooi, Shumin Deng
  • CopyNE: Better Contextual ASR by Copying Named Entities
    Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai
  • Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval
    Peter Baile Chen, Yi Zhang, Dan Roth
  • Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation
    Haonan Chen, Zhicheng Dou, Kelong Mao, Jiongnan Liu, Ziliang Zhao
  • ItD: Large Language Models Can Teach Themselves Induction through Deduction
    Wangtao Sun, Haotian Xu, Xuanqing Yu, Pei Chen, Shizhu He, Jun Zhao, Kang Liu
  • MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs
    Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li
  • MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin
    Tianshuo Zhou, Sen Mei, Xinze Li, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Ge Yu
  • Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Agent
    Heng-Da Xu, Xian-Ling Mao, Puhai Yang, Fanshu Sun, Heyan Huang
  • On Context Utilization in Summarization with Large Language Models
    Mathieu Ravaut, Aixin Sun, Nancy F. Chen, Shafiq Joty
  • INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning
    Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou
  • Enhancing In-Context Learning via Implicit Demonstration Augmentation
    Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang
  • PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA
    Sheng Wang, Boyang XUE, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu
  • Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics
    Chun Hei Lo, Wai Lam, Hong Cheng, Guy Emerson
  • Improving Event Definition Following For Zero-Shot Event Detection
    Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng
  • Through the MUD: A Multi-Defendant Charge Prediction Benchmark with Linked Crime Elements
    Xiao Wei, Xu Qi, Hang Yu, Qian Liu, Erik Cambria
  • Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding
    Yiruo Cheng, Kelong Mao, Zhicheng Dou
  • Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks
    Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, Yulia Tsvetkov, Tianxing He
  • CausalGym: Benchmarking causal interpretability methods on linguistic tasks
    Aryaman Arora, Dan Jurafsky, Christopher Potts
  • Training Language Models to Generate Text with Citations via Fine-grained Rewards
    Chengyu Huang, Zeqiu Wu, Yushi Hu, Wenya Wang
  • Hypergraph based Understanding for Document Semantic Entity Recognition
    Qiwei Li, Zuchao Li, Ping Wang, Haojun Ai, hai zhao
  • GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers
    Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi
  • Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models
    Qingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang
  • AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning
    Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, chengfei lv, Huajun Chen
  • ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks
    Santosh T.Y.S.S, Tuan-Quang Vuong, Matthias Grabmair
  • Virtual Compiler Is All You Need For Assembly Code Search
    Zeyu Gao, Hao Wang, Yuanda Wang, Chao Zhang
  • MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning
    Pengjie Ren, Chengshun Shi, Shiguang Wu, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei
  • Can LLMs Learn from Previous Mistakes? Investigating LLMs’ Errors to Boost for Reasoning
    Yongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, Jingbo Shang
  • An Iterative Associative Memory Model for Empathetic Response Generation
    Zhou Yang, Zhaochun Ren, Wang Yufeng, Haizhou Sun, Chao Chen, Xiaofei Zhu, Xiangwen Liao
  • Detoxifying Large Language Models via Knowledge Editing
    Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen
  • LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding
    Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li
  • Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models
    Yuyan Chen, Songzhou Yan, Panjun Liu, Yanghua Xiao
  • UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages
    Trinh Pham, Khoi M. Le, Anh Tuan Luu
  • VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
    Junjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, yongping xiong
  • Black-Box Prompt Optimization: Aligning Large Language Models without Model Training
    Jiale Cheng, Xiao Liu, Kehan Zheng, Pei Ke, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang
  • Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark
    Chanjun Park, Hyeonwoo Kim, Dahyun Kim, SeongHwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, Hwalsuk Lee
  • Unified Hallucination Detection for Multimodal Large Language Models
    Xiang Chen, Chenxi Wang, Yida Xue, Ningyu Zhang, xiaoyan yang, Qiang Li, YUE SHEN, Lei Liang, Jinjie GU, Huajun Chen
  • Empowering Character-level Text Infilling by Eliminating Sub-Tokens
    Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Hongsheng Li
  • Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models
    Kun Luo, Zheng Liu, Shitao Xiao, Tong Zhou, Yubo Chen, Jun Zhao, Kang Liu
  • GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge?
    Dayoon Ko, Jinyoung Kim, Hahyeon Choi, Gunhee Kim
  • Attribute First, then Generate: Locally-attributable Grounded Text Generation
    Aviv Slobodkin, Eran Hirsch, Arie Cattan, Tal Schuster, Ido Dagan
  • T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text
    Aoxiong Yin, Haoyuan Li, Kai Shen, Siliang Tang, Yueting Zhuang
  • OceanGPT: A Large Language Model for Ocean Science Tasks
    Zhen Bi, Ningyu Zhang, Yida Xue, Yixin Ou, Daxiong Ji, Guozhou Zheng, Huajun Chen
  • Beyond Memorization: The Challenge of Random Memory Access in Language Models
    Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin
  • BIPED: Pedagogically Informed Tutoring System for ESL Education
    Soonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim
  • Timeline-based Sentence Decomposition with In Context Learning for Temporal Fact Extraction
    Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu
  • Collaboration or Corporate Capture? Quantifying NLP’s Reliance on Industry Artifacts and Contributions
    Will Aitken, Mohamed Abdalla, Karen Rudie, Catherine Stinson
  • Prompt Expansion for Adaptive Text-to-Image Generation
    Siddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson
  • Progressively Modality Freezing for Multi-Modal Entity Alignment
    Yani Huang, Xuefeng Zhang, Richong Zhang, Junfan Chen, Jaein Kim
  • Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval
    Chaofan Li, Zheng Liu, Shitao Xiao, Yingxia Shao, Defu Lian
  • Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts
    Xuan-Phi Nguyen, Mahani Aljunied, Shafiq Joty, Lidong Bing
  • Metaphor Understanding Challenge Dataset for LLMs
    Xiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina Shutova
  • A Multi-Task Embedder For Retrieval Augmented LLMs
    Peitian Zhang, Zheng Liu, Shitao Xiao, Zhicheng Dou, Jian-Yun Nie
  • Language Models Don’t Learn the Physical Manifestation of Language
    Bruce W Lee, Jaehyuk Lim
  • Don’t Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration
    Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov
  • What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection
    Shangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov
  • Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives
    Wenqi Zhang, Yongliang Shen, Linjuan Wu, Qiuying Peng, Jun Wang, Yueting Zhuang, Weiming Lu
  • Relying on the Unreliable: The Impact of Language Models’ Reluctance to Express Uncertainty
    Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Maarten Sap
  • Mission: Impossible Language Models
    Julie Kallini, Isabel Papadimitriou, Richard Futrell, Kyle Mahowald, Christopher Potts
  • Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources
    Xiaochen Wang, Junyu Luo, Jiaqi Wang, Yuan Zhong, Xiaokun Zhang, Yaqing Wang, Parminder Bhatia, Cao Xiao, Fenglong Ma
  • Semisupervised Neural Proto-Language Reconstruction
    Liang Lu, Peirong Xie, David R Mortensen
  • When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP
    Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri
  • SBAAM! Eliminating Transcript Dependency in Automatic Subtitling
    Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli
  • Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing?
    Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli
  • StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection
    Sara Papi, Marco Gaido, Matteo Negri, Luisa Bentivogli
  • ARL2: Aligning Retrievers with Black-box Large Language Models via Self-guided Adaptive Relevance Labeling
    LingXi Zhang, Yue Yu, Kuan Wang, Chao Zhang
  • Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference
    Jihwan Bang, Juntae Lee, Kyuhong Shim, Seunghan Yang, Simyung Chang
  • FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model
    Yebin Lee, Imseong Park, Myungjoo Kang
  • MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations
    Yuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi
  • MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning
    Zhenlong Dai, Chang Yao, WenKang Han, Yuanying, Zhipeng Gao, Jingyuan Chen
  • DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows
    Ajay Patel, Colin Raffel, Chris Callison-Burch
  • Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective
    Chenze Shao, Fandong Meng, Jiali Zeng, Jie Zhou
  • Identifying while Learning for Document Event Causality Identification
    Cheng Liu, Wei Xiang, Bang Wang
  • OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems
    Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun
  • Insert or Attach: Taxonomy Completion via Box Embedding
    Wei Xue, Yongliang Shen, Wenqi Ren, Jietian Guo, Shiliang Pu, Weiming Lu
  • Semiparametric Token-Sequence Co-Supervision
    Hyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo
  • Instruction Fusion: Advancing Prompt Evolution through Hybridization
    Weidong Guo, Jiuding Yang, Kaitong Yang, Xiangyang Li, Zhuwei Rao, Yu Xu, Di Niu
  • TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation
    Yikai Zhang, Siyu Yuan, Caiyu Hu, Kyle Richardson, Yanghua Xiao, Jiangjie Chen
  • Exploring Memorization in Fine-tuned Language Models
    Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin
  • Towards Real-world Scenario: Imbalanced New Intent Discovery
    Shun Zhang, Yan Chaoran, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li
  • M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection
    Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, OSAMA MOHAMMED AFZAL, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov
  • Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue
    Jian Wang, Chak Tou Leong, Jiashuo WANG, Dongding Lin, Wenjie Li, Xiaoyong Wei
  • SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training
    Nan He, Weichen Xiong, Hanwen Liu, Yi Liao, Lei Ding, Kai Zhang, Guohua Tang, Xiao Han, Yang Wei
  • Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models?
    Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He, Le Sun
  • Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning
    Zeqi Tan, Yongliang Shen, Xiaoxia Cheng, Chang Zong, Wenqi Zhang, Jian Shao, Weiming Lu, Yueting Zhuang
  • CaMML: Context-Aware Multimodal Learner for Large Models
    Yixin Chen, Shuai Zhang, Boran Han, Tong He, Bo Li
  • MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation
    Xiaozhi Wang, Hao Peng, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou, Xu Han, Yankai Lin, Zhiyuan Liu, Ruobing Xie, Jie Zhou, Juanzi Li
  • NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes
    Lizhou Fan, Wenyue Hua, Lingyao Li, Haoyang Ling, Yongfeng Zhang
  • Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models
    Zhiwei He, Binglin Zhou, Hongkun Hao, Aiwei Liu, Xing Wang, Zhaopeng Tu, Zhuosheng Zhang, Rui Wang
  • Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization?
    Roshan Sharma, Suwon Shon, Mark Lindsey, Hira Dhamyal, Bhiksha Raj
  • Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer Counselors
    Alicja Chaszczewicz, Raj Sanjay Shah, Ryan Louie, Bruce A Arnow, Robert Kraut, Diyi Yang
  • D2LLM: Decomposed and Distilled Large Language Models for Semantic Search
    Zihan Liao, Hang Yu, Jianguo Li, Jun Wang, Wei Zhang
  • In­-context Mixing (ICM): Code­-mixed Prompts for Multilingual LLMs
    Bhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya
  • Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language Models
    Liang Zhang, Qin Jin, Haoyang Huang, Dongdong Zhang, Furu Wei
  • Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries
    Yu-Hsiang Huang, Yuche Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-De Lin
  • Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language Understanding
    Kuo Liao, Shuang Li, Meng Zhao, Liqun Liu, Mengge Xue, zhenyu hu, Honglin Han, Chengguo Yin
  • Intuitive or Dependent? Investigating LLMs’ Behavior Style to Conflicting Prompts
    Jiahao Ying, Yixin Cao, Kai Xiong, Long Cui, yidong He, Yongbin Liu
  • CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending
    Shiyi Zhu, Jing Ye, Wei Jiang, Siqiao Xue, Qi Zhang, Yifan Wu, Jianguo Li
  • Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization
    Salman Elgamal, Ossama Obeid, MHD Tameem Kabbani, Go Inoue, Nizar Habash
  • InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification
    Jan Trienes, Sebastian Antony Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu, Byron C Wallace, Junyi Jessy Li
  • Disinformation Capabilities of Large Language Models
    Ivan Vykopal, Matúš Pikuliak, Ivan Srba, Robert Moro, Dominik Macko, Maria Bielikova
  • Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models
    Junhao Zheng, Shengjie Qiu, Qianli Ma
  • CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following
    Kaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding, Bowen Zhou
  • DAPR: A Benchmark on Document-Aware Passage Retrieval
    Kexin Wang, Nils Reimers, Iryna Gurevych
  • How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study
    Andreas Waldis, Yufang Hou, Iryna Gurevych
  • Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors
    Mengge Xue, zhenyu hu, Liqun Liu, Kuo Liao, Shuang Li, Honglin Han, Meng Zhao, Chengguo Yin
  • SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graph
    Hanzhu Chen, Xu Shen, Qitan Lv, Jie Wang, Xiaoqi Ni, Jieping Ye
  • Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages
    Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Wawan Cenggoro, Jhonson Lee, Salsabil Maulana Akbar, Emmanuel Dave, Nuurshadieq, Muhammad Ihza Mahendra, Rr Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung
  • Uncertainty-Guided Modal Rebalance for Hateful Memes Detection
    Chuanpeng Yang, Yaxin Liu, Fuqing Zhu, Jizhong Han, Songlin Hu
  • Must NLP be Extractive?
    Steven Bird
  • Spiral of Silence: How is Large Language Model Killing Information Retrieval?—A Case Study on Open Domain Question Answering
    Xiaoyang Chen, Ben He, Hongyu Lin, Xianpei Han, Tianshu Wang, Boxi Cao, Le Sun, Yingfei Sun
  • Missci: Reconstructing Fallacies in Misrepresented Science
    Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych
  • Uncovering the Full Potential of Visual Grounding Methods in VQA
    Daniel Reich, Tanja Schultz
  • Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs
    Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen
  • Favi-Score: A Measure for Favoritism in Automated Preference Ratings for Generative AI Evaluation
    Pius von Däniken, Jan Milan Deriu, Don Tuggener, Mark Cieliebak
  • LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback
    Timon Ziegenbein, Gabriella Skitalinskaya, Alireza Bayat Makou, Henning Wachsmuth
  • Graph Language Models
    Moritz Plenz, Anette Frank
  • Analyzing Semantic Change through Lexical Replacements
    Francesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi
  • Exploiting Intrinsic Multilateral Logical Rules for Weakly Supervised Natural Language Video Localization
    Zhe Xu, Kun Wei, Xu Yang, Cheng Deng
  • Latxa: An Open Language Model and Evaluation Suite for Basque
    Julen Etxaniz, Oscar Sainz, Naiara Perez Miguel, Itziar Aldabe, German Rigau, Eneko Agirre, Aitor Ormazabal, Mikel Artetxe, Aitor Soroa
  • Interpretability of Language Models via Task Spaces
    Lucas Weber, Jaap Jumelet, Elia Bruni, Dieuwke Hupkes
  • Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types
    Pierluigi Cassotti, Stefano De Pascale, Nina Tahmasebi
  • Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators
    Matéo Mahaut, Laura Aina, Paula Czarnowska, Momchil Hardalov, Thomas Müller, Lluis Marquez
  • StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback
    Shihan Dou, Yan Liu, Haoxiang Jia, Enyu Zhou, Limao Xiong, Junjie Shan, Huangcaishuang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang
  • One-Shot Learning as Instruction Data Prospector for Large Language Models
    Yunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang, Lei Zhang, Shuzheng Si, Ling-Hao Chen, Junhao Liu, Tongliang Liu, Fei Huang, Yongbin Li
  • Navigating the OverKill in Large Language Models
    Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin
  • Why are Sensitive Functions Hard for Transformers?
    Michael Hahn, Mark Rofin
  • A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains
    Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva
  • Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision
    Qian Ruan, Ilia Kuznetsov, Iryna Gurevych
  • NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents
    Tamara Czinczoll, Christoph Hönes, Maximilian Schall, Gerard de Melo
  • FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models
    Yuxin Jiang, Yufei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang
  • Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction
    Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu
  • Learning to Edit: Aligning LLMs with Knowledge Editing
    Yuxin Jiang, Yufei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang
  • DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning
    Yejie Wang, Keqing He, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Weiran Xu, Jingang Wang, Mengdi Zhang, Xunliang Cai
  • IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators
    Indraneil Paul, Goran Glavaš, Iryna Gurevych
  • When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality
    Brielen Madureira, Patrick Kahardipraja, David Schlangen
  • SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models
    Md Imbesat Hassan Rizvi, Xiaodan Zhu, Iryna Gurevych
  • Planning Like Human: A Dual-process Framework for Dialogue Planning
    Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Ming Liu, Zerui Chen, Bing Qin
  • Spectral Filters, Dark Signals, and Attention Sinks
    Nicola Cancedda
  • DiffuCOMET: Contextual Commonsense Knowledge Diffusion
    Silin Gao, Mete Ismayilzada, Mengjie Zhao, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut
  • Systematic Task Exploration with LLMs: A Study in Citation Text Generation
    Furkan Şahinuç, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych
  • The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning
    Rochelle Choenni, Anne Lauscher, Ekaterina Shutova
  • Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition
    Matteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling
  • MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling
    Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer
  • Temporal Knowledge Question Answering via Abstract Reasoning Induction
    Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, Min Zhang
  • MultiLegalPile: A 689GB Multilingual Legal Corpus
    Joel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho
  • Who Wrote this Code? Watermarking for Code Generation
    Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, Gunhee Kim
  • MapCoder: Multi-Agent Code Generation for Competitive Problem Solving
    Md. Ashraful Islam, Mohammed Eunus Ali, Md Rizwan Parvez
  • RelayAttention for Efficient Large Language Model Serving with Long System Prompts
    Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau
  • Boosting Language Models Reasoning with Chain-of-Knowledge Prompting
    Jianing Wang, Qiushi Sun, Xiang Li, Ming Gao
  • Open Grounded Planning: Challenges and Benchmark Construction
    Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun
  • WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations
    Haolin Deng, Chang Wang, Li Xin, Dezhang Yuan, Junlang Zhan, Tian Hua Zhou, Jin Ma, Jun Gao, Ruifeng Xu
  • LLM Knows Body Language, Too: Translating Speech Voices into Human Gestures
    Chenghao Xu, Guangtao Lyu, Jiexi Yan, Muli Yang, Cheng Deng
  • QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction
    Xiang Huang, Sitao Cheng, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu
  • PITA: Prompting Task Interaction for Argumentation Mining
    Yang Sun, Muyi Wang, Jianzhu Bao, Bin Liang, Xiaoyan Zhao, Caihua Yang, Min Yang, Ruifeng Xu
  • Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models
    Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu
  • Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations
    Gregor Geigle, Radu Timofte, Goran Glavaš
  • Estimating Agreement by Chance for Sequence Annotation
    Diya Li, Carolyn Rose, Ao Yuan, Chunxiao Zhou
  • What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages
    Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell
  • Are Emergent Abilities in Large Language Models just In-Context Learning?
    Sheng Lu, Irina Bigoulaeva, Rachneet Singh Sachdeva, Harish Tayyar Madabushi, Iryna Gurevych
  • WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning
    Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, Yishujie Zhao, Wenxiang Hu, Qiufeng Yin
  • Eliciting Better Multilingual Structured Reasoning from LLMs through Code
    Bryan Li, Tamer Alkhouli, Daniele Bonadiman, Nikolaos Pappas, Saab Mansour
  • OLIVE: Object Level In-Context Visual Embeddings
    Timothy Ossowski, Junjie Hu
  • Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
    Jiuhai Chen, Jonas Mueller
  • Marathon: A Race Through the Realm of Long Context with Large Language Models
    Lei Zhang, Yunshui Li, Ziqiang Liu, Jiaxi Yang, Junhao Liu, Longze Chen, Run Luo, Min Yang
  • Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph
    Xiaochen Kev Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang
  • PCAD: Towards ASR-Robust Spoken Language Understanding via Prototype Calibration and Asymmetric Decoupling
    Xianwei Zhuang, Xuxin Cheng, Liming Liang, Yuxin Xie, Zhichang Wang, Zhiqi Huang, Yuexian Zou
  • Rethinking the Multimodal Correlation of Multimodal Sequential Learning via Generalizable Attentional Results Alignment
    Tao Jin, Wang Lin, Ye Wang, Linjun Li, Xize Cheng, Zhou Zhao
  • UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation
    Xun Liang, Shichao Song, Simin Niu, Zhiyu li, Feiyu Xiong, Bo Tang, Yezhaohui Wang, Dawei He, Cheng Peng, Zhonghao Wang, Haiying Deng
  • PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers
    Weizhe Lin, Jingbiao Mei, Jinghong Chen, Bill Byrne
  • Triple-Encoders: Representations That Fire Together, Wire Together
    Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, Iryna Gurevych
  • Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning
    Jingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin
  • Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization
    Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting Zhuang, Weiming Lu
  • Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing
    Behzad Shayegh, Yuqiao Wen, Lili Mou
  • Your Transformer is Secretly Linear
    Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan Oseledets, Denis Dimitrov, Andrey Kuznetsov
  • Noise Correction on Subjective Datasets
    Uthman Jinadu, Yi Ding
  • Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers
    Lütfi Kerem Senel, Besnik Fetahu, Davis Yoshida, Zhiyu Chen, Giuseppe Castellucci, Nikhita Vedula, Jason Ingyu Choi, Shervin Malmasi
  • Instruction-tuned Language Models are Better Knowledge Learners
    Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodriguez, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srini Iyer
  • What Do Language Models Hear? Probing for Auditory Representations in Language Models
    Jerry Ngo, Yoon Kim
  • Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs
    Zae Myung Kim, Kwang Hee Lee, Preston Zhu, Vipul Raheja, Dongyeop Kang
  • Jailbreak Open-Sourced Large Language Models via Enforced Decoding
    Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu
  • NICE: To Optimize In-Context Examples or Not?
    Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma
  • CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation
    Weixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen, Wen Wang, Tingyu Lin, Weishan Zhao, Li Zhu, Hari Sundaram, Shuiguang Deng
  • Digital Socrates: Evaluating LLMs through Explanation Critiques
    Yuling Gu, Oyvind Tafjord, Peter Clark
  • SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding
    Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran
  • ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs
    Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran
  • Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once?
    Guijin Son, SangWon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim
  • ChatDev: Communicative Agents for Software Development
    Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, dahai li, Zhiyuan Liu, Maosong Sun
  • Experiential Co-Learning of Software-Developing Agents
    Chen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, YiFei Wang, Weize Chen, Cheng Yang, Xin Cong, Xiaoyin Che, Zhiyuan Liu, Maosong Sun
  • Learning Geometry-Aware Representations for New Intent Discovery
    Kai Tang, Junbo Zhao, Xiao Ding, Runze Wu, Lei Feng, Gang Chen, Haobo Wang
  • Speaker Verification in Agent-generated Conversations
    Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Ee-Peng Lim
  • Benchmarking Data Science Agents
    Yuge Zhang, Qiyang Jiang, XingyuHan, Nan Chen, Yuqing Yang, Kan Ren
  • Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models
    Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen
  • Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models
    Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang
  • A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques
    Megh Thakkar, Quentin Fournier, Matthew D Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar
  • Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation
    Xiang Luo, Zhiwen Tang, Jin Wang, Xuejie Zhang
  • PRP-Graph: Pairwise Ranking Prompting to LLMs with Graph Aggregation for Effective Text Re-ranking
    Jian Luo, Xuanang Chen, Ben He, Le Sun
  • RepCodec: A Speech Representation Codec for Speech Tokenization
    Zhichao Huang, Chutong Meng, Tom Ko
  • Disentangled Learning with Synthetic Parallel Data for Text Style Transfer
    Jingxuan Han, Quan Wang, Zikang Guo, Benfeng Xu, Licheng Zhang, Zhendong Mao
  • GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick
    Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao
  • PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety
    Zaibin Zhang, Yongting Zhang, Lijun Li, Jing Shao, Hongzhi Gao, Yu Qiao, Lijun Wang, Huchuan Lu, Feng Zhao
  • Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News Detection
    Zihan Ma, Minnan Luo, Hao Guo, Zhi Zeng, Yiran Hao, Xiang Zhao
  • Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions
    Liyan Xu, Jiangnan Li, Mo Yu, Jie Zhou
  • Stealthy Attack on Large Language Model based Recommendation
    Jinghao Zhang, Yuting Liu, Qiang Liu, Shu Wu, Guibing Guo, Liang Wang
  • Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning
    Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Lee, Jungseul Ok
  • Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models
    Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li
  • SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning
    Guoxin Chen, kexin Tang, Chao Yang, Fuying Ye, Yu Qiao, Yiming Qian
  • Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning
    Yeachan Kim, Junho Kim, SangKeun Lee
  • SparseFlow: Accelerating Transformers by Sparsifying Information Flows
    Yeachan Kim, SangKeun Lee
  • ProtT3: Protein-to-Text Generation for Text-based Protein Understanding
    Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua
  • KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models
    Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang
  • EmoBench: Evaluating the Emotional Intelligence of Large Language Models
    Sahand Sabour, Siyang Liu, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna Shiergetya Sunaryo, Tatia M.C. Lee, Rada Mihalcea, Minlie Huang
  • Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation
    Dongjin Kang, Sunghwan Kim, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, Jinyoung Yeo
  • Are AI-Generated Text Detectors Robust to Adversarial Perturbations?
    Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang
  • FinTextQA: A Dataset for Long-form Financial Question Answering
    Jian Chen, Peilin Zhou, Yining Hua, Loh Ying Xin, Kehui chen, Ziyuan Li, Bing Zhu, Junwei Liang
  • On Measuring Faithfulness or Self-consistency of Natural Language Explanations
    Letitia Parcalabescu, Anette Frank
  • $\infty$Bench: Extending Long Context Evaluation Beyond 100K Tokens
    Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun
  • Learning or Self-aligning? Rethinking Instruction Fine-tuning
    Mengjie Ren, Boxi Cao, Hongyu Lin, Cao Liu, Xianpei Han, Ke Zeng, Wan Guanglu, Xunliang Cai, Le Sun
  • Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key?
    Qineng Wang, Zihao Wang, Ying Su, Hanghang Tong, Yangqiu Song
  • Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQA
    Qunbo Wang, Ruyi Ji, Tianhao Peng, Wenjun Wu, Zechao Li, Jing Liu
  • TasTe: Teaching Large Language Models to Translate through Self-Reflection
    Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie Zhou, Min Zhang
  • Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models
    Xudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li
  • Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language Models
    Tharindu Madusanka, Ian Pratt-Hartmann, Riza Batista-Navarro
  • UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion
    Wei Li, Xue Xu, Jiachen Liu, Xinyan Xiao
  • The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities
    David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran
  • Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models
    Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Rose Kirk, Hinrich Schuetze, Dirk Hovy
  • AI ‘News’ Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian
    Giovanni Puccetti, Anna Rogers, Chiara Alzetta, Felice Dell’Orletta, Andrea Esuli
  • Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts?
    Hexiang Tan, Fei Sun, Wanli Yang, Yuanzhuo Wang, Qi Cao, Xueqi Cheng
  • Unveiling Linguistic Regions in Large Language Models
    Zhihao Zhang, Jun Zhao, Qi Zhang, Tao Gui, Xuanjing Huang
  • Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment
    Zhiqing Hong, Rongjie Huang, Xize Cheng, Yongqi Wang, Ruiqi Li, Fuming You, Zhou Zhao, Zhimeng Zhang
  • FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection
    Yufei Huang, Xu Han, Maosong Sun
  • Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models’ Understanding of Discourse Relations
    Yisong Miao, Hongfu Liu, Wenqiang Lei, Nancy F. Chen, Min-Yen Kan
  • An Open Multilingual System for Scoring Readability of Wikipedia
    Mykola Trokhymovych, Indira Sen, Martin Gerlach
  • Unlearning Traces the Influential Training Data of Language Models
    Masaru Isonuma, Ivan Titov
  • Exploring Alignment in Shared Cross-lingual Spaces
    Basel Mousi, Nadir Durrani, Fahim Dalvi, Majd Hawasly, Ahmed Abdelali
  • Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models
    Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael Lyu
  • Self-Evolving GPT: A Lifelong Autonomous Experiential Learner
    Jinglong Gao, Xiao Ding, Yiming Cui, Jianbai Zhao, Hepeng Wang, Ting Liu, Bing Qin
  • WRP: Weight Recover Prune for Structured Sparsity
    Zhendong Tan, Xingjun Zhang, Zheng Wei
  • Error-preserving Automatic Speech Recognition of Young English Learners’ Language
    Janick Michot, Manuela Hürlimann, Jan Milan Deriu, Luzia Sauer, Katsiaryna Mlynchyk, Mark Cieliebak
  • DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity Recognition
    Yuxiang Cai, Qiao Liu, Yanglei Gan, Run Lin, Changlin Li, Xueyi Liu, Da Luo, JiayeYang
  • Legal Case Retrieval: A Survey of the State of the Art
    Yi Feng, Chuanyi Li, Vincent Ng
  • Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models
    Mosh Levy, Alon Jacoby, Yoav Goldberg
  • Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation
    Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying Wei, Defu Lian, Zhendong Mao
  • LLaMA Pro: Progressive LLaMA with Block Expansion
    Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ying Shan, Ping Luo
  • Generating Contrastive Narratives Using the Brownian Bridge Process for Narrative Coherence Learning
    Feiteng Mu, Wenjie Li
  • A Causal Approach for Counterfactual Reasoning in Narratives
    Feiteng Mu, Wenjie Li
  • SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by Simulation
    Matthias Lindemann, Alexander Koller, Ivan Titov
  • The Hidden Space of Transformer Language Adapters
    Jesujoba Oluwadara Alabi, Marius Mosbach, Matan Eyal, Dietrich Klakow, Mor Geva
  • A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts
    Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Robert Moro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee
  • Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations
    Guan-Ting Lin, Cheng-Han Chiang, Hung-yi Lee
  • RetinaQA: A Robust Knowledge Base Question Answering Model for both Answerable and Unanswerable Questions
    Prayushi Faldu, Indrajit Bhattacharya, Mausam .
  • GroundingGPT: Language Enhanced Multi-modal Grounding Model
    Zhaowei Li, Xu Qi, Dong Zhang, Hang Song, YiQing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Vu Van Tu, Zhida Huang, Tao Wang
  • Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches
    Islam Eldifrawi, Shengrui Wang, Amine Trabelsi
  • Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages
    Carlos Mullov, Quan Pham, Alexander Waibel
  • SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget
    Rui Kong, Yuanchun Li, qingtian feng, Weijun Wang, Xiaozhou Ye, Ye Ouyang, Linghe Kong, Yunxin Liu
  • PixT3: Pixel-based Table-To-Text Generation
    Iñigo Alonso, Eneko Agirre, Mirella Lapata
  • Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers
    Gal Yona, Roee Aharoni, Mor Geva
  • TAMS: Translation-Assisted Morphological Segmentation
    Enora Rice, Ali Marashian, Luke Gessler, Alexis Palmer, Katharina von der Wense
  • Disambiguate Words like Composing Them: A Morphology-Informed Approach to Enhance Chinese Word Sense Disambiguation
    Yue Wang, Qiliang Liang, Yaqi Yin, Hansi Wang, Yang Liu
  • XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval
    Mohammad Abdullah Matin Khan, M Saiful Bari, Do Xuan Long, Weishi Wang, Md Rizwan Parvez, Shafiq Joty
  • ProxyQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models
    Haochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Yunlong Feng, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song
  • A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia
    Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West
  • Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA
    Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Yang Zhao, Xinze Guan, Xin Eric Wang
  • WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models
    Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu
  • Translation-based Lexicalization Generation and Lexical Gap Detection: Application to Kinship Terms
    Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak
  • Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations
    Ritam Dutt, Zhen Wu, Jiaxin Shi, Divyanshu Sheth, Prakhar Gupta, Carolyn Rose
  • Robust Frame-Semantic Models with Lexical Unit Trees and Negative Samples
    Jacob Devasier, Yogesh Gurjar, Chengkai Li
  • Do Llamas Work in English? On the Latent Language of Multilingual Transformers
    Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West
  • Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation
    Yuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri
  • Lightweight reranking for language model generations
    Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang
  • ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews
    Mike D’Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg, Tom Hope, Doug Downey
  • The Unreasonable Effectiveness of Easy Training Data for Hard Tasks
    Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe
  • PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning
    Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri
  • MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning
    Inderjeet Jayakumar Nair, Lu Wang
  • ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs
    Justin Chen, Swarnadeep Saha, Mohit Bansal
  • Mirror: Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning
    Hanqi Yan, Qinglin Zhu, Xinyu Wang, Lin Gui, Yulan He
  • Where Do People Tell Stories Online? Story Detection Across Online Communities
    Maria Antoniak, Joel Mire, Maarten Sap, Elliott Ash, Andrew Piper
  • Large Language Models Are No Longer Shallow Parsers
    Yuanhe Tian, Fei Xia, Yan Song
  • Dialogue Summarization with Mixture of Experts based on Large Language Models
    Yuanhe Tian, Fei Xia, Yan Song
  • ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences
    Yuanhe Tian, Ruyi Gan, Yan Song, Jiaxing Zhang, Yongdong Zhang
  • An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs
    Daking Rai, Ziyu Yao
  • Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling
    Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex Pentland, Yoon Kim, Deb Roy, Jad Kabbara
  • Intrinsic Task-based Evaluation for Referring Expression Generation
    Guanyi Chen, Fahime Same, Kees Van Deemter
  • From Moments to Milestones: Incremental Timeline Summarization Leveraging Large Language Models
    Qisheng Hu, Geonsik Moon, Hwee Tou Ng
  • End-to-end Learning of Logical Rules for Enhancing Document-level Relation Extraction
    Kunxun Qi, Jianfeng Du, Hai Wan
  • Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?
    Qingkai Fang, Shaolei Zhang, Zhengrui Ma, Min zhang, Yang Feng
  • Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder
    Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min zhang, Zhiguo Zhang
  • G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation
    Xingyuan Pan, Luyang Huang, Liyan Kang, Zhicheng Liu, Yu Lu, Shanbo Cheng
  • CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers
    Longwei Zou, Qingyang Wang, Han Zhao, jiangangkong, YI YANG, Yangdong Deng
  • Prompt Optimization via Adversarial In-Context Learning
    Do Xuan Long, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He
  • StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion
    Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang
  • Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering
    Zhengliang Shi, Shuo Zhang, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen, Zhaochun Ren
  • Multimodal Contextualized Semantic Parsing from Speech
    Jordan Voas, David Harwath, Ray Mooney
  • LaMP: When Large Language Models Meet Personalization
    Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani
  • AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters
    Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge
  • MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues
    Ge Bai, Jie Liu, Xingyuan Bu, yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng, Wanli Ouyang
  • EFSA: Towards Event-Level Financial Sentiment Analysis
    Tianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He, Xiang Ao
  • Media Framing: A typology and Survey of Computational Approaches Across Disciplines
    Yulia Otmakhova, Shima Khanehzar, Lea Frermann
  • What Evidence Do Language Models Find Convincing?
    Alexander Wan, Eric Wallace, Dan Klein
  • Advancement in Graph Understanding: A Multimodal Benchmark and Fine-Tuning of Vision-Language Models
    Qihang Ai, Jiafan Li, Jincheng Dai, Jianwu Zhou, Lemao Liu, Haiyun Jiang, Shuming Shi
  • LangBridge: Multilingual Reasoning Without Multilingual Supervision
    Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo
  • Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs
    Siyuan Wang, zhongyu wei, Yejin Choi, Xiang Ren
  • SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving
    Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong
  • Unlocking the Power of Large Language Models for Entity Alignment
    Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, Yuanzhuo Wang
  • SPZ: A Semantic Perturbation-based Data Augmentation Method with Zonal-Mixing for Alzheimer’s Disease Detection
    FangFang Li, Cheng Huang, PuZhen Su, Jie Yin
  • Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents
    Yifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin
  • ReFT: Reasoning with Reinforced Fine-Tuning
    Luong Quoc Trung, Xinbo Zhang, Zhanming Jie, peng sun, Xiaoran Jin, Hang Li
  • Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment
    yunxin li, Xinyu Chen, Baotian Hu, Haoyuan Shi, Min Zhang
  • FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation
    Zijian Feng, Hanzhang Zhou, Kezhi Mao, Zixiao Zhu
  • HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition
    Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang
  • Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art
    Shengjie Li, Vincent Ng
  • Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution
    Flor Miriam Plaza-del-Arco, Amanda Cercas Curry, Alba Cercas Curry, Gavin Abercrombie, Dirk Hovy
  • Label Augmentation for Zero-Shot Hierarchical Text Classification
    Lorenzo Paletto, Valerio Basile, Roberto Esposito
  • STICKERCONV: Generating Multimodal Empathetic Responses from Scratch
    Yiqun Zhang, Fanheng Kong, Peidong Wang, Shuang Sun, SWangLing, Shi Feng, Daling Wang, Yifei Zhang, Kaisong Song
  • EIT: Enhanced Interactive Transformer
    Tong Zheng, Bei Li, Huiwen Bao, Tong Xiao, JingBo Zhu
  • MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs
    Yavuz Faruk Bakman, Duygu Nur Yaldiz, Baturalp Buyukates, Chenyang Tao, Dimitrios Dimitriadis, Salman Avestimehr
  • EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models
    Rocktim Jyoti Das, Simeon Emilov Hristov, Haonan Li, Dimitar Iliyanov Dimitrov, Ivan Koychev, Preslav Nakov
  • Order-Agnostic Data Augmentation for Few-Shot Named Entity Recognition
    Huiming Wang, Liying Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing
  • Text Embedding Inversion Security for Multilingual Language Models
    Yiyi Chen, Heather Lent, Johannes Bjerva
  • Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
    Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou
  • Calibrating Large Language Models Using Their Generations Only
    Dennis Thomas Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh
  • PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator
    Chuyi Kong, Yaxin FAN, Xiang Wan, Feng Jiang, Benyou Wang
  • Synthesizing Text-to-SQL Data from Weak and Strong LLMs
    Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, Chang Zhou
  • Iterative Forward Tuning Boosts In-Context Learning in Language Models
    Jiaxi Yang, Binyuan Hui, Min Yang, Bailin Wang, Bowen Li, Binhua Li, Fei Huang, Yongbin Li
  • STRUCTSUM Generation for Faster Text Comprehension
    Parag Jain, Andreea Marzoca, Francesco Piccinno
  • Analysing The Impact of Sequence Composition on Language Model Pre-Training
    Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Miłoś, Yuxiang Wu, Pasquale Minervini
  • NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference Time
    Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu
  • SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network
    Kexin Wang, Jiahong Zhang, Yong Ren, Man Yao, Di Shang, Bo XU, Guoqi Li
  • Context-aware Difference Distilling for Multi-change Captioning
    Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang
  • Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion
    Wei Cheng, Yuhan Wu, Wei Hu
  • Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question Generation
    Haohao Luo, Yang Deng, Ying Shen, See-Kiong Ng, Tat-Seng Chua
  • LLMEmbed: Rethinking Lightweight LLM’s Genuine Function in Text Classification
    ChunLiu, Hongguang Zhang, Kainan Zhao, Xinghai Ju, Lin Yang
  • LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter Fusion
    Yilong Chen, Junyuan Shang, Zhenyu Zhang, Shiyao Cui, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu
  • Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech Translation
    Tengfei Yu, Xuebo Liu, Liang Ding, Kehai Chen, Dacheng Tao, Min Zhang
  • To be Continuous, or to be Discrete, Those are Bits of Questions
    Yiran Wang, Masao Utiyama
  • Moûsai: Efficient Text-to-Music Diffusion Models
    Flavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Schölkopf
  • PokeMQA: Programmable knowledge editing for Multi-hop Question Answering
    Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang
  • MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention
    Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha, Pushpak Bhattacharyya
  • Efficient OCR for Building a Diverse Digital History
    Jacob Carlson, Tom Bryan, Melissa Dell
  • Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space
    Zongru Wu, Zhuosheng Zhang, Pengzhou Cheng, Gongshen Liu
  • ANAH: Analytical Annotation of Hallucinations in Large Language Models
    Ziwei Ji, Yuzhe Gu, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen
  • Aligning Large Language Models for Controllable Recommendations
    Wensheng Lu, Jianxun Lian, Wei Zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie
  • Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods
    Haeun Yu, Pepa Atanasova, Isabelle Augenstein
  • Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement
    Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William Yang Wang
  • Full Parameter Fine-tuning for Large Language Models with Limited Resources
    Kai Lv, Yuqing Yang, Tengxiao Liu, Qipeng Guo, Xipeng Qiu
  • M$^3$CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought
    Qiguang Chen, Libo Qin, Jin Zhang, Zhi Chen, Xiao Xu, Wanxiang Che
  • Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models
    Longze Chen, Ziqiang Liu, Wanwei He, Yinhe Zheng, Hao Sun, Yunshui Li, Run Luo, Min Yang
  • Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation
    Keqi Deng, Phil Woodland
  • Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL
    Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim
  • A Modular Approach for Multimodal Summarization of TV Shows
    Louis Mahon, Mirella Lapata
  • Think Twice: Perspective-Taking Improves Large Language Models’ Theory-of-Mind Capabilities
    Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency
  • BizBench: A Quantitative Reasoning Benchmark for Business and Finance
    Michael Krumdick, Rik Koncel-Kedziorski, Viet Dac Lai, Varshini Reddy, Charles Lovering, Chris Tanner
  • Direct Metric Optimization for Image Captioning through Reward-Weighted Augmented Data Utilization
    Takumi Takada, Yuma Suzuki, Hiroki Takushima, Hayato Tanoue, Haruki Sato, Aiswariya Manoj Kumar, Hiroki Nishihara, Takayuki Hori, Kazuya Ueki
  • Deciphering Hate: Identifying Hateful Memes and Their Targets
    Eftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah Masud Preum
  • Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings
    Yichen Jiang, Xiang Zhou, Mohit Bansal
  • Label-Efficient Model Selection for Text Generation
    Shir Ashury Tahan, Ariel Gera, Benjamin Sznajder, Leshem Choshen, Liat Ein-Dor, Eyal Shnarch
  • Machine Unlearning of Pre-trained Large Language Models
    Jin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang, Zezhou Cheng, Xiang Yue
  • Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals
    Francesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf
  • FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence
    Sebastian Antony Joseph, Lily Chen, Jan Trienes, Hannah Louisa Göke, Monika Coers, Wei Xu, Byron C Wallace, Junyi Jessy Li
  • BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction
    Yinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao, Renhong Cheng
  • Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack
    Yu Fu, Yufei Li, Wen Xiao, Cong Liu, Yue Dong
  • Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn’t
    Chihiro Taguchi, David Chiang
  • Speech language models lack important brain-relevant semantics
    SUBBA REDDY OOTA, Emin Çelik, Fatma Deniz, Mariya Toneva
  • DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding
    Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu
  • Bypassing LLM Watermarks with Color-Aware Substitutions
    Qilong Wu, Varun Chandrasekaran
  • Parallel Structures in Pre-training Data Yield In-Context Learning
    Yanda Chen, Chen Zhao, Zhou Yu, Kathleen McKeown, He He
  • OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models
    Hainiu Xu, Runcong Zhao, Lixing Zhu, Jinhua Du, Yulan He
  • Towards Privacy-Aware Sign Language Translation at Scale
    Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgoz, Jean Maillard
  • Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards
    Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang
  • Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters
    Yinghui Li, Zishan Xu, Shaoshen Chen, Haojing Huang, Yangning Li, Shirong Ma, Yong Jiang, Zhongli Li, Qingyu Zhou, Hai-Tao Zheng, Ying Shen
  • Steering Llama 2 via Contrastive Activation Addition
    Nina Rimsky, Nick Gabrieli, Julian Schulz, Meg Tong, Evan J Hubinger, Alexander Matt Turner
  • RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations
    Jing Huang, Zhengxuan Wu, Christopher Potts, Mor Geva, Atticus Geiger
  • Large Language Models as Zero-shot Dialogue State Tracker through Function Calling
    Zekun Li, Zhiyu Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Luna Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook
  • Faithful Chart Summarization with ChaTS-Pi
    Syrine Krichene, Francesco Piccinno, Fangyu Liu, Julian Martin Eisenschlos
  • Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation
    Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang
  • MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking
    Ting-Chih Chen, Chia-Wei Tang, Chris Thomas
  • KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction
    Zixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, lixiang, Zhilei Hu, Long Bai, Wei Li, Yidan Liu, Pan Yang, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng
  • ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis
    Yanming Liu, Xinyue Peng, Tianyu Du, Jianwei Yin, Weihao Liu, Xuhong Zhang
  • EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities
    Nian Li, Chen Gao, Mingyu Li, Yong Li, Qingmin Liao
  • On the Multi-turn Instruction Following for Conversational Web Agents
    Yang Deng, Xuan Zhang, Wenxuan Zhang, Yifei Yuan, See-Kiong Ng, Tat-Seng Chua
  • Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents
    Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Liujianfeng, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang
  • MC$^2$: Towards Transparent and Culturally-Aware NLP for Minority Languages in China
    Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng
  • Decoder-only Streaming Transformer for Simultaneous Translation
    Shoutao Guo, Shaolei Zhang, Yang Feng
  • Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization
    Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang
  • I am a Strange Dataset: Metalinguistic Tests for Language Models
    Tristan Thrush, Jared Moore, Miguel Monares, Christopher Potts, Douwe Kiela
  • SafetyBench: Evaluating the Safety of Large Language Models
    Zhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu, Xuanyu Lei, Jie Tang, Minlie Huang
  • Deciphering Oracle Bone Language with Diffusion Models
    Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu
  • TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space
    Shaolei Zhang, Tian Yu, Yang Feng
  • ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training
    Le Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Jianan Zhao, Heqi Zheng, Conghui He, Xian-Ling Mao, Wentao Zhang
  • StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning
    Shaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min zhang, Yang Feng
  • Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models
    Tianjie Ju, Yijin Chen, Xinwei Yuan, Zhuosheng Zhang, Wei Du, Yubin Zheng, Gongshen Liu
  • Why Don’t Prompt-Based Fairness Metrics Correlate?
    Abdelrahman Zayed, Goncalo Mordido, Ioana Baldini, Sarath Chandar
  • NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
    Manuel Tonneau, Pedro Vitor Quinta de Castro, Karim Lasri, Ibrahim Sambo Farouq, Lakshmi Subramanian, Victor Orozco-Olvera, Samuel Fraiberger
  • M$^3$AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset
    Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang
  • Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination
    Nakyeong Yang, Taegwan Kang, Stanley Jungkyu Choi, Honglak Lee, Kyomin Jung
  • Domain Adaptation for Subjective Induction Questions Answering on Products by Adversarial Disentangled Learning
    Yufeng Zhang, Jianxing Yu, Yanghui Rao, Libin Zheng, Qinliang Su, Huaijie Zhu, Jian Yin
  • Revisiting Demonstration Selection Strategies in In-Context Learning
    Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, Dacheng Tao
  • Multimodal Table Understanding
    Mingyu Zheng, Xinwei Feng, Qingyi Si, Qiaoqiao She, Zheng Lin, Wenbin Jiang, Weiping Wang
  • Ex\textsuperscript{3}: Automatic Novel Writing by Extracting, Excelsior and Expanding
    Huang Lei, Jiaming Guo, Guanhua He, Xishan Zhang, Rui Zhang, Shaohui Peng, Shaoli Liu, Tianshi Chen
  • Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning
    Mayur Patidar, Riya Sawhney, Avinash Kumar Singh, Biswajit Chatterjee, Mausam ., Indrajit Bhattacharya
  • WatME: Towards Lossless Watermarking Through Lexical Redundancy
    Liang CHEN, Yatao Bian, Yang Deng, Deng Cai, Shuaiyi Li, Peilin Zhao, Kam-Fai Wong
  • Text-like Encoding of Collaborative Information in Large Language Models for Recommendation
    Yang Zhang, Keqin Bao, Ming Yan, Wenjie Wang, Fuli Feng, Xiangnan He
  • MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception
    Yuhao Wang, Yusheng Liao, Heyang Liu, Hongcheng Liu, Yanfeng Wang, Yu Wang
  • Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning
    Jiachun Li, Pengfei Cao, Chenhao Wang, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao
  • Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation
    Yi Liu, Xiangyu Liu, Xiangrong Zhu, Wei Hu
  • M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models
    Wai-Chung Kwan, Xingshan Zeng, Yufei Wang, Yusen Sun, Liangyou Li, Yuxin Jiang, Lifeng Shang, Qun Liu, Kam-Fai Wong
  • Reward-based Input Construction for Cross-document Relation Extraction
    Byeonghu Na, Suhyeon Jo, Yeongmin Kim, Il-chul Moon
  • Hyperspherical Multi-Prototype with Optimal Transport for Event Argument Extraction
    Guangjun Zhang, Hu zhang, YuJie Wang, Ru Li, Hongye Tan, Jiye Liang
  • Understanding Retrieval Robustness for Retrieval-augmented Image Captioning
    Wenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott
  • Semi-Supervised Spoken Language Glossification
    Huijie Yao, Wengang Zhou, Hao Zhou, Houqiang Li
  • SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
    Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Li YanTao, Jianbing Zhang, Zhiyong Wu
  • InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers
    Yakir Yehuda, Itzik Malkiel, Oren Barkan, Jonathan Weill, Royi Ronen, Noam Koenigstein
  • F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods
    Yu Sun, keyuchen, Shujie Wang, Peiji Li, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin
  • Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning
    Philipp Mondorf, Barbara Plank
  • Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback
    Maria Emilia Agis Lerner, Florian E. Dorner, Elliott Ash, Naman Goel
  • Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations
    Peiyi Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yifei Li, Deli Chen, Yu Wu, Zhifang Sui
  • Large Language Models are not Fair Evaluators
    Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, Zhifang Sui
  • Improving Large Language Models in Event Relation Logical Prediction
    Meiqi Chen, Yubo Ma, Kaitao Song, Yixin Cao, Yan Zhang, Dongsheng Li
  • Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline
    Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin
  • Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation
    Wenting Chen, Linlin Shen, Jingyang Lin, Jiebo Luo, Xiang Li, Yixuan Yuan
  • T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step
    Zehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao
  • Are LLM-based Evaluators Confusing NLG Quality Criteria?
    Xinyu Hu, Mingqi Gao, Sen Hu, Yang Zhang, Yicheng Chen, TENG XU, Xiaojun Wan
  • Synergistic Interplay between Search and Large Language Models for Information Retrieval
    Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong Long, Dongyan Zhao, Daxin Jiang
  • Linear Transformers with Learnable Kernel Functions are Better In-Context Models
    Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov
  • Temperature-scaling surprisal estimates improve fit to human reading times – but does it do so for the “right reasons”?
    Tong Liu, Iza Škrjanec, Vera Demberg
  • Beyond Recognising Entailment: Formalising Natural Language Inference from an Argumentative Perspective
    Ameer Saadat-Yazdi, Nadin Kökciyan
  • RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization
    Jaavid Aktar Husain J, Raj Dabre, Aswanth Kumar M, Jay Gala, Thanmay Jayakumar, Ratish Puduppully, Anoop Kunchukuttan
  • AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling
    Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yu-Gang Jiang, Xipeng Qiu
  • CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models
    Zixin Chen, Hongzhan Lin, Ziyang Luo, Mingfei Cheng, Jing Ma, Guang Chen
  • Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation
    Aiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Xiaoming Simon Wang, Jiulong Shan, Meng Cao, Lijie Wen
  • Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines
    Michael Toker, Hadas Orgad, Mor Ventura, Dana Arad, Yonatan Belinkov
  • Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models
    Yuchong Sun, Che Liu, Kun Zhou, Jinwen Huang, Ruihua Song, Xin Zhao, Fuzheng Zhang, Di ZHANG, Kun Gai
  • Robust Singing Voice Transcription Serves Synthesis
    Ruiqi Li, Yu Zhang, Yongqi Wang, Zhiqing Hong, Rongjie Huang, Zhou Zhao
  • VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model
    Tianyu Chen, Lin Li, ZhuLiuchuan, Zongyang Li, Xueqing Liu, Guangtai Liang, Qianxiang Wang, Tao Xie
  • Self-Modifying State Modeling for Simultaneous Machine Translation
    Donglei Yu, Xiaomian Kang, Yuchen Liu, Yu Zhou, Chengqing Zong
  • MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation
    Jiaqi Chen, Bingqian Lin, Ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee K. Wong
  • BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents
    Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian
  • DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy
    Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan
  • LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent
    Robert Mahari, Dominik Stammbach, Elliott Ash, Alex Pentland
  • To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering
    Giacomo Frisoni, Alessio Cocchieri, Alex Presepi, Gianluca Moro, Zaiqiao Meng
  • MERA: A Comprehensive LLM Evaluation in Russian
    Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid S Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Anastasia Minaeva, Denis Dimitrov, Alexander Panchenko, Sergey Markov
  • SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer
    Jie Zhao, Ziyu Guan, Cai Xu, Wei Zhao, Yue Jiang
  • Causal Estimation of Memorisation Profiles
    Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel
  • CHECKWHY: Causal Fact Verification via Argument Structure
    Jiasheng Si, Yibo Zhao, Yingjie Zhu, Haiyang Zhu, Wenpeng Lu, Deyu Zhou
  • Dodo: Dynamic Contextual Compression for Decoder-only LMs
    Guanghui Qin, Corby Rosset, Ethan C. Chau, Nikhil Rao, Benjamin Van Durme
  • POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation
    Shilong Pan, Zhiliang Tian, Liang Ding, Haoqi Zheng, Zhen Huang, Zhihua Wen, Dongsheng Li
  • NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism
    Miao Li, Ming-Bin Chen, Bo Tang, ShengbinHou, Pengyu Wang, Haiying Deng, Zhiyu li, Feiyu Xiong, Keming Mao, Cheng Peng, Yi Luo
  • MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization
    Shuaijie She, Wei Zou, Shujian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, Jiajun Chen
  • Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training
    Feiteng Fang, yuelin bai, Shiwen Ni, Min Yang, Xiaojun Chen, Ruifeng Xu
  • Predicting Text Preference Via Structured Comparative Reasoning
    Jing Nathan Yan, Tianqi Liu, Justin T Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Charumathi Lakshmanan, Yair Kurzion, Alexander M Rush, Jialu Liu, Michael Bendersky
  • CoELM: Construction-Enhanced Language Modeling
    Lvxiaowei Xu, Zhilin Gong, Jianhua Dai, Tianxiang Wang, Ming Cai, Jiawei Peng
  • Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model
    Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers
  • Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation
    Songju Lei, Xize Cheng, Mengjiao Lyu, Jianqiao Hu, Jintao Tan, Runlin Liu, Lingyu Xiong, Tao Jin, Xiandong Li, Zhou Zhao
  • On the Impact of Calibration Data in Post-training Quantization and Pruning
    Miles Williams, Nikolaos Aletras
  • SymKGQA: Few-Shot Knowledge Graph Question Answering via Symbolic Program Generation and Execution
    Prerna Agarwal, Nishant Kumar, Srikanta J. Bedathur
  • Meta-Task Prompting Elicits Embeddings from Large Language Models
    Yibin Lei, Di Wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates
  • A Sentiment Consolidation Framework for Meta-Review Generation
    Miao Li, Jey Han Lau, Eduard Hovy
  • Revisiting Structured Sentiment Analysis as Latent Dependency Graph Parsing
    Chengjie Zhou, Bobo Li, Hao Fei, Fei Li, Chong Teng, Donghong Ji
  • OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification
    Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe
  • Do Large Language Models Latently Perform Multi-Hop Reasoning?
    Sohee Yang, Elena Gribovskaya, Nora Kassner, Mor Geva, Sebastian Riedel
  • MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning
    Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou
  • Harnessing Toulmin’s theory for zero-shot argument explication
    Ankita Gupta, Ethan Zuckerman, Brendan O’Connor
  • BinaryAlign: Word Alignment as Binary Sequence Labeling
    Gaetan Lopez Latouche, Marc-André Carbonneau, Benjamin Swanson
  • Quantifying the Persona Effect in LLM Simulations
    Tiancheng Hu, Nigel Collier
  • On Efficient and Statistical Quality Estimation for Data Annotation
    Jan-Christoph Klie, Juan Haladjian, Marc Kirchner, Rahul Nair
  • EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection
    Chenye Zhao, Cornelia Caragea
  • Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question?
    Nishant Balepur, Abhilasha Ravichander, Rachel Rudinger
  • Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments
    Zhenrui Yue, Huimin Zeng, Lanyu Shang, Yifan Liu, Yang Zhang, Dong Wang
  • SyllabusQA: A Course Logistics Question Answering Dataset
    Nigel Fernandez, Alexander Scarlatos, Andrew Lan
  • American Sign Language Handshapes Reflect Pressures for Communicative Efficiency
    Kayo Yin, Terry Regier, Dan Klein
  • MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models
    Yilin Wen, Zifeng Wang, Jimeng Sun
  • AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts
    Daniel Braun, Florian Matthes
  • Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks
    Charlotte Siska, Katerina Marazopoulou, Melissa Ailem, James Bono
  • Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning
    Eric Pasewark, Kyle Montgomery, Kefei Duan, Dawn Song, Chenguang Wang
  • Bridging the Preference Gap between Retrievers and LLMs
    Zixuan Ke, Weize Kong, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky
  • Large Language Models Can Learn Temporal Reasoning
    Siheng Xiong, Ali Payani, Ramana Rao Kompella, Faramarz Fekri
  • Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
    Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Evan Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo
  • Learning Relational Decomposition of Queries for Question Answering from Tables
    Raphaël Mouravieff, Benjamin Piwowarski, sylvain lamprier
  • Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People
    Dun-Ming Huang, Pol Van Rijn, Ilia Sucholutsky, Raja Marjieh, Nori Jacoby
  • Pareto Optimal Learning for Estimating Large Language Model Errors
    Theodore Zhao, Mu Wei, J. Samuel Preston, Hoifung Poon
  • Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models
    Victor Agostinelli III, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen
  • Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM
    Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen
  • Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models
    Guanming Xiong, Junwei Bao, Wen Zhao
  • LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error
    Boshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su
  • HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts
    Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu
  • Aligning Large Language Models with Human Preferences through Representation Engineering
    Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Zhu JianHao, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
  • CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models
    Fuwen Luo, Chi Chen, Zihao Wan, Zhaolu Kang, Qidong Yan, Yingjie Li, Xiaolong Wang, Siyu Wang, Ziyue Wang, Xiaoyue Mi, Peng Li, Ning Ma, Maosong Sun, Yang Liu
  • ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation
    Chen Huang, Yiping Jin, Ilija Ilievski, Wenqiang Lei, Jiancheng Lv
  • PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations
    Qihao Yang, Yong Li, Xuelin Wang, Fu Lee Wang, Tianyong Hao
  • Prompted Aspect Key Point Analysis for Quantitative Review Summarization
    An Quang Tang, Xiuzhen Zhang, Minh Ngoc Dinh, Erik Cambria
  • Ask Again, Then Fail: Large Language Models’ Vacillations in Judgment
    Qiming Xie, Zengzhi Wang, Yi Feng, Rui Xia
  • CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models
    Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua
  • Multimodal Reasoning with Multimodal Knowledge Graph
    Junlin Lee, Yequan Wang, Jing Li, Min Zhang
  • Confidence is not Timeless: Modeling Temporal Validity for Rule-based Temporal Knowledge Graph Forecasting
    Rikui Huang, Wei Wei, Xiaoye Qu, Shengzhe Zhang, Dangyang Chen, Yu Cheng
  • CARE: A Clue-guided Assistant for CSRs to Read User Manuals
    Weihong Du, Jia Liu, zujie wen, Dingnan Jin, Hongru Liang, Wenqiang Lei
  • Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes
    Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che
  • PAGED: A Benchmark for Procedural Graphs Extraction from Documents
    Weihong Du, Wenrui Liao, Hongru Liang, Wenqiang Lei
  • Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors
    Ying Zhou, Ben He, Le Sun
  • RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models
    Cheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, KaShun SHUM, Randy Zhong, Juntong Song, Tong Zhang
  • The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models
    Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Xin Zhao, Jian-Yun Nie, Ji-Rong Wen
  • Revisiting Knowledge Distillation for Autoregressive Language Models
    Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, Dacheng Tao
  • OLMo: Accelerating the Science of Language Models
    Dirk Groeneveld, Iz Beltagy, Evan Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, William H. Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi
  • Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine Translation
    Yunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen, Jie Zhou
  • Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners
    Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, CHAO WENG, Zhou Zhao, Dong Yu
  • Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages
    Shih-Cheng Huang, Pin-Zu Li, YU-CHI HSU, Kuang-Ming Chen, Yu Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee
  • Emulated Disalignment: Safety Alignment for Large Language Models May Backfire!
    Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao
  • PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails
    Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash
  • Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered Assistance
    Bo Yuan, Yulin Chen, Yin Zhang, Wei Jiang
  • CLOMO: Counterfactual Logical Modification with Large Language Models
    Yinya Huang, Ruixin Hong, Hongming Zhang, Wei Shao, Zhicheng YANG, Dong Yu, Changshui Zhang, Xiaodan Liang, Linqi Song
  • Exploring Hybrid Question Answering via Program-based Prompting
    Qi Shi, Han Cui, Haofeng Wang, Qingfu Zhu, Wanxiang Che, Ting Liu
  • IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages
    Harman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar
  • Simple but Effective Compound Geometric Operations for Temporal Knowledge Graph Completion
    Rui Ying, Mengting Hu, Jianfeng Wu, Yalan Xie, Xiaoyi Liu, Zhunheng Wang, Ming Jiang, Hang Gao, Linlin Zhang, Renhong Cheng
  • Uncertainty Aware Learning for Language Model Alignment
    Yikun Wang, Rui Zheng, Liang Ding, Qi Zhang, Dahua Lin, Dacheng Tao
  • Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models
    Ying-Chun Lin, Jennifer Neville, Jack W Stokes, Longqi Yang, Tara Safavi, Mengting Wan, Scott Counts, Siddharth Suri, Reid Andersen, Xiaofeng Xu, Deepak Gupta, Sujay Kumar Jauhar, Xia Song, Georg Buscher, saurabh tiwary, Brent Hecht, Jaime Teevan
  • Fundamental Capabilities of Large Language Models and their Applications in Domain Scenarios: A Survey
    Jiawei Li, Yizhe Yang, Yu Bai, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, 林一冠, Bin Xu, Ren bowen, Chong Feng, Yang Gao, Heyan Huang
  • IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages
    Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad B, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M Khapra
  • Measuring Political Bias in Large Language Models: What Is Said and How It Is Said
    Yejin Bang, Delong Chen, Nayeon Lee, Pascale Fung
  • Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use
    Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan
  • Layer-Condensed KV Cache for Efficient Inference of Large Language Models
    Haoyi Wu, Kewei Tu
  • Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models
    Xiaolong Wang, Yile Wang, Yuanchi Zhang, Fuwen Luo, Peng Li, Maosong Sun, Yang Liu
  • Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages
    Yuanchi Zhang, Yile Wang, Zijun Liu, Shuo Wang, Xiaolong Wang, Peng Li, Maosong Sun, Yang Liu
  • Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations
    Jiaxing Sun, weiquan Huang, Jiang Wu, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He
  • Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context Fusion
    Ziyue Wang, Chi Chen, Yiqi Zhu, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu
  • Model Composition for Multimodal Large Language Models
    Chi Chen, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu
  • Draft & Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding
    Jun Zhang, Jue WANG, Huan Li, Lidan Shou, Ke Chen, Gang Chen, Sharad Mehrotra
  • Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup
    Xuxin Cheng, Ziyu Yao, Yifei Xin, Hao An, Hongxiang Li, Yaowei Li, Yuexian Zou
  • Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models
    Changjiang Gao, Jixing Li, Jiajun Chen, Shujian Huang
  • MIST: Mutual Information Maximization for Short Text Clustering
    Krissanee Kamthawee, Can Udomcharoenchaikit, Sarana Nutanong
  • Self-chats from Large Language Models Make Small Emotional Support Chatbot Better
    Zhonghua Zheng, Lizi Liao, Yang Deng, Libo Qin, Liqiang Nie
  • Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment
    Janghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang, Jungwook Choi
  • Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs
    Tianqing Fang, Zeming Chen, Yangqiu Song, Antoine Bosselut
  • An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing
    Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang
  • Learning to Plan and Generate Text with Citations
    Constanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata
  • Exploring Precision and Recall to assess the quality and diversity of LLMs
    Florian Le Bronnec, Alexandre Verine, benjamin negrevergne, Yann Chevaleyre, Alexandre Allauzen
  • Aligning Large Language Models by On-Policy Self-Judgment
    Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu
  • IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning
    Abhinav Joshi, Shounak Paul, Akshat Sharma, Pawan Goyal, Saptarshi Ghosh, Ashutosh Modi
  • JumpCoder: Go Beyond Autoregressive Coder via Online Modification
    Mouxiang Chen, Hao Tian, Zhongxin Liu, Xiaoxue Ren, Jianling Sun
  • Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning
    Shivalika Singh, Freddie Vargus, Daniel D’souza, Börje F. Karlsson, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura O’Mahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzemiński, Hakimeh Fadaei, Irem Ergun, Ifeoma Okoh, Aisha Alaagib, Oshan Ivantha Mudannayake, Zaid Alyafeai, Vu Minh Chien, Sebastian Ruder, Surya Guthikonda, Emad A. Alghamdi, Sebastian Gehrmann, Niklas Muennighoff, Max Bartolo, Julia Kreutzer, Ahmet Üstün, Marzieh Fadaee, Sara Hooker
  • Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks
    Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, Tanmoy Chakraborty
  • Split and Rephrase with Large Language Models
    Antonio David Ponce Martínez, Thierry Etchegoyhen, Jesus Javier Calleja Perez, Harritxu Gete
  • ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition
    Lu Ye, Ze Tao, Yong Huang, Yang Li
  • AlignBench: Benchmarking Chinese Alignment of Large Language Models
    Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Andrew Zhuoer Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Xiaotao Gu, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang
  • SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models
    Weixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che
  • DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution
    Yulong Mao, Kaiyu Huang, Changhao Guan, Ganglin Bao, Fengran Mo, Jinan Xu
  • Cross-Lingual Knowledge Editing in Large Language Models
    Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu, Fandong Meng
  • Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model
    Ahmet Üstün, Viraat Aryabumi, Zheng Xin Yong, Wei-Yin Ko, Daniel D’souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid, Freddie Vargus, Phil Blunsom, Shayne Longpre, Niklas Muennighoff, Marzieh Fadaee, Julia Kreutzer, Sara Hooker
  • Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques
    Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri
  • Learning Task Decomposition to Assist Humans in Competitive Programming
    Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang
  • An Entropy-based Text Watermarking Detection Method
    Yijian LU, Aiwei Liu, Dianzhi Yu, Jingjing Li, Irwin King
  • Enhancing Explainable Rating Prediction through Annotated Macro Concepts
    Huachi Zhou, Shuang Zhou, Hao Chen, Ninghao Liu, Fan Yang, Xiao Huang
  • How to Engage your Readers? Generating Guiding Questions to Promote Active Reading
    Peng Cui, Vilém Zouhar, Xiaoyu Zhang, Mrinmaya Sachan
  • Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective
    Zihao Yue, Liang Zhang, Qin Jin
  • Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation
    Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li
  • More frequent verbs are associated with more diverse valency frames: Efficient principles at the lexicon-grammar interface
    Siyu Tao, Lucia Donatelli, Michael Hahn
  • BatchEval: Towards Human-like Text Evaluation
    Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li
  • Quantifying Generalizations: Exploring the Divide Between Human and LLMs’ Sensitivity to Quantification
    Claudia Collacciani, Giulia Rambelli, Marianna Bolognesi
  • Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated Study on Lexicalized and Novel Compounds
    Giulia Rambelli, Emmanuele Chersoni, Claudia Collacciani, Marianna Bolognesi
  • CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation
    Quan Tu, Shilong Fan, Zihang Tian, Tianhao Shen, Shuo Shang, Xin Gao, Rui Yan
  • Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond
    Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua
  • Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction
    Yice Zhang, Jie Zeng, Weiming Hu, Ziyi Wang, Shiwei Chen, Ruifeng Xu
  • ToMBench: Benchmarking Theory of Mind in Large Language Models
    Zhuang Chen, Jincenzi Wu, Jinfeng Zhou, Bosi Wen, Guanqun Bi, Gongyao Jiang, Yaru Cao, Mengting Hu, Yunghwei Lai, Zexuan Xiong, Minlie Huang
  • Learning to Generate Answers with Citations via Factual Consistency Models
    Rami Aly, Zhiqiang Tang, Samson Tan, George Karypis
  • Improving Text Embeddings with Large Language Models
    Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei
  • Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning
    Tianduo Wang, Shichen Li, Wei Lu
  • UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset
    Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun
  • Document-level Claim Extraction and Decontextualisation for Fact-Checking
    Zhenyun Deng, Michael Sejr Schlichtkrull, Andreas Vlachos
  • PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning
    Xiaoqi Qiu, Yongjie Wang, Xu Guo, Zhiwei Zeng, Yu Yue, Yuhong Feng, Chunyan Miao
  • LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction
    Hanzhang Zhou, Junlang Qian, Zijian Feng, Hui Lu, Zixiao Zhu, Kezhi Mao
  • Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models
    Weihong Zhong, Xiaocheng Feng, Liang Zhao, Qiming Li, Lei Huang, Yuxuan Gu, Weitao Ma, Yuan Xu, Bing Qin
  • COKE: A Cognitive Knowledge Graph for Machine Theory of Mind
    Jincenzi Wu, Zhuang Chen, Jiawen Deng, Sahand Sabour, Helen M. Meng, Minlie Huang
  • mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models
    Huiyuan Lai, Malvina Nissim
  • GunStance: Stance Detection for Gun Control and Gun Regulation
    Nikesh Gyawali, Iustin Sirbu, Tiberiu Sosea, Sarthak Khanal, Doina Caragea, Traian Rebedea, Cornelia Caragea
  • Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation
    Zdeněk Kasner, Ondrej Dusek
  • Don’t Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection
    Min Zhang, Jianfeng He, Taoran Ji, Chang-Tien Lu
  • Don’t Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation
    Giorgos Vernikos, Andrei Popescu-Belis
  • Generating and Evaluating Plausible Explanations for Knowledge Graph Completion
    Antonio Di Mauro, Zhao Xu, Wiem Ben Rim, Timo Sztyler, Carolin Lawrence
  • One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation
    Tejpalsingh Siledar, Swaroop Nath, Sankara Sri Raghava Ravindra Muddu, Rupasai Rangaraju, Swaprava Nath, Pushpak Bhattacharyya, Suman Banerjee, Amey Patil, Sudhanshu Shekhar Singh, Muthusamy Chelliah, Nikesh Garera
  • MultiPICo: Multilingual Perspectivist Irony Corpus
    Silvia Casola, Simona Frenda, Soda Marem Lo, Erhan Sezerer, Antonio Uva, Valerio Basile, Cristina Bosco, Alessandro Pedrani, Chiara Rubagotti, Viviana Patti, Davide Bernardi
  • LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation
    shaolin Zhu, Leiyu Pan, Bo Li, Deyi Xiong
  • A Joint Coreference-Aware Approach to Document-Level Target Sentiment Analysis
    Hongjie Cai, Heqing Ma, Jianfei Yu, Rui Xia
  • VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models
    Qingxing Cao, Junhao Cheng, Xiaodan Liang, Liang Lin
  • AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints
    Yu-Zhe Shi, Haofei Hou, Zhangqian Bi, Fanxu Meng, Xiang Wei, Lecheng Ruan, Qining Wang
  • Multipath parsing in the brain
    Berta Franzluebbers, Donald Dunagan, Miloš Stanojević, Jan Buys, John T. Hale
  • Search-Adaptor: Embedding Customization for Information Retrieval
    Jinsung Yoon, Yanfei Chen, Sercan O Arik, Tomas Pfister
  • Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs
    Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker
  • VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation
    Max Ku, Dongfu Jiang, Cong Wei, Xiang Yue, Wenhu Chen
  • Tree Transformer’s Disambiguation Ability of Prepositional Phrase Attachment and Garden Path Effects
    Lingling Zhou, Suzan Verberne, Gijs Wijnholds
  • Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs
    Elan Sopher Markowitz, Anil Ramakrishna, Jwala Dhamala, Ninareh Mehrabi, Charith Peris, Rahul Gupta, Kai-Wei Chang, Aram Galstyan
  • Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing
    Freda Shi, Kevin Gimpel, Karen Livescu
  • ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation
    Akshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Dev
  • AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents
    Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian
  • Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking
    Xiaokang Zhang, Zijun Yao, Jing Zhang, Kaifeng Yun, Jifan Yu, Juanzi Li, Jie Tang
  • What Do Language Models Learn in Context? The Structured Task Hypothesis.
    Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell
  • Agent Lumos: Unified and Modular Training for Open-Source Language Agents
    Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin
  • Investigating Cultural Alignment of Large Language Models
    Badr AlKhamissi, Muhammad ElNokrashy, Mai Alkhamissi, Mona T. Diab
  • More Victories, Less Cooperation: Assessing Cicero’s Diplomacy Play
    Wichayaporn Wongkamjan, Feng Gu, Yanze Wang, Ulf Hermjakob, Jonathan May, Brandon M. Stewart, Jonathan K. Kummerfeld, Denis Peskoff, Jordan Lee Boyd-Graber
  • VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild
    Puyuan Peng, Po-Yao Huang, Shang-Wen Li, Abdelrahman Mohamed, David Harwath
  • RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors
    Liam Dugan, Alyssa Hwang, Filip Trhlík, Andrew Zhu, Josh magnus Ludan, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch
  • Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles
    Julia Kruk, Michela Marchini, Rijul Magu, Caleb Ziems, David Muchlinski, Diyi Yang
  • On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning
    Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell
  • Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends
    Sanjana Ramprasad, Elisa Ferracane, Zachary Chase Lipton
  • MMToM-QA: Multimodal Theory of Mind Question Answering
    Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu
  • LLM in a flash: Efficient Large Language Model Inference with Limited Memory
    Keivan Alizadeh, Seyed Iman Mirzadeh, Dmitry Belenko, S. Karen Khatamifard, Minsik Cho, Carlo C del Mundo, Mohammad Rastegari, Mehrdad Farajtabar
  • Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models
    Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Fahad Shahbaz Khan
  • To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation
    Abdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed
  • DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents
    Yilun Zhao, Yitao Long, Hongjun Liu, Ryo Kamoi, Linyong Nan, Lyuhao Chen, Yixin Liu, Xiangru Tang, Rui Zhang, Arman Cohan
  • LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding
    Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu
  • Unintended Impacts of LLM Alignment on Global Representation
    Michael J Ryan, William Barr Held, Diyi Yang
  • Classist Tools: Social Class Correlates with Performance in NLP
    Amanda Cercas Curry, Giuseppe Attanasio, Zeerak Talat, Dirk Hovy
  • ActionIE: Action Extraction from Scientific Literature with Programming Languages
    Xianrui Zhong, Yufeng Du, Siru Ouyang, Ming Zhong, Tingfeng Luo, Qirong Ho, Hao Peng, Heng Ji, Jiawei Han
  • A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech
    Gaurav Verma, Rynaa Grover, Jiawei Zhou, Binny Mathew, Jordan Kraemer, Munmun De Choudhury, Srijan Kumar
  • Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs
    Zhiwei Cao, Qian Cao, Yu Lu, Ningxin Peng, Luyang Huang, Shanbo Cheng, Jinsong Su
  • COSMIC: Mutual Information for Task-Agnostic Summarization Evaluation
    Maxime DARRIN, Philippe Formont, Jackie CK Cheung, Pablo Piantanida
  • ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer
    Arkadiy Saakyan, Smaranda Muresan
  • EUROPA: A Legal Multilingual Keyphrase Generation Dataset
    Olivier Salaün, Frédéric Piedboeuf, Guillaume Le Berre, David Alfonso-Hermelo, Philippe Langlais
  • GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews
    Maxime DARRIN, Ines Arous, Pablo Piantanida, Jackie CK Cheung
  • MAP’s not dead yet: Uncovering true language model modes by conditioning away degeneracy
    Davis Yoshida, Kartik Goyal, Kevin Gimpel
  • Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks
    Fakhraddin Alwajih, El Moatez Billah Nagoudi, Gagan Bhatia, Abdelrahman Mohamed, Muhammad Abdul-Mageed
  • Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks
    João Bordalo, Vasco Ramos, Rodrigo Valério, Diogo Glória-Silva, Yonatan Bitton, Michal Yarom, Idan Szpektor, Joao Magalhaes
  • Cheetah: Natural Language Generation for 517 African Languages
    Ife Adebara, AbdelRahim A. Elmadany, Muhammad Abdul-Mageed
  • TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning
    Yilun Zhao, Lyuhao Chen, Arman Cohan, Chen Zhao
  • KnowledgeFMath: A Knowledge-Intensive Math Reasoning Dataset in Finance Domains
    Yilun Zhao, Hongjun Liu, Yitao Long, Rui Zhang, Chen Zhao, Arman Cohan
  • API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs
    Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Vernon Austel, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras
  • LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks
    Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu, Maosong Sun
  • Harder Task Needs More Experts: Dynamic Routing in MoE Models
    Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng
  • XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception
    HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang
  • SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents
    Ruiyi Wang, Haofei Yu, Wenxin Sharon Zhang, Zhengyang Qi, Maarten Sap, Yonatan Bisk, Graham Neubig, Hao Zhu
  • ${\mathcal X}$FT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts
    Yifeng Ding, Jiawei Liu, Yuxiang Wei, LINGMING ZHANG
  • Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning
    Tuc Van Nguyen, Thai Le
  • Learning to Decode Collaboratively with Multiple Language Models
    Zejiang Shen, Hunter Lang, Bailin Wang, Yoon Kim, David Sontag
  • DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models
    Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun LIU
  • Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning?
    Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min zhang
  • CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation
    Pei Ke, Bosi Wen, Andrew Zhuoer Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang
  • LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments
    Junzhe Chen, Xuming Hu, Shuodi Liu, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Lijie Wen
  • Small But Funny: A Feedback-Driven Approach to Humor Distillation
    Sahithya Ravi, Patrick Huber, Akshat Shrivastava, Vered Shwartz, Arash Einolghozati
  • Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models
    Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu
  • From Sights to Insights: Towards Summarization of Multimodal Clinical Documents
    Akash Ghosh, Mohit Singh Tomar, Abhisek Tiwari, Sriparna Saha, JATIN AVINASH SALVE, Setu Sinha
  • When Phrases Meet Probabilities: Enabling Open Relation Extraction with Cooperating Large Language Models
    Jiaxin Wang, Lingling Zhang, Wee Sun Lee, Yujie Zhong, Liwei Kang, Jun Liu
  • Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation
    Jan Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Maria Bielikova, Peter Brusilovsky
  • Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic
    Yassine El Kheir, Hamdy Mubarak, Ahmed Ali, Shammur Absar Chowdhury
  • Document-Level Machine Translation with Large-Scale Public Parallel Corpora
    Proyag Pal, Alexandra Birch, Kenneth Heafield
  • Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In!
    Stefano Perrella, Lorenzo Proietti, Alessandro Scirè, Edoardo Barba, Roberto Navigli
  • NounAtlas: Filling the Gap in Nominal Semantic Role Labeling
    Roberto Navigli, Marco Lo Pinto, Pasquale Silvestri, Dennis Rotondi, Simone Ciciliano, Alessandro Scirè
  • Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length
    Nur Lan, Emmanuel Chemla, Roni Katzir
  • Context versus Prior Knowledge in Language Models
    Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell
  • Word Matters: What Influences Domain Adaptation in Summarization?
    Yinghao Li, Siyu Miao, Heyan Huang, Yang Gao
  • Visualization Recommendation with Prompt-based Reprogramming of Large Language Models
    Xinhang Li, Jingbo Zhou, Wei Chen, Derong Xu, Tong Xu, Enhong Chen
  • HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs
    Pranoy Panda, Ankush Agarwal, Chaitanya Devaguptapu, Manohar Kaul, Prathosh AP
  • Toward In-Context Teaching: Adapting Examples to Students’ Misconceptions
    Alexis Ross, Jacob Andreas
  • Bridging Word-Pair and Token-Level Metaphor Detection with Explainable Domain Mining
    Yuan Tian, Ruike Zhang, Nan Xu, Wenji Mao
  • Faithful Logical Reasoning via Symbolic Chain-of-Thought
    Jundong Xu, Hao Fei, Liangming Pan, Qian Liu, Mong-Li Lee, Wynne Hsu
  • S$^2$GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis
    Bingfeng chen, qihan ouyang, yongqi luo, Boyan Xu, Ruichu Cai, Zhifeng Hao
  • Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends
    Giuliano Martinelli, Edoardo Barba, Roberto Navigli
  • ESCoT: Towards Interpretable Emotional Support Dialogue Systems
    Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin
  • PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering
    Fangzhi Xu, Qika Lin, Tianzhe Zhao, JiaweiHan, Jun Liu
  • WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection
    Anudeex Shetty, Yue Teng, Ke He, Qiongkai Xu
  • Advancing Parameter Efficiency in Fine-tuning via Representation Editing
    Muling Wu, Wenhao Liu, Xiaohua Wang, Tianlong Li, Changze Lv, Zixuan Ling, Zhu JianHao, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang
  • Context Consistency between Training and Inference in Simultaneous Machine Translation
    Meizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang
  • Using Natural Language Explanations to Improve Robustness of In-context Learning
    Xuanli He, Yuxiang Wu, Oana-Maria Camburu, Pasquale Minervini, Pontus Stenetorp
  • The Earth is Flat because…: Investigating LLMs’ Belief towards Misinformation via Persuasive Conversation
    Rongwu Xu, Brian S. Lin, Shujian Yang, Tianqi Zhang, Weiyan Shi, Tianwei Zhang, Zhixuan Fang, Wei Xu, Han Qiu
  • Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers
    Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, nan du
  • LooGLE: Can Long-Context Language Models Understand Long Contexts?
    Jiaqi Li, Mengmeng Wang, Zilong Zheng, Muhan Zhang
  • ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models
    Hojae Han, Jaejin Kim, Jaeseok Yoo, Youngwon Lee, seung-won hwang
  • Let’s Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation
    Se Jin Park, Chae Won Kim, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeonghun Yeo, Yong Man Ro
  • Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels
    Zixia Jia, Junpeng Li, Shichuan Zhang, Anji Liu, Zilong Zheng
  • MULFE: A Multi-Level Benchmark for Free Text Model Editing
    Chenhao Wang, Pengfei Cao, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao
  • MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech
    Shengpeng Ji, Ziyue Jiang, Wang Hanting, Jialung Zuo, Zhou Zhao
  • Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation
    Muraleekrishna Gopinathan, Martin Masek, Jumana Abu-Khalaf, David Suter
  • HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position
    Kechi Zhang, Ge Li, Huangzhao Zhang, Zhi Jin
  • Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training
    Junqing He, Kunhao Pan, Xiaoqun Dong, Zhuoyang Song, LiuYiBo, qianguosun, Yuxin Liang, Hao Wang, Enming Zhang, Jiaxing Zhang
  • CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges
    Kechi Zhang, Jia Li, Ge Li, xianjie Shi, Zhi Jin
  • When is Tree Search Useful for LLM Planning? It Depends on the Discriminator
    Ziru Chen, Michael White, Ray Mooney, Ali Payani, Yu Su, Huan Sun
  • LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models
    Mihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo, Santosh Mashetty, Arindam Mitra, Chitta Baral
  • ECBD: Evidence-Centered Benchmark Design for NLP
    Yu Lu Liu, Su Lin Blodgett, Jackie CK Cheung, Vera Liao, Alexandra Olteanu, Ziang Xiao
  • Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style Understanding
    Ruohao Guo, Wei Xu, Alan Ritter
  • Reducing Privacy Risks in Online Self-Disclosures with Language Models
    Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu
  • Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models
    Zihao Lin, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin, Lifu Huang
  • REFINESUMM: Self-Refining MLLM for Generating a Multimodal Summarization Dataset
    Vaidehi Patil, Leonardo F. R. Ribeiro, Mengwen Liu, Mohit Bansal, Markus Dreyer
  • When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards
    Norah A. Alzahrani, Hisham Abdullah Alyahya, Yazeed Alnumay, Sultan AlRashed, Shaykhah Z. Alsubaie, Yousef Almushayqih, Faisal Abdulrahman Mirza, Nouf M. Alotaibi, Nora Al-Twairesh, Areeb Alowisheq, M Saiful Bari, Haidar Khan
  • LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts
    Helia Hashemi, Jason Eisner, Corby Rosset, Benjamin Van Durme, Chris Kedzie
  • LIEDER: Linguistically-Informed Evaluation for Discourse Entity Recognition
    Xiaomeng Zhu, Robert Frank
  • Evaluating Very Long-Term Conversational Memory of LLM Agents
    Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang
  • Prototypical Reward Network for Data-Efficient Model Alignment
    Jinghan Zhang, Xiting Wang, Yiqiao Jin, Changyu Chen, Xinhao Zhang, Kunpeng Liu
  • NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms
    Jonathan Zheng, Alan Ritter, Wei Xu
  • Impacts of Misspelled Queries on Translation and Product Search
    Greg Hanneman, Natawut Monaikul, Taichi Nakatani
  • Having Beer after Prayer? Measuring Cultural Bias in Large Language Models
    Tarek Naous, Michael J Ryan, Alan Ritter, Wei Xu
  • Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs
    Bilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia, Ming Jin
  • The MERSA Dataset and a Transformer-Based Approach for Speech Emotion Recognition
    Enshi Zhang, Rafael Trujillo, Christian Poellabauer
  • Transparent and Scrutable Recommendations Using Natural Language User Profiles
    Jerome Ramos, Hossein A. Rahmani, Xi Wang, Xiao Fu, Aldo Lipani
  • Fora: A corpus and framework for the study of facilitated dialogue
    Hope Schroeder, Deb Roy, Jad Kabbara
  • Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning
    Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky
  • What is the Best Way for ChatGPT to Translate Poetry?
    Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao
  • Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling
    Pratyush Maini, Skyler Seto, Richard He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly
  • DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal Intervention
    Junda Wu, Tong Yu, Xiang Chen, Haoliang Wang, Ryan A. Rossi, Sungchul Kim, Anup Rao, Julian McAuley
  • Representation Learning with Conditional Information Flow Maximization
    Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu
  • GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction
    Virginia K. Felkner, Jennifer A. Thompson, Jonathan May
  • Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models
    Martin Riddell, Ansong Ni, Arman Cohan
  • Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic
    Rishabh Bhardwaj, Duc Anh Do, Soujanya Poria
  • Tracking the Newsworthiness of Public Documents
    Alexander Spangher, Serdar Tumgoren, Ben Welsh, Nanyun Peng, Emilio Ferrara, Jonathan May
  • EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems
    Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye HAO, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh
  • Explicating the Implicit: Argument Detection Beyond Sentence Boundaries
    Paul Roit, Aviv Slobodkin, Eran Hirsch, Arie Cattan, Ayal Klein, Valentina Pyatkin, Ido Dagan
  • Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning on Language Models
    Shengzhi LI, Rongyu Lin, Shichao Pei
  • Word Embeddings Are Steers for Language Models
    Chi Han, Jialiang Xu, Manling Li, Yi Fung, Chenkai Sun, Nan Jiang, Tarek F. Abdelzaher, Heng Ji
  • Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation
    Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum
  • Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor
    Sangwon Yu, Changmin Lee, Hojin Lee, Sungroh Yoon
  • LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP
    Danlu Chen, Freda Shi, Aditi Agarwal, Jacobo Myerston, Taylor Berg-Kirkpatrick
  • Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning
    Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou
  • Confabulation: The Surprising Value of Large Language Model Hallucinations
    Peiqi Sui, Eamon Duede, Sophie Wu, Richard Jean So
  • IAPT: Instance-Aware Prompt Tuning for Large Language Models
    Wei Zhu, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie

Short Papers

  • Can Language Models Serve as Text-Based World Simulators?
    Ruoyao Wang, Graham Todd, Ziang Xiao, Xingdi Yuan, Marc-Alexandre Côté, Peter Clark, Peter Jansen
  • FanOutQA: A Multi-Hop, Multi-Document Question Answering Benchmark for Large Language Models
    Andrew Zhu, Alyssa Hwang, Liam Dugan, Chris Callison-Burch
  • Revisiting Code Similarity Evaluation with Abstract Syntax Tree Edit Distance
    Yewei Song, Cedric Lothritz, Daniel Tang, Tegawendé F. Bissyandé, Jacques Klein
  • Resisting the Lure of the Skyline: Grounding Practices in Active Learning for Morphological Inflection
    Saliha Muradoglu, Michael Ginn, Miikka Silfverberg, Mans Hulden
  • Speculative Contrastive Decoding
    Hongyi Yuan, Keming Lu, Fei Huang, Zheng Yuan, Chang Zhou
  • RDRec: Rationale Distillation for LLM-based Recommendation
    Xinfeng Wang, Jin Cui, Yoshimi Suzuki, Fumiyo Fukumoto
  • Isotropy, Clusters, and Classifiers
    Timothee Mickus, Stig-Arne Grönroos, Joseph Attieh
  • Language Models Do Hard Arithmetic Tasks Easily and Hardly Do Easy Arithmetic Tasks
    Andrew Gambardella, Yusuke Iwasawa, Yutaka Matsuo
  • Cleaner Pretraining Corpus Curation with Neural Web Scraping
    Zhipeng Xu, Zhenghao Liu, Yukun Yan, Zhiyuan Liu, Ge Yu, Chenyan Xiong
  • Simpson’s Paradox and the Accuracy-Fluency Tradeoff in Translation
    Zheng Wei Lim, Ekaterina Vylomova, Trevor Cohn, Charles Kemp
  • UltraSparseBERT: 99% Conditionally Sparse Language Modelling
    Peter Belcak, Roger Wattenhofer
  • SceMQA: A Scientific College Entrance Level Multimodal Question Answering Benchmark
    Zhenwen Liang, Kehan Guo, Gang Liu, Taicheng Guo, Yujun Zhou, Tianyu Yang, Jiajun Jiao, Renjie Pi, Jipeng Zhang, Xiangliang Zhang
  • On the Role of Long-tail Knowledge in Retrieval Augmented Large Language Models
    Dongyang Li, Junbing Yan, Taolin Zhang, Chengyu Wang, Xiaofeng He, Longtao Huang, Hui Xue’, Jun Huang
  • IEPile: Unearthing Large Scale Schema-Conditioned Information Extraction Corpus
    Honghao Gui, Lin Yuan, Hongbin Ye, Ningyu Zhang, Mengshu Sun, Lei Liang, Huajun Chen
  • Bi-Directional Multi-Granularity Generation Framework for Knowledge Graph-to-Text with Large Language Model
    Haowei Du, Chen Li, Dinghao Zhang, Dongyan Zhao
  • Code-Switching Can be Better Aligners: Advancing Cross-Lingual SLU through Representation-Level and Prediction-Level Alignment
    Zhihong Zhu, Xuxin Cheng, Zhanpeng Chen, Xianwei Zhuang, Zhiqi Huang, Yuexian Zou
  • AFLoRA: Adaptive Freezing of Low Rank Adaptation in Parameter Efficient Fine-Tuning of Large Models
    Zeyu Liu, Souvik Kundu, Anni Li, Junrui Wan, Lianghao Jiang, Peter Anthony Beerel
  • DDPrompt: Differential Diversity Prompting in Large Language Models
    Lin Mu, Wenhao Zhang, Yiwen Zhang, Peiquan Jin
  • Monotonic Representation of Numeric Attributes in Language Models
    Benjamin Heinzerling, Kentaro Inui
  • Two Issues with Chinese Spelling Correction and A Refinement Solution
    Changxuan Sun, Linlin She, Xuesong Lu
  • Linear-time Minimum Bayes Risk Decoding with Reference Aggregation
    Jannis Vamvas, Rico Sennrich
  • DynaSemble: Dynamic Ensembling of Textual and Structure-Based Models for Knowledge Graph Completion
    Ananjan Nandi, Navdeep Kaur, Parag Singla, Mausam .
  • Fine-Tuning Pre-Trained Language Models with Gaze Supervision
    Shuwen Deng, Paul Prasse, David Robert Reich, Tobias Scheffer, Lena Ann Jäger
  • Growing Trees on Sounds: Assessing Strategies for End-to-End Dependency Parsing of Speech
    Adrien Pupier, Maximin Coavoux, Jérôme Goulian, Benjamin Lecouteux
  • Sketch-Guided Constrained Decoding for Boosting Blackbox Large Language Models without Logit Access
    Saibo Geng, Berkay Döner, Chris Wendler, Martin Josifoski, Robert West
  • On the Semantic Latent Space of Diffusion-Based Text-To-Speech Models
    Miri Varshavsky, Roy Hirsch, Regev Cohen, Tomer Golany, Daniel Freedman, Ehud Rivlin
  • Learnable Privacy Neurons Localization in Language Models
    Ruizhe Chen, Tianxiang Hu, YANG FENG, Zuozhu Liu
  • Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Non-Literal Intent Resolution in LLMs
    Akhila Yerukola, Saujas Vaduguru, Daniel Fried, Maarten Sap
  • Generating Harder Cross-document Event Coreference Resolution Datasets using Metaphoric Paraphrasing
    Shafiuddin Rehan Ahmed, Zhiyong Wang, George Arthur Baker, Kevin Stowe, James H. Martin
  • Soft Self-Consistency Improves Language Models Agents
    Han Wang, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal
  • RecGPT: Generative Pre-training for Text-based Recommendation
    Hoang Ngo, Dat Quoc Nguyen
  • MTP: A Dataset for Multi-Modal Turning Points in Casual Conversations
    Gia-Bao Dinh Ho, Chang Wei Tan, Zahra Zamanzadeh Darban, Mahsa Salehi, Reza Haf, Wray Buntine
  • What Do Dialect Speakers Want? A Survey of Attitudes Towards Language Technology for German Dialects
    Verena Blaschke, Christoph Purschke, Hinrich Schuetze, Barbara Plank
  • What Does Parameter-free Probing Really Uncover?
    Tommi Buder-Gröndahl
  • ATLAS: Improving Lay Summarisation with Attribute-based Control
    Zhihao Zhang, Tomas Goldsack, Carolina Scarton, Chenghua Lin
  • EmbSpatial-Bench: Benchmarking Spatial Understanding for Embodied Tasks with Large Vision-Language Models
    Mengfei Du, Binhao Wu, Zejun Li, Xuanjing Huang, zhongyu wei
  • Understanding the Effects of Noise in Text-to-SQL: An Examination of the BIRD-Bench Benchmark
    Niklas Wretblad, Fredrik Gordh Riseby, Rahul Biswas, Amin Ahmadi, Oskar Holmström
  • Dwell in the Beginning: How Language Models Embed Long Documents for Dense Retrieval
    João Coelho, Bruno Martins, Joao Magalhaes, Jamie Callan, Chenyan Xiong
  • That’s Optional: A Contemporary Exploration of “that” Omission in English Subordinate Clauses
    Ella Rabinovich
  • Do Large Language Models Discriminate in Hiring Decisions on the Basis of Race, Ethnicity, and Gender?
    Haozhe An, Christabel Acquaye, Colin Wang, Zongxia Li, Rachel Rudinger
  • Explainability and Hate Speech: Structured Explanations Make Social Media Moderators Faster
    Agostina Calabrese, Leonardo Neves, Neil Shah, Maarten W. Bos, Björn Ross, Mirella Lapata, Francesco Barbieri
  • Getting Serious about Humor: Crafting Humor Datasets with Unfunny Large Language Models
    Zachary Horvitz, Jingru Chen, Rahul Aditya, Harshvardhan Srivastava, Robert West, Zhou Yu, Kathleen McKeown
  • Estimating the Level of Dialectness Predicts Inter-annotator Agreement in Multi-dialect Arabic Datasets
    Amr Keleg, Walid Magdy, Sharon Goldwater
  • Born Differently Makes a Difference: Counterfactual Study of Bias in Biography Generation from a Data-to-Text Perspective
    Biaoyan Fang, Ritvik Dinesh, Xiang Dai, Sarvnaz Karimi
  • Greed is All You Need: An Evaluation of Tokenizer Inference Methods
    Omri Uzan, Craig W Schmidt, Chris Tanner, Yuval Pinter
  • Sign Language Translation with Sentence Embedding Supervision
    HAMIDULLAH Yasser, Josef van Genabith, Cristina España-Bonet
  • STREAM: Simplified Topic Retrieval, Exploration, and Analysis Module
    Anton Frederik Thielmann, Arik Reuter, Christoph Weisser, Gillian Kant, Manish Kumar, Benjamin Säfken
  • DocFinQA: A Long-Context Financial Reasoning Dataset
    Varshini Reddy, Rik Koncel-Kedziorski, Viet Dac Lai, Michael Krumdick, Charles Lovering, Chris Tanner
  • MaskLID: Code-Switching Language Identification through Iterative Masking
    Amir Hossein Kargaran, François Yvon, Hinrich Schuetze
  • An Empirical Analysis on Large Language Models in Debate Evaluation
    Xinyi Liu, Pinxin Liu, Hangfeng He
  • Fine-Tuned Machine Translation Metrics Struggle in Unseen Domains
    Vilém Zouhar, Shuoyang Ding, Anna Currey, Tatyana Badeka, Jenyuan Wang, Brian Thompson
  • IndicIRSuite: Multilingual Dataset and Neural Information Models for Indian Languages
    Saiful Haq, Ashutosh Sharma, Omar Khattab, Niyati Chhaya, Pushpak Bhattacharyya
  • AGR: Reinforced Causal Agent-Guided Self-explaining Rationalization
    Yunxiao Zhao, Zhiqiang Wang, Xiaoli Li, Jiye Liang, Ru Li
  • Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research
    Surangika Ranathunga, Nisansa de Silva, Dilith Jayakody, Aloka Fernando
  • The Probabilities Also Matter: A More Faithful Metric for Faithfulness of Free-Text Explanations in Large Language Models
    Noah Yamamoto Siegel, Oana-Maria Camburu, Nicolas Heess, Maria Perez-Ortiz
  • Don’t Buy it! Reassessing the Ad Understanding Abilities of Contrastive Multimodal Models
    Anna Bavaresco, Alberto Testoni, Raquel Fernández
  • Naming, Describing, and Quantifying Visual Objects in Humans and LLMs
    Alberto Testoni, Juell Sprott, Sandro Pezzelle
  • Are LLMs classical or nonmonotonic reasoners? Lessons from generics
    Alina Leidinger, Robert Van Rooij, Ekaterina Shutova
  • ConstitutionalExperts: Training a Mixture of Principle-based Prompts
    Savvas Petridis, Ben Wedin, Ann Yuan, James Wexler, Nithum Thain
  • Time Sensitive Knowledge Editing through Efficient Finetuning
    Xiou Ge, Ali Mousavi, Edouard Grave, Armand Joulin, Kun Qian, Benjamin Han, Mostafa Arefiyan, Yunyao Li
  • PRewrite: Prompt Rewriting with Reinforcement Learning
    Weize Kong, Spurthi Amba Hombaiah, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky
  • SeeGULL Multilingual: a Dataset of Geo-Culturally Situated Stereotypes
    Mukul Bhutani, Kevin Robinson, Vinodkumar Prabhakaran, Shachi Dave, Sunipa Dev
  • Paraphrasing in Affirmative Terms Improves Negation Understanding
    MohammadHossein Rezaei, Eduardo Blanco
  • Exploring Conditional Variational Mechanism to Pinyin Input Method for Addressing One-to-Many Mappings in Low-Resource Scenarios
    Bin Sun, Jianfeng Li, Hao Zhou, Fandong Meng, Kan Li, Jie Zhou
  • Consistency Training by Synthetic Question Generation for Conversational Question Answering
    Hamed Hematian Hemati, Hamid Beigy
  • How Good is Zero-Shot MT Evaluation for Low Resource Indian Languages?
    Anushka Singh, Ananya B. Sai, Raj Dabre, Ratish Puduppully, Anoop Kunchukuttan, Mitesh M Khapra
  • Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages
    Mofetoluwa Adeyemi, Akintunde Oladipo, Ronak Pradeep, Jimmy Lin
  • Cross-Modal Projection in Multimodal LLMs Doesn’t Really Project Visual Attributes to Textual Space
    Gaurav Verma, Minje Choi, Kartik Sharma, Jamelle Watson-Daniels, Sejoon Oh, Srijan Kumar
  • Guidance-Based Prompt Data Augmentation in Specialized Domains for Named Entity Recognition
    Hyeonseok Kang, Hyein Seo, Jeesu Jung, Sangkeun Jung, Du-Seong Chang, Riwoo Chung
  • Aligning Large Language Models via Fine-grained Supervision
    Dehong Xu, Liang Qiu, Minseok Kim, Faisal Ladhak, Jaeyoung Do
  • Annotating FrameNet via Structure-Conditioned Language Generation
    Xinyue Cui, Swabha Swayamdipta
  • DUAL-REFLECT: Enhancing Large Language Models for Reflective Translation through Dual Learning Feedback Mechanisms
    Andong Chen, Lianzhang Lou, Kehai Chen, Xuefeng Bai, Yang Xiang, Muyun Yang, Tiejun Zhao, Min Zhang
  • Towards Artwork Explanation in Large-scale Vision Language Models
    Kazuki Hayashi, Yusuke Sakai, Hidetaka Kamigaito, Katsuhiko Hayashi, Taro Watanabe
  • On the Hallucination in Simultaneous Machine Translation
    Meizhi Zhong, Kehai Chen, Zhengshan Xue, Lemao Liu, Mingming Yang, Min Zhang
  • Self-Augmented In-Context Learning for Unsupervised Word Translation
    Yaoyiran Li, Anna Korhonen, Ivan Vulić
  • RAM-EHR: Retrieval Augmentation Meets Clinical Predictions on Electronic Health Records
    Ran Xu, Wenqi Shi, Yue Yu, Yuchen Zhuang, Bowen Jin, May Dongmei Wang, Joyce C. Ho, Carl Yang