[![Contributors][contributors-shield]][contributors-url] [![Forks][forks-shield]][forks-url] [![Stargazers][stars-shield]][stars-url] [![Issues][issues-shield]][issues-url]
Usage instructions: here
Table of Contents
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-14 | Masked Image Modeling Boosting Semi-Supervised Semantic Segmentation | Yangyang Li et.al. | 2411.08756 | null |
2024-11-06 | Retentive Neural Quantum States: Efficient Ansätze for Ab Initio Quantum Chemistry | Oliver Knitter et.al. | 2411.03900 | null |
2024-10-31 | Domain-Adaptive Pre-training of Self-Supervised Foundation Models for Medical Image Classification in Gastrointestinal Endoscopy | Marcel Roth et.al. | 2410.21302 | null |
2024-10-10 | C^2DA: Contrastive and Context-aware Domain Adaptive Semantic Segmentation | Md. Al-Masrur Khan et.al. | 2410.19748 | link |
2024-10-22 | Towards Real Zero-Shot Camouflaged Object Segmentation without Camouflaged Annotations | Cheng Lei et.al. | 2410.16953 | null |
2024-10-21 | TIPS: Text-Image Pretraining with Spatial Awareness | Kevis-Kokitsi Maninis et.al. | 2410.16512 | null |
2024-10-31 | Stabilize the Latent Space for Image Autoregressive Modeling: A Unified Perspective | Yongxin Zhu et.al. | 2410.12490 | link |
2024-10-14 | Enhancing JEPAs with Spatial Conditioning: Robust and Efficient Representation Learning | Etai Littwin et.al. | 2410.10773 | null |
2024-10-14 | LADMIM: Logical Anomaly Detection with Masked Image Modeling in Discrete Latent Space | Shunsuke Sakai et.al. | 2410.10234 | null |
2024-10-10 | Meissonic: Revitalizing Masked Generative Transformers for Efficient High-Resolution Text-to-Image Synthesis | Jinbin Bai et.al. | 2410.08261 | link |
2024-10-25 | OneRef: Unified One-tower Expression Grounding and Segmentation with Mask Referring Modeling | Linhui Xiao et.al. | 2410.08021 | link |
2024-10-11 | Self-Supervised Learning for Real-World Object Detection: a Survey | Alina Ciocarlan et.al. | 2410.07442 | null |
2024-10-09 | Robust infrared small target detection using self-supervised and a contrario paradigms | Alina Ciocarlan et.al. | 2410.07437 | null |
2024-10-09 | Pair-VPR: Place-Aware Pre-training and Contrastive Pair Classification for Visual Place Recognition with Vision Transformers | Stephen Hausler et.al. | 2410.06614 | null |
2024-10-05 | RetCompletion:High-Speed Inference Image Completion with Retentive Network | Yueyang Cang et.al. | 2410.04056 | null |
2024-10-02 | Denoising with a Joint-Embedding Predictive Architecture | Dengsheng Chen et.al. | 2410.03755 | null |
2024-10-02 | Performant, Memory Efficient and Scalable Multi-Agent Reinforcement Learning | Omayma Mahjoub et.al. | 2410.01706 | null |
2024-09-30 | MaskMamba: A Hybrid Mamba-Transformer Model for Masked Image Generation | Wenchao Chen et.al. | 2409.19937 | null |
2024-09-28 | Restore Anything with Masks: Leveraging Mask Image Modeling for Blind All-in-One Image Restoration | Chu-Jie Qin et.al. | 2409.19403 | link |
2024-09-30 | UniEmoX: Cross-modal Semantic-Guided Large-Scale Pretraining for Universal Scene Emotion Perception | Chuang Chen et.al. | 2409.18877 | link |
2024-09-26 | Self-supervised Pretraining for Cardiovascular Magnetic Resonance Cine Segmentation | Rob A. J. de Mooij et.al. | 2409.18100 | link |
2024-09-20 | Leveraging Text Localization for Scene Text Removal via Text-aware Masked Image Modeling | Zixiao Wang et.al. | 2409.13431 | link |
2024-09-13 | Interactive Masked Image Modeling for Multimodal Object Detection in Remote Sensing | Minh-Duc Vu et.al. | 2409.08885 | null |
2024-09-13 | Hybrid-TTA: Continual Test-time Adaptation via Dynamic Domain Shift Detection | Hyewon Park et.al. | 2409.08566 | null |
2024-09-04 | MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling | Jihye Ahn et.al. | 2409.02846 | null |
2024-09-04 | SG-MIM: Structured Knowledge Guided Efficient Pre-training for Dense Prediction | Sumin Son et.al. | 2409.02513 | null |
2024-08-21 | AttDiCNN: Attentive Dilated Convolutional Neural Network for Automatic Sleep Staging using Visibility Graph and Force-directed Layout | Md Jobayer et.al. | 2409.01962 | null |
2024-09-14 | Dual Advancement of Representation Learning and Clustering for Sparse and Noisy Images | Wenlin Li et.al. | 2409.01781 | link |
2024-08-28 | Online pre-training with long-form videos | Itsuki Kato et.al. | 2408.15651 | null |
2024-08-23 | MICM: Rethinking Unsupervised Pretraining for Enhanced Few-shot Learning | Zhenyu Zhang et.al. | 2408.13385 | link |
2024-08-23 | Symmetric masking strategy enhances the performance of Masked Image Modeling | Khanh-Binh Nguyen et.al. | 2408.12772 | null |
2024-08-13 | Membership Inference Attack Against Masked Image Modeling | Zheng Li et.al. | 2408.06825 | null |
2024-08-13 | Masked Image Modeling: A Survey | Vlad Hondru et.al. | 2408.06687 | null |
2024-08-11 | HySparK: Hybrid Sparse Masking for Large Scale Medical Image Pre-Training | Fenghe Tang et.al. | 2408.05815 | link |
2024-08-20 | PersonViT: Large-scale Self-supervised Vision Transformer for Person Re-Identification | Bin Hu et.al. | 2408.05398 | link |
2024-08-15 | AMAES: Augmented Masked Autoencoder Pretraining on Public Brain MRI Data for 3D-Native Segmentation | Asbjørn Munk et.al. | 2408.00640 | null |
2024-07-29 | Short-Term Forecasting of Photovoltaic Power Generation Based on Entropy during the Foggy Winter | Xuan Yang et.al. | 2407.19663 | null |
2024-08-02 | XLIP: Cross-modal Attention Masked Modelling for Medical Language-Image Pre-Training | Biao Wu et.al. | 2407.19546 | link |
2024-07-23 | QPT V2: Masked Image Modeling Advances Visual Scoring | Qizhi Xie et.al. | 2407.16541 | link |
2024-07-22 | Towards Latent Masked Image Modeling for Self-Supervised Visual Representation Learning | Yibing Wei et.al. | 2407.15837 | link |
2024-07-20 | Self-supervised transformer-based pre-training method with General Plant Infection dataset | Zhengle Wang et.al. | 2407.14911 | null |
2024-07-20 | Universal Medical Imaging Model for Domain Generalization with Data Privacy | Ahmed Radwan et.al. | 2407.14719 | null |
2024-07-18 | Keypoint Aware Masked Image Modelling | Madhava Krishna et.al. | 2407.13873 | link |
2024-07-18 | X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs | Sirnam Swetha et.al. | 2407.13851 | null |
2024-07-16 | AEMIM: Adversarial Examples Meet Masked Image Modeling | Wenzhao Xiang et.al. | 2407.11537 | null |
2024-07-16 | EndoFinder: Online Image Retrieval for Explainable Colorectal Polyp Diagnosis | Ruijie Yang et.al. | 2407.11401 | null |
2024-07-13 | ST-RetNet: A Long-term Spatial-Temporal Traffic Flow Prediction Method | Baichao Long et.al. | 2407.11074 | null |
2024-07-12 | On the Role of Discrete Tokenization in Visual Representation Learning | Tianqi Du et.al. | 2407.09087 | null |
2024-07-12 | Tissue-Contrastive Semi-Masked Autoencoders for Segmentation Pretraining on Chest CT | Jie Zheng et.al. | 2407.08961 | null |
2024-07-15 | Spatial-Temporal Attention Model for Traffic State Estimation with Sparse Internet of Vehicles | Jianzhe Xue et.al. | 2407.08047 | null |
2024-07-09 | D-MASTER: Mask Annealed Transformer for Unsupervised Domain Adaptation in Breast Cancer Detection from Mammograms | Tajamul Ashraf et.al. | 2407.06585 | null |
2024-07-16 | AnatoMask: Enhancing Medical Image Segmentation with Reconstruction-guided Self-masking | Yuheng Li et.al. | 2407.06468 | link |
2024-06-25 | Investigating Self-Supervised Methods for Label-Efficient Learning | Srinivasa Rao Nandam et.al. | 2406.17460 | null |
2024-06-25 | Pseudo Labelling for Enhanced Masked Autoencoders | Srinivasa Rao Nandam et.al. | 2406.17450 | null |
2024-06-18 | GFM4MPM: Towards Geospatial Foundation Models for Mineral Prospectivity Mapping | Angel Daruna et.al. | 2406.12756 | null |
2024-06-17 | Scaling Efficient Masked Autoencoder Learning on Large Remote Sensing Dataset | Fengxiang Wang et.al. | 2406.11933 | link |
2024-06-15 | SemanticMIM: Marring Masked Image Modeling with Semantics Compression for General Visual Representation | Yike Yuan et.al. | 2406.10673 | link |
2024-06-11 | Visual Representation Learning with Stochastic Frame Prediction | Huiwon Jang et.al. | 2406.07398 | null |
2024-06-08 | Medical Vision Generalist: Unifying Medical Imaging Tasks in Context | Sucheng Ren et.al. | 2406.05565 | link |
2024-06-03 | Boosting Spatial-Spectral Masked Auto-Encoder Through Mining Redundant Spectra for HSI-SAR/LiDAR Classification | Junyan Lin et.al. | 2406.01235 | null |
2024-06-06 | Whole Heart 3D+T Representation Learning Through Sparse 2D Cardiac MR Images | Yundi Zhang et.al. | 2406.00329 | null |
2024-06-14 | Enhancing Vision-Language Model with Unmasked Token Alignment | Jihao Liu et.al. | 2405.19009 | link |
2024-05-28 | Visualizing the loss landscape of Self-supervised Vision Transformer | Youngwan Lee et.al. | 2405.18042 | null |
2024-05-23 | Masked Image Modelling for retinal OCT understanding | Theodoros Pissas et.al. | 2405.14788 | null |
2024-05-14 | CLIP with Quality Captions: A Strong Pretraining for Vision Tasks | Pavan Kumar Anasosalu Vasu et.al. | 2405.08911 | null |
2024-05-11 | Self-Supervised Pre-training with Symmetric Superimposition Modeling for Scene Text Recognition | Zuan Gao et.al. | 2405.05841 | null |
2024-05-09 | Efficient Pretraining Model based on Multi-Scale Local Visual Field Feature Reconstruction for PCB CT Image Element Segmentation | Chen Chen et.al. | 2405.05745 | null |
2024-05-06 | Intra-task Mutual Attention based Vision Transformer for Few-Shot Learning | Weihao Jiang et.al. | 2405.03109 | null |
2024-05-02 | Self-Supervised Learning for Interventional Image Analytics: Towards Robust Device Trackers | Saahil Islam et.al. | 2405.01156 | null |
2024-05-25 | An Experimental Study on Exploring Strong Lightweight Vision Transformers via Masked Image Modeling Pre-Training | Jin Gao et.al. | 2404.12210 | link |
2024-04-18 | How to Benchmark Vision Foundation Models for Semantic Segmentation? | Tommie Kerssies et.al. | 2404.12172 | null |
2024-04-15 | XoFTR: Cross-modal Feature Matching Transformer | Önder Tuzcuoğlu et.al. | 2404.09692 | null |
2024-04-13 | Label-free Anomaly Detection in Aerial Agricultural Images with Masked Image Modeling | Sambal Shikhar et.al. | 2404.08931 | null |
2024-04-12 | Masked Image Modeling as a Framework for Self-Supervised Learning across Eye Movements | Robin Weiler et.al. | 2404.08526 | link |
2024-04-12 | Emerging Property of Masked Token for Effective Pre-training | Hyesong Choi et.al. | 2404.08330 | null |
2024-04-12 | Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training | Hyesong Choi et.al. | 2404.08327 | null |
2024-04-12 | A Novel Vision Transformer based Load Profile Analysis using Load Images as Inputs | Hyeonjin Kim et.al. | 2404.08175 | null |
2024-04-03 | A Unified Membership Inference Method for Visual Self-supervised Encoder via Part-aware Capability | Jie Zhu et.al. | 2404.02462 | link |
2024-04-01 | Bridging Remote Sensors with Multisensor Geospatial Foundation Models | Boran Han et.al. | 2404.01260 | link |
2024-04-01 | SyncMask: Synchronized Attentional Masking for Fashion-centric Vision-Language Pretraining | Chull Hwan Song et.al. | 2404.01156 | null |
2024-03-31 | Learning to Rank Patches for Unbiased Image Redundancy Reduction | Yang Luo et.al. | 2404.00680 | link |
2024-03-31 | DailyMAE: Towards Pretraining Masked Autoencoders in One Day | Jiantao Wu et.al. | 2404.00509 | link |
2024-03-23 | Once for Both: Single Stage of Importance and Sparsity Search for Vision Transformer Compression | Hancheng Ye et.al. | 2403.15835 | link |
2024-03-14 | Explore In-Context Segmentation via Latent Diffusion Models | Chaoyang Wang et.al. | 2403.09616 | null |
2024-03-13 | MIM4D: Masked Modeling with Multi-View Video for Autonomous Driving Representation Learning | Jialv Zou et.al. | 2403.08760 | link |
2024-03-20 | Content-aware Masked Image Modeling Transformer for Stereo Image Compression | Xinjie Zhang et.al. | 2403.08505 | null |
2024-03-07 | Masked Capsule Autoencoders | Miles Everett et.al. | 2403.04724 | null |
2024-03-04 | Transformers Provably Learn Feature-Position Correlations in Masked Image Modeling | Yu Huang et.al. | 2403.02233 | null |
2024-03-01 | Learning and Leveraging World Models in Visual Representation Learning | Quentin Garrido et.al. | 2403.00504 | null |
2024-03-01 | Semantics-enhanced Cross-modal Masked Image Modeling for Vision-Language Pre-training | Haowei Liu et.al. | 2403.00249 | null |
2024-02-27 | Masked Gamma-SSL: Learning Uncertainty Estimation via Masked Image Modeling | David S. W. Williams et.al. | 2402.17622 | null |
2024-03-08 | A Simple Framework Uniting Visual In-context Learning with Masked Image Modeling to Improve Ultrasound Segmentation | Yuyue Zhou et.al. | 2402.14300 | link |
2024-02-15 | MIM-Refiner: A Contrastive Learning Boost from Intermediate Pre-Trained Representations | Benedikt Alkin et.al. | 2402.10093 | link |
2024-02-13 | Improving Token-Based World Models with Parallel Observation Prediction | Lior Cohen et.al. | 2402.05643 | link |
2024-02-07 | Sparse Anatomical Prompt Semi-Supervised Learning with Masked Image Modeling for CBCT Tooth Segmentation | Pengyu Dai et.al. | 2402.04587 | null |
2024-01-24 | Learning Representations for Clustering via Partial Information Discrimination and Cross-Level Interaction | Hai-Xin Zhang et.al. | 2401.13503 | link |
2024-01-23 | Correlation-Embedded Transformer Tracking: A Single-Branch Framework | Fei Xie et.al. | 2401.12743 | link |
2024-01-15 | Exploring Masked Autoencoders for Sensor-Agnostic Image Retrieval in Remote Sensing | Jakob Hackstein et.al. | 2401.07782 | link |
2024-01-15 | One for All: Toward Unified Foundation Models for Earth Vision | Zhitong Xiong et.al. | 2401.07527 | null |
2024-01-17 | Frequency Masking for Universal Deepfake Detection | Chandler Timm Doloriel et.al. | 2401.06506 | link |
2024-01-05 | Fus-MAE: A cross-attention-based data fusion approach for Masked Autoencoders in remote sensing | Hugo Chan-To-Hing et.al. | 2401.02764 | null |
2024-01-05 | MOODv2: Masked Image Modeling for Out-of-Distribution Detection | Jingyao Li et.al. | 2401.02611 | null |
2024-01-04 | SyCoCa: Symmetrizing Contrastive Captioners with Attentive Masking for Multimodal Alignment | Ziping Ma et.al. | 2401.02137 | null |
2024-01-03 | aMUSEd: An Open MUSE Reproduction | Suraj Patil et.al. | 2401.01808 | link |
2023-12-31 | Analyzing Local Representations of Self-supervised Vision Transformers | Ani Vanyan et.al. | 2401.00463 | null |
2023-12-30 | Masked Image Modeling via Dynamic Token Morphing | Taekyung Kim et.al. | 2401.00254 | null |
2024-01-02 | USFM: A Universal Ultrasound Foundation Model Generalized to Tasks and Organs towards Label Efficient Image Analysis | Jing Jiao et.al. | 2401.00153 | null |
2023-12-27 | Learning to Embed Time Series Patches Independently | Seunghan Lee et.al. | 2312.16427 | link |
2023-12-19 | DMT: Comprehensive Distillation with Multiple Self-supervised Teachers | Yuang Liu et.al. | 2312.11938 | null |
2023-12-13 | PAD: Self-Supervised Pre-Training with Patchwise-Scale Adapter for Infrared Images | Tao Zhang et.al. | 2312.08192 | link |
2023-12-12 | Pre-trained Universal Medical Image Transformer | Lingxiao Luo et.al. | 2312.07630 | link |
2023-12-08 | MIMIR: Masked Image Modeling for Mutual Information-based Adversarial Robustness | Xiaoyun Xu et.al. | 2312.04960 | link |
2023-12-07 | Intelligent Anomaly Detection for Lane Rendering Using Transformer with Self-Supervised Pre-Training and Customized Fine-Tuning | Yongqi Dong et.al. | 2312.04398 | null |
2023-12-11 | Learning Cortical Anomaly through Masked Encoding for Unsupervised Heterogeneity Mapping | Hao-Chun Yang et.al. | 2312.02762 | null |
2023-12-02 | Local Masking Meets Progressive Freezing: Crafting Efficient Vision Transformers for Self-Supervised Learning | Utku Mert Topcuoglu et.al. | 2312.02194 | link |
2023-12-01 | Improve Supervised Representation Learning with Masked Image Modeling | Kaifeng Chen et.al. | 2312.00950 | null |
2023-11-28 | BIM: Block-Wise Self-Supervised Learning with Masked Image Modeling | Yixuan Luo et.al. | 2311.17218 | null |
2023-11-29 | Cross-Axis Transformer with 2D Rotary Embeddings | Lily Erickson et.al. | 2311.07184 | null |
2023-11-08 | Self-Supervised Learning for Visual Relationship Detection through Masked Bounding Box Reconstruction | Zacharias Anastasakis et.al. | 2311.04834 | link |
2023-11-08 | SS-MAE: Spatial-Spectral Masked Auto-Encoder for Multi-Source Remote Sensing Image Classification | Junyan Lin et.al. | 2311.04442 | link |
2023-10-31 | HAP: Structure-Aware Masked Image Modeling for Human-Centric Perception | Junkun Yuan et.al. | 2310.20695 | null |
2023-10-30 | ViR: Vision Retention Networks | Ali Hatamizadeh et.al. | 2310.19731 | null |
2023-10-29 | BirdSAT: Cross-View Contrastive Masked Autoencoders for Bird Species Classification and Mapping | Srikumar Sastry et.al. | 2310.19168 | link |
2023-11-20 | Adversarial Examples Are Not Real Features | Ang Li et.al. | 2310.18936 | null |
2023-10-28 | Pre-training with Random Orthogonal Projection Image Modeling | Maryam Haghighat et.al. | 2310.18737 | null |
2023-10-28 | Feature Guided Masked Autoencoder for Self-supervised Learning in Remote Sensing | Yi Wang et.al. | 2310.18653 | link |
2023-10-20 | Longer-range Contextualized Masked Autoencoder | Taekyung Kim et.al. | 2310.13593 | null |
2023-10-19 | Minimalist and High-Performance Semantic Segmentation with Plain Vision Transformers | Yuanduo Hong et.al. | 2310.12755 | link |
2023-10-11 | Heuristic Vision Pre-Training with Self-Supervised and Supervised Multi-Task Learning | Zhiming Qian et.al. | 2310.07510 | null |
2023-10-10 | Pre-Trained Masked Image Model for Mobile Robot Navigation | Vishnu Dutt Sharma et.al. | 2310.07021 | null |
2023-10-31 | RetSeg: Retention-based Colorectal Polyps Segmentation Network | Khaled ELKarazle et.al. | 2310.05446 | null |
2023-10-06 | Self-Supervised Neuron Segmentation with Multi-Agent Reinforcement Learning | Yinda Chen et.al. | 2310.04148 | link |
2023-10-02 | Self-distilled Masked Attention guided masked image modeling with noise Regularized Teacher (SMART) for medical image analysis | Jue Jiang et.al. | 2310.01209 | null |
2023-10-15 | Information Flow in Self-Supervised Learning | Zhiquan Tan et.al. | 2309.17281 | link |
2023-09-26 | M |
Muhammad Abdullah Jamal et.al. | 2309.15313 | null |
2023-10-08 | Masked Image Residual Learning for Scaling Deeper Vision Transformers | Guoxi Huang et.al. | 2309.14136 | link |
2023-10-11 | RMT: Retentive Networks Meet Vision Transformers | Qihang Fan et.al. | 2309.11523 | null |
2023-09-18 | Heterogeneous Generative Knowledge Distillation with Masked Image Modeling | Ziming Wang et.al. | 2309.09571 | null |
2023-09-18 | FactoFormer: Factorized Hyperspectral Transformers with Self-Supervised Pre-Training | Shaheer Mohamed et.al. | 2309.09431 | link |
2023-09-16 | RingMo-lite: A Remote Sensing Multi-task Lightweight Network with CNN-Transformer Hybrid Framework | Yuelei Wang et.al. | 2309.09003 | null |
2023-09-14 | Unleashing the Power of Depth and Pose Estimation Neural Networks by Designing Compatible Endoscopic Images | Junyang Wu et.al. | 2309.07390 | null |
2023-09-11 | SCD-Net: Spatiotemporal Clues Disentanglement Network for Self-supervised Skeleton-based Action Recognition | Cong Wu et.al. | 2309.05834 | null |
2023-09-11 | An Effective Two-stage Training Paradigm Detector for Small Dataset | Zheng Wang et.al. | 2309.05652 | null |
2023-09-09 | BiLMa: Bidirectional Local-Matching for Text-based Person Re-identification | Takuro Fujii et.al. | 2309.04675 | null |
2023-09-08 | AMLP:Adaptive Masking Lesion Patches for Self-supervised Medical Image Segmentation | Xiangtao Wang et.al. | 2309.04312 | null |
Publish Date | Title | Authors | Code | |
---|---|---|---|---|
2024-11-14 | MagicQuill: An Intelligent Interactive Image Editing System | Zichen Liu et.al. | 2411.09703 | null |
2024-11-14 | Advancing Fine-Grained Visual Understanding with Multi-Scale Alignment in Multi-Modal Models | Wei Wang et.al. | 2411.09691 | null |
2024-11-14 | Squeezed Attention: Accelerating Long Context Length LLM Inference | Coleman Hooper et.al. | 2411.09688 | null |
2024-11-14 | Local deployment of large-scale music AI models on commodity hardware | Xun Zhou et.al. | 2411.09625 | null |
2024-11-14 | PTR: Precision-Driven Tool Recommendation for Large Language Models | Hang Gao et.al. | 2411.09613 | null |
2024-11-14 | The Moral Foundations Weibo Corpus | Renjie Cao et.al. | 2411.09612 | null |
2024-11-14 | Initial Nugget Evaluation Results for the TREC 2024 RAG Track with the AutoNuggetizer Framework | Ronak Pradeep et.al. | 2411.09607 | null |
2024-11-14 | Accelerating Knowledge Graph and Ontology Engineering with Large Language Models | Cogan Shimizu et.al. | 2411.09601 | null |
2024-11-14 | LLaMA-Mesh: Unifying 3D Mesh Generation with Language Models | Zhengyi Wang et.al. | 2411.09595 | null |
2024-11-14 | Adopting RAG for LLM-Aided Future Vehicle Design | Vahid Zolfaghari et.al. | 2411.09590 | null |
2024-11-13 | The Limited Impact of Medical Adaptation of Large Language and Vision-Language Models | Daniel P. Jeong et.al. | 2411.08870 | null |
2024-11-13 | LLMStinger: Jailbreaking LLMs using RL fine-tuned LLMs | Piyush Jha et.al. | 2411.08862 | null |
2024-11-13 | Multimodal Instruction Tuning with Hybrid State Space Models | Jianing Zhou et.al. | 2411.08840 | null |
2024-11-13 | FinRobot: AI Agent for Equity Research and Valuation with Large Language Models | Tianyu Zhou et.al. | 2411.08804 | link |
2024-11-13 | Evaluating World Models with LLM for Decision Making | Chang Yang et.al. | 2411.08794 | null |
2024-11-13 | Can sparse autoencoders be used to decompose and interpret steering vectors? | Harry Mayne et.al. | 2411.08790 | link |
2024-11-13 | Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers | Clément Dumas et.al. | 2411.08745 | link |
2024-11-13 | A Comparative Study of Discrete Speech Tokens for Semantic-Related Tasks with Large Language Models | Dingdong Wang et.al. | 2411.08742 | null |
2024-11-14 | Dynamic Rewarding with Prompt Optimization Enables Tuning-free Self-Alignment of Language Models | Somanshu Singla et.al. | 2411.08733 | null |
2024-11-13 | Polymetis:Large Language Modeling for Multiple Material Domains | Chao Huang et.al. | 2411.08728 | null |
2024-11-12 | Learning with Less: Knowledge Distillation from Large Language Models via Unlabeled Data | Juanhui Li et.al. | 2411.08028 | null |
2024-11-12 | LLMPhy: Complex Physical Reasoning Using Large Language Models and World Models | Anoop Cherian et.al. | 2411.08027 | null |
2024-11-12 | Language Models as Causal Effect Generators | Lucius E. J. Bynum et.al. | 2411.08019 | link |
2024-11-12 | ExpressivityArena: Can LLMs Express Information Implicitly? | Joshua Tint et.al. | 2411.08010 | null |
2024-11-12 | Can adversarial attacks by large language models be attributed? | Manuel Cebrian et.al. | 2411.08003 | null |
2024-11-12 | Derivational Morphology Reveals Analogical Generalization in Large Language Models | Valentin Hofmann et.al. | 2411.07990 | null |
2024-11-12 | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation | Yiyang Ma et.al. | 2411.07975 | null |
2024-11-12 | From General to Specific: Utilizing General Hallucation to Automatically Measure the Role Relationship Fidelity for Specific Role-Play Agents | Chuyi Kong et.al. | 2411.07965 | null |
2024-11-12 | Towards Low-bit Communication for Tensor Parallel LLM Inference | Harry Dong et.al. | 2411.07942 | null |
2024-11-12 | Leveraging Multimodal Models for Enhanced Neuroimaging Diagnostics in Alzheimer's Disease | Francesco Chiumento et.al. | 2411.07871 | null |
2024-11-11 | UTMath: Math Evaluation with Unit Test via Reasoning-to-Coding Thoughts | Bo Yang et.al. | 2411.07240 | null |
2024-11-11 | OpenThaiGPT 1.5: A Thai-Centric Open Source Large Language Model | Sumeth Yuenyong et.al. | 2411.07238 | null |
2024-11-11 | Tooling or Not Tooling? The Impact of Tools on Language Agents for Chemistry Problem Solving | Botao Yu et.al. | 2411.07228 | null |
2024-11-11 | TreeCoders: Trees of Transformers | Pierre Colonna D'Istria et.al. | 2411.07218 | null |
2024-11-11 | Comparing Bottom-Up and Top-Down Steering Approaches on In-Context Learning Tasks | Madeline Brumley et.al. | 2411.07213 | null |
2024-11-11 | DLCR: A Generative Data Expansion Framework via Diffusion for Clothes-Changing Person Re-ID | Nyle Siddiqui et.al. | 2411.07205 | link |
2024-11-11 | The Super Weight in Large Language Models | Mengxia Yu et.al. | 2411.07191 | link |
2024-11-11 | NatureLM-audio: an Audio-Language Foundation Model for Bioacoustics | David Robinson et.al. | 2411.07186 | null |
2024-11-11 | Continual Memorization of Factoids in Large Language Models | Howard Chen et.al. | 2411.07175 | link |
2024-11-11 | A Domain-Agnostic Neurosymbolic Approach for Big Social Data Analysis: Evaluating Mental Health Sentiment on Social Media during COVID-19 | Vedant Khandelwal et.al. | 2411.07163 | null |
2024-11-08 | Recycled Attention: Efficient inference for long-context language models | Fangyuan Xu et.al. | 2411.05787 | null |
2024-11-08 | Fact or Fiction? Can LLMs be Reliable Annotators for Political Truths? | Veronica Chatrath et.al. | 2411.05775 | null |
2024-11-08 | Multi-hop Evidence Pursuit Meets the Web: Team Papelo at FEVER 2024 | Christopher Malon et.al. | 2411.05762 | null |
2024-11-08 | Unmasking the Limits of Large Language Models: A Systematic Evaluation of Masked Text Processing Ability through MskQA and MskCal | Fuka Matsuzaki et.al. | 2411.05665 | link |
2024-11-08 | The influence of persona and conversational task on social interactions with a LLM-controlled embodied conversational agent | Leon O. H. Kroczek et.al. | 2411.05653 | null |
2024-11-08 | LightVA: Lightweight Visual Analytics with LLM Agent-Based Task Planning and Execution | Yuheng Zhao et.al. | 2411.05651 | null |
2024-11-08 | Evaluating Large Language Model Capability in Vietnamese Fact-Checking Data Generation | Long Truong To et.al. | 2411.05641 | null |
2024-11-08 | Assessing Open-Source Large Language Models on Argumentation Mining Subtasks | Mohammad Yeghaneh Abkenar et.al. | 2411.05639 | null |
2024-11-08 | A Two-Step Concept-Based Approach for Enhanced Interpretability and Trust in Skin Lesion Diagnosis | Cristiano Patrício et.al. | 2411.05609 | null |
2024-11-08 | Evaluating and Adapting Large Language Models to Represent Folktales in Low-Resource Languages | JA Meaney et.al. | 2411.05593 | null |
2024-11-07 | SVDQunat: Absorbing Outliers by Low-Rank Components for 4-Bit Diffusion Models | Muyang Li et.al. | 2411.05007 | link |
2024-11-07 | Needle Threading: Can LLMs Follow Threads through Near-Million-Scale Haystacks? | Jonathan Roberts et.al. | 2411.05000 | null |
2024-11-07 | LLM2CLIP: Powerful Language Model Unlock Richer Visual Representation | Weiquan Huang et.al. | 2411.04997 | link |
2024-11-07 | Mixture-of-Transformers: A Sparse and Scalable Architecture for Multi-Modal Foundation Models | Weixin Liang et.al. | 2411.04996 | null |
2024-11-07 | Rethinking Bradley-Terry Models in Preference-Based Reward Modeling: Foundations, Theory, and Alternatives | Hao Sun et.al. | 2411.04991 | link |
2024-11-07 | Enhancing Reverse Engineering: Investigating and Benchmarking Large Language Models for Vulnerability Analysis in Decompiled Binaries | Dylan Manuel et.al. | 2411.04981 | null |
2024-11-07 | SuffixDecoding: A Model-Free Approach to Speeding Up Large Language Model Inference | Gabriele Oliaro et.al. | 2411.04975 | null |
2024-11-07 | BitNet a4.8: 4-bit Activations for 1-bit LLMs | Hongyu Wang et.al. | 2411.04965 | null |
2024-11-07 | Position Paper On Diagnostic Uncertainty Estimation from Large Language Models: Next-Word Probability Is Not Pre-test Probability | Yanjun Gao et.al. | 2411.04962 | null |
2024-11-07 | CAD-MLLM: Unifying Multimodality-Conditioned CAD Generation With MLLM | Jingwei Xu et.al. | 2411.04954 | null |
2024-11-06 | Medical Adaptation of Large Language and Vision-Language Models: Are We Making Progress? | Daniel P. Jeong et.al. | 2411.04118 | null |
2024-11-07 | How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis | Guan Zhe Hong et.al. | 2411.04105 | null |
2024-11-06 | Textual Decomposition Then Sub-motion-space Scattering for Open-Vocabulary Motion Generation | Ke Fan et.al. | 2411.04079 | null |
2024-11-06 | Beemo: Benchmark of Expert-edited Machine-generated Outputs | Ekaterina Artemova et.al. | 2411.04032 | null |
2024-11-06 | Prompt Engineering Using GPT for Word-Level Code-Mixed Language Identification in Low-Resource Dravidian Languages | Aniket Deroy et.al. | 2411.04025 | null |
2024-11-06 | Themistoklis Haris et.al. | 2411.04013 | null | |
2024-11-06 | Customized Multiple Clustering via Multi-Modal Subspace Proxy Learning | Jiawei Yao et.al. | 2411.03978 | null |
2024-11-06 | What Really is Commonsense Knowledge? | Quyet V. Do et.al. | 2411.03964 | null |
2024-11-06 | How Does A Text Preprocessing Pipeline Affect Ontology Syntactic Matching? | Zhangcheng Qiang et.al. | 2411.03962 | null |
2024-11-06 | Fine-Grained Guidance for Retrievers: Leveraging LLMs' Feedback in Retrieval-Augmented Generation | Yuhang Liu et.al. | 2411.03957 | null |
2024-11-05 | LLMs for Domain Generation Algorithm Detection | Reynier Leyva La O et.al. | 2411.03307 | null |
2024-11-05 | VERITAS: A Unified Approach to Reliability Evaluation | Rajkumar Ramamurthy et.al. | 2411.03300 | null |
2024-11-05 | Examining Human-AI Collaboration for Co-Writing Constructive Comments Online | Farhana Shahid et.al. | 2411.03295 | null |
2024-11-05 | Interaction2Code: How Far Are We From Automatic Interactive Webpage Generation? | Jingyu Xiao et.al. | 2411.03292 | null |
2024-11-05 | The Future of Intelligent Healthcare: A Systematic Analysis and Discussion on the Integration and Impact of Robots Using Large Language Models for Healthcare | Souren Pashangpour et.al. | 2411.03287 | null |
2024-11-05 | SMoA: Improving Multi-agent Large Language Models with Sparse Mixture-of-Agents | Dawei Li et.al. | 2411.03284 | link |
2024-11-05 | ShadowMamba: State-Space Model with Boundary-Region Selective Scan for Shadow Removal | Xiujin Zhu et.al. | 2411.03260 | null |
2024-11-05 | Spontaneous Emergence of Agent Individuality through Social Interactions in LLM-Based Communities | Ryosuke Takata et.al. | 2411.03252 | null |
2024-11-05 | DiffLM: Controllable Synthetic Data Generation via Diffusion Language Models | Ying Zhou et.al. | 2411.03250 | null |
2024-11-05 | From Pen to Prompt: How Creative Writers Integrate AI into their Writing Practice | Alicia Guo et.al. | 2411.03137 | null |
2024-11-04 | Training-free Regional Prompting for Diffusion Transformers | Anthony Chen et.al. | 2411.02395 | link |
2024-11-04 | Adaptive Length Image Tokenization via Recurrent Allocation | Shivam Duggal et.al. | 2411.02393 | link |
2024-11-04 | Improving Scientific Hypothesis Generation with Knowledge Grounded Large Language Models | Guangzhi Xiong et.al. | 2411.02382 | null |
2024-11-04 | Addressing Uncertainty in LLMs to Enhance Reliability in Generative AI | Ramneet Kaur et.al. | 2411.02381 | null |
2024-11-04 | DeeR-VLA: Dynamic Inference of Multimodal Large Language Models for Efficient Robot Execution | Yang Yue et.al. | 2411.02359 | link |
2024-11-04 | "Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization | Eldar Kurtic et.al. | 2411.02355 | null |
2024-11-04 | Social-RAG: Retrieving from Group Interactions to Socially Ground Proactive AI Generation to Group Preferences | Ruotong Wang et.al. | 2411.02353 | null |
2024-11-04 | Can Large Language Models generalize analogy solving like people can? | Claire E. Stevenson et.al. | 2411.02348 | null |
2024-11-04 | WebRL: Training LLM Web Agents via Self-Evolving Online Curriculum Reinforcement Learning | Zehan Qi et.al. | 2411.02337 | null |
2024-11-04 | Sparsing Law: Towards Large Language Models with Greater Activation Sparsity | Yuqi Luo et.al. | 2411.02335 | null |
2024-11-01 | DELTA: Dense Efficient Long-range 3D Tracking for any video | Tuan Duc Ngo et.al. | 2410.24211 | null |
2024-10-31 | Length-Induced Embedding Collapse in Transformer-based Models | Yuqi Zhou et.al. | 2410.24200 | null |
2024-11-01 | SelfCodeAlign: Self-Alignment for Code Generation | Yuxiang Wei et.al. | 2410.24198 | link |
2024-10-31 | Constraint Back-translation Improves Complex Instruction Following of Large Language Models | Yunjia Qi et.al. | 2410.24175 | null |
2024-10-31 | Thought Space Explorer: Navigating and Expanding Thought Space for Large Language Model Reasoning | Jinghan Zhang et.al. | 2410.24155 | null |
2024-10-31 | Language-Driven Policy Distillation for Cooperative Driving in Multi-Agent Reinforcement Learning | Jiaqi Liu et.al. | 2410.24152 | null |
2024-10-31 | Leveraging Large Language Models for Code Translation and Software Development in Scientific Computing | Akash Dhruv et.al. | 2410.24119 | link |
2024-10-31 | Repository-Level Compositional Code Translation and Validation | Ali Reza Ibrahimzada et.al. | 2410.24117 | null |
2024-10-31 | Matchmaker: Self-Improving Large Language Model Programs for Schema Matching | Nabeel Seedat et.al. | 2410.24105 | null |
2024-10-31 | Desert Camels and Oil Sheikhs: Arab-Centric Red Teaming of Frontier LLMs | Muhammed Saeed et.al. | 2410.24049 | null |
2024-10-30 | EMMA: End-to-End Multimodal Model for Autonomous Driving | Jyh-Jing Hwang et.al. | 2410.23262 | null |
2024-10-30 | Evaluating Cultural and Social Awareness of LLM Web Agents | Haoyi Qiu et.al. | 2410.23252 | null |
2024-10-30 | Carrot and Stick: Eliciting Comparison Data and Beyond | Yiling Chen et.al. | 2410.23243 | null |
2024-10-30 | A little less conversation, a little more action, please: Investigating the physical common-sense of LLMs in a 3D embodied environment | Matteo G. Mecattaf et.al. | 2410.23242 | null |
2024-10-30 | EMOTION: Expressive Motion Sequence Generation for Humanoid Robots with In-Context Learning | Peide Huang et.al. | 2410.23234 | null |
2024-10-31 | Grounding by Trying: LLMs with Reinforcement Learning-Enhanced Retrieval | Sheryl Hsu et.al. | 2410.23214 | null |
2024-10-30 | ProTransformer: Robustify Transformers via Plug-and-Play Paradigm | Zhichao Hou et.al. | 2410.23182 | null |
2024-10-30 | ReasoningRec: Bridging Personalized Recommendations and Human-Interpretable Explanations through LLM Reasoning | Millennium Bismay et.al. | 2410.23180 | link |
2024-10-30 | SciPIP: An LLM-based Scientific Paper Idea Proposer | Wenxiao Wang et.al. | 2410.23166 | null |
2024-10-30 | Real-Time Personalization for LLM-based Recommendation with Customized In-Context Learning | Keqin Bao et.al. | 2410.23136 | link |
2024-10-29 | Enhancing Code Annotation Reliability: Generative AI's Role in Comment Quality Assessment Models | Seetharam Killivalavan et.al. | 2410.22323 | null |
2024-10-29 | Online Detecting LLM-Generated Texts via Sequential Hypothesis Testing by Betting | Can Chen et.al. | 2410.22318 | link |
2024-10-29 | Natural Language Inference Improves Compositionality in Vision-Language Models | Paola Cascante-Bonilla et.al. | 2410.22315 | null |
2024-10-30 | GPT-4o reads the mind in the eyes | James W. A. Strachan et.al. | 2410.22309 | null |
2024-10-29 | SVIP: Towards Verifiable Inference of Open-source Large Language Models | Yifan Sun et.al. | 2410.22307 | null |
2024-10-29 | Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning | Yihe Deng et.al. | 2410.22304 | null |
2024-10-29 | LLMs are Highly-Constrained Biophysical Sequence Optimizers | Angelica Chen et.al. | 2410.22296 | null |
2024-10-29 | Fine-Tuning LLMs for Code Mutation: A New Era of Cyber Threats | Mohammad Setak et.al. | 2410.22293 | null |
2024-10-29 | Embedding-based classifiers can detect prompt injection attacks | Md. Ahsan Ayub et.al. | 2410.22284 | link |
2024-10-29 | Whose ChatGPT? Unveiling Real-World Educational Inequalities Introduced by Large Language Models | Renzhe Yu et.al. | 2410.22282 | null |
2024-10-28 | Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics | Yaniv Nikankin et.al. | 2410.21272 | null |
2024-10-28 | LARP: Tokenizing Videos with a Learned Autoregressive Generative Prior | Hanyu Wang et.al. | 2410.21264 | null |
2024-10-28 | LongReward: Improving Long-context Large Language Models with AI Feedback | Jiajie Zhang et.al. | 2410.21252 | null |
2024-10-28 | Zero-Shot Dense Retrieval with Embeddings from Relevance Feedback | Nour Jedidi et.al. | 2410.21242 | null |
2024-10-28 | Hierarchical Knowledge Graph Construction from Images for Scalable E-Commerce | Zhantao Yang et.al. | 2410.21237 | null |
2024-10-28 | Flaming-hot Initiation with Regular Execution Sampling for Large Language Models | Weizhe Chen et.al. | 2410.21236 | null |
2024-10-28 | LoRA vs Full Fine-tuning: An Illusion of Equivalence | Reece Shuttleworth et.al. | 2410.21228 | null |
2024-10-28 | Lifting the Veil on the Large Language Model Supply Chain: Composition, Risks, and Mitigations | Kaifeng Huang et.al. | 2410.21218 | null |
2024-10-28 | BongLLaMA: LLaMA for Bangla Language | Abdullah Khan Zehady et.al. | 2410.21200 | null |
2024-10-29 | Document Parsing Unveiled: Techniques, Challenges, and Prospects for Structured Information Extraction | Qintong Zhang et.al. | 2410.21169 | null |
2024-10-25 | The Potential and Value of AI Chatbot in Personalized Cognitive Training | Zilong Wang et.al. | 2410.19733 | null |
2024-10-25 | Counting Ability of Large Language Models and Impact of Tokenization | Xiang Zhang et.al. | 2410.19730 | null |
2024-10-25 | FISHNET: Financial Intelligence from Sub-querying, Harmonizing, Neural-Conditioning, Expert Swarms, and Task Planning | Nicole Cho et.al. | 2410.19727 | null |
2024-10-25 | 2D-DPO: Scaling Direct Preference Optimization with 2-Dimensional Supervision | Shilong Li et.al. | 2410.19720 | null |
2024-10-25 | TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning | Xiangyu Zeng et.al. | 2410.19702 | null |
2024-10-25 | IPPON: Common Sense Guided Informative Path Planning for Object Goal Navigation | Kaixian Qu et.al. | 2410.19697 | null |
2024-10-25 | Less is More: Extreme Gradient Boost Rank-1 Adaption for Efficient Finetuning of LLMs | Yifei Zhang et.al. | 2410.19694 | null |
2024-10-25 | APRICOT: Active Preference Learning and Constraint-Aware Task Planning with LLMs | Huaxiaoyue Wang et.al. | 2410.19656 | null |
2024-10-25 | Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina | Yuan Gao et.al. | 2410.19599 | null |
2024-10-25 | Diverse Sign Language Translation | Xin Shen et.al. | 2410.19586 | null |
2024-10-24 | Unbounded: A Generative Infinite Game of Character Life Simulation | Jialu Li et.al. | 2410.18975 | null |
2024-10-24 | Ferret-UI 2: Mastering Universal User Interface Understanding Across Platforms | Zhangheng Li et.al. | 2410.18967 | null |
2024-10-24 | Does Data Contamination Detection Work (Well) for LLMs? A Survey and Evaluation on Detection Assumptions | Yujuan Fu et.al. | 2410.18966 | null |
2024-10-24 | OSCAR: Operating System Control via State-Aware Reasoning and Re-Planning | Xiaoqiang Wang et.al. | 2410.18963 | null |
2024-10-24 | Bridge-Coder: Unlocking LLMs' Potential to Overcome Language Gaps in Low-Resource Code | Jipeng Zhang et.al. | 2410.18957 | null |
2024-10-24 | BioMistral-NLU: Towards More Generalizable Medical Language Understanding through Instruction Tuning | Yujuan Velvin Fu et.al. | 2410.18955 | null |
2024-10-24 | Dynamic Vocabulary Pruning in Early-Exit LLMs | Jort Vincenti et.al. | 2410.18952 | link |
2024-10-24 | SafeBench: A Safety Evaluation Framework for Multimodal Large Language Models | Zonghao Ying et.al. | 2410.18927 | null |
2024-10-24 | From Blind Solvers to Logical Thinkers: Benchmarking LLMs' Logical Integrity on Faulty Mathematical Problems | A M Muntasir Rahman et.al. | 2410.18921 | null |
2024-10-25 | A Survey on Speech Large Language Models | Jing Peng et.al. | 2410.18908 | null |
2024-10-23 | TP-Eval: Tap Multimodal LLMs' Potential in Evaluation by Customizing Prompts | Yuxuan Xie et.al. | 2410.18071 | null |
2024-10-23 | LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering | Qingfei Zhao et.al. | 2410.18050 | link |
2024-10-23 | Key Algorithms for Keyphrase Generation: Instruction-Based LLMs for Russian Scientific Keyphrases | Anna Glazkova et.al. | 2410.18040 | null |
2024-10-23 | MiLoRA: Efficient Mixture of Low-Rank Adaptation for Large Language Models Fine-tuning | Jingfan Zhang et.al. | 2410.18035 | null |
2024-10-23 | GraphTeam: Facilitating Large Language Model-based Graph Analysis via Multi-Agent Collaboration | Xin Li et.al. | 2410.18032 | link |
2024-10-23 | MiniFed : Integrating LLM-based Agentic-Workflow for Simulating FOMC Meeting | Sungil Seok et.al. | 2410.18012 | null |
2024-10-23 | ExpertFlow: Optimized Expert Activation and Token Allocation for Efficient Mixture-of-Experts Inference | Xin He et.al. | 2410.17954 | null |
2024-10-23 | SimRAG: Self-Improving Retrieval-Augmented Generation for Adapting Large Language Models to Specialized Domains | Ran Xu et.al. | 2410.17952 | null |
2024-10-23 | Benchmarking Floworks against OpenAI & Anthropic: A Novel Framework for Enhanced LLM Function Calling | Nirav Bhan et.al. | 2410.17950 | null |
2024-10-23 | Guide for Defense (G4D): Dynamic Guidance for Robust and Balanced Defense in Large Language Models | He Cao et.al. | 2410.17922 | null |
2024-10-22 | Large Language Models Empowered Personalized Web Agents | Hongru Cai et.al. | 2410.17236 | null |
2024-10-22 | Automated Spinal MRI Labelling from Reports Using a Large Language Model | Robin Y. Park et.al. | 2410.17235 | link |
2024-10-22 | Fine-Tuning Large Language Models to Appropriately Abstain with Semantic Entropy | Benedict Aaron Tjandra et.al. | 2410.17234 | null |
2024-10-22 | Few-shot In-Context Preference Learning Using Large Language Models | Chao Yu et.al. | 2410.17233 | null |
2024-10-22 | Context-aware Prompt Tuning: Advancing In-Context Learning with Adversarial Methods | Tsachi Blau et.al. | 2410.17222 | null |
2024-10-22 | Exploring Possibilities of AI-Powered Legal Assistance in Bangladesh through Large Language Modeling | Azmine Toushik Wasi et.al. | 2410.17210 | link |
2024-10-22 | VoiceBench: Benchmarking LLM-Based Voice Assistants | Yiming Chen et.al. | 2410.17196 | link |
2024-10-23 | Non-myopic Generation of Language Model for Reasoning and Planning | Chang Ma et.al. | 2410.17195 | null |
2024-10-22 | From Attention to Activation: Unravelling the Enigmas of Large Language Models | Prannay Kaul et.al. | 2410.17174 | null |
2024-10-22 | Improving Pinterest Search Relevance Using Large Language Models | Han Wang et.al. | 2410.17152 | null |
2024-10-21 | Reflection-Bench: probing AI intelligence with reflection | Lingyu Li et.al. | 2410.16270 | link |
2024-10-22 | Mini-InternVL: A Flexible-Transfer Pocket Multimodal Model with 5% Parameters and 90% Performance | Zhangwei Gao et.al. | 2410.16261 | link |
2024-10-21 | Elucidating the design space of language models for image generation | Xuantong Liu et.al. | 2410.16257 | null |
2024-10-21 | CompassJudger-1: All-in-one Judge Model Helps Model Evaluation and Evolution | Maosong Cao et.al. | 2410.16256 | link |
2024-10-21 | Can Knowledge Editing Really Correct Hallucinations? | Baixiang Huang et.al. | 2410.16251 | link |
2024-10-21 | Analyzing Context Contributions in LLM-based Machine Translation | Emmanouil Zaranis et.al. | 2410.16246 | null |
2024-10-21 | MoRE: Multi-Modal Contrastive Pre-training with Transformers on X-Rays, ECGs, and Diagnostic Report | Samrajya Thapa et.al. | 2410.16239 | link |
2024-10-21 | IBGP: Imperfect Byzantine Generals Problem for Zero-Shot Robustness in Communicative Multi-Agent Systems | Yihuan Mao et.al. | 2410.16237 | null |
2024-10-21 | LLaVA-KD: A Framework of Distilling Multimodal Large Language Models | Yuxuan Cai et.al. | 2410.16236 | null |
2024-10-21 | ToW: Thoughts of Words Improve Reasoning in Large Language Models | Zhikun Xu et.al. | 2410.16235 | null |
2024-10-18 | Are AI Detectors Good Enough? A Survey on Quality of Datasets With Machine-Generated Texts | German Gritsai et.al. | 2410.14677 | null |
2024-10-18 | SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment | Qin Liu et.al. | 2410.14676 | null |
2024-10-18 | Enhancing Large Language Models' Situated Faithfulness to External Contexts | Yukun Huang et.al. | 2410.14675 | link |
2024-10-18 | Decomposing The Dark Matter of Sparse Autoencoders | Joshua Engels et.al. | 2410.14670 | link |
2024-10-18 | MiCEval: Unveiling Multimodal Chain of Thought's Quality via Image Description and Reasoning Steps | Xiongtao Zhou et.al. | 2410.14668 | link |
2024-10-18 | A Large Language Model-Driven Reward Design Framework via Dynamic Feedback for Reinforcement Learning | Shengjie Sun et.al. | 2410.14660 | null |
2024-10-18 | EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search | Oliver Sieberling et.al. | 2410.14649 | null |
2024-10-18 | Distance between Relevant Information Pieces Causes Bias in Long-Context LLMs | Runchu Tian et.al. | 2410.14641 | link |
2024-10-18 | GenEOL: Harnessing the Generative Power of LLMs for Training-Free Sentence Embeddings | Raghuveer Thirukovalluru et.al. | 2410.14635 | null |
2024-10-18 | DiSCo Meets LLMs: A Unified Approach for Sparse Retrieval and Contextual Distillation in Conversational Search | Simon Lupart et.al. | 2410.14609 | null |
2024-10-17 | Fluid: Scaling Autoregressive Text-to-image Generative Models with Continuous Tokens | Lijie Fan et.al. | 2410.13863 | null |
2024-10-17 | PUMA: Empowering Unified MLLM with Multi-granular Visual Generation | Rongyao Fang et.al. | 2410.13861 | link |
2024-10-17 | Yaxin Luo et.al. | 2410.13859 | null | |
2024-10-17 | How Numerical Precision Affects Mathematical Reasoning Capabilities of LLMs | Guhao Feng et.al. | 2410.13857 | null |
2024-10-17 | Can MLLMs Understand the Deep Implication Behind Chinese Images? | Chenhao Zhang et.al. | 2410.13854 | link |
2024-10-17 | Retrospective Learning from Interactions | Zizhao Chen et.al. | 2410.13852 | null |
2024-10-17 | SimLayerKV: A Simple Framework for Layer-Level KV Cache Reduction | Xuan Zhang et.al. | 2410.13846 | link |
2024-10-17 | Active-Dormant Attention Heads: Mechanistically Demystifying Extreme-Token Phenomena in LLMs | Tianyu Guo et.al. | 2410.13835 | null |
2024-10-17 | AgentOccam: A Simple Yet Strong Baseline for LLM-Based Web Agents | Ke Yang et.al. | 2410.13825 | null |
2024-10-18 | Harnessing Webpage UIs for Text-Rich Visual Understanding | Junpeng Liu et.al. | 2410.13824 | null |
2024-10-16 | Meta-Chunking: Learning Efficient Text Segmentation via Logical Perception | Jihao Zhao et.al. | 2410.12788 | null |
2024-10-16 | In-Context Learning Enables Robot Action Prediction in LLMs | Yida Yin et.al. | 2410.12782 | null |
2024-10-16 | Long-LRM: Long-sequence Large Reconstruction Model for Wide-coverage Gaussian Splats | Chen Ziwen et.al. | 2410.12781 | null |
2024-10-16 | Identifying Task Groupings for Multi-Task Learning Using Pointwise V-Usable Information | Yingya Li et.al. | 2410.12774 | null |
2024-10-16 | StyleDistance: Stronger Content-Independent Style Embeddings with Synthetic Parallel Examples | Ajay Patel et.al. | 2410.12757 | null |
2024-10-17 | CREAM: Consistency Regularized Self-Rewarding Language Models | Zhaoyang Wang et.al. | 2410.12735 | null |
2024-10-16 | FusionLLM: A Decentralized LLM Training System on Geo-distributed GPUs with Adaptive Compression | Zhenheng Tang et.al. | 2410.12707 | null |
2024-10-16 | Embedding an Ethical Mind: Aligning Text-to-Image Synthesis via Lightweight Value Optimization | Xingqi Wang et.al. | 2410.12700 | link |
2024-10-17 | Automatic Mapping of Anatomical Landmarks from Free-Text Using Large Language Models: Insights from Llama-2 | Mohamad Abdi et.al. | 2410.12686 | null |
2024-10-16 | Evaluating Morphological Compositional Generalization in Large Language Models | Mete Ismayilzada et.al. | 2410.12656 | null |
2024-10-15 | GaVaMoE: Gaussian-Variational Gated Mixture of Experts for Explainable Recommendation | Fei Tang et.al. | 2410.11841 | null |
2024-10-15 | MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding | Yue Cao et.al. | 2410.11829 | link |
2024-10-15 | SGEdit: Bridging LLM with Text2Image Generative Model for Scene Graph-based Image Editing | Zhiyuan Zhang et.al. | 2410.11815 | null |
2024-10-15 | NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models | Han Han et.al. | 2410.11805 | null |
2024-10-15 | FoundTS: Comprehensive and Unified Benchmarking of Foundation Models for Time Series Forecasting | Zhe Li et.al. | 2410.11802 | null |
2024-10-15 | Selection-p: Self-Supervised Task-Agnostic Prompt Compression for Faithfulness and Transferability | Tsz Ting Chung et.al. | 2410.11786 | null |
2024-10-15 | G-Designer: Architecting Multi-agent Communication Topologies via Graph Neural Networks | Guibin Zhang et.al. | 2410.11782 | null |
2024-10-15 | Language Models Encode Numbers Using Digit Representations in Base 10 | Amit Arnold Levy et.al. | 2410.11781 | null |
2024-10-15 | MLLM can see? Dynamic Correction Decoding for Hallucination Mitigation | Chenxi Wang et.al. | 2410.11779 | link |
2024-10-15 | Layer-wise Importance Matters: Less Memory for Better Performance in Parameter-efficient Fine-tuning of Large Language Models | Kai Yao et.al. | 2410.11772 | link |
2024-10-14 | DuoAttention: Efficient Long-Context LLM Inference with Retrieval and Streaming Heads | Guangxuan Xiao et.al. | 2410.10819 | link |
2024-10-14 | Your Mixture-of-Experts LLM Is Secretly an Embedding Model For Free | Ziyue Li et.al. | 2410.10814 | null |
2024-10-14 | LongMemEval: Benchmarking Chat Assistants on Long-Term Interactive Memory | Di Wu et.al. | 2410.10813 | link |
2024-10-14 | Mix Data or Merge Models? Optimizing for Diverse Multi-Task Learning | Aakanksha et.al. | 2410.10801 | null |
2024-10-15 | MMAR: Towards Lossless Multi-Modal Auto-Regressive Probabilistic Modeling | Jian Yang et.al. | 2410.10798 | null |
2024-10-14 | Context-Parametric Inversion: Why Instruction Finetuning May Not Actually Improve Context Reliance | Sachin Goyal et.al. | 2410.10796 | link |
2024-10-14 | Focused ReAct: Improving ReAct through Reiterate and Early Stop | Shuoqiu Li et.al. | 2410.10779 | null |
2024-10-14 | AFlow: Automating Agentic Workflow Generation | Jiayi Zhang et.al. | 2410.10762 | link |
2024-10-14 | Denial-of-Service Poisoning Attacks against Large Language Models | Kuofeng Gao et.al. | 2410.10760 | link |
2024-10-14 | SplitLLM: Collaborative Inference of LLMs for Model Placement and Throughput Optimization | Akrit Mudvari et.al. | 2410.10759 | null |
2024-10-11 | AttnGCG: Enhancing Jailbreaking Attacks on LLMs with Attention Manipulation | Zijun Wang et.al. | 2410.09040 | link |
2024-10-11 | Semi-Supervised Learning of Noisy Mixture of Experts Models | Oh-Ran Kwon et.al. | 2410.09039 | null |
2024-10-11 | SimpleStrat: Diversifying Language Model Generation with Stratification | Justin Wong et.al. | 2410.09038 | null |
2024-10-11 | Mentor-KD: Making Small Language Models Better Multi-step Reasoners | Hojae Lee et.al. | 2410.09037 | link |
2024-10-11 | PEAR: A Robust and Flexible Automation Framework for Ptychography Enabled by Multiple Large Language Model Agents | Xiangyu Yin et.al. | 2410.09034 | null |
2024-10-11 | The Impact of Visual Information in Chinese Characters: Evaluating Large Models' Ability to Recognize and Utilize Radicals | Xiaofeng Wu et.al. | 2410.09013 | null |
2024-10-11 | Software Engineering and Foundation Models: Insights from Industry Blogs Using a Jury of Foundation Models | Hao Li et.al. | 2410.09012 | null |
2024-10-11 | SuperCorrect: Supervising and Correcting Language Models with Error-Driven Insights | Ling Yang et.al. | 2410.09008 | link |
2024-10-11 | From Interaction to Impact: Towards Safer AI Agents Through Understanding and Evaluating UI Operation Impacts | Zhuohao Jerry Zhang et.al. | 2410.09006 | null |
2024-10-11 | Hypothesis-only Biases in Large Language Model-Elicited Natural Language Inference | Grace Proebsting et.al. | 2410.08996 | null |
2024-10-10 | Mono-InternVL: Pushing the Boundaries of Monolithic Multimodal Large Language Models with Endogenous Visual Pre-training | Gen Luo et.al. | 2410.08202 | null |
2024-10-10 | From Exploration to Mastery: Enabling LLMs to Master Tools via Self-Driven Interactions | Changle Qu et.al. | 2410.08197 | link |
2024-10-10 | MathCoder2: Better Math Reasoning from Continued Pretraining on Model-translated Mathematical Code | Zimu Lu et.al. | 2410.08196 | link |
2024-10-10 | GenARM: Reward Guided Generation with Autoregressive Reward Model for Test-time Alignment | Yuancheng Xu et.al. | 2410.08193 | null |
2024-10-10 | Sample then Identify: A General Framework for Risk Control and Assessment in Multimodal Large Language Models | Qingni Wang et.al. | 2410.08174 | null |
2024-10-10 | On the Evaluation of Generative Robotic Simulations | Feng Chen et.al. | 2410.08172 | null |
2024-10-10 | Agent S: An Open Agentic Framework that Uses Computers Like a Human | Saaket Agashe et.al. | 2410.08164 | link |
2024-10-10 | Rewarding Progress: Scaling Automated Process Verifiers for LLM Reasoning | Amrith Setlur et.al. | 2410.08146 | null |
2024-10-10 | Insight Over Sight? Exploring the Vision-Knowledge Conflicts in Multimodal LLMs | Xiaoyuan Liu et.al. | 2410.08145 | null |
2024-10-10 | DelTA: An Online Document-Level Translation Agent Based on Multi-Level Memory | Yutong Wang et.al. | 2410.08143 | link |
2024-10-09 | Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models | Fei Wang et.al. | 2410.07176 | null |
2024-10-09 | Do better language models have crisper vision? | Jona Ruthardt et.al. | 2410.07173 | null |
2024-10-09 | Deciphering Cross-Modal Alignment in Large Vision-Language Models with Modality Integration Rate | Qidong Huang et.al. | 2410.07167 | link |
2024-10-09 | Embodied Agent Interface: Benchmarking LLMs for Embodied Decision Making | Manling Li et.al. | 2410.07166 | link |
2024-10-09 | Simplicity Prevails: Rethinking Negative Preference Optimization for LLM Unlearning | Chongyu Fan et.al. | 2410.07163 | null |
2024-10-09 | Trans4D: Realistic Geometry-Aware Transition for Compositional Text-to-4D Synthesis | Bohan Zeng et.al. | 2410.07155 | link |
2024-10-09 | Stuffed Mamba: State Collapse and State Capacity of RNN-Based Long-Context Modeling | Yingfa Chen et.al. | 2410.07145 | null |
2024-10-09 | Mental Disorders Detection in the Era of Large Language Models | Gleb Kuzmin et.al. | 2410.07129 | null |
2024-10-09 | Personalized Visual Instruction Tuning | Renjie Pi et.al. | 2410.07113 | null |
2024-10-09 | I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy | Gian Maria Campedelli et.al. | 2410.07109 | null |
2024-10-07 | Data Advisor: Dynamic Data Curation for Safety Alignment of Large Language Models | Fei Wang et.al. | 2410.05269 | null |
2024-10-07 | PrefixQuant: Static Quantization Beats Dynamic through Prefixed Outliers in LLMs | Mengzhao Chen et.al. | 2410.05265 | link |
2024-10-07 | TurtleBench: Evaluating Top Language Models via Real-World Yes/No Puzzles | Qingchen Yu et.al. | 2410.05262 | link |
2024-10-07 | Differential Transformer | Tianzhu Ye et.al. | 2410.05258 | null |
2024-10-07 | GLEE: A Unified Framework and Benchmark for Language-based Economic Environments | Eilam Shapira et.al. | 2410.05254 | link |
2024-10-07 | Causal Micro-Narratives | Mourad Heddaya et.al. | 2410.05252 | null |
2024-10-07 | SFTMix: Elevating Language Model Instruction Tuning with Mixup Recipe | Yuxin Xiao et.al. | 2410.05248 | null |
2024-10-07 | Navigating the Digital World as Humans Do: Universal Visual Grounding for GUI Agents | Boyu Gou et.al. | 2410.05243 | null |
2024-10-07 | GSM-Symbolic: Understanding the Limitations of Mathematical Reasoning in Large Language Models | Iman Mirzadeh et.al. | 2410.05229 | null |
2024-10-07 | Cookbook: A framework for improving LLM generative abilities via programmatic data generating templates | Avanika Narayan et.al. | 2410.05224 | null |
2024-10-04 | Enhance Reasoning by Learning from Mistakes: Peer-Review Knowledge Distillation from Multiple Large Language Models | Zhuochun Li et.al. | 2410.03663 | null |
2024-10-04 | RAFT: Realistic Attacks to Fool Text Detectors | James Wang et.al. | 2410.03658 | null |
2024-10-04 | Aligning LLMs with Individual Preferences via Interaction | Shujin Wu et.al. | 2410.03642 | link |
2024-10-04 | Large Language Model Performance Benchmarking on Mobile Platforms: A Thorough Evaluation | Jie Xiao et.al. | 2410.03613 | null |
2024-10-04 | TICKing All the Boxes: Generated Checklists Improve LLM Evaluation and Generation | Jonathan Cook et.al. | 2410.03608 | null |
2024-10-04 | Efficiently Identifying Watermarked Segments in Mixed-Source Texts | Xuandong Zhao et.al. | 2410.03600 | null |
2024-10-04 | Understanding Reasoning in Chain-of-Thought from the Hopfieldian View | Lijie Hu et.al. | 2410.03595 | null |
2024-10-04 | Look Twice Before You Answer: Memory-Space Visual Retracing for Hallucination Mitigation in Multimodal Large Language Models | Xin Zou et.al. | 2410.03577 | null |
2024-10-04 | Towards Linguistically-Aware and Language-Independent Tokenization for Large Language Models (LLMs) | Abrar Rahman et.al. | 2410.03568 | null |
2024-10-04 | Structure-Enhanced Protein Instruction Tuning: Towards General-Purpose Protein Understanding | Wei Wu et.al. | 2410.03553 | null |
2024-10-03 | FakeShield: Explainable Image Forgery Detection and Localization via Multi-modal Large Language Models | Zhipei Xu et.al. | 2410.02761 | null |
2024-10-03 | Loong: Generating Minute-level Long Videos with Autoregressive Language Models | Yuqing Wang et.al. | 2410.02757 | null |
2024-10-03 | SIEVE: General Purpose Data Filtering System Matching GPT-4o Accuracy at 1% the Cost | Jifan Zhang et.al. | 2410.02755 | null |
2024-10-03 | Training Language Models on Synthetic Edit Sequences Improves Code Synthesis | Ulyana Piterbarg et.al. | 2410.02749 | null |
2024-10-03 | CriSPO: Multi-Aspect Critique-Suggestion-guided Automatic Prompt Optimization for Text Generation | Han He et.al. | 2410.02748 | null |
2024-10-03 | Contrastive Localized Language-Image Pre-Training | Hong-You Chen et.al. | 2410.02746 | null |
2024-10-03 | Neutral residues: revisiting adapters for model extension | Franck Signe Talla et.al. | 2410.02744 | null |
2024-10-03 | MA-RLHF: Reinforcement Learning from Human Feedback with Macro Actions | Yekun Chai et.al. | 2410.02743 | null |
2024-10-03 | Grounding Large Language Models In Embodied Environment With Imperfect World Models | Haolan Liu et.al. | 2410.02742 | null |
2024-10-03 | Salient Information Prompting to Steer Content in Prompt-based Abstractive Summarization | Lei Xu et.al. | 2410.02741 | null |
2024-10-02 | Locret: Enhancing Eviction in Long-Context LLM Inference with Trained Retaining Heads | Yuxiang Huang et.al. | 2410.01805 | link |
2024-10-02 | Efficient |
Alex W. Neal Riasanovsky et.al. | 2410.01799 | null |
2024-10-02 | Knowledge-Driven Feature Selection and Engineering for Genotype Data with Large Language Models | Joseph Lee et.al. | 2410.01795 | link |
2024-10-02 | When a language model is optimized for reasoning, does it still show embers of autoregression? An analysis of OpenAI o1 | R. Thomas McCoy et.al. | 2410.01792 | null |
2024-10-02 | Investigating on RLHF methodology | Alexey Kutalev et.al. | 2410.01789 | null |
2024-10-02 | OmniGenBench: Automating Large-scale in-silico Benchmarking for Genomic Foundation Models | Heng Yang et.al. | 2410.01784 | link |
2024-10-02 | Open-RAG: Enhanced Retrieval-Augmented Reasoning with Open-Source Large Language Models | Shayekh Bin Islam et.al. | 2410.01782 | null |
2024-10-02 | Trained Transformer Classifiers Generalize and Exhibit Benign Overfitting In-Context | Spencer Frei et.al. | 2410.01774 | null |
2024-10-03 | Quantifying Generalization Complexity for Large Language Models | Zhenting Qi et.al. | 2410.01769 | null |
2024-10-03 | Leopard: A Vision Language Model For Text-Rich Multi-Image Tasks | Mengzhao Jia et.al. | 2410.01744 | null |
2024-09-30 | MM1.5: Methods, Analysis & Insights from Multimodal LLM Fine-tuning | Haotian Zhang et.al. | 2409.20566 | null |
2024-09-30 | Propose, Assess, Search: Harnessing LLMs for Goal-Oriented Planning in Instructional Videos | Md Mohaiminul Islam et.al. | 2409.20557 | null |
2024-09-30 | LLM Hallucinations in Practical Code Generation: Phenomena, Mechanism, and Mitigation | Ziyao Zhang et.al. | 2409.20550 | null |
2024-09-30 | Robi Butler: Remote Multimodal Interactions with Household Robot Assistant | Anxing Xiao et.al. | 2409.20548 | null |
2024-09-30 | Uncertainty-Informed Screening for Safer Solvents Used in the Synthesis of Perovskite via Language Models | Arpan Mukherjee et.al. | 2409.20512 | null |
2024-09-30 | COLLAGE: Collaborative Human-Agent Interaction Generation using Hierarchical Latent Diffusion and Language Models | Divyanshu Daiya et.al. | 2409.20502 | null |
2024-10-01 | Instance-adaptive Zero-shot Chain-of-Thought Prompting | Xiaosong Yuan et.al. | 2409.20441 | null |
2024-09-30 | Wait, but Tylenol is Acetaminophen... Investigating and Improving Language Models' Ability to Resist Requests for Misinformation | Shan Chen et.al. | 2409.20385 | null |
2024-09-30 | The Perfect Blend: Redefining RLHF with Mixture of Judges | Tengyu Xu et.al. | 2409.20370 | null |
2024-09-30 | VideoINSTA: Zero-shot Long Video Understanding via Informative Spatial-Temporal Reasoning with LLMs | Ruotong Liao et.al. | 2409.20365 | null |
2024-09-27 | LML: Language Model Learning a Dataset for Data-Augmented Prediction | Praneeth Vadlapati et.al. | 2409.18957 | link |
2024-09-27 | Ruler: A Model-Agnostic Method to Control Generated Length for Large Language Models | Jiaming Li et.al. | 2409.18943 | link |
2024-09-27 | From Seconds to Hours: Reviewing MultiModal Large Language Models on Comprehensive Long Video Understanding | Heqing Zou et.al. | 2409.18938 | null |
2024-09-27 | AIPatient: Simulating Patients with EHRs and LLM Powered Agentic Workflow | Huizi Yu et.al. | 2409.18924 | null |
2024-09-27 | Soft Measures for Extracting Causal Collective Intelligence | Maryam Berijanian et.al. | 2409.18911 | link |
2024-09-27 | IDGen: Item Discrimination Induced Prompt Generation for LLM Evaluation | Fan Lin et.al. | 2409.18892 | null |
2024-09-27 | Predicting and analyzing memorization within fine-tuned Large Language Models | Jérémie Dentan et.al. | 2409.18858 | null |
2024-09-27 | Mitigating Selection Bias with Node Pruning and Auxiliary Options | Hyeong Kyu Choi et.al. | 2409.18857 | null |
2024-09-27 | LLMs4Synthesis: Leveraging Large Language Models for Scientific Synthesis | Hamed Babaei Giglou et.al. | 2409.18812 | null |
2024-09-27 | Open-Nav: Exploring Zero-Shot Vision-and-Language Navigation in Continuous Environment with Open-Source LLMs | Yanyuan Qiao et.al. | 2409.18794 | null |
2024-09-26 | EgoLM: Multi-Modal Language Model of Egocentric Motions | Fangzhou Hong et.al. | 2409.18127 | null |
2024-09-26 | Multi-View and Multi-Scale Alignment for Contrastive Language-Image Pre-training in Mammography | Yuexi Du et.al. | 2409.18119 | null |
2024-09-26 | E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding | Ye Liu et.al. | 2409.18111 | link |
2024-09-26 | Infering Alt-text For UI Icons With Large Language Models During App Development | Sabrina Haque et.al. | 2409.18060 | null |
2024-09-26 | DualAD: Dual-Layer Planning for Reasoning in Autonomous Driving | Dingrui Wang et.al. | 2409.18053 | null |
2024-09-26 | EMOVA: Empowering Language Models to See, Hear and Speak with Vivid Emotions | Kai Chen et.al. | 2409.18042 | null |
2024-09-26 | Compositional Hardness of Code in Large Language Models -- A Probabilistic Perspective | Yotam Wolf et.al. | 2409.18028 | null |
2024-09-26 | An Adversarial Perspective on Machine Unlearning for AI Safety | Jakub Łucki et.al. | 2409.18025 | null |
2024-09-26 | DARE: Diverse Visual Question Answering with Robustness Evaluation | Hannah Sterz et.al. | 2409.18023 | null |
2024-09-26 | Role-RL: Online Long-Context Processing with Role Reinforcement Learning for Distinct LLMs in Their Optimal Roles | Lewei He et.al. | 2409.18014 | null |
2024-09-25 | Attention Prompting on Image for Large Vision-Language Models | Runpeng Yu et.al. | 2409.17143 | link |
2024-09-25 | FineZip : Pushing the Limits of Large Language Models for Practical Lossless Text Compression | Fazal Mittu et.al. | 2409.17141 | link |
2024-09-25 | Turn Every Application into an Agent: Towards Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents | Junting Lu et.al. | 2409.17140 | null |
2024-09-25 | Programming Every Example: Lifting Pre-training Data Quality like Experts at Scale | Fan Zhou et.al. | 2409.17115 | link |
2024-09-25 | Accumulator-Aware Post-Training Quantization | Ian Colbert et.al. | 2409.17092 | null |
2024-09-25 | VPTQ: Extreme Low-bit Vector Post-Training Quantization for Large Language Models | Yifei Liu et.al. | 2409.17066 | link |
2024-09-25 | Using LLM for Real-Time Transcription and Summarization of Doctor-Patient Interactions into ePuskesmas in Indonesia | Azmul Asmar Irfan et.al. | 2409.17054 | null |
2024-09-25 | How to Connect Speech Foundation Models and Large Language Models? What Matters and What Does Not | Francesco Verdini et.al. | 2409.17044 | null |
2024-09-25 | Counterfactual Token Generation in Large Language Models | Ivi Chatzi et.al. | 2409.17027 | null |
2024-09-25 | LLM-CARD: Towards a Description and Landscape of Large Language Models | Shengwei Tian et.al. | 2409.17011 | null |
2024-09-24 | LLM Echo Chamber: personalized and automated disinformation | Tony Ma et.al. | 2409.16241 | link |
2024-09-24 | Towards Enhancing Linked Data Retrieval in Conversational UIs using Large Language Models | Omar Mussa et.al. | 2409.16220 | null |
2024-09-24 | LLMCount: Enhancing Stationary mmWave Detection with Multimodal-LLM | Boyan Li et.al. | 2409.16209 | null |
2024-09-25 | CJEval: A Benchmark for Assessing Large Language Models Using Chinese Junior High School Exam Data | Qian-Wen Zhang et.al. | 2409.16202 | link |
2024-09-24 | HelloBench: Evaluating Long Text Generation Capabilities of Large Language Models | Haoran Que et.al. | 2409.16191 | link |
2024-09-24 | Cyber Knowledge Completion Using Large Language Models | Braden K Webb et.al. | 2409.16176 | null |
2024-09-24 | Merging LoRAs like Playing LEGO: Pushing the Modularity of LoRA to Extremes Through Rank-Wise Clustering | Ziyu Zhao et.al. | 2409.16167 | null |
2024-09-24 | Controlling Risk of Retrieval-augmented Generation: A Counterfactual Prompting Framework | Lu Chen et.al. | 2409.16146 | null |
2024-09-24 | MOSS: Enabling Code-Driven Evolution and Context Management for AI Agents | Ming Zhu et.al. | 2409.16120 | link |
2024-09-24 | Exploring Hint Generation Approaches in Open-Domain Question Answering | Jamshid Mozafari et.al. | 2409.16096 | link |
2024-09-20 | Gender Representation and Bias in Indian Civil Service Mock Interviews | Somonnoy Banerjee et.al. | 2409.12194 | null |
2024-09-18 | To CoT or not to CoT? Chain-of-thought helps mainly on math and symbolic reasoning | Zayne Sprague et.al. | 2409.12183 | null |
2024-09-18 | Finetuning Language Models to Emit Linguistic Expressions of Uncertainty | Arslan Chaudhry et.al. | 2409.12180 | null |
2024-09-18 | Decoding Style: Efficient Fine-Tuning of LLMs for Image-Guided Outfit Recommendation with Preference | Najmeh Forouzandehmehr et.al. | 2409.12150 | null |
2024-09-18 | MAgICoRe: Multi-Agent, Iterative, Coarse-to-Fine Refinement for Reasoning | Justin Chih-Yao Chen et.al. | 2409.12147 | link |
2024-09-18 | MoRAG -- Multi-Fusion Retrieval Augmented Generation for Human Motion | Kalakonda Sai Shashank et.al. | 2409.12140 | null |
2024-09-24 | Takin: A Cohort of Superior Quality Zero-shot Speech Generation Models | Sijing Chen et.al. | 2409.12139 | null |
2024-09-18 | Qwen2.5-Math Technical Report: Toward Mathematical Expert Model via Self-Improvement | An Yang et.al. | 2409.12122 | null |
2024-09-18 | Low Frame-rate Speech Codec: a Codec Designed for Fast High-quality Speech LLM Training and Inference | Edresson Casanova et.al. | 2409.12117 | null |
2024-09-18 | Measuring Human and AI Values based on Generative Psychometrics with Large Language Models | Haoran Ye et.al. | 2409.12106 | link |
2024-09-17 | AraDiCE: Benchmarks for Dialectal and Cultural Capabilities in LLMs | Basel Mousi et.al. | 2409.11404 | null |
2024-09-17 | NVLM: Open Frontier-Class Multimodal LLMs | Wenliang Dai et.al. | 2409.11402 | null |
2024-09-17 | Says Who? Effective Zero-Shot Annotation of Focalization | Rebecca M. M. Hicke et.al. | 2409.11390 | null |
2024-09-17 | Diversify and Conquer: Diversity-Centric Data Selection with Iterative Refinement | Simon Yu et.al. | 2409.11378 | null |
2024-09-17 | Towards Time Series Reasoning with LLMs | Winnie Chow et.al. | 2409.11376 | null |
2024-09-17 | Multi-OCT-SelfNet: Integrating Self-Supervised Learning with Multi-Source Data Fusion for Enhanced Multi-Class Retinal Disease Classification | Fatema-E- Jannat et.al. | 2409.11375 | null |
2024-09-17 | CoCA: Regaining Safety-awareness of Multimodal Large Language Models with Constitutional Calibration | Jiahui Gao et.al. | 2409.11365 | null |
2024-09-17 | AI Suggestions Homogenize Writing Toward Western Styles and Diminish Cultural Nuances | Dhruv Agarwal et.al. | 2409.11360 | null |
2024-09-17 | THaMES: An End-to-End Tool for Hallucination Mitigation and Evaluation in Large Language Models | Mengfei Liang et.al. | 2409.11353 | null |
2024-09-17 | Leveraging Distillation Techniques for Document Understanding: A Case Study with FLAN-T5 | Marcel Lamott et.al. | 2409.11282 | null |
2024-09-16 | RetrievalAttention: Accelerating Long-Context LLM Inference via Vector Retrieval | Di Liu et.al. | 2409.10516 | null |
2024-09-16 | Context-aware Code Segmentation for C-to-Rust Translation using Large Language Models | Momoko Shiraishi et.al. | 2409.10506 | null |
2024-09-16 | DILA: Dictionary Label Attention for Mechanistic Interpretability in High-dimensional Multi-label Medical Coding Prediction | John Wu et.al. | 2409.10504 | null |
2024-09-16 | Causal Language Modeling Can Elicit Search and Reasoning Capabilities on Logic Puzzles | Kulin Shah et.al. | 2409.10502 | null |
2024-09-16 | Code Vulnerability Detection: A Comparative Analysis of Emerging Large Language Models | Shaznin Sultana et.al. | 2409.10490 | null |
2024-09-16 | XLM for Autonomous Driving Systems: A Comprehensive Review | Sonda Fourati et.al. | 2409.10484 | null |
2024-09-17 | Schrodinger's Memory: Large Language Models | Wei Wang et.al. | 2409.10482 | null |
2024-09-16 | LLM as BT-Planner: Leveraging LLMs for Behavior Tree Generation in Robot Task Planning | Jicong Ao et.al. | 2409.10444 | null |
2024-09-16 | A Large-Scale Privacy Assessment of Android Third-Party SDKs | Mark Huasong Meng et.al. | 2409.10411 | null |
2024-09-17 | Learnings from a Large-Scale Deployment of an LLM-Powered Expert-in-the-Loop Healthcare Chatbot | Bhuvan Sachdeva et.al. | 2409.10354 | null |
2024-09-13 | Agents in Software Engineering: Survey, Landscape, and Vision | Yanxian Huang et.al. | 2409.09030 | link |
2024-09-13 | Contri(e)ve: Context + Retrieve for Scholarly Question Answering | Kanchan Shivashankar et.al. | 2409.09010 | null |
2024-09-13 | Safeguarding Decentralized Social Media: LLM Agents for Automating Community Rule Compliance | Lucio La Cava et.al. | 2409.08963 | null |
2024-09-13 | Emerging Reliance Behaviors in Human-AI Text Generation: Hallucinations, Data Quality Assessment, and Cognitive Forcing Functions | Zahra Ashktorab et.al. | 2409.08937 | null |
2024-09-13 | SynSUM -- Synthetic Benchmark with Structured and Unstructured Medical Records | Paloma Rabaey et.al. | 2409.08936 | link |
2024-09-13 | LLM-based Weak Supervision Framework for Query Intent Classification in Video Search | Farnoosh Javadi et.al. | 2409.08931 | null |
2024-09-13 | AnyBipe: An End-to-End Framework for Training and Deploying Bipedal Robots Guided by Large Language Models | Yifei Yao et.al. | 2409.08904 | null |
2024-09-13 | A Market for Lemons? Strategic Directions for a Vigilant Application of Artificial Intelligence in Entrepreneurship Research | Martin Obschonka et.al. | 2409.08890 | null |
2024-09-13 | Exploring Graph Structure Comprehension Ability of Multimodal Large Language Models: Case Studies | Zhiqiang Zhong et.al. | 2409.08864 | null |
2024-09-13 | FP-VEC: Fingerprinting Large Language Models via Efficient Vector Addition | Zhenhua Xu et.al. | 2409.08846 | null |
2024-09-12 | Windows Agent Arena: Evaluating Multi-Modal OS Agents at Scale | Rogerio Bonatti et.al. | 2409.08264 | link |
2024-09-12 | OmniQuery: Contextually Augmenting Captured Multimodal Memory to Enable Personal Question Answering | Jiahao Nick Li et.al. | 2409.08250 | null |
2024-09-12 | Source2Synth: Synthetic Data Generation and Curation Grounded in Real Data Sources | Alisia Lupidi et.al. | 2409.08239 | null |
2024-09-12 | LLM Honeypot: Leveraging Large Language Models as Advanced Interactive Honeypot Systems | Hakan T. Otal et.al. | 2409.08234 | link |
2024-09-12 | What Makes a Maze Look Like a Maze? | Joy Hsu et.al. | 2409.08202 | null |
2024-09-12 | Fine-tuning Large Language Models for Entity Matching | Aaron Steiner et.al. | 2409.08185 | link |
2024-09-12 | Faster Speech-LLaMA Inference with Multi-token Prediction | Desh Raj et.al. | 2409.08148 | null |
2024-09-12 | LLM-POTUS Score: A Framework of Analyzing Presidential Debates with Large Language Models | Zhengliang Liu et.al. | 2409.08147 | null |
2024-09-12 | The CLC-UKET Dataset: Benchmarking Case Outcome Prediction for the UK Employment Tribunal | Huiyuan Xie et.al. | 2409.08098 | null |
2024-09-12 | Securing Large Language Models: Addressing Bias, Misinformation, and Prompt Attacks | Benji Peng et.al. | 2409.08087 | null |
2024-09-11 | "My Grade is Wrong!": A Contestable AI Framework for Interactive Feedback in Evaluating Student Essays | Shengxin Hong et.al. | 2409.07453 | null |
2024-09-11 | SUPER: Evaluating Agents on Setting Up and Executing Tasks from Research Repositories | Ben Bogin et.al. | 2409.07440 | link |
2024-09-11 | CLNX: Bridging Code and Natural Language for C/C++ Vulnerability-Contributing Commits Identification | Zeqing Qin et.al. | 2409.07407 | null |
2024-09-11 | AdaCAD: Adaptively Decoding to Balance Conflicts between Contextual and Parametric Knowledge | Han Wang et.al. | 2409.07394 | link |
2024-09-11 | Demo: SGCode: A Flexible Prompt-Optimizing System for Secure Generation of Code | Khiem Ton et.al. | 2409.07368 | null |
2024-09-11 | Think Together and Work Better: Combining Humans' and LLMs' Think-Aloud Outcomes for Effective Text Evaluation | SeongYeub Chu et.al. | 2409.07355 | link |
2024-09-11 | Learning to Compress Contexts for Efficient Knowledge-based Visual Question Answering | Weixi Weng et.al. | 2409.07331 | null |
2024-09-11 | MEDIC: Towards a Comprehensive Framework for Evaluating LLMs in Clinical Applications | Praveen K Kanithi et.al. | 2409.07314 | null |
2024-09-11 | STORE: Streamlining Semantic Tokenization and Generative Recommendation with A Single LLM | Qijiong Liu et.al. | 2409.07276 | null |
2024-09-11 | MiniDrive: More Efficient Vision-Language Models with Multi-Level 2D Features as Text Tokens for Autonomous Driving | Enming Zhang et.al. | 2409.07267 | link |
2024-09-10 | E2LLM: Encoder Elongated Large Language Models for Long-Context Understanding and Reasoning | Zihan Liao et.al. | 2409.06679 | null |
2024-09-10 | LLaMA-Omni: Seamless Speech Interaction with Large Language Models | Qingkai Fang et.al. | 2409.06666 | link |
2024-09-10 | Human Perception of LLM-generated Text Content in Social Media Environments | Kristina Radivojevic et.al. | 2409.06653 | null |
2024-09-10 | Optimal Workload Placement on Multi-Instance GPUs | Bekir Turkkan et.al. | 2409.06646 | null |
2024-09-10 | MoWE-Audio: Multitask AudioLLMs with Mixture of Weak Encoders | Wenyu Zhang et.al. | 2409.06635 | null |
2024-09-10 | A Practice of Post-Training on Llama-3 70B with Optimal Selection of Additional Language Mixture Ratio | Ningyuan Xi et.al. | 2409.06624 | null |
2024-09-10 | Alleviating Hallucinations in Large Language Models with Scepticism Modeling | Yetao Wu et.al. | 2409.06601 | null |
2024-09-10 | GroUSE: A Benchmark to Evaluate Evaluators in Grounded Question Answering | Sacha Muller et.al. | 2409.06595 | null |
2024-09-10 | MAPS: Energy-Reliability Tradeoff Management in Autonomous Vehicles Through LLMs Penetrated Science | Mahdieh Aliazam et.al. | 2409.06558 | null |
2024-09-10 | Questioning Internal Knowledge Structure of Large Language Models Through the Lens of the Olympic Games | Juhwan Choi et.al. | 2409.06518 | null |
2024-09-09 | MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct | Run Luo et.al. | 2409.05840 | null |
2024-09-09 | Are Large Language Models a Threat to Programming Platforms? An Exploratory Study | Md Mustakim Billah et.al. | 2409.05824 | null |
2024-09-09 | GASP: Gaussian Splatting for Physic-Based Simulations | Piotr Borycki et.al. | 2409.05819 | null |
2024-09-09 | Benchmarking Chinese Knowledge Rectification in Large Language Models | Tianhe Lu et.al. | 2409.05806 | link |
2024-09-09 | Evidence from fMRI Supports a Two-Phase Abstraction Process in Language Models | Emily Cheng et.al. | 2409.05771 | null |
2024-09-09 | Model Input Verification of Large Scale Simulations | Rumyana Neykova et.al. | 2409.05768 | null |
2024-09-09 | A Novel Idea Generation Tool using a Structured Conversational AI (CAI) System | B. Sankar et.al. | 2409.05747 | null |
2024-09-09 | LLMs Will Always Hallucinate, and We Need to Live With This | Sourav Banerjee et.al. | 2409.05746 | null |
2024-09-09 | A System and Benchmark for LLM-based Q&A on Heterogeneous Data | Achille Fokoue et.al. | 2409.05735 | null |
2024-09-09 | Towards Democratizing Multilingual Large Language Models For Medicine Through A Two-Stage Instruction Fine-tuning Approach | Meng Zhou et.al. | 2409.05732 | null |
2024-09-06 | RLPF: Reinforcement Learning from Prediction Feedback for User Summarization with LLMs | Jiaxing Wu et.al. | 2409.04421 | null |
2024-09-06 | Question-Answering Dense Video Events | Hangyu Qin et.al. | 2409.04388 | null |
2024-09-06 | Learning vs Retrieval: The Role of In-Context Examples in Regression with LLMs | Aliakbar Nafar et.al. | 2409.04318 | null |
2024-09-06 | An optically accelerated extreme learning machine using hot atomic vapors | Pierre Azam et.al. | 2409.04312 | null |
2024-09-06 | Using Large Language Models to Generate Authentic Multi-agent Knowledge Work Datasets | Desiree Heim et.al. | 2409.04286 | null |
2024-09-06 | Advancing Automated Knowledge Transfer in Evolutionary Multitasking via Large Language Models | Yuxiao Huang et.al. | 2409.04270 | null |
2024-09-06 | GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding | Ziyin Zhang et.al. | 2409.04183 | null |
2024-09-06 | Combining LLMs and Knowledge Graphs to Reduce Hallucinations in Question Answering | Larissa Pusch et.al. | 2409.04181 | null |
2024-09-06 | From Calculation to Adjudication: Examining LLM judges on Mathematical Reasoning Tasks | Andreas Stephan et.al. | 2409.04168 | null |
2024-09-06 | Can OpenSource beat ChatGPT? -- A Comparative Study of Large Language Models for Text-to-Code Generation | Luis Mayer et.al. | 2409.04164 | null |
2024-09-05 | Attention Heads of Large Language Models: A Survey | Zifan Zheng et.al. | 2409.03752 | link |
2024-09-05 | LLM-CI: Assessing Contextual Integrity Norms in Language Models | Yan Shvartzshnaider et.al. | 2409.03735 | null |
2024-09-05 | Safety vs. Performance: How Multi-Objective Learning Reduces Barriers to Market Entry | Meena Jagadeesan et.al. | 2409.03734 | null |
2024-09-05 | Planning In Natural Language Improves LLM Search For Code Generation | Evan Wang et.al. | 2409.03733 | null |
2024-09-06 | RAG based Question-Answering for Contextual Response Prediction System | Sriram Veturi et.al. | 2409.03708 | null |
2024-09-05 | TRACE-cs: Trustworthy Reasoning for Contrastive Explanations in Course Scheduling Problems | Stylianos Loukas Vasileiou et.al. | 2409.03671 | null |
2024-09-05 | A Fused Large Language Model for Predicting Startup Success | Abdurahman Maarouf et.al. | 2409.03668 | null |
2024-09-05 | The representation landscape of few-shot learning and fine-tuning in large language models | Diego Doimo et.al. | 2409.03662 | link |
2024-09-06 | LLM-based multi-agent poetry generation in non-cooperative environments | Ran Zhang et.al. | 2409.03659 | link |
2024-09-05 | From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents | Jifan Yu et.al. | 2409.03512 | null |
2024-09-04 | RoboTwin: Dual-Arm Robot Benchmark with Generative Digital Twins (early version) | Yao Mu et.al. | 2409.02920 | null |
2024-09-04 | LongCite: Enabling LLMs to Generate Fine-grained Citations in Long-context QA | jiajie Zhang et.al. | 2409.02897 | null |
2024-09-04 | LongLLaVA: Scaling Multi-modal LLMs to 1000 Images Efficiently via Hybrid Architecture | Xidong Wang et.al. | 2409.02889 | link |
2024-09-04 | Historical German Text Normalization Using Type- and Token-Based Language Modeling | Anton Ehrmanntraut et.al. | 2409.02841 | null |
2024-09-04 | Exploring Sentiment Dynamics and Predictive Behaviors in Cryptocurrency Discussions by Few-Shot Learning with Large Language Models | Moein Shahiki Tash et.al. | 2409.02836 | null |
2024-09-04 | CMM-Math: A Chinese Multimodal Math Dataset To Evaluate and Enhance the Mathematics Reasoning of Large Multimodal Models | Wentao Liu et.al. | 2409.02834 | null |
2024-09-04 | ExpLLM: Towards Chain of Thought for Facial Expression Recognition | Xing Lan et.al. | 2409.02828 | null |
2024-09-04 | Design Contradictions: Help or Hindrance? | Aron E. Owen et.al. | 2409.02823 | null |
2024-09-04 | Language Understanding as a Constraint on Consensus Size in LLM Societies | Giordano De Marzo et.al. | 2409.02822 | null |
2024-09-04 | Towards a Unified View of Preference Learning for Large Language Models: A Survey | Bofei Gao et.al. | 2409.02795 | null |
2024-08-30 | SYNTHEVAL: Hybrid Behavioral Testing of NLP Models with Synthetic CheckLists | Raoyuan Zhao et.al. | 2408.17437 | link |
2024-08-30 | Advancing Multi-talker ASR Performance with Large Language Models | Mohan Shi et.al. | 2408.17431 | null |
2024-08-30 | Getting Inspiration for Feature Elicitation: App Store- vs. LLM-based Approach | Jialiang Wei et.al. | 2408.17404 | null |
2024-08-30 | NDP: Next Distribution Prediction as a More Broad Target | Junhao Ruan et.al. | 2408.17377 | null |
2024-08-30 | Assessing Generative Language Models in Classification Tasks: Performance and Self-Evaluation Capabilities in the Environmental and Climate Change Domain | Francesca Grasso et.al. | 2408.17362 | link |
2024-08-30 | Forget to Flourish: Leveraging Machine-Unlearning on Pretrained Language Models for Privacy Leakage | Md Rafi Ur Rashid et.al. | 2408.17354 | null |
2024-08-30 | Bridging Domain Knowledge and Process Discovery Using Large Language Models | Ali Norouzifar et.al. | 2408.17316 | link |
2024-08-30 | Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts | Rhui Dih Lee et.al. | 2408.17280 | null |
2024-08-30 | Joint Estimation and Prediction of City-wide Delivery Demand: A Large Language Model Empowered Graph-based Learning Approach | Tong Nie et.al. | 2408.17258 | null |
2024-08-30 | VisionTS: Visual Masked Autoencoders Are Free-Lunch Zero-Shot Time Series Forecasters | Mouxiang Chen et.al. | 2408.17253 | link |
2024-08-29 | How Far Can Cantonese NLP Go? Benchmarking Cantonese Capabilities of Large Language Models | Jiyue Jiang et.al. | 2408.16756 | null |
2024-08-29 | Reinforcement Learning without Human Feedback for Last Mile Fine-Tuning of Large Language Models | Alec Solway et.al. | 2408.16753 | null |
2024-08-29 | Assessing Large Language Models for Online Extremism Research: Identification, Explanation, and New Knowledge | Beidi Dong et.al. | 2408.16749 | null |
2024-08-29 | Theoretical and Methodological Framework for Studying Texts Produced by Large Language Models | Jiří Milička et.al. | 2408.16740 | null |
2024-08-29 | GradBias: Unveiling Word Influence on Bias in Text-to-Image Generative Models | Moreno D'Incà et.al. | 2408.16700 | link |
2024-08-29 | Entropic Distribution Matching in Supervised Fine-tuning of LLMs: Less Overfitting and Better Diversity | Ziniu Li et.al. | 2408.16673 | null |
2024-08-29 | Towards Efficient Modelling of String Dynamics: A Comparison of State Space and Koopman based Deep Learning Methods | Rodrigo Diaz et.al. | 2408.16650 | null |
2024-08-29 | Examination of Code generated by Large Language Models | Robin Beer et.al. | 2408.16601 | link |
2024-08-29 | Enhancing Dialogue Generation in Werewolf Game Through Situation Analysis and Persuasion Strategies | Zhiyang Qi et.al. | 2408.16586 | null |
2024-08-29 | CNIMA: A Universal Evaluation Framework and Automated Approach for Assessing Second Language Dialogues | Rena Gao et.al. | 2408.16518 | null |
2024-08-28 | Eagle: Exploring The Design Space for Multimodal LLMs with Mixture of Encoders | Min Shi et.al. | 2408.15998 | link |
2024-08-28 | BattleAgentBench: A Benchmark for Evaluating Cooperation and Competition Capabilities of Language Models in Multi-Agent Systems | Wei Wang et.al. | 2408.15971 | null |
2024-08-28 | More Text, Less Point: Towards 3D Data-Efficient Point-Language Understanding | Yuan Tang et.al. | 2408.15966 | null |
2024-08-28 | Atari-GPT: Investigating the Capabilities of Multimodal Large Language Models as Low-Level Policies for Atari Games | Nicholas R. Waytowich et.al. | 2408.15950 | null |
2024-08-28 | Leveraging Open Knowledge for Advancing Task Expertise in Large Language Models | Yuncheng Yang et.al. | 2408.15915 | null |
2024-08-28 | Decentralized LLM Inference over Edge Networks with Energy Harvesting | Aria Khoshsirat et.al. | 2408.15907 | null |
2024-08-28 | LLM-Based Multi-Hop Question Answering with Knowledge Graph Integration in Evolving Environments | Ruirui Chen et.al. | 2408.15903 | null |
2024-08-28 | Nexus: Specialization meets Adaptability for Efficiently Training Mixture of Experts | Nikolas Gritsch et.al. | 2408.15901 | null |
2024-08-28 | Bias in LLMs as Annotators: The Effect of Party Cues on Labelling Decision by Large Language Models | Sebastian Vallejo Vera et.al. | 2408.15895 | null |
2024-08-28 | Persuasion Games using Large Language Models | Ganesh Prasath Ramani et.al. | 2408.15879 | null |
2024-08-27 | Generative Verifiers: Reward Modeling as Next-Token Prediction | Lunjun Zhang et.al. | 2408.15240 | null |
2024-08-27 | LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks Yet | Nathaniel Li et.al. | 2408.15221 | null |
2024-08-27 | Investigating Coverage Criteria in Large Language Models: An In-Depth Study Through Jailbreak Attacks | Shide Zhou et.al. | 2408.15207 | null |
2024-08-27 | Leveraging Hallucinations to Reduce Manual Prompt Dependency in Promptable Segmentation | Jian Hu et.al. | 2408.15205 | null |
2024-08-27 | Can Unconfident LLM Annotations Be Used for Confident Conclusions? | Kristina Gligorić et.al. | 2408.15204 | null |
2024-08-27 | Unlocking Potential in Pre-Trained Music Language Models for Versatile Multi-Track Music Arrangement | Longshen Ou et.al. | 2408.15176 | null |
2024-08-27 | X-Reflect: Cross-Reflection Prompting for Multimodal Recommendation | Hanjia Lyu et.al. | 2408.15172 | null |
2024-08-27 | Measuring text summarization factuality using atomic facts entailment metrics in the context of retrieval augmented generation | N. E. Kriman et.al. | 2408.15171 | null |
2024-08-27 | BaichuanSEED: Sharing the Potential of ExtensivE Data Collection and Deduplication by Introducing a Competitive Large Language Model Baseline | Guosheng Dong et.al. | 2408.15079 | null |
2024-08-27 | Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models | Ned Cooper et.al. | 2408.15066 | null |
2024-08-27 | Step-by-Step Unmasking for Parameter-Efficient Fine-tuning of Large Language Models | Aradhye Agarwal et.al. | 2408.14470 | link |
2024-08-26 | Grounded Multi-Hop VideoQA in Long-Form Egocentric Videos | Qirui Chen et.al. | 2408.14469 | null |
2024-08-26 | Explicit Inductive Inference using Large Language Models | Tianyang Liu et.al. | 2408.14467 | null |
2024-08-26 | Evaluating Large Language Models on Spatial Tasks: A Multi-Task Benchmarking Study | Liuchang Xu Shuo Zhao et.al. | 2408.14438 | null |
2024-08-26 | CHARTOM: A Visual Theory-of-Mind Benchmark for Multimodal Large Language Models | Shubham Bharti et.al. | 2408.14419 | null |
2024-08-26 | MEDSAGE: Enhancing Robustness of Medical Dialogue Summarization to ASR Errors with LLM-generated Synthetic Dialogues | Kuluhan Binici et.al. | 2408.14418 | null |
2024-08-26 | Language-specific Calibration for Pruning Multilingual Language Models | Simon Kurz et.al. | 2408.14398 | null |
2024-08-26 | Reprogramming Foundational Large Language Models(LLMs) for Enterprise Adoption for Spatio-Temporal Forecasting Applications: Unveiling a New Era in Copilot-Guided Cross-Modal Time Series Representation Learning | Sakhinana Sagar Srinivas et.al. | 2408.14387 | null |
2024-08-26 | Probing Causality Manipulation of Large Language Models | Chenyang Zhang et.al. | 2408.14380 | link |
2024-08-26 | SWE-bench-java: A GitHub Issue Resolving Benchmark for Java | Daoguang Zan et.al. | 2408.14354 | link |
2024-08-23 | MME-RealWorld: Could Your Multimodal LLM Challenge High-Resolution Real-World Scenarios that are Difficult for Humans? | Yi-Fan Zhang et.al. | 2408.13257 | null |
2024-08-23 | Domain-specific long text classification from sparse relevant information | Célia D'Cruz et.al. | 2408.13253 | null |
2024-08-23 | Multi-Layer Transformers Gradient Can be Approximated in Almost Linear Time | Yingyu Liang et.al. | 2408.13233 | null |
2024-08-23 | EUR-USD Exchange Rate Forecasting Based on Information Fusion with Large Language Models and Deep Learning Methods | Hongcheng Ding et.al. | 2408.13214 | null |
2024-08-23 | DOMAINEVAL: An Auto-Constructed Benchmark for Multi-Domain Code Generation | Qiming Zhu et.al. | 2408.13204 | null |
2024-08-23 | Can LLM be a Good Path Planner based on Prompt Engineering? Mitigating the Hallucination for Path Planning | Hourui Deng et.al. | 2408.13184 | null |
2024-08-23 | IntelliCare: Improving Healthcare Analysis with Variance-Controlled Patient-Level Knowledge from Large Language Models | Zhihao Yu et.al. | 2408.13073 | null |
2024-08-23 | Guiding IoT-Based Healthcare Alert Systems with Large Language Models | Yulan Gao et.al. | 2408.13071 | null |
2024-08-23 | VFM-Det: Towards High-Performance Vehicle Detection via Large Foundation Models | Wentao Wu et.al. | 2408.13031 | link |
2024-08-23 | In-Context Learning with Reinforcement Learning for Incomplete Utterance Rewriting | Haowei Du et.al. | 2408.13028 | null |
2024-08-22 | Controllable Text Generation for Large Language Models: A Survey | Xun Liang et.al. | 2408.12599 | link |
2024-08-22 | xGen-VideoSyn-1: High-fidelity Text-to-Video Synthesis with Compressed Representations | Can Qin et.al. | 2408.12590 | null |
2024-08-22 | RuleAlign: Making Large Language Models Better Physicians with Diagnostic Rule Alignment | Xiaohan Wang et.al. | 2408.12579 | null |
2024-08-22 | Jamba-1.5: Hybrid Transformer-Mamba Models at Scale | Jamba Team et.al. | 2408.12570 | null |
2024-08-22 | ssProp: Energy-Efficient Training for Convolutional Neural Networks with Scheduled Sparse Back Propagation | Lujia Zhong et.al. | 2408.12561 | link |
2024-08-22 | Towards Evaluating and Building Versatile Large Language Models for Medicine | Chaoyi Wu et.al. | 2408.12547 | link |
2024-08-22 | MEDCO: Medical Education Copilots Based on A Multi-Agent Framework | Hao Wei et.al. | 2408.12496 | null |
2024-08-22 | GenderCARE: A Comprehensive Framework for Assessing and Reducing Gender Bias in Large Language Models | Kunsheng Tang et.al. | 2408.12494 | link |
2024-08-23 | Vintern-1B: An Efficient Multimodal Large Language Model for Vietnamese | Khang T. Doan et.al. | 2408.12480 | null |
2024-08-22 | Frame Order Matters: A Temporal Sequence-Aware Model for Few-Shot Action Recognition | Bozheng Li et.al. | 2408.12475 | null |
2024-08-21 | SEA: Supervised Embedding Alignment for Token-Level Visual-Textual Integration in MLLMs | Yuanyang Yin et.al. | 2408.11813 | null |
2024-08-21 | Story3D-Agent: Exploring 3D Storytelling Visualization with Large Language Models | Yuzhou Huang et.al. | 2408.11801 | null |
2024-08-21 | PermitQA: A Benchmark for Retrieval Augmented Generation in Wind Siting and Permitting domain | Rounak Meyur et.al. | 2408.11800 | null |
2024-08-21 | EE-MLLM: A Data-Efficient and Compute-Efficient Multimodal Large Language Model | Feipeng Ma et.al. | 2408.11795 | null |
2024-08-21 | Leveraging Chemistry Foundation Models to Facilitate Structure Focused Retrieval Augmented Generation in Multi-Agent Workflows for Catalyst and Materials Design | Nathaniel H. Park et.al. | 2408.11793 | null |
2024-08-21 | Critique-out-Loud Reward Models | Zachary Ankner et.al. | 2408.11791 | link |
2024-08-21 | DreamFactory: Pioneering Multi-Scene Long Video Generation with a Multi-Agent Framework | Zhifei Xie et.al. | 2408.11788 | null |
2024-08-21 | Personality Alignment of Large Language Models | Minjun Zhu et.al. | 2408.11779 | link |
2024-08-21 | Leveraging Fine-Tuned Retrieval-Augmented Generation with Long-Context Support: For 3GPP Standards | Omar Erak et.al. | 2408.11775 | link |
2024-08-21 | Against All Odds: Overcoming Typology, Script, and Language Confusion in Multilingual Embedding Inversion Attacks | Yiyi Chen et.al. | 2408.11749 | null |
2024-08-20 | Revisiting VerilogEval: Newer LLMs, In-Context Learning, and Specification-to-RTL Tasks | Nathaniel Pinckney et.al. | 2408.11053 | null |
2024-08-20 | FLAME: Learning to Navigate with Multimodal LLM in Urban Environments | Yunzhe Xu et.al. | 2408.11051 | link |
2024-08-21 | MagicDec: Breaking the Latency-Throughput Tradeoff for Long Context Generation with Speculative Decoding | Jian Chen et.al. | 2408.11049 | null |
2024-08-20 | Reconciling Methodological Paradigms: Employing Large Language Models as Novice Qualitative Research Assistants in Talent Management Research | Sreyoshi Bhaduri et.al. | 2408.11043 | null |
2024-08-20 | Scaling Law with Learning Rate Annealing | Howe Tissue et.al. | 2408.11029 | null |
2024-08-20 | Athena: Safe Autonomous Agents with Verbal Contrastive Learning | Tanmana Sadhu et.al. | 2408.11021 | null |
2024-08-20 | While GitHub Copilot Excels at Coding, Does It Ensure Responsible Output? | Wen Cheng et.al. | 2408.11006 | link |
2024-08-20 | CTP-LLM: Clinical Trial Phase Transition Prediction Using Large Language Models | Michael Reinisch et.al. | 2408.10995 | null |
2024-08-20 | Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models | Yuyan Chen et.al. | 2408.10947 | null |
2024-08-20 | Large Language Model Driven Recommendation | Anton Korikov et.al. | 2408.10946 | null |
2024-08-19 | Demystifying the Communication Characteristics for Distributed Transformer Models | Quentin Anthony et.al. | 2408.10197 | null |
2024-08-19 | SMILE: Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models | Anke Tang et.al. | 2408.10174 | link |
2024-08-19 | Customizing Language Models with Instance-wise LoRA for Sequential Recommendation | Xiaoyu Kong et.al. | 2408.10159 | null |
2024-08-19 | Multilingual Needle in a Haystack: Investigating Long-Context Behavior of Multilingual Large Language Models | Amey Hengle et.al. | 2408.10151 | null |
2024-08-19 | In-Context Learning with Representations: Contextual Generalization of Trained Transformers | Tong Yang et.al. | 2408.10147 | null |
2024-08-19 | Instruction Finetuning for Leaderboard Generation from Empirical AI Research | Salomon Kabongo et.al. | 2408.10141 | null |
2024-08-19 | Molecular Graph Representation Learning Integrating Large Language Models with Domain-specific Small Models | Tianyu Zhang et.al. | 2408.10124 | link |
2024-08-20 | PLUTUS: A Well Pre-trained Large Unified Transformer can Unveil Financial Time Series Regularities | Yuanjian Xu et.al. | 2408.10111 | null |
2024-08-19 | ARMADA: Attribute-Based Multimodal Data Augmentation | Xiaomeng Jin et.al. | 2408.10086 | null |
2024-08-19 | FFAA: Multimodal Large Language Model based Explainable Open-World Face Forgery Analysis Assistant | Zhengchao Huang et.al. | 2408.10072 | null |
2024-08-19 | PEDAL: Enhancing Greedy Decoding with Large Language Models using Diverse Exemplars | Sumanth Prabhu et.al. | 2408.08869 | null |
2024-08-16 | Visual Agents as Fast and Slow Thinkers | Guangyan Sun et.al. | 2408.08862 | null |
2024-08-16 | ECG-Chat: A Large ECG-Language Model for Cardiac Disease Diagnosis | Yubao Zhao et.al. | 2408.08849 | null |
2024-08-16 | PsychoLex: Unveiling the Psychological Mind of Large Language Models | Mohammad Amin Abbasi et.al. | 2408.08848 | null |
2024-08-16 | FLEXTAF: Enhancing Table Reasoning with Flexible Tabular Formats | Xuanliang Zhang et.al. | 2408.08841 | link |
2024-08-16 | Artificial Intelligence and Strategic Decision-Making: Evidence from Entrepreneurs and Investors | Felipe A. Csaszar et.al. | 2408.08811 | null |
2024-08-16 | Constructing Domain-Specific Evaluation Sets for LLM-as-a-judge | Ravi Raju et.al. | 2408.08808 | null |
2024-08-16 | EmoDynamiX: Emotional Support Dialogue Strategy Prediction by Modelling MiXed Emotions and Discourse Dynamics | Chenwei Wan et.al. | 2408.08782 | link |
2024-08-16 | Large Language Models Might Not Care What You Are Saying: Prompt Format Beats Descriptions | Chenming Tang et.al. | 2408.08780 | null |
2024-08-16 | DAC: Decomposed Automation Correction for Text-to-SQL | Dingzirui Wang et.al. | 2408.08779 | link |
2024-08-15 | Can Large Language Models Understand Symbolic Graphics Programs? | Zeju Qiu et.al. | 2408.08313 | null |
2024-08-15 | ScalingFilter: Assessing Data Quality through Inverse Utilization of Scaling Laws | Ruihang Li et.al. | 2408.08310 | null |
2024-08-15 | Benchmarking the Capabilities of Large Language Models in Transportation System Engineering: Accuracy, Consistency, and Reasoning Behaviors | Usman Syed et.al. | 2408.08302 | null |
2024-08-15 | HELP: Hierarchical Embeddings-based Log Parsing | Andy Xu et.al. | 2408.08300 | null |
2024-08-15 | The ShareLM Collection and Plugin: Contributing Human-Model Chats for the Benefit of the Community | Shachar Don-Yehiya et.al. | 2408.08291 | null |
2024-08-15 | Autonomous Behavior Planning For Humanoid Loco-manipulation Through Grounded Language Model | Jin Wang et.al. | 2408.08282 | null |
2024-08-15 | BAM! Just Like That: Simple and Efficient Parameter Upcycling for Mixture of Experts | Qizhen Zhang et.al. | 2408.08274 | null |
2024-08-15 | DaRec: A Disentangled Alignment Framework for Large Language Model and Recommender System | Xihong Yang et.al. | 2408.08231 | null |
2024-08-15 | RED-CT: A Systems Design Methodology for Using LLM-labeled Data to Train and Deploy Edge Classifiers for Computational Social Science | David Farr et.al. | 2408.08217 | null |
2024-08-15 | Does Reasoning Emerge? Examining the Probabilities of Causation in Large Language Models | Javier González et.al. | 2408.08210 | null |
2024-08-14 | The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models | Karime Maamari et.al. | 2408.07702 | null |
2024-08-15 | Model Merging in LLMs, MLLMs, and Beyond: Methods, Theories, Applications and Opportunities | Enneng Yang et.al. | 2408.07666 | link |
2024-08-14 | Spoken Stereoset: On Evaluating Social Bias Toward Speaker in Speech Large Language Models | Yi-Cheng Lin et.al. | 2408.07665 | null |
2024-08-14 | Alignment-Enhanced Decoding:Defending via Token-Level Adaptive Refining of Probability Distributions | Quan Liu et.al. | 2408.07663 | link |
2024-08-14 | WeKnow-RAG: An Adaptive Approach for Retrieval-Augmented Generation Integrating Web Search and Knowledge Graphs | Weijian Xie et.al. | 2408.07611 | null |
2024-08-14 | Transformers and Large Language Models for Efficient Intrusion Detection Systems: A Comprehensive Survey | Hamza Kheddar et.al. | 2408.07583 | null |
2024-08-15 | MathScape: Evaluating MLLMs in multimodal Math Scenarios through a Hierarchical Benchmark | Minxuan Zhou et.al. | 2408.07543 | null |
2024-08-15 | Usefulness of data flow diagrams and large language models for security threat validation: a registered report | Winnie Bahati Mbaka et.al. | 2408.07537 | null |
2024-08-14 | Development of a Multi-Agent Clinical Decision Support System for Korean Triage and Acuity Scale (KTAS)-Based Triage and Treatment Planning in Emergency Departments | Seungjun Han et.al. | 2408.07531 | null |
2024-08-14 | Large Language Models Know What Makes Exemplary Contexts | Quanyu Long et.al. | 2408.07505 | null |
2024-08-13 | Diversity Empowers Intelligence: Integrating Expertise of Software Engineering Agents | Kexun Zhang et.al. | 2408.07060 | null |
2024-08-13 | LongWriter: Unleashing 10,000+ Word Generation from Long Context LLMs | Yushi Bai et.al. | 2408.07055 | link |
2024-08-13 | Casper: Prompt Sanitization for Protecting User Privacy in Web-Based Large Language Models | Chun Jie Chong et.al. | 2408.07004 | null |
2024-08-13 | LLMs can Schedule | Henrik Abgaryan et.al. | 2408.06993 | link |
2024-08-13 | OpenResearcher: Unleashing AI for Accelerated Scientific Research | Yuxiang Zheng et.al. | 2408.06941 | link |
2024-08-13 | Evaluating Cultural Adaptability of a Large Language Model via Simulation of Synthetic Personas | Louis Kwok et.al. | 2408.06929 | null |
2024-08-13 | Re-TASK: Revisiting LLM Tasks from Capability, Skill, and Knowledge Perspectives | Zhihu Wang et.al. | 2408.06904 | null |
2024-08-13 | Leveraging Language Models for Emotion and Behavior Analysis in Education | Kaito Tanaka et.al. | 2408.06874 | null |
2024-08-13 | LoRA |
Jia-Chen Zhang et.al. | 2408.06854 | null |
2024-08-13 | Causal Agent based on Large Language Model | Kairong Han et.al. | 2408.06849 | link |
2024-08-12 | Animate, or Inanimate, That is the Question for Large Language Models | Leonardo Ranaldi et.al. | 2408.06332 | null |
2024-08-12 | Can We Rely on LLM Agents to Draft Long-Horizon Plans? Let's Take TravelPlanner as an Example | Yanan Chen et.al. | 2408.06318 | null |
2024-08-12 | The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | Chris Lu et.al. | 2408.06292 | link |
2024-08-12 | MovieSum: An Abstractive Summarization Dataset for Movie Screenplays | Rohit Saxena et.al. | 2408.06281 | link |
2024-08-13 | Review-driven Personalized Preference Reasoning with Large Language Models for Recommendation | Jieyong Kim et.al. | 2408.06276 | null |
2024-08-13 | FuxiTranyu: A Multilingual Large Language Model Trained with Balanced Data | Haoran Sun et.al. | 2408.06273 | null |
2024-08-12 | A RAG-Based Question-Answering Solution for Cyber-Attack Investigation and Attribution | Sampath Rajapaksha et.al. | 2408.06272 | null |
2024-08-12 | Anchored Preference Optimization and Contrastive Revisions: Addressing Underspecification in Alignment | Karel D'Oosterlinck et.al. | 2408.06266 | null |
2024-08-12 | On Effects of Steering Latent Representation for Large Language Model Unlearning | Dang Huu-Tien et.al. | 2408.06223 | null |
2024-08-12 | Improving Structural Diversity of Blackbox LLMs via Chain-of-Specification Prompting | Halley Young et.al. | 2408.06186 | null |
2024-08-10 | Preserving Privacy in Large Language Models: A Survey on Current Threats and Solutions | Michele Miranda et.al. | 2408.05212 | null |
2024-08-09 | VITA: Towards Open-Source Interactive Omni Multimodal LLM | Chaoyou Fu et.al. | 2408.05211 | null |
2024-08-09 | Evaluating the capability of large language models to personalize science texts for diverse middle-school-age learners | Michael Vaccaro Jr et.al. | 2408.05204 | null |
2024-08-09 | TaSL: Task Skill Localization and Consolidation for Language Model Continual Learning | Yujie Feng et.al. | 2408.05200 | null |
2024-08-09 | AttackER: Towards Enhancing Cyber-Attack Attribution with a Named Entity Recognition Dataset | Pritam Deka et.al. | 2408.05149 | null |
2024-08-09 | A Hybrid RAG System with Comprehensive Enhancement on Complex Reasoning | Ye Yuan et.al. | 2408.05141 | null |
2024-08-09 | Is ChatGPT a Good Software Librarian? An Exploratory Study on the Use of ChatGPT for Software Library Recommendations | Jasmine Latendresse et.al. | 2408.05128 | null |
2024-08-09 | Large Language Models and Thematic Analysis: Human-AI Synergy in Researching Hate Speech on Social Media | Petre Breazu et.al. | 2408.05126 | null |
2024-08-09 | Sportify: Question Answering with Embedded Visualizations and Personified Narratives for Sports Video | Chunggi Lee et.al. | 2408.05123 | null |
2024-08-09 | A Survey of NL2SQL with Large Language Models: Where are we, and where are we going? | Xinyu Liu et.al. | 2408.05109 | null |
2024-08-08 | Better Alignment with Instruction Back-and-Forth Translation | Thao Nguyen et.al. | 2408.04614 | null |
2024-08-09 | Img-Diff: Contrastive Data Synthesis for Multimodal Large Language Models | Qirui Jiao et.al. | 2408.04594 | link |
2024-08-08 | Towards Resilient and Efficient LLMs: A Comparative Study of Efficiency, Performance, and Adversarial Robustness | Xiaojing Fan et.al. | 2408.04585 | null |
2024-08-08 | SCENE: Evaluating Explainable AI Techniques Using Soft Counterfactuals | Haoran Zheng et.al. | 2408.04575 | null |
2024-08-08 | Learning Fine-Grained Grounded Citations for Attributed Large Language Models | Lei Huang et.al. | 2408.04568 | link |
2024-08-08 | Bias-Aware Low-Rank Adaptation: Mitigating Catastrophic Inheritance of Large Language Models | Yupeng Chang et.al. | 2408.04556 | link |
2024-08-08 | Compromesso! Italian Many-Shot Jailbreaks Undermine the Safety of Large Language Models | Fabio Pernisi et.al. | 2408.04522 | null |
2024-08-08 | What You Need is What You Get: Theory of Mind for an LLM-Based Code Understanding Assistant | Jonan Richards et.al. | 2408.04477 | null |
2024-08-08 | Can LLMs Beat Humans in Debating? A Dynamic Multi-agent Framework for Competitive Debate | Yiqun Zhang et.al. | 2408.04472 | link |
2024-08-08 | RiskAwareBench: Towards Evaluating Physical Risk Awareness for High-level Planning of LLM-based Embodied Agents | Zihao Zhu et.al. | 2408.04449 | null |
2024-08-07 | How Well Can Vision Language Models See Image Details? | Chenhui Gou et.al. | 2408.03940 | null |
2024-08-07 | SLIM-RAFT: A Novel Fine-Tuning Approach to Improve Cross-Linguistic Performance for Mercosur Common Nomenclature | Vinícius Di Oliveira et.al. | 2408.03936 | null |
2024-08-07 | CodexGraph: Bridging Large Language Models and Code Repositories via Code Graph Databases | Xiangyan Liu et.al. | 2408.03910 | link |
2024-08-07 | Decoding Biases: Automated Methods and LLM Judges for Gender Bias Detection in Language Models | Shachi H Kumar et.al. | 2408.03907 | null |
2024-08-07 | From Data to Story: Towards Automatic Animated Data Video Creation with LLM-based Multi-Agent Systems | Leixian Shen et.al. | 2408.03876 | null |
2024-08-07 | PackMamba: Efficient Processing of Variable-Length Sequences in Mamba training | Haoran Xu et.al. | 2408.03865 | null |
2024-08-07 | GAIA -- A Large Language Model for Advanced Power Dispatch | Yuheng Cheng et.al. | 2408.03847 | null |
2024-08-07 | MaxMind: A Memory Loop Network to Enhance Software Productivity based on Large Language Models | Yuchen Dong et.al. | 2408.03841 | null |
2024-08-07 | WalledEval: A Comprehensive Safety Evaluation Toolkit for Large Language Models | Prannaya Gupta et.al. | 2408.03837 | null |
2024-08-07 | Target Prompting for Information Extraction with Vision Language Model | Dipankar Medhi et.al. | 2408.03834 | null |
2024-08-06 | TextIM: Part-aware Interactive Motion Synthesis from Text | Siyuan Fan et.al. | 2408.03302 | null |
2024-08-06 | KaPO: Knowledge-aware Preference Optimization for Controllable Knowledge Selection in Retrieval-Augmented Language Models | Ruizhe Zhang et.al. | 2408.03297 | null |
2024-08-07 | StructEval: Deepen and Broaden Large Language Model Assessment via Structured Evaluation | Boxi Cao et.al. | 2408.03281 | link |
2024-08-06 | Synthesizing Text-to-SQL Data from Weak and Strong LLMs | Jiaxi Yang et.al. | 2408.03256 | null |
2024-08-06 | Unveiling Factual Recall Behaviors of Large Language Models through Knowledge Neurons | Yifei Wang et.al. | 2408.03247 | null |
2024-08-06 | Crab Pulsar: IXPE Observations Reveal Unified Polarization Properties Across Optical and Soft X-Ray Bands | Denis González-Caniulef et.al. | 2408.03245 | null |
2024-08-06 | Leveraging Parameter Efficient Training Methods for Low Resource Text Classification: A Case Study in Marathi | Pranita Deshmukh et.al. | 2408.03172 | null |
2024-08-06 | Conditioning LLMs with Emotion in Neural Machine Translation | Charles Brazier et.al. | 2408.03150 | null |
2024-08-06 | Inference Optimizations for Large Language Models: Effects, Challenges, and Practical Considerations | Leo Donisch et.al. | 2408.03130 | null |
2024-08-06 | Lisbon Computational Linguists at SemEval-2024 Task 2: Using A Mistral 7B Model and Data Augmentation | Artur Guimarães et.al. | 2408.03127 | null |
2024-08-05 | Can Reinforcement Learning Unlock the Hidden Dangers in Aligned Large Language Models? | Mohammad Bahrami Karkevandi et.al. | 2408.02651 | null |
2024-08-05 | SEAS: Self-Evolving Adversarial Safety Optimization for Large Language Models | Muxi Diao et.al. | 2408.02632 | null |
2024-08-05 | LaMamba-Diff: Linear-Time High-Fidelity Diffusion Models Based on Local Attention and Mamba | Yunxiang Fu et.al. | 2408.02615 | null |
2024-08-05 | Progressively Selective Label Enhancement for Language Model Alignment | Biao Liu et.al. | 2408.02599 | null |
2024-08-05 | Leveraging the Power of LLMs: A Fine-Tuning Approach for High-Quality Aspect-Based Summarization | Ankan Mullick et.al. | 2408.02584 | null |
2024-08-05 | Evaluating and Enhancing LLMs Agent based on Theory of Mind in Guandan: A Multi-Player Cooperative Game under Imperfect Information | Yauwai Yim et.al. | 2408.02559 | null |
2024-08-05 | Generative AI as a Service in 6G Edge-Cloud: Generation Task Offloading by In-context Learning | Hao Zhou et.al. | 2408.02549 | null |
2024-08-05 | RAG Foundry: A Framework for Enhancing LLMs for Retrieval Augmented Generation | Daniel Fleischer et.al. | 2408.02545 | null |
2024-08-05 | Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions | Xinbei Ma et.al. | 2408.02544 | null |
2024-08-05 | Towards Coarse-grained Visual Language Navigation Task Planning Enhanced by Event Knowledge Graph | Zhao Kaichen et.al. | 2408.02535 | null |
2024-08-02 | Prompt Recursive Search: A Living Framework with Adaptive Growth in LLM Auto-Prompting | Xiangyu Zhao et.al. | 2408.01423 | null |
2024-08-02 | Mission Impossible: A Statistical Perspective on Jailbreaking LLMs | Jingtong Su et.al. | 2408.01420 | null |
2024-08-02 | DebateQA: Evaluating Question Answering on Debatable Knowledge | Rongwu Xu et.al. | 2408.01419 | null |
2024-08-02 | Talk Less, Interact Better: Evaluating In-context Conversational Adaptation in Multimodal LLMs | Yilun Hua et.al. | 2408.01417 | null |
2024-08-02 | Pre-trained Language Models Improve the Few-shot Prompt Ability of Decision Transformer | Yu Yang et.al. | 2408.01402 | null |
2024-08-02 | Coalitions of Large Language Models Increase the Robustness of AI Agents | Prattyush Mangal et.al. | 2408.01380 | null |
2024-08-02 | Toward Automatic Relevance Judgment using Vision--Language Models for Image--Text Retrieval Evaluation | Jheng-Hong Yang et.al. | 2408.01363 | null |
2024-08-05 | Hallu-PI: Evaluating Hallucination in Multi-modal Large Language Models within Perturbed Inputs | Peng Ding et.al. | 2408.01355 | null |
2024-08-02 | MCGMark: An Encodable and Robust Online Watermark for LLM-Generated Malicious Code | Kaiwen Ning et.al. | 2408.01354 | null |
2024-08-02 | Prompt Refinement or Fine-tuning? Best Practices for using LLMs in Computational Social Science Tasks | Anders Giovanni Møller et.al. | 2408.01346 | null |
2024-08-01 | AgentGen: Enhancing Planning Abilities for Large Language Model based Agent via Environment and Task Generation | Mengkang Hu et.al. | 2408.00764 | null |
2024-08-01 | Tamper-Resistant Safeguards for Open-Weight LLMs | Rishub Tamirisa et.al. | 2408.00761 | null |
2024-08-01 | DynamoLLM: Designing LLM Inference Clusters for Performance and Energy Efficiency | Jovan Stojkovic et.al. | 2408.00741 | null |
2024-08-01 | Improving Retrieval-Augmented Generation in Medicine with Iterative Follow-up Questions | Guangzhi Xiong et.al. | 2408.00727 | null |
2024-08-01 | An Empirical Analysis of Compute-Optimal Inference for Problem-Solving with Language Models | Yangzhen Wu et.al. | 2408.00724 | null |
2024-08-01 | Pathway to Secure and Trustworthy 6G for LLMs: Attacks, Defense, and Opportunities | Sunder Ali Khowaja et.al. | 2408.00722 | null |
2024-08-02 | Improving Text Embeddings for Smaller Language Models Using Contrastive Fine-tuning | Trapoom Ukarapol et.al. | 2408.00690 | null |
2024-08-01 | Can Developers Prompt? A Controlled Experiment for Code Documentation Generation | Hans-Alexander Kruse et.al. | 2408.00686 | null |
2024-08-01 | AutoM3L: An Automated Multimodal Machine Learning Framework with Large Language Models | Daqin Luo et.al. | 2408.00665 | null |
2024-08-01 | Disentangling Dense Embeddings with Sparse Autoencoders | Charles O'Neill et.al. | 2408.00657 | null |
2024-07-31 | Paying More Attention to Image: A Training-Free Method for Alleviating Hallucination in LVLMs | Shi Liu et.al. | 2407.21771 | null |
2024-07-31 | ReplanVLM: Replanning Robotic Tasks with Visual Language Models | Aoran Mei et.al. | 2407.21762 | null |
2024-07-31 | Adaptive Retrieval-Augmented Generation for Conversational Systems | Xi Wang et.al. | 2407.21712 | null |
2024-07-31 | CEAR: Automatic construction of a knowledge graph of chemical entities and roles from scientific literature | Stefan Langer et.al. | 2407.21708 | null |
2024-07-31 | TransferTOD: A Generalizable Chinese Multi-Domain Task-Oriented Dialogue System with Transfer Capabilities | Ming Zhang et.al. | 2407.21693 | null |
2024-07-31 | Synth-Empathy: Towards High-Quality Synthetic Empathy Data | Hao Liang et.al. | 2407.21669 | null |
2024-07-31 | LLM-for-X: Application-agnostic Integration of Large Language Models to Support Personal Writing Workflows | Lukas Teufelberger et.al. | 2407.21593 | null |
2024-07-31 | A Performance Study of LLM-Generated Code on Leetcode | Tristan Coignion et.al. | 2407.21579 | null |
2024-07-31 | PMoE: Progressive Mixture of Experts with Asymmetric Transformer for Continual Learning | Min Jae Jung et.al. | 2407.21571 | null |
2024-07-31 | CXSimulator: A User Behavior Simulation using LLM Embeddings for Web-Marketing Campaign Assessment | Akira Kasuga et.al. | 2407.21553 | null |
2024-07-30 | ThinK: Thinner Key Cache by Query-Driven Pruning | Yuhui Xu et.al. | 2407.21018 | null |
2024-07-30 | CLEFT: Language-Image Contrastive Learning with Efficient Large Language Model and Prompt Fine-Tuning | Yuexi Du et.al. | 2407.21011 | link |
2024-07-31 | MoFO: Momentum-Filtered Optimizer for Mitigating Forgetting in LLM Fine-Tuning | Yupeng Chen et.al. | 2407.20999 | null |
2024-07-30 | From Feature Importance to Natural Language Explanations Using LLMs with RAG | Sule Tekkesinoglu et.al. | 2407.20990 | null |
2024-07-30 | Large Language Models (LLMs) for Semantic Communication in Edge-based IoT Networks | Alakesh Kalita et.al. | 2407.20970 | null |
2024-07-30 | Automated Review Generation Method Based on Large Language Models | Shican Wu et.al. | 2407.20906 | link |
2024-07-30 | ThinkRepair: Self-Directed Automated Program Repair | Xin Yin et.al. | 2407.20898 | link |
2024-07-30 | Effective Black Box Testing of Sentiment Analysis Classification Networks | Parsa Karbasizadeh et.al. | 2407.20884 | null |
2024-07-30 | Breaking Agents: Compromising Autonomous LLM Agents Through Malfunction Amplification | Boyang Zhang et.al. | 2407.20859 | null |
2024-07-30 | Learn by Selling: Equipping Large Language Models with Product Knowledge for Context-Driven Recommendations | Sarthak Anand et.al. | 2407.20856 | null |
2024-07-29 | Specify and Edit: Overcoming Ambiguity in Text-Based Image Editing | Ekaterina Iakovleva et.al. | 2407.20232 | null |
2024-07-29 | Can Editing LLMs Inject Harm? | Canyu Chen et.al. | 2407.20224 | null |
2024-07-29 | QAEA-DR: A Unified Text Augmentation Framework for Dense Retrieval | Hongming Tan et.al. | 2407.20207 | null |
2024-07-29 | MindSearch: Mimicking Human Minds Elicits Deep AI Searcher | Zehui Chen et.al. | 2407.20183 | link |
2024-07-29 | Advancing Multimodal Large Language Models in Chart Question Answering with Visualization-Referenced Instruction Tuning | Xingchen Zeng et.al. | 2407.20174 | link |
2024-07-29 | Diffusion Feedback Helps CLIP See Better | Wenxuan Wang et.al. | 2407.20171 | null |
2024-07-29 | Language-Conditioned Offline RL for Multi-Robot Navigation | Steven Morad et.al. | 2407.20164 | null |
2024-07-29 | rLLM: Relational Table Learning with LLMs | Weichen Li et.al. | 2407.20157 | link |
2024-07-29 | ByteCheckpoint: A Unified Checkpointing System for LLM Development | Borui Wan et.al. | 2407.20143 | null |
2024-07-29 | Orca: Ocean Significant Wave Height Estimation with Spatio-temporally Aware Large Language Models | Zhe Li et.al. | 2407.20053 | null |
2024-07-26 | Small Molecule Optimization with Large Language Models | Philipp Guevorguian et.al. | 2407.18897 | link |
2024-07-26 | Human-artificial intelligence teaming for scientific information extraction from data-driven additive manufacturing research using large language models | Mutahar Safdar et.al. | 2407.18827 | null |
2024-07-26 | Automatic Detection of Moral Values in Music Lyrics | Vjosa Preniqi et.al. | 2407.18787 | null |
2024-07-26 | The power of Prompts: Evaluating and Mitigating Gender Bias in MT with LLMs | Aleix Sant et.al. | 2407.18786 | null |
2024-07-26 | TAGIFY: LLM-powered Tagging Interface for Improved Data Findability on OGD portals | Kevin Kliimask et.al. | 2407.18764 | null |
2024-07-29 | Knowledge Graph Structure as Prompt: Improving Small Language Models Capabilities for Knowledge-based Causal Discovery | Yuni Susanti et.al. | 2407.18752 | link |
2024-07-26 | Towards Effective and Efficient Continual Pre-training of Large Language Models | Jie Chen et.al. | 2407.18743 | null |
2024-07-26 | Towards Generalized Offensive Language Identification | Alphaeus Dmonte et.al. | 2407.18738 | null |
2024-07-26 | LLASP: Fine-tuning Large Language Models for Answer Set Programming | Erica Coppolillo et.al. | 2407.18723 | null |
2024-07-26 | Neurosymbolic AI for Enhancing Instructability in Generative AI | Amit Sheth et.al. | 2407.18722 | null |
2024-07-26 | Recursive Introspection: Teaching Language Model Agents How to Self-Improve | Yuxiao Qu et.al. | 2407.18219 | null |
2024-07-26 | Exploring Scaling Trends in LLM Robustness | Nikolaus Howe et.al. | 2407.18213 | null |
2024-07-25 | Unlocking Tokens as Data Points for Generalization Bounds on Larger Language Models | Sanae Lotfi et.al. | 2407.18158 | null |
2024-07-26 | Dallah: A Dialect-Aware Multimodal Large Language Model for Arabic | Fakhraddin Alwajih et.al. | 2407.18129 | null |
2024-07-25 | Fine-Tuning Large Language Models for Stock Return Prediction Using Newsflow | Tian Guo et.al. | 2407.18103 | null |
2024-07-25 | PEFT-U: Parameter-Efficient Fine-Tuning for User Personalization | Christopher Clarke et.al. | 2407.18078 | link |
2024-07-25 | C2P: Featuring Large Language Models with Causal Reasoning | Abdolmahdi Bagheri et.al. | 2407.18069 | null |
2024-07-25 | ComPeer: A Generative Conversational Agent for Proactive Peer Support | Tianjian Liu et.al. | 2407.18064 | null |
2024-07-25 | Audio Entailment: Assessing Deductive Reasoning for Audio Understanding | Soham Deshmukh et.al. | 2407.18062 | link |
2024-07-25 | Difficulty Estimation and Simplification of French Text Using LLMs | Henri Jamet et.al. | 2407.18061 | null |
2024-07-24 | I Could've Asked That: Reformulating Unanswerable Questions | Wenting Zhao et.al. | 2407.17469 | link |
2024-07-24 | WildHallucinations: Evaluating Long-form Factuality in LLMs with Real-World Entity Queries | Wenting Zhao et.al. | 2407.17468 | null |
2024-07-24 | CMR Scaling Law: Predicting Critical Mixture Ratios for Continual Pre-training of Language Models | Jiawei Gu et.al. | 2407.17467 | null |
2024-07-24 | Yunhao Fang et.al. | 2407.17453 | null | |
2024-07-24 | Can Watermarking Large Language Models Prevent Copyrighted Text Generation and Hide Training Data? | Michael-Andrei Panaitescu-Liess et.al. | 2407.17417 | null |
2024-07-24 | (PASS) Visual Prompt Locates Good Structure Sparsity through a Recurrent HyperNetwork | Tianjin Huang et.al. | 2407.17412 | null |
2024-07-24 | Grammar-based Game Description Generation using Large Language Models | Tsunehiko Tanaka et.al. | 2407.17404 | null |
2024-07-24 | 3D Question Answering for City Scene Understanding | Penglei Sun et.al. | 2407.17398 | null |
2024-07-24 | ViPer: Visual Personalization of Generative Models via Individual Preference Learning | Sogand Salehi et.al. | 2407.17365 | null |
2024-07-24 | Scalify: scale propagation for efficient low-precision LLM training | Paul Balança et.al. | 2407.17353 | link |
2024-07-23 | Can Large Language Models Automatically Jailbreak GPT-4V? | Yuanwei Wu et.al. | 2407.16686 | null |
2024-07-23 | RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent | Huiyu Xu et.al. | 2407.16667 | null |
2024-07-23 | Course-Correction: Safety Alignment Using Synthetic Preferences | Rongwu Xu et.al. | 2407.16637 | null |
2024-07-23 | Lawma: The Power of Specialization for Legal Tasks | Ricardo Dominguez-Olmedo et.al. | 2407.16615 | null |
2024-07-23 | Shared Imagination: LLMs Hallucinate Alike | Yilun Zhou et.al. | 2407.16604 | null |
2024-07-23 | Exploring Automatic Cryptographic API Misuse Detection in the Era of LLMs | Yifan Xia et.al. | 2407.16576 | null |
2024-07-23 | Retrieve, Generate, Evaluate: A Case Study for Medical Paraphrases Generation with Small Language Models | Ioana Buhnila et.al. | 2407.16565 | null |
2024-07-23 | Patched RTC: evaluating LLMs for diverse software development tasks | Asankhaya Sharma et.al. | 2407.16557 | null |
2024-07-24 | MicroEmo: Time-Sensitive Multimodal Emotion Recognition with Micro-Expression Dynamics in Video Dialogues | Liyun Zhang et.al. | 2407.16552 | null |
2024-07-23 | HAPFI: History-Aware Planning based on Fused Information | Sujin Jeon et.al. | 2407.16533 | null |
2024-07-22 | AutoAD-Zero: A Training-Free Framework for Zero-Shot Audio Description | Junyu Xie et.al. | 2407.15850 | link |
2024-07-22 | LLMmap: Fingerprinting For Large Language Models | Dario Pasquini et.al. | 2407.15847 | null |
2024-07-22 | SlowFast-LLaVA: A Strong Training-Free Baseline for Video Large Language Models | Mingze Xu et.al. | 2407.15841 | null |
2024-07-22 | MMInstruct: A High-Quality Multi-Modal Instruction Tuning Dataset with Extensive Diversity | Yangzhou Liu et.al. | 2407.15838 | null |
2024-07-22 | dMel: Speech Tokenization made Simple | He Bai et.al. | 2407.15835 | null |
2024-07-22 | Accelerating Pre-training of Multimodal LLMs via Chain-of-Sight | Ziyuan Huang et.al. | 2407.15819 | null |
2024-07-22 | Extracting Structured Insights from Financial News: An Augmented LLM Driven Approach | Rian Dolphin et.al. | 2407.15788 | null |
2024-07-22 | MoRSE: Bridging the Gap in Cybersecurity Expertise with Retrieval Augmented Generation | Marco Simoni et.al. | 2407.15748 | null |
2024-07-22 | OMoS-QA: A Dataset for Cross-Lingual Extractive Question Answering in a German Migration Context | Steffen Kleinle et.al. | 2407.15736 | null |
2024-07-22 | TaskGen: A Task-Based, Memory-Infused Agentic Framework using StrictJSON | John Chong Min Tan et.al. | 2407.15734 | null |
2024-07-19 | Internal Consistency and Self-Feedback in Large Language Models: A Survey | Xun Liang et.al. | 2407.14507 | link |
2024-07-19 | On Pre-training of Multimodal Language Models Customized for Chart Understanding | Wan-Cyuan Fan et.al. | 2407.14506 | null |
2024-07-19 | Evaluating the Reliability of Self-Explanations in Large Language Models | Korbinian Randl et.al. | 2407.14487 | link |
2024-07-19 | Contrastive Learning with Counterfactual Explanations for Radiology Report Generation | Mingjie Li et.al. | 2407.14474 | null |
2024-07-19 | Check-Eval: A Checklist-based Approach for Evaluating Text Quality | Jayr Pereira et.al. | 2407.14467 | null |
2024-07-19 | Undermining Mental Proof: How AI Can Make Cooperation Harder by Making Thinking Easier | Zachary Wojtowicz et.al. | 2407.14452 | null |
2024-07-19 | Token-level Correlation-guided Compression for Efficient Multimodal Document Understanding | Renshan Zhang et.al. | 2407.14439 | link |
2024-07-19 | The Vision of Autonomic Computing: Can LLMs Make It a Reality? | Zhiyang Zhang et.al. | 2407.14402 | null |
2024-07-19 | Open Artificial Knowledge | Vadim Borisov et.al. | 2407.14371 | null |
2024-07-19 | Enhancing Zero-shot Audio Classification using Sound Attribute Knowledge from Large Language Models | Xuenan Xu et.al. | 2407.14355 | null |
2024-07-18 | SegPoint: Segment Any Point Cloud via Large Language Model | Shuting He et.al. | 2407.13761 | null |
2024-07-18 | Streetscapes: Large-scale Consistent Street View Generation Using Autoregressive Video Diffusion | Boyang Deng et.al. | 2407.13759 | null |
2024-07-18 | Black-Box Opinion Manipulation Attacks to Retrieval-Augmented Generation of Large Language Models | Zhuo Chen et.al. | 2407.13757 | null |
2024-07-18 | CellularLint: A Systematic Approach to Identify Inconsistent Behavior in Cellular Network Specifications | Mirza Masfiqur Rahman et.al. | 2407.13742 | null |
2024-07-18 | Baba Is AI: Break the Rules to Beat the Benchmark | Nathan Cloos et.al. | 2407.13729 | null |
2024-07-18 | CoDefeater: Using LLMs To Find Defeaters in Assurance Cases | Usman Gohar et.al. | 2407.13717 | null |
2024-07-18 | Understanding Reference Policies in Direct Preference Optimization | Yixin Liu et.al. | 2407.13709 | null |
2024-07-18 | A Comprehensive Review of Recommender Systems: Transitioning from Theory to Practice | Shaina Raza et.al. | 2407.13699 | null |
2024-07-18 | Prover-Verifier Games improve legibility of LLM outputs | Jan Hendrik Kirchner et.al. | 2407.13692 | null |
2024-07-18 | COMCAT: Leveraging Human Judgment to Improve Automatic Documentation and Summarization | Skyler Grandel et.al. | 2407.13648 | null |
2024-07-17 | EchoSight: Advancing Visual-Language Models with Wiki Knowledge | Yibin Yan et.al. | 2407.12735 | null |
2024-07-17 | NL2Contact: Natural Language Guided 3D Hand-Object Contact Modeling with Diffusion Model | Zhongqun Zhang et.al. | 2407.12727 | null |
2024-07-17 | Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models? | Ben Yao et.al. | 2407.12725 | null |
2024-07-17 | The Future of Learning: Large Language Models through the Lens of Students | He Zhang et.al. | 2407.12723 | null |
2024-07-17 | MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models | Leyang Shen et.al. | 2407.12709 | link |
2024-07-17 | Patch-Level Training for Large Language Models | Chenze Shao et.al. | 2407.12665 | link |
2024-07-17 | Zero-shot Text-guided Infinite Image Synthesis with LLM guidance | Soyeong Kwon et.al. | 2407.12642 | null |
2024-07-17 | Harnessing the Power of Artificial Intelligence to Vitalize Endangered Indigenous Languages: Technologies and Experiences | Claudio Pinhanez et.al. | 2407.12620 | null |
2024-07-17 | AudienceView: AI-Assisted Interpretation of Audience Feedback in Journalism | William Brannon et.al. | 2407.12613 | link |
2024-07-17 | E5-V: Universal Embeddings with Multimodal Large Language Models | Ting Jiang et.al. | 2407.12580 | link |
2024-07-16 | UrbanWorld: An Urban World Model for 3D City Generation | Yu Shang et.al. | 2407.11965 | null |
2024-07-16 | NeedleBench: Can LLMs Do Retrieval and Reasoning in 1 Million Context Window? | Mo Li et.al. | 2407.11963 | link |
2024-07-16 | Code Documentation and Analysis to Secure Software Development | Paul Attie et.al. | 2407.11934 | null |
2024-07-16 | What's Wrong? Refining Meeting Summaries with LLM Feedback | Frederic Kirstein et.al. | 2407.11919 | null |
2024-07-16 | Ascend-CC: Confidential Computing on Heterogeneous NPU for Emerging Generative AI Workloads | Aritra Dhar et.al. | 2407.11888 | null |
2024-07-16 | Schema Matching with Large Language Models: an Experimental Study | Marcel Parciak et.al. | 2407.11852 | link |
2024-07-16 | LoFTI: Localization and Factuality Transfer to Indian Locales | Sona Elza Simon et.al. | 2407.11833 | link |
2024-07-16 | GPT Assisted Annotation of Rhetorical and Linguistic Features for Interpretable Propaganda Technique Detection in News Text | Kyle Hamilton et.al. | 2407.11827 | null |
2024-07-16 | PipeInfer: Accelerating LLM Inference using Asynchronous Pipelined Speculation | Branden Butler et.al. | 2407.11798 | null |
2024-07-16 | Large Language Models as Misleading Assistants in Conversation | Betty Li Hou et.al. | 2407.11789 | null |
2024-07-15 | VGBench: Evaluating Large Language Models on Vector Graphics Understanding and Generation | Bocheng Zou et.al. | 2407.10972 | link |
2024-07-15 | Q-Sparse: All Large Language Models can be Fully Sparsely-Activated | Hongyu Wang et.al. | 2407.10969 | null |
2024-07-15 | Fast Matrix Multiplications for Lookup Table-Quantized LLMs | Han Guo et.al. | 2407.10960 | null |
2024-07-15 | MMM: Multilingual Mutual Reinforcement Effect Mix Datasets & Test with Open-domain Information Extraction Large Language Models | Chengguang Gan et.al. | 2407.10953 | null |
2024-07-15 | Can Textual Semantics Mitigate Sounding Object Segmentation Preference? | Yaoting Wang et.al. | 2407.10947 | link |
2024-07-15 | GRUtopia: Dream General Robots in a City at Scale | Hanqing Wang et.al. | 2407.10943 | link |
2024-07-15 | OPa-Ma: Text Guided Mamba for 360-degree Image Out-painting | Penglei Gao et.al. | 2407.10923 | null |
2024-07-15 | FinDKG: Dynamic Knowledge Graphs with Large Language Models for Detecting Global Trends in Financial Markets | Xiaohui Victor Li et.al. | 2407.10909 | null |
2024-07-15 | Hey, That's My Model! Introducing Chain & Hash, An LLM Fingerprinting Technique | Mark Russinovich et.al. | 2407.10887 | null |
2024-07-15 | SLIP: Securing LLMs IP Using Weights Decomposition | Yehonathan Refael et.al. | 2407.10886 | null |
2024-07-12 | FairyLandAI: Personalized Fairy Tales utilizing ChatGPT and DALLE-3 | Georgios Makridis et.al. | 2407.09467 | null |
2024-07-12 | Human-like Episodic Memory for Infinite Context LLMs | Zafeirios Fountas et.al. | 2407.09450 | null |
2024-07-12 | ASTPrompter: Weakly Supervised Automated Language Model Red-Teaming to Identify Likely Toxic Prompts | Amelia F. Hardy et.al. | 2407.09447 | null |
2024-07-12 | MUSCLE: A Model Update Strategy for Compatible LLM Evolution | Jessica Echterhoff et.al. | 2407.09435 | null |
2024-07-12 | Open (Clinical) LLMs are Sensitive to Instruction Phrasings | Alberto Mario Ceballos Arroyo et.al. | 2407.09429 | null |
2024-07-12 | TelecomGPT: A Framework to Build Telecom-Specfic Large Language Models | Hang Zou et.al. | 2407.09424 | null |
2024-07-12 | Mitigating Entity-Level Hallucination in Large Language Models | Weihang Su et.al. | 2407.09417 | link |
2024-07-12 | SPIQA: A Dataset for Multimodal Question Answering on Scientific Papers | Shraman Pramanick et.al. | 2407.09413 | link |
2024-07-12 | PersonaRAG: Enhancing Retrieval-Augmented Generation Systems with User-Centric Agents | Saber Zerhoudi et.al. | 2407.09394 | null |
2024-07-12 | GAVEL: Generating Games Via Evolution and Language Models | Graham Todd et.al. | 2407.09388 | null |
2024-07-11 | MAVIS: Mathematical Visual Instruction Tuning | Renrui Zhang et.al. | 2407.08739 | link |
2024-07-11 | Real-Time Anomaly Detection and Reactive Planning with Large Language Models | Rohan Sinha et.al. | 2407.08735 | null |
2024-07-11 | Is Your Model Really A Good Math Reasoner? Evaluating Mathematical Reasoning with Checklist | Zihao Zhou et.al. | 2407.08733 | null |
2024-07-11 | A Taxonomy for Data Contamination in Large Language Models | Medha Palavalli et.al. | 2407.08716 | null |
2024-07-11 | GTA: A Benchmark for General Tool Agents | Jize Wang et.al. | 2407.08713 | link |
2024-07-11 | Live2Diff: Live Stream Translation via Uni-directional Attention in Video Diffusion Models | Zhening Xing et.al. | 2407.08701 | null |
2024-07-11 | Mitigating Catastrophic Forgetting in Language Transfer via Model Merging | Anton Alexandrov et.al. | 2407.08699 | null |
2024-07-11 | Cloud Atlas: Efficient Fault Localization for Cloud Systems using Language Models and Causal Insight | Zhiqiang Xie et.al. | 2407.08694 | null |
2024-07-11 | SEED-Story: Multimodal Long Story Generation with Large Language Model | Shuai Yang et.al. | 2407.08683 | link |
2024-07-11 | Uncertainty Estimation of Large Language Models in Medical Question Answering | Jiaxin Wu et.al. | 2407.08662 | null |
2024-07-10 | Training on the Test Task Confounds Evaluation and Emergence | Ricardo Dominguez-Olmedo et.al. | 2407.07890 | link |
2024-07-10 | Towards Robust Alignment of Language Models: Distributionally Robustifying Direct Preference Optimization | Junkang Wu et.al. | 2407.07880 | link |
2024-07-10 | FACTS About Building Retrieval Augmented Generation-based Chatbots | Rama Akkiraju et.al. | 2407.07858 | null |
2024-07-10 | OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training | Sami Jaghouar et.al. | 2407.07852 | link |
2024-07-10 | Natural Language Mechanisms via Self-Resolution with Foundation Models | Nicolas Della Penna et.al. | 2407.07845 | null |
2024-07-10 | Transformer Alignment in Large Language Models | Murdock Aubry et.al. | 2407.07810 | null |
2024-07-10 | Attribute or Abstain: Large Language Models as Long Document Assistants | Jan Buchmann et.al. | 2407.07799 | link |
2024-07-11 | Evaluating Large Language Models with Grid-Based Game Competitions: An Extensible LLM Benchmark and Leaderboard | Oguzhan Topsakal et.al. | 2407.07796 | link |
2024-07-10 | Flooding Spread of Manipulated Knowledge in LLM-Based Multi-Agent Communities | Tianjie Ju et.al. | 2407.07791 | link |
2024-07-10 | WorldAPIs: The World Is Worth How Many APIs? A Thought Experiment | Jiefu Ou et.al. | 2407.07778 | null |
2024-07-09 | AnyTaskTune: Advanced Domain-Specific Solutions through Task-Fine-Tuning | Jiaxi Cui et.al. | 2407.07094 | link |
2024-07-09 | FBI-LLM: Scaling Up Fully Binarized LLMs from Scratch via Autoregressive Distillation | Liqun Ma et.al. | 2407.07093 | link |
2024-07-09 | Hypothetical Minds: Scaffolding Theory of Mind for Multi-Agent Tasks with Large Language Models | Logan Cross et.al. | 2407.07086 | link |
2024-07-09 | Adapting LLMs to Hebrew: Unveiling DictaLM 2.0 with Enhanced Vocabulary and Instruction Capabilities | Shaltiel Shmidman et.al. | 2407.07080 | null |
2024-07-09 | Lookback Lens: Detecting and Mitigating Contextual Hallucinations in Large Language Models Using Only Attention Maps | Yung-Sung Chuang et.al. | 2407.07071 | link |
2024-07-09 | Prompting Techniques for Secure Code Generation: A Systematic Investigation | Catherine Tony et.al. | 2407.07064 | null |
2024-07-10 | Internet of Agents: Weaving a Web of Heterogeneous Agents for Collaborative Intelligence | Weize Chen et.al. | 2407.07061 | link |
2024-07-10 | Multimodal Self-Instruct: Synthetic Abstract Image and Visual Reasoning Instruction Using Language Model | Wenqi Zhang et.al. | 2407.07053 | link |
2024-07-09 | Using Large Language Models for Generating Smart Contracts for Health Insurance from Textual Policies | Inwon Kang et.al. | 2407.07019 | null |
2024-07-09 | End-To-End Causal Effect Estimation from Unstructured Natural Language Data | Nikita Dhawan et.al. | 2407.07018 | null |
2024-07-08 | Video-STaR: Self-Training Enables Video Instruction Tuning with Any Supervision | Orr Zohar et.al. | 2407.06189 | link |
2024-07-08 | CrowdMoGen: Zero-Shot Text-Driven Collective Motion Generation | Xinying Guo et.al. | 2407.06188 | null |
2024-07-08 | On Speeding Up Language Model Evaluation | Jin Peng Zhou et.al. | 2407.06172 | null |
2024-07-08 | What's Wrong with Your Code Generated by Large Language Models? An Extensive Study | Shihan Dou et.al. | 2407.06153 | null |
2024-07-09 | Using Grammar Masking to Ensure Syntactic Validity in LLM-based Modeling Tasks | Lukas Netz et.al. | 2407.06146 | null |
2024-07-08 | ANOLE: An Open, Autoregressive, Native Large Multimodal Models for Interleaved Image-Text Generation | Ethan Chern et.al. | 2407.06135 | link |
2024-07-09 | Evaluating the Semantic Profiling Abilities of LLMs for Natural Language Utterances in Data Visualization | Hannah K. Bako et.al. | 2407.06129 | link |
2024-07-08 | Depression Detection and Analysis using Large Language Models on Textual and Audio-Visual Modalities | Avinash Anand et.al. | 2407.06125 | null |
2024-07-08 | Artificial Intuition: Efficient Classification of Scientific Abstracts | Harsh Sakhrani et.al. | 2407.06093 | null |
2024-07-08 | Merge, Ensemble, and Cooperate! A Survey on Collaborative Strategies in the Era of Large Language Models | Jinliang Lu et.al. | 2407.06089 | null |
2024-07-05 | Me, Myself, and AI: The Situational Awareness Dataset (SAD) for LLMs | Rudolf Laine et.al. | 2407.04694 | null |
2024-07-05 | ANAH-v2: Scaling Analytical Hallucination Annotation of Large Language Models | Yuzhe Gu et.al. | 2407.04693 | null |
2024-07-05 | Rethinking Visual Prompting for Multimodal Large Language Models with External Knowledge | Yuanze Lin et.al. | 2407.04681 | null |
2024-07-05 | Seed-ASR: Understanding Diverse Speech and Contexts with LLM-based Speech Recognition | Ye Bai et.al. | 2407.04675 | null |
2024-07-05 | Lazarus: Resilient and Elastic Training of Mixture-of-Experts Models with Adaptive Expert Placement | Yongji Wu et.al. | 2407.04656 | null |
2024-07-05 | Entity Decomposition with Filtering: A Zero-Shot Clinical Named Entity Recognition Framework | Reza Averly et.al. | 2407.04629 | null |
2024-07-05 | On scalable oversight with weak LLMs judging strong LLMs | Zachary Kenton et.al. | 2407.04622 | null |
2024-07-05 | Leveraging Large Language Models for Integrated Satellite-Aerial-Terrestrial Networks: Recent Advances and Future Directions | Shumaila Javaid et.al. | 2407.04581 | null |
2024-07-05 | VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models | Hang Gao et.al. | 2407.04573 | null |
2024-07-05 | PoPreRo: A New Dataset for Popularity Prediction of Romanian Reddit Posts | Ana-Cristina Rogoz et.al. | 2407.04541 | link |
2024-07-03 | Universal Length Generalization with Turing Programs | Kaiying Hou et.al. | 2407.03310 | null |
2024-07-03 | Large Language Models for JSON Schema Discovery | Michael J. Mior et.al. | 2407.03286 | null |
2024-07-03 | LLM Internal States Reveal Hallucination Risk Faced With a Query | Ziwei Ji et.al. | 2407.03282 | null |
2024-07-03 | Programming universal unitary transformations on a general-purpose silicon photonics platform | Jose Roberto Rausell-Campo et.al. | 2407.03235 | null |
2024-07-03 | Improving Retrieval-augmented Text-to-SQL with AST-based Ranking and Schema Pruning | Zhili Shen et.al. | 2407.03227 | null |
2024-07-03 | How Does Quantization Affect Multilingual LLMs? | Kelly Marchisio et.al. | 2407.03211 | null |
2024-07-03 | TheoremLlama: Transforming General-Purpose LLMs into Lean4 Experts | Ruida Wang et.al. | 2407.03203 | link |
2024-07-03 | Fine-Tuning with Divergent Chains of Thought Boosts Reasoning Through Self-Correction in Language Models | Haritz Puerto et.al. | 2407.03181 | link |
2024-07-03 | Investigating Decoder-only Large Language Models for Speech-to-text Translation | Chao-Wei Huang et.al. | 2407.03169 | null |
2024-07-03 | SOS! Soft Prompt Attack Against Open-Source Large Language Models | Ziqing Yang et.al. | 2407.03160 | null |
2024-07-02 | MInference 1.0: Accelerating Pre-filling for Long-Context LLMs via Dynamic Sparse Attention | Huiqiang Jiang et.al. | 2407.02490 | link |
2024-07-02 | Neurocache: Efficient Vector Retrieval for Long-range Language Modeling | Ali Safaya et.al. | 2407.02486 | link |
2024-07-02 | RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs | Yue Yu et.al. | 2407.02485 | null |
2024-07-02 | MMedAgent: Learning to Use Medical Tools with Multi-modal Agent | Binxu Li et.al. | 2407.02483 | null |
2024-07-02 | Understanding Alignment in Multimodal LLMs: A Comprehensive Study | Elmira Amirloo et.al. | 2407.02477 | null |
2024-07-02 | Open Scene Graphs for Open World Object-Goal Navigation | Joel Loo et.al. | 2407.02473 | null |
2024-07-02 | Reliable Confidence Intervals for Information Retrieval Evaluation Using Generative A.I | Harrie Oosterhuis et.al. | 2407.02464 | null |
2024-07-03 | Video Watermarking: Safeguarding Your Video from (Unauthorized) Annotations by Video-based LLMs | Jinmin Li et.al. | 2407.02411 | null |
2024-07-02 | CEB: Compositional Evaluation Benchmark for Fairness in Large Language Models | Song Wang et.al. | 2407.02408 | null |
2024-07-02 | Assessing the Code Clone Detection Capability of Large Language Models | Zixian Zhang et.al. | 2407.02402 | null |
2024-06-28 | Web2Code: A Large-scale Webpage-to-Code Dataset and Evaluation Framework for Multimodal LLMs | Sukmin Yun et.al. | 2406.20098 | link |
2024-06-28 | LLaRA: Supercharging Robot Learning Data for Vision-Language Policy | Xiang Li et.al. | 2406.20095 | link |
2024-06-28 | Scaling Synthetic Data Creation with 1,000,000,000 Personas | Xin Chan et.al. | 2406.20094 | null |
2024-06-28 | LLaVolta: Efficient Multi-modal Models via Stage-wise Visual Context Compression | Jieneng Chen et.al. | 2406.20092 | link |
2024-06-28 | ProgressGym: Alignment with a Millennium of Moral Progress | Tianyi Qiu et.al. | 2406.20087 | null |
2024-06-28 | Auto Cherry-Picker: Learning from High-quality Generative Data Driven by Language | Yicheng Chen et.al. | 2406.20085 | null |
2024-06-28 | Molecular Facts: Desiderata for Decontextualization in LLM Fact Verification | Anisha Gunjal et.al. | 2406.20079 | link |
2024-07-02 | BMW Agents -- A Framework For Task Automation Through Multi-Agent Collaboration | Noel Crawford et.al. | 2406.20041 | null |
2024-06-28 | LEMoE: Advanced Mixture of Experts Adaptor for Lifelong Model Editing of Large Language Models | Renzhi Wang et.al. | 2406.20030 | null |
2024-06-28 | ToolBeHonest: A Multi-level Hallucination Diagnostic Benchmark for Tool-Augmented Large Language Models | Yuxiang Zhang et.al. | 2406.20015 | link |
2024-06-27 | ReXTime: A Benchmark Suite for Reasoning-Across-Time in Videos | Jr-Jen Chen et.al. | 2406.19392 | link |
2024-06-27 | The Remarkable Robustness of LLMs: Stages of Inference? | Vedang Lad et.al. | 2406.19384 | link |
2024-06-27 | Mamba or RWKV: Exploring High-Quality and High-Efficiency Segment Anything Model | Haobo Yuan et.al. | 2406.19369 | null |
2024-06-27 | The Model Arena for Cross-lingual Sentiment Analysis: A Comparative Study in the Era of Large Language Models | Xiliang Zhu et.al. | 2406.19358 | null |
2024-06-27 | DiVERT: Distractor Generation with Variational Errors Represented as Text for Math Multiple-choice Questions | Nigel Fernandez et.al. | 2406.19356 | null |
2024-06-27 | IndoToxic2024: A Demographically-Enriched Dataset of Hate Speech and Toxicity Types for Indonesian Language | Lucky Susanto et.al. | 2406.19349 | null |
2024-06-27 | Efficient World Models with Context-Aware Tokenization | Vincent Micheli et.al. | 2406.19320 | link |
2024-06-27 | Jump Starting Bandits with LLM-Generated Prior Knowledge | Parand A. Alamdari et.al. | 2406.19317 | null |
2024-06-27 | From Artificial Needles to Real Haystacks: Improving Retrieval Capabilities in LLMs by Finetuning on Synthetic Data | Zheyang Xiong et.al. | 2406.19292 | null |
2024-06-27 | PhysioLLM: Supporting Personalized Health Insights with Wearables and Large Language Models | Cathy Mengying Fang et.al. | 2406.19283 | null |
2024-06-26 | Symbolic Learning Enables Self-Evolving Agents | Wangchunshu Zhou et.al. | 2406.18532 | link |
2024-06-26 | PrExMe! Large Scale Prompt Exploration of Open Source LLMs for Machine Translation and Summarization Evaluation | Christoph Leiter et.al. | 2406.18528 | null |
2024-06-26 | CharXiv: Charting Gaps in Realistic Chart Understanding in Multimodal LLMs | Zirui Wang et.al. | 2406.18521 | null |
2024-06-26 | "Is ChatGPT a Better Explainer than My Professor?": Evaluating the Explanation Capabilities of LLMs in Conversation Compared to a Human Baseline | Grace Li et.al. | 2406.18512 | null |
2024-06-26 | Mental Modeling of Reinforcement Learning Agents by Language Models | Wenhao Lu et.al. | 2406.18505 | null |
2024-06-26 | Is In-Context Learning a Type of Gradient-Based Learning? Evidence from the Inverse Frequency Effect in Structural Priming | Zhenghao Zhou et.al. | 2406.18501 | null |
2024-06-26 | LoongTrain: Efficient Training of Long-Sequence LLMs with Head-Context Parallelism | Diandian Gu et.al. | 2406.18485 | null |
2024-06-26 | Role-Play Zero-Shot Prompting with Large Language Models for Open-Domain Human-Machine Conversation | Ahmed Njifenjou et.al. | 2406.18460 | null |
2024-06-26 | Cascading Large Language Models for Salient Event Graph Generation | Xingwei Tan et.al. | 2406.18449 | null |
2024-06-26 | New intelligent empowerment for digital transformation | Peng Yifeng et.al. | 2406.18440 | null |
2024-06-25 | MG-LLaVA: Towards Multi-Granularity Visual Instruction Tuning | Xiangyu Zhao et.al. | 2406.17770 | link |
2024-06-25 | BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning | Ercong Nie et.al. | 2406.17764 | null |
2024-06-25 | CaLMQA: Exploring culturally specific long-form question answering across 23 languages | Shane Arora et.al. | 2406.17761 | link |
2024-06-25 | Accelerating Clinical Evidence Synthesis with Large Language Models | Zifeng Wang et.al. | 2406.17755 | null |
2024-06-25 | Measuring and Benchmarking Large Language Models' Capabilities to Generate Persuasive Language | Amalie Brogaard Pauli et.al. | 2406.17753 | null |
2024-06-25 | LLM Targeted Underperformance Disproportionately Impacts Vulnerable Users | Elinor Poole-Dayan et.al. | 2406.17737 | null |
2024-06-25 | FedBiOT: LLM Local Fine-tuning in Federated Learning without Full Model | Feijie Wu et.al. | 2406.17706 | null |
2024-06-25 | From Distributional to Overton Pluralism: Investigating Large Language Model Alignment | Thom Lake et.al. | 2406.17692 | link |
2024-06-26 | VarBench: Robust Language Model Benchmarking Through Dynamic Variable Perturbation | Kun Qian et.al. | 2406.17681 | link |
2024-06-25 | Quantifying AI Psychology: A Psychometrics Benchmark for Large Language Models | Yuan Li et.al. | 2406.17675 | null |
2024-06-24 | EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees | Yuhui Li et.al. | 2406.16858 | null |
2024-06-24 | From Decoding to Meta-Generation: Inference-time Algorithms for Large Language Models | Sean Welleck et.al. | 2406.16838 | null |
2024-06-24 | USDC: A Dataset of $\underline{U}$ser $\underline{S}$tance and $\underline{D}$ogmatism in Long |
Mounika Marreddy et.al. | 2406.16833 | null |
2024-06-24 | Ragnarök: A Reusable RAG Framework and Baselines for TREC 2024 Retrieval-Augmented Generation Track | Ronak Pradeep et.al. | 2406.16828 | null |
2024-06-24 | RES-Q: Evaluating Code-Editing Large Language Model Systems at the Repository Scale | Beck LaBash et.al. | 2406.16801 | link |
2024-06-25 | Lottery Ticket Adaptation: Mitigating Destructive Interference in LLMs | Ashwinee Panda et.al. | 2406.16797 | link |
2024-06-24 | M2Lingual: Enhancing Multilingual, Multi-Turn Instruction Alignment in Large Language Models | Rishabh Maheshwary et.al. | 2406.16783 | null |
2024-06-24 | It Is Not About What You Say, It Is About How You Say It: A Surprisingly Simple Approach for Improving Reading Comprehension | Sagi Shaier et.al. | 2406.16779 | null |
2024-06-24 | Blending LLMs into Cascaded Speech Translation: KIT's Offline Speech Translation System for IWSLT 2024 | Sai Koneru et.al. | 2406.16777 | null |
2024-06-24 | WARP: On the Benefits of Weight Averaged Rewarded Policies | Alexandre Ramé et.al. | 2406.16768 | null |
2024-06-21 | GenoTEX: A Benchmark for Evaluating LLM-Based Exploration of Gene Expression Data in Alignment with Bioinformaticians | Haoyang Liu et.al. | 2406.15341 | link |
2024-06-21 | Gradient-Mask Tuning Elevates the Upper Limits of LLM Performance | Haoling Li et.al. | 2406.15330 | null |
2024-06-21 | Bug In the Code Stack: Can LLMs Find Bugs in Large Python Code Stacks | Hokyung Lee et.al. | 2406.15325 | null |
2024-06-21 | Towards Fine-Grained Citation Evaluation in Generated Text: A Comparative Analysis of Faithfulness Metrics | Weijia Zhang et.al. | 2406.15264 | null |
2024-06-21 | Detecting Synthetic Lyrics with Few-Shot Inference | Yanis Labrak et.al. | 2406.15231 | null |
2024-06-21 | A LLM-Based Ranking Method for the Evaluation of Automatic Counter-Narrative Generation | Irune Zubiaga et.al. | 2406.15227 | null |
2024-06-21 | Unsupervised Extraction of Dialogue Policies from Conversations | Makesh Narsimhan Sreedhar et.al. | 2406.15214 | null |
2024-06-21 | Prompting Whisper for QA-driven Zero-shot End-to-end Spoken Language Understanding | Mohan Li et.al. | 2406.15209 | null |
2024-06-21 | Exploring the Efficacy of Robotic Assistants with ChatGPT and Claude in Enhancing ADHD Therapy: Innovating Treatment Paradigms | Santiago Berrezueta-Guzman et.al. | 2406.15198 | null |
2024-06-21 | UDA: A Benchmark Suite for Retrieval Augmented Generation in Real-world Document Analysis | Yulong Hui et.al. | 2406.15187 | link |
2024-06-20 | Model Merging and Safety Alignment: One Bad Model Spoils the Bunch | Hasan Abed Al Kader Hammoud et.al. | 2406.14563 | null |
2024-06-20 | Whiteboard-of-Thought: Thinking Step-by-Step Across Modalities | Sachit Menon et.al. | 2406.14562 | null |
2024-06-21 | Asynchronous Large Language Model Enhanced Planner for Autonomous Driving | Yuan Chen et.al. | 2406.14556 | null |
2024-06-20 | GraphReader: Building Graph-based Agent to Enhance Long-Context Abilities of Large Language Models | Shilong Li et.al. | 2406.14550 | null |
2024-06-20 | Uncovering Latent Memories: Assessing Data Leakage and Memorization Patterns in Large Language Models | Sunny Duan et.al. | 2406.14549 | null |
2024-06-20 | Connecting the Dots: LLMs can Infer and Verbalize Latent Structure from Disparate Training Data | Johannes Treutlein et.al. | 2406.14546 | link |
2024-06-20 | Unmasking Database Vulnerabilities: Zero-Knowledge Schema Inference Attacks in Text-to-SQL Systems | Đorđe Klisura et.al. | 2406.14545 | null |
2024-06-20 | Prism: A Framework for Decoupling and Assessing the Capabilities of VLMs | Yuxuan Qiao et.al. | 2406.14544 | link |
2024-06-21 | Are LLMs Naturally Good at Synthetic Tabular Data Generation? | Shengzhe Xu et.al. | 2406.14541 | link |
2024-06-20 | PostMark: A Robust Blackbox Watermark for Large Language Models | Yapei Chang et.al. | 2406.14517 | link |
2024-06-18 | DrVideo: Document Retrieval Based Long Video Understanding | Ziyu Ma et.al. | 2406.12846 | null |
2024-06-18 | Interpretable Preferences via Multi-Objective Reward Modeling and Mixture-of-Experts | Haoxiang Wang et.al. | 2406.12845 | link |
2024-06-18 | Synergizing Foundation Models and Federated Learning: A Survey | Shenghui Li et.al. | 2406.12844 | null |
2024-06-18 | LaMDA: Large Model Fine-Tuning via Spectrally Decomposed Low-Dimensional Adaptation | Seyedarmin Azizi et.al. | 2406.12832 | link |
2024-06-18 | Is It Good Data for Multilingual Instruction Tuning or Just Bad Multilingual Evaluation for Large Language Models? | Pinzhen Chen et.al. | 2406.12822 | null |
2024-06-18 | Can Large Language Models Always Solve Easy Problems if They Can Solve Harder Ones? | Zhe Yang et.al. | 2406.12809 | null |
2024-06-18 | Identifying Performance-Sensitive Configurations in Software Systems through Code Analysis with LLM Agents | Zehao Wang et.al. | 2406.12806 | null |
2024-06-18 | Supporting Human Raters with the Detection of Harmful Content using Large Language Models | Kurt Thomas et.al. | 2406.12800 | null |
2024-06-18 | ChatGLM: A Family of Large Language Models from GLM-130B to GLM-4 All Tools | Team GLM et.al. | 2406.12793 | null |
2024-06-18 | UBENCH: Benchmarking Uncertainty in Large Language Models with Multiple Choice Questions | Xunzhi Wang et.al. | 2406.12784 | null |
2024-06-17 | LLaNA: Large Language and NeRF Assistant | Andrea Amaduzzi et.al. | 2406.11840 | null |
2024-06-17 | mDPO: Conditional Preference Optimization for Multimodal Large Language Models | Fei Wang et.al. | 2406.11839 | null |
2024-06-17 | Unveiling Encoder-Free Vision-Language Models | Haiwen Diao et.al. | 2406.11832 | link |
2024-06-17 | Exploring the Role of Large Language Models in Prompt Encoding for Diffusion Models | Bingqi Ma et.al. | 2406.11831 | null |
2024-06-17 | WPO: Enhancing RLHF with Weighted Preference Optimization | Wenxuan Zhou et.al. | 2406.11827 | link |
2024-06-17 | Embodied Instruction Following in Unknown Environments | Zhenyu Wu et.al. | 2406.11818 | null |
2024-06-17 | VideoLLM-online: Online Video Large Language Model for Streaming Video | Joya Chen et.al. | 2406.11816 | null |
2024-06-17 | How Do Large Language Models Acquire Factual Knowledge During Pretraining? | Hoyeon Chang et.al. | 2406.11813 | null |
2024-06-17 | RepLiQA: A Question-Answering Dataset for Benchmarking LLMs on Unseen Reference Content | Joao Monteiro et.al. | 2406.11811 | null |
2024-06-17 | Safety Arithmetic: A Framework for Test-time Safety Alignment of Language Models by Steering Parameters and Activations | Rima Hazra et.al. | 2406.11801 | link |
2024-06-14 | Quantifying Variance in Evaluation Benchmarks | Lovish Madaan et.al. | 2406.10229 | null |
2024-06-14 | Semantic Membership Inference Attack against Large Language Models | Hamid Mozaffari et.al. | 2406.10218 | null |
2024-06-14 | Regularizing Hidden States Enables Learning Generalizable Reward Model for LLMs | Rui Yang et.al. | 2406.10216 | null |
2024-06-14 | Be like a Goldfish, Don't Memorize! Mitigating Memorization in Generative LLMs | Abhimanyu Hans et.al. | 2406.10209 | link |
2024-06-14 | TRIP-PAL: Travel Planning with Guarantees by Combining Large Language Models and Automated Planners | Tomas de la Rosa et.al. | 2406.10196 | null |
2024-06-14 | Detecting and Evaluating Medical Hallucinations in Large Vision Language Models | Jiawei Chen et.al. | 2406.10185 | null |
2024-06-14 | Practical offloading for fine-tuning LLM on commodity GPU via learned subspace projectors | Siyuan Chen et.al. | 2406.10181 | null |
2024-06-14 | Datasets for Multilingual Answer Sentence Selection | Matteo Gabburo et.al. | 2406.10172 | null |
2024-06-14 | Sycophancy to Subterfuge: Investigating Reward-Tampering in Large Language Models | Carson Denison et.al. | 2406.10162 | link |
2024-06-14 | BABILong: Testing the Limits of LLMs with Long Context Reasoning-in-a-Haystack | Yuri Kuratov et.al. | 2406.10149 | null |
2024-06-13 | VideoGPT+: Integrating Image and Video Encoders for Enhanced Video Understanding | Muhammad Maaz et.al. | 2406.09418 | link |
2024-06-13 | Explore the Limits of Omni-modal Pretraining at Scale | Yiyuan Zhang et.al. | 2406.09412 | link |
2024-06-13 | Aligning Vision Models with Human Aesthetics in Retrieval: Benchmarks and Algorithms | Miaosen Zhang et.al. | 2406.09397 | null |
2024-06-13 | Too Many Frames, not all Useful:Efficient Strategies for Long-Form Video QA | Jongwoo Park et.al. | 2406.09396 | null |
2024-06-13 | Needle In A Video Haystack: A Scalable Synthetic Framework for Benchmarking Video MLLMs | Zijia Zhao et.al. | 2406.09367 | link |
2024-06-13 | ElicitationGPT: Text Elicitation Mechanisms via Language Models | Yifan Wu et.al. | 2406.09363 | null |
2024-06-13 | DiscreteSLU: A Large Language Model with Self-Supervised Discrete Speech Units for Spoken Language Understanding | Suwon Shon et.al. | 2406.09345 | null |
2024-06-13 | REVS: Unlearning Sensitive Information in Language Models via Rank Editing in the Vocabulary Space | Tomer Ashuach et.al. | 2406.09325 | null |
2024-06-13 | Bag of Tricks: Benchmarking of Jailbreak Attacks on LLMs | Zhao Xu et.al. | 2406.09324 | link |
2024-06-13 | JailbreakEval: An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models | Delong Ran et.al. | 2406.09321 | link |
2024-06-12 | Improving LLMs for Recommendation with Out-Of-Vocabulary Tokens | Ting-Ji Huang et.al. | 2406.08477 | null |
2024-06-13 | Real2Code: Reconstruct Articulated Objects via Code Generation | Zhao Mandi et.al. | 2406.08474 | null |
2024-06-12 | Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing | Zhangchen Xu et.al. | 2406.08464 | null |
2024-06-12 | TasTe: Teaching Large Language Models to Translate through Self-Reflection | Yutong Wang et.al. | 2406.08434 | link |
2024-06-12 | Next-Generation Database Interfaces: A Survey of LLM-based Text-to-SQL | Zijin Hong et.al. | 2406.08426 | null |
2024-06-12 | State Soup: In-Context Skill Learning, Retrieval and Mixing | Maciej Pióro et.al. | 2406.08423 | null |
2024-06-12 | OmniCorpus: An Unified Multimodal Corpus of 10 Billion-Level Images Interleaved with Text | Qingyun Li et.al. | 2406.08418 | link |
2024-06-12 | Discovering Preference Optimization Algorithms with and for Large Language Models | Chris Lu et.al. | 2406.08414 | link |
2024-06-12 | Memory Is All You Need: An Overview of Compute-in-Memory Architectures for Accelerating Large Language Model Inference | Christopher Wolters et.al. | 2406.08413 | null |
2024-06-12 | Understanding Sounds, Missing the Questions: The Challenge of Object Hallucination in Large Audio-Language Models | Chun-Yi Kuan et.al. | 2406.08402 | link |
2024-06-11 | Open-LLM-Leaderboard: From Multi-choice to Open-style Questions for LLMs Evaluation, Benchmark, and Arena | Aidar Myrzakhan et.al. | 2406.07545 | link |
2024-06-11 | QuickLLaMA: Query-aware Inference Acceleration for Large Language Models | Jingyao Li et.al. | 2406.07528 | link |
2024-06-11 | Beyond Model Collapse: Scaling Up with Synthesized Data Requires Reinforcement | Yunzhen Feng et.al. | 2406.07515 | null |
2024-06-11 | THaLLE: Text Hyperlocally Augmented Large Language Extension -- Technical Report | KBTG Labs et.al. | 2406.07505 | null |
2024-06-11 | Image Textualization: An Automatic Framework for Creating Accurate and Detailed Image Descriptions | Renjie Pi et.al. | 2406.07502 | link |
2024-06-11 | TextGrad: Automatic "Differentiation" via Text | Mert Yuksekgonul et.al. | 2406.07496 | link |
2024-06-12 | CADS: A Systematic Literature Review on the Challenges of Abstractive Dialogue Summarization | Frederic Kirstein et.al. | 2406.07494 | null |
2024-06-11 | PITCH: Productivity and Mental Well-being Coaching through Daily Conversational Interaction | Adnan Abbas et.al. | 2406.07485 | null |
2024-06-11 | Advancing Annotation of Stance in Social Media Posts: A Comparative Analysis of Large Language Models and Crowd Sourcing | Mao Li et.al. | 2406.07483 | null |
2024-06-11 | VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs | Zesen Cheng et.al. | 2406.07476 | link |
2024-06-10 | Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation | Peize Sun et.al. | 2406.06525 | link |
2024-06-10 | UMBRELA: UMbrela is the (Open-Source Reproduction of the) Bing RELevance Assessor | Shivani Upadhyay et.al. | 2406.06519 | link |
2024-06-10 | NarrativeBridge: Enhancing Video Captioning with Causal-Temporal Narrative | Asmar Nadeem et.al. | 2406.06499 | null |
2024-06-10 | Parallelizing Linear Transformers with the Delta Rule over Sequence Length | Songlin Yang et.al. | 2406.06484 | null |
2024-06-10 | Towards a Personal Health Large Language Model | Justin Cosentino et.al. | 2406.06474 | null |
2024-06-10 | AID: Adapting Image2Video Diffusion Models for Instruction-guided Video Prediction | Zhen Xing et.al. | 2406.06465 | null |
2024-06-11 | Transforming Wearable Data into Health Insights using Large Language Model Agents | Mike A. Merrill et.al. | 2406.06464 | null |
2024-06-11 | Reasoning in Token Economies: Budget-Aware Evaluation of LLM Reasoning Strategies | Junlin Wang et.al. | 2406.06461 | null |
2024-06-10 | Evaluating the Retrieval Component in LLM-Based Question Answering Systems | Ashkan Alinejad et.al. | 2406.06458 | null |
2024-06-10 | A Large Language Model Pipeline for Breast Cancer Oncology | Tristen Pool et.al. | 2406.06455 | null |
2024-06-07 | 3D-GRAND: Towards Better Grounding and Less Hallucination for 3D-LLMs | Jianing Yang et.al. | 2406.05132 | null |
2024-06-07 | An Empirical Study on Parameter-Efficient Fine-Tuning for MultiModal Large Language Models | Xiongtao Zhou et.al. | 2406.05130 | null |
2024-06-07 | Towards Semantic Equivalence of Tokenization in Multimodal LLM | Shengqiong Wu et.al. | 2406.05127 | null |
2024-06-07 | LINX: A Language Driven Generative System for Goal-Oriented Automated Data Exploration | Tavor Lipman et.al. | 2406.05107 | null |
2024-06-07 | Multi-Head RAG: Solving Multi-Aspect Problems with LLMs | Maciej Besta et.al. | 2406.05085 | link |
2024-06-07 | Are Large Language Models More Empathetic than Humans? | Anuradha Welivita et.al. | 2406.05063 | null |
2024-06-07 | Robustness Assessment of Mathematical Reasoning in the Presence of Missing and Contradictory Conditions | Shi-Yu Tian et.al. | 2406.05055 | null |
2024-06-07 | Hints-In-Browser: Benchmarking Language Models for Programming Feedback Generation | Nachiket Kotalwar et.al. | 2406.05053 | null |
2024-06-07 | Bootstrapping Referring Multi-Object Tracking | Yani Zhang et.al. | 2406.05039 | null |
2024-06-07 | Efficient 3D Shape Generation via Diffusion Mamba with Bidirectional SSMs | Shentong Mo et.al. | 2406.05038 | null |
2024-06-06 | Verbalized Machine Learning: Revisiting Machine Learning with Language Models | Tim Z. Xiao et.al. | 2406.04344 | null |
2024-06-06 | RoboMamba: Multimodal State Space Model for Efficient Robot Reasoning and Manipulation | Jiaming Liu et.al. | 2406.04339 | null |
2024-06-06 | Coherent Zero-Shot Visual Instruction Generation | Quynh Phung et.al. | 2406.04337 | null |
2024-06-06 | DeepStack: Deeply Stacking Visual Tokens is Surprisingly Simple and Effective for LMMs | Lingchen Meng et.al. | 2406.04334 | null |
2024-06-06 | PaCE: Parsimonious Concept Engineering for Large Language Models | Jinqi Luo et.al. | 2406.04331 | link |
2024-06-06 | Step-aware Preference Optimization: Aligning Preference with Denoising Performance at Each Step | Zhanhao Liang et.al. | 2406.04314 | null |
2024-06-06 | Semantically Diverse Language Generation for Uncertainty Estimation in Language Models | Lukas Aichberger et.al. | 2406.04306 | link |
2024-06-06 | Quixer: A Quantum Transformer Model | Nikhil Khatri et.al. | 2406.04305 | null |
2024-06-06 | Text-to-Drive: Diverse Driving Behavior Synthesis via Large Language Models | Phat Nguyen et.al. | 2406.04300 | null |
2024-06-07 | What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages | Nadav Borenstein et.al. | 2406.04289 | null |
2024-06-05 | Wings: Learning Multimodal LLMs without Text-only Forgetting | Yi-Kai Zhang et.al. | 2406.03496 | null |
2024-06-06 | Seq1F1B: Efficient Sequence-Level Pipeline Parallelism for Large Language Model Training | Ao Sun et.al. | 2406.03488 | null |
2024-06-05 | Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends | Sanjana Ramprasad et.al. | 2406.03487 | null |
2024-06-05 | BIPED: Pedagogically Informed Tutoring System for ESL Education | Soonwoo Kwon et.al. | 2406.03486 | null |
2024-06-05 | Does your data spark joy? Performance gains from domain upsampling at the end of training | Cody Blakeney et.al. | 2406.03476 | null |
2024-06-05 | AD-H: Autonomous Driving with Hierarchical Agents | Zaibin Zhang et.al. | 2406.03474 | null |
2024-06-05 | What is the Best Way for ChatGPT to Translate Poetry? | Shanshan Wang et.al. | 2406.03450 | null |
2024-06-05 | Pre-trained Large Language Models Use Fourier Features to Compute Addition | Tianyi Zhou et.al. | 2406.03445 | null |
2024-06-05 | Cycles of Thought: Measuring LLM Confidence through Stable Explanations | Evan Becker et.al. | 2406.03441 | null |
2024-06-05 | Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach | Saehyung Lee et.al. | 2406.03411 | link |
2024-06-04 | Learning to grok: Emergence of in-context learning and skill composition in modular arithmetic tasks | Tianyu He et.al. | 2406.02550 | link |
2024-06-04 | Leveraging Visual Tokens for Extended Text Contexts in Multi-Modal Learning | Alex Jinpeng Wang et.al. | 2406.02547 | link |
2024-06-04 | To Believe or Not to Believe Your LLM | Yasin Abbasi Yadkori et.al. | 2406.02543 | null |
2024-06-04 | Loki: Low-Rank Keys for Efficient Sparse Attention | Prajwal Singhania et.al. | 2406.02542 | null |
2024-06-04 | Parrot: Multilingual Visual Instruction Tuning | Hai-Long Sun et.al. | 2406.02539 | null |
2024-06-04 | Mitigate Position Bias in Large Language Models via Scaling a Single Dimension | Yijiong Yu et.al. | 2406.02536 | null |
2024-06-04 | SpecExec: Massively Parallel Speculative Decoding for Interactive LLM Inference on Consumer Devices | Ruslan Svirschevski et.al. | 2406.02532 | null |
2024-06-04 | Scalable MatMul-free Language Modeling | Rui-Jie Zhu et.al. | 2406.02528 | link |
2024-06-04 | CheckEmbed: Effective Verification of LLM Solutions to Open-Ended Tasks | Maciej Besta et.al. | 2406.02524 | null |
2024-06-04 | RoboCasa: Large-Scale Simulation of Everyday Tasks for Generalist Robots | Soroush Nasiriany et.al. | 2406.02523 | null |
2024-05-31 | Video-MME: The First-Ever Comprehensive Evaluation Benchmark of Multi-modal LLMs in Video Analysis | Chaoyou Fu et.al. | 2405.21075 | null |
2024-05-31 | Grammar-Aligned Decoding | Kanghee Park et.al. | 2405.21047 | null |
2024-05-31 | Direct Alignment of Language Models via Quality-Aware Self-Refinement | Runsheng Yu et.al. | 2405.21040 | null |
2024-05-31 | Standards for Belief Representations in LLMs | Daniel A. Herrmann et.al. | 2405.21030 | null |
2024-05-31 | LACIE: Listener-Aware Finetuning for Confidence Calibration in Large Language Models | Elias Stengel-Eskin et.al. | 2405.21028 | link |
2024-05-31 | You Only Scan Once: Efficient Multi-dimension Sequential Modeling with LightNet | Zhen Qin et.al. | 2405.21022 | null |
2024-05-31 | Improved Techniques for Optimization-Based Jailbreaking on Large Language Models | Xiaojun Jia et.al. | 2405.21018 | link |
2024-05-31 | DeCo: Decoupling Token Compression from Semantic Abstraction in Multimodal Large Language Models | Linli Yao et.al. | 2405.20985 | null |
2024-05-31 | Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training | Feiteng Fang et.al. | 2405.20978 | null |
2024-05-31 | SaySelf: Teaching LLMs to Express Confidence with Self-Reflective Rationales | Tianyang Xu et.al. | 2405.20974 | link |
2024-05-30 | MotionLLM: Understanding Human Behaviors from Human Motions and Videos | Ling-Hao Chen et.al. | 2405.20340 | null |
2024-05-30 | Visual Perception by Large Language Model's Weights | Feipeng Ma et.al. | 2405.20339 | null |
2024-05-30 | OccSora: 4D Occupancy Generation Models as World Simulators for Autonomous Driving | Lening Wang et.al. | 2405.20337 | link |
2024-05-30 | Xwin-LM: Strong and Scalable Alignment Practice for LLMs | Bolin Ni et.al. | 2405.20335 | link |
2024-05-31 | ParSEL: Parameterized Shape Editing with Language | Aditya Ganeshan et.al. | 2405.20319 | null |
2024-05-30 | CausalQuest: Collecting Natural Causal Questions for AI Agents | Roberto Ceraolo et.al. | 2405.20318 | link |
2024-05-30 | ANAH: Analytical Annotation of Hallucinations in Large Language Models | Ziwei Ji et.al. | 2405.20315 | link |
2024-05-30 | Sequence-Augmented SE(3)-Flow Matching For Conditional Protein Backbone Generation | Guillaume Huguet et.al. | 2405.20313 | null |
2024-05-30 | Large Language Models Can Self-Improve At Web Agent Tasks | Ajay Patel et.al. | 2405.20309 | null |
2024-05-30 | Group Robust Preference Optimization in Reward-free RLHF | Shyam Sundhar Ramesh et.al. | 2405.20304 | link |
2024-05-29 | X-VILA: Cross-Modality Alignment for Large Language Model | Hanrong Ye et.al. | 2405.19335 | null |
2024-05-29 | LLMs Meet Multimodal Generation and Editing: A Survey | Yingqing He et.al. | 2405.19334 | link |
2024-05-29 | Multi-Modal Generative Embedding Model | Feipeng Ma et.al. | 2405.19333 | null |
2024-05-29 | Self-Exploring Language Models: Active Preference Elicitation for Online Alignment | Shenao Zhang et.al. | 2405.19332 | link |
2024-05-29 | Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation | Atrisha Sarkar et.al. | 2405.19328 | null |
2024-05-30 | MAP-Neo: Highly Capable and Transparent Bilingual Large Language Model Series | Ge Zhang et.al. | 2405.19327 | null |
2024-05-29 | Reasoning3D -- Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models | Tianrun Chen et.al. | 2405.19326 | null |
2024-05-29 | Nearest Neighbor Speculative Decoding for LLM Generation and Attribution | Minghan Li et.al. | 2405.19325 | null |
2024-05-29 | Are Large Language Models Chameleons? | Mingmeng Geng et.al. | 2405.19323 | null |
2024-05-29 | Value-Incentivized Preference Optimization: A Unified Approach to Online and Offline RLHF | Shicong Cen et.al. | 2405.19320 | null |
2024-05-28 | DiG: Scalable and Efficient Diffusion Models with Gated Linear Attention | Lianghui Zhu et.al. | 2405.18428 | link |
2024-05-29 | ViG: Linear-complexity Visual Sequence Learning with Gated Linear Attention | Bencheng Liao et.al. | 2405.18425 | link |
2024-05-28 | Don't Forget to Connect! Improving RAG with Graph-based Reranking | Jialin Dong et.al. | 2405.18414 | null |
2024-05-29 | Instruct-MusicGen: Unlocking Text-to-Music Editing for Music Language Models via Instruction Tuning | Yixiao Zhang et.al. | 2405.18386 | link |
2024-05-28 | OwLore: Outlier-weighed Layerwise Sampled Low-Rank Projection for Memory-Efficient LLM Fine-tuning | Pengxiang Li et.al. | 2405.18380 | link |
2024-05-28 | LLaMA-NAS: Efficient Neural Architecture Search for Large Language Models | Anthony Sarah et.al. | 2405.18377 | null |
2024-05-28 | Empowering Source-Free Domain Adaptation with MLLM-driven Curriculum Learning | Dongjie Chen et.al. | 2405.18376 | link |
2024-05-28 | Thai Winograd Schemas: A Benchmark for Thai Commonsense Reasoning | Phakphum Artkaew et.al. | 2405.18375 | null |
2024-05-28 | PromptWizard: Task-Aware Agent-driven Prompt Optimization Framework | Eshaan Agarwal et.al. | 2405.18369 | null |
2024-05-28 | Is a 3D-Tokenized LLM the Key to Reliable Autonomous Driving? | Yifan Bai et.al. | 2405.18361 | null |
2024-05-27 | Matryoshka Multimodal Models | Mu Cai et.al. | 2405.17430 | null |
2024-05-27 | NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models | Chankyu Lee et.al. | 2405.17428 | null |
2024-05-27 | Reason3D: Searching and Reasoning 3D Segmentation via Large Language Model | Kuan-Chih Huang et.al. | 2405.17427 | link |
2024-05-27 | LARM: Large Auto-Regressive Model for Long-Horizon Embodied Intelligence | Zhuoling Li et.al. | 2405.17424 | null |
2024-05-27 | Self-Corrected Multimodal Large Language Model for End-to-End Robot Manipulation | Jiaming Liu et.al. | 2405.17418 | null |
2024-05-27 | THREAD: Thinking Deeper with Recursive Spawning | Philip Schroeder et.al. | 2405.17402 | null |
2024-05-27 | MindMerger: Efficient Boosting LLM Reasoning in non-English Languages | Zixian Huang et.al. | 2405.17386 | null |
2024-05-27 | Unlocking the Secrets of Linear Complexity Sequence Model from A Unified Perspective | Zhen Qin et.al. | 2405.17383 | null |
2024-05-27 | ReMoDetect: Reward Models Recognize Aligned LLM's Generations | Hyunseok Lee et.al. | 2405.17382 | null |
2024-05-27 | Various Lengths, Constant Speed: Efficient Language Modeling with Lightning Attention | Zhen Qin et.al. | 2405.17381 | link |
2024-05-24 | Scaling Laws for Discriminative Classification in Large Language Models | Dean Wyatte et.al. | 2405.15765 | null |
2024-05-24 | Large Language Models Reflect Human Citation Patterns with a Heightened Citation Bias | Andres Algaba et.al. | 2405.15739 | null |
2024-05-24 | LM4LV: A Frozen Large Language Model for Low-level Vision Tasks | Boyang Zheng et.al. | 2405.15734 | null |
2024-05-24 | Understanding the differences in Foundation Models: Attention, State Space Models, and Recurrent Neural Networks | Jerome Sieber et.al. | 2405.15731 | link |
2024-05-24 | Optimizing Large Language Models for OpenAPI Code Completion | Bohdan Petryshyn et.al. | 2405.15729 | null |
2024-05-24 | Prompt-Aware Adapter: Towards Learning Adaptive Visual Tokens for Multimodal Large Language Models | Yue Zhang et.al. | 2405.15684 | null |
2024-05-24 | What Do You See? Enhancing Zero-Shot Image Classification with Multimodal Large Language Models | Abdelrahman Abdelhamed et.al. | 2405.15668 | null |
2024-05-24 | Class Machine Unlearning for Complex Data via Concepts Inference and Data Poisoning | Wenhan Chang et.al. | 2405.15662 | null |
2024-05-24 | Simen Gaure et.al. | 2405.15652 | null | |
2024-05-24 | LLM-based Robot Task Planning with Exceptional Handling for General Purpose Service Robots | Ruoyu Wang et.al. | 2405.15646 | null |
2024-05-23 | A Nurse is Blue and Elephant is Rugby: Cross Domain Alignment in Large Language Models Reveal Human-like Patterns | Asaf Yehudai et.al. | 2405.14863 | null |
2024-05-23 | Bitune: Bidirectional Instruction-Tuning | Dawid J. Kopiczko et.al. | 2405.14862 | null |
2024-05-23 | PV-Tuning: Beyond Straight-Through Estimation for Extreme LLM Compression | Vladimir Malinovskii et.al. | 2405.14852 | null |
2024-05-23 | HippoRAG: Neurobiologically Inspired Long-Term Memory for Large Language Models | Bernal Jiménez Gutiérrez et.al. | 2405.14831 | null |
2024-05-23 | Can LLMs Solve longer Math Word Problems Better? | Xin Xu et.al. | 2405.14804 | null |
2024-05-23 | Lessons from the Trenches on Reproducible Evaluation of Language Models | Stella Biderman et.al. | 2405.14782 | null |
2024-05-23 | WISE: Rethinking the Knowledge Memory for Lifelong Model Editing of Large Language Models | Peng Wang et.al. | 2405.14768 | link |
2024-05-23 | FinRobot: An Open-Source AI Agent Platform for Financial Applications using Large Language Models | Hongyang Yang et.al. | 2405.14767 | link |
2024-05-23 | Evaluating Large Language Models for Public Health Classification and Extraction Tasks | Joshua Harris et.al. | 2405.14766 | null |
2024-05-23 | Large language models can be zero-shot anomaly detectors for time series? | Sarah Alnegheimish et.al. | 2405.14755 | null |
2024-05-21 | Reducing Transformer Key-Value Cache Size with Cross-Layer Attention | William Brandon et.al. | 2405.12981 | null |
2024-05-21 | Energy Rank Alignment: Using Preference Optimization to Search Chemical Space at Scale | Shriram Chennakesavalu et.al. | 2405.12961 | null |
2024-05-21 | Aggregation of Reasoning: A Hierarchical Framework for Enhancing Answer Selection in Large Language Models | Zhangyue Yin et.al. | 2405.12939 | null |
2024-05-21 | Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs | Bilgehan Sel et.al. | 2405.12933 | null |
2024-05-21 | Code-mixed Sentiment and Hate-speech Prediction | Anjali Yadav et.al. | 2405.12929 | null |
2024-05-21 | Streamlining Software Reviews: Efficient Predictive Modeling with Minimal Examples | Tim Menzies et.al. | 2405.12920 | null |
2024-05-21 | G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation | Xingyuan Pan et.al. | 2405.12915 | null |
2024-05-21 | An Empirical Study and Analysis of Text-to-Image Generation Using Large Language Model-Powered Textual Representation | Zhiyu Tan et.al. | 2405.12914 | null |
2024-05-21 | Topic Modelling Case Law Using a Large Language Model and a New Taxonomy for UK Law: AI Insights into Summary Judgment | Holli Sargeant et.al. | 2405.12910 | link |
2024-05-21 | Adversarial DPO: Harnessing Harmful Data for Reducing Toxicity with Minimal Impact on Coherence and Evasiveness in Dialogue Agents | San Kim et.al. | 2405.12900 | null |
2024-05-20 | Adapting Large Multimodal Models to Distribution Shifts: The Role of In-Context Learning | Guanglin Zhou et.al. | 2405.12217 | link |
2024-05-20 | MathBench: Evaluating the Theory and Application Proficiency of LLMs with a Hierarchical Mathematics Benchmark | Hongwei Liu et.al. | 2405.12209 | link |
2024-05-20 | Developers' Perceptions on the Impact of ChatGPT in Software Development: A Survey | Thiago S. Vaillant et.al. | 2405.12195 | null |
2024-05-20 | CT-Eval: Benchmarking Chinese Text-to-Table Performance in Large Language Models | Haoxiang Shi et.al. | 2405.12174 | null |
2024-05-20 | Fennec: Fine-grained Language Model Evaluation and Correction Extended through Branching and Bridging | Xiaobo Liang et.al. | 2405.12163 | link |
2024-05-20 | Eliciting Problem Specifications via Large Language Models | Robert E. Wray et.al. | 2405.12147 | null |
2024-05-20 | MoRA: High-Rank Updating for Parameter-Efficient Fine-Tuning | Ting Jiang et.al. | 2405.12130 | link |
2024-05-20 | Reindex-Then-Adapt: Improving Large Language Models for Conversational Recommendation | Zhankui He et.al. | 2405.12119 | null |
2024-05-20 | Imp: Highly Capable Large Multimodal Models for Mobile Devices | Zhenwei Shao et.al. | 2405.12107 | link |
2024-05-20 | DOP: Diagnostic-Oriented Prompting for Large Language Models in Mathematical Correction | Hao Chen et.al. | 2405.12100 | null |
2024-05-17 | A Survey on Large Language Models with Multilingualism: Recent Advances and New Frontiers | Kaiyu Huang et.al. | 2405.10936 | link |
2024-05-17 | The Local Interaction Basis: Identifying Computationally-Relevant and Sparsely Interacting Features in Neural Networks | Lucius Bushnaq et.al. | 2405.10928 | null |
2024-05-17 | COGNET-MD, an evaluation framework and dataset for Large Language Model benchmarks in the medical domain | Dimitrios P. Panagoulias et.al. | 2405.10893 | null |
2024-05-17 | Application of Artificial Intelligence in Schizophrenia Rehabilitation Management: Systematic Literature Review | Hongyi Yang et.al. | 2405.10883 | null |
2024-05-17 | The Future of Large Language Model Pre-training is Federated | Lorenzo Sani et.al. | 2405.10853 | null |
2024-05-17 | Large Language Model (LLM) for Telecommunications: A Comprehensive Survey on Principles, Key Techniques, and Opportunities | Hao Zhou et.al. | 2405.10825 | null |
2024-05-17 | ActiveLLM: Large Language Model-based Active Learning for Textual Few-Shot Scenarios | Markus Bayer et.al. | 2405.10808 | null |
2024-05-17 | Distinctive and Natural Speaker Anonymization via Singular Value Transformation-assisted Matrix | Jixun Yao et.al. | 2405.10786 | null |
2024-05-17 | Empowering Small-Scale Knowledge Graphs: A Strategy of Leveraging General-Purpose Knowledge Graphs for Enriched Embeddings | Albert Sawczyn et.al. | 2405.10745 | null |
2024-05-17 | Efficient Multimodal Large Language Models: A Survey | Yizhang Jin et.al. | 2405.10739 | link |
2024-05-16 | UniRAG: Universal Retrieval Augmentation for Multi-Modal Large Language Models | Sahel Sharifymoghaddam et.al. | 2405.10311 | null |
2024-05-16 | 4D Panoptic Scene Graph Generation | Jingkang Yang et.al. | 2405.10305 | link |
2024-05-16 | HW-GPT-Bench: Hardware-Aware Architecture Benchmark for Language Models | Rhea Sanjay Sukthanker et.al. | 2405.10299 | link |
2024-05-16 | Timeline-based Sentence Decomposition with In-Context Learning for Temporal Fact Extraction | Jianhao Chen et.al. | 2405.10288 | null |
2024-05-16 | Revisiting OPRO: The Limitations of Small-Scale LLMs as Optimizers | Tuo Zhang et.al. | 2405.10276 | null |
2024-05-16 | Keep It Private: Unsupervised Privatization of Online Text | Calvin Bao et.al. | 2405.10260 | link |
2024-05-16 | When LLMs step into the 3D World: A Survey and Meta-Analysis of 3D Tasks via Multi-modal Large Language Models | Xianzheng Ma et.al. | 2405.10255 | null |
2024-05-16 | A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks | Xuanfan Ni et.al. | 2405.10251 | null |
2024-05-16 | IntelliExplain: Enhancing Interactive Code Generation through Natural Language Explanations for Non-Professional Programmers | Hao Yan et.al. | 2405.10250 | null |
2024-05-16 | CPsyExam: A Chinese Benchmark for Evaluating Psychology using Examinations | Jiahao Zhao et.al. | 2405.10212 | null |
2024-05-15 | Modeling Bilingual Sentence Processing: Evaluating RNN and Transformer Architectures for Cross-Language Structural Priming | Bushi Xiao et.al. | 2405.09508 | null |
2024-05-15 | Beyond Flesch-Kincaid: Prompt-based Metrics Improve Difficulty Classification of Educational Texts | Donya Rooein et.al. | 2405.09482 | null |
2024-05-15 | Tell Me Why: Explainable Public Health Fact-Checking with Large Language Models | Majid Zarharan et.al. | 2405.09454 | link |
2024-05-15 | Facilitating Opinion Diversity through Hybrid NLP Approaches | Michiel van der Meer et.al. | 2405.09439 | null |
2024-05-15 | MicroPython Testbed for Federated Learning Algorithms | Miroslav Popovic et.al. | 2405.09423 | null |
2024-05-15 | Matching domain experts by training from scratch on domain knowledge | Xiaoliang Luo et.al. | 2405.09395 | null |
2024-05-15 | PolygloToxicityPrompts: Multilingual Evaluation of Neural Toxic Degeneration in Large Language Models | Devansh Jain et.al. | 2405.09373 | null |
2024-05-15 | Analysis of the Geometric Structure of Neural Networks and Neural ODEs via Morse Functions | Christian Kuehn et.al. | 2405.09351 | null |
2024-05-15 | Large Language Model Bias Mitigation from the Perspective of Knowledge Editing | Ruizhe Chen et.al. | 2405.09341 | null |
2024-05-15 | Prompting-based Synthetic Data Generation for Few-Shot Question Answering | Maximilian Schmidt et.al. | 2405.09335 | null |
2024-05-14 | Towards Enhanced RAC Accessibility: Leveraging Datasets and LLMs | Edison Jair Bejarano Sepulveda et.al. | 2405.08792 | null |
2024-05-14 | Incorporating Clinical Guidelines through Adapting Multi-modal Large Language Model for Prostate Cancer PI-RADS Scoring | Tiantian Zhang et.al. | 2405.08786 | null |
2024-05-14 | Is the Pope Catholic? Yes, the Pope is Catholic. Generative Evaluation of Intent Resolution in LLMs | Akhila Yerukola et.al. | 2405.08760 | link |
2024-05-14 | Distributed Threat Intelligence at the Edge Devices: A Large Language Model-Driven Approach | Syed Mhamudul Hasan et.al. | 2405.08755 | null |
2024-05-14 | Hunyuan-DiT: A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding | Zhimin Li et.al. | 2405.08748 | link |
2024-05-15 | ALMol: Aligned Language-Molecule Translation LLMs through Offline Preference Contrastive Optimisation | Dimitris Gkoumas et.al. | 2405.08619 | null |
2024-05-14 | A Comprehensive Survey of Large Language Models and Multimodal Large Language Models in Medicine | Hanguang Xiao et.al. | 2405.08603 | null |
2024-05-15 | EVDA: Evolving Deepfake Audio Detection Continual Learning Benchmark | Xiaohui Zhang et.al. | 2405.08596 | null |
2024-05-14 | Falcon 7b for Software Mention Detection in Scholarly Documents | AmeerAli Khan et.al. | 2405.08514 | null |
2024-05-14 | Archimedes-AUEB at SemEval-2024 Task 5: LLM explains Civil Procedure | Odysseas S. Chlapanis et.al. | 2405.08502 | null |
2024-05-14 | MambaOut: Do We Really Need Mamba for Vision? | Weihao Yu et.al. | 2405.07992 | link |
2024-05-13 | Plot2Code: A Comprehensive Benchmark for Evaluating Multi-modal Large Language Models in Code Generation from Scientific Plots | Chengyue Wu et.al. | 2405.07990 | null |
2024-05-13 | A Generalist Learner for Multifaceted Medical Image Interpretation | Hong-Yu Zhou et.al. | 2405.07988 | null |
2024-05-13 | OverlapMamba: Novel Shift State Space Model for LiDAR-based Place Recognition | Qiuchi Xiang et.al. | 2405.07966 | link |
2024-05-13 | PyZoBot: A Platform for Conversational Information Extraction and Synthesis from Curated Zotero Reference Libraries through Advanced Retrieval-Augmented Generation | Suad Alshammari et.al. | 2405.07963 | null |
2024-05-13 | AgentClinic: a multimodal agent benchmark to evaluate AI in simulated clinical environments | Samuel Schmidgall et.al. | 2405.07960 | null |
2024-05-13 | EconLogicQA: A Question-Answering Benchmark for Evaluating Large Language Models in Economic Sequential Reasoning | Yinzhu Quan et.al. | 2405.07938 | null |
2024-05-14 | PARDEN, Can You Repeat That? Defending against Jailbreaks via Repetition | Ziyang Zhang et.al. | 2405.07932 | link |
2024-05-13 | Can Better Text Semantics in Prompt Tuning Improve VLM Generalization? | Hari Chandana Kuchibhotla et.al. | 2405.07921 | null |
2024-05-13 | A Systematic Investigation of Distilling Large Language Models into Cross-Encoders for Passage Re-ranking | Ferdinand Schlatt et.al. | 2405.07920 | null |
2024-05-10 | Linearizing Large Language Models | Jean Mercat et.al. | 2405.06640 | link |
2024-05-10 | Value Augmented Sampling for Language Model Alignment and Personalization | Seungwook Han et.al. | 2405.06639 | link |
2024-05-10 | Characterizing the Accuracy - Efficiency Trade-off of Low-rank Decomposition in Language Models | Chakshu Moar et.al. | 2405.06626 | null |
2024-05-10 | Non-Uniform Spatial Alignment Errors in sUAS Imagery From Wide-Area Disasters | Thomas Manzini et.al. | 2405.06593 | null |
2024-05-10 | What Can Natural Language Processing Do for Peer Review? | Ilia Kuznetsov et.al. | 2405.06563 | null |
2024-05-10 | Mitigating Hallucinations in Large Language Models via Self-Refinement-Enhanced Knowledge Retrieval | Mengjia Niu et.al. | 2405.06545 | null |
2024-05-10 | Prompting Large Language Models with Knowledge Graphs for Question Answering Involving Long-tail Facts | Wenyu Huang et.al. | 2405.06524 | null |
2024-05-10 | UniDM: A Unified Framework for Data Manipulation with Large Language Models | Yichen Qian et.al. | 2405.06510 | null |
2024-05-10 | Storypark: Leveraging Large Language Models to Enhance Children Story Learning Through Child-AI collaboration Storytelling | Lyumanshan Ye et.al. | 2405.06495 | null |
2024-05-10 | Can Large Language Models Replicate ITS Feedback on Open-Ended Math Questions? | Hunter McNichols et.al. | 2405.06414 | null |
2024-05-09 | Natural Language Processing RELIES on Linguistics | Juri Opitz et.al. | 2405.05966 | null |
2024-05-09 | OpenBA-V2: Reaching 77.3% High Compression Ratio with Fast Multi-Stage Pruning | Dan Qiao et.al. | 2405.05957 | link |
2024-05-09 | Probing Multimodal LLMs as World Models for Driving | Shiva Sreeram et.al. | 2405.05956 | link |
2024-05-09 | Smurfs: Leveraging Multiple Proficiency Agents with Context-Efficiency for Tool Planning | Junzhi Chen et.al. | 2405.05955 | null |
2024-05-09 | CuMo: Scaling Multimodal LLM with Co-Upcycled Mixture-of-Experts | Jiachen Li et.al. | 2405.05949 | link |
2024-05-09 | Trustworthy AI-Generative Content in Intelligent 6G Network: Adversarial, Privacy, and Fairness | Siyuan Li et.al. | 2405.05930 | null |
2024-05-09 | Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations? | Zorik Gekhman et.al. | 2405.05904 | null |
2024-05-09 | Co-driver: VLM-based Autonomous Driving Assistant with Human-like Behavior and Understanding for Complex Road Scenes | Ziang Guo et.al. | 2405.05885 | null |
2024-05-09 | FlockGPT: Guiding UAV Flocking with Linguistic Orchestration | Artem Lykov et.al. | 2405.05872 | null |
2024-05-09 | Robots Can Feel: LLM-based Framework for Robot Ethical Reasoning | Artem Lykov et.al. | 2405.05824 | link |
2024-05-09 | You Only Cache Once: Decoder-Decoder Architectures for Language Models | Yutao Sun et.al. | 2405.05254 | null |
2024-05-08 | Open Source Language Models Can Provide Feedback: Evaluating LLMs' Ability to Help Students Using GPT-4-As-A-Judge | Charles Koutcheme et.al. | 2405.05253 | link |
2024-05-09 | LLMs with Personalities in Multi-issue Negotiation Games | Sean Noh et.al. | 2405.05248 | null |
2024-05-08 | SuFIA: Language-Guided Augmented Dexterity for Robotic Surgical Assistants | Masoud Moghani et.al. | 2405.05226 | null |
2024-05-08 | Conv-Basis: A New Paradigm for Efficient Attention Inference and Gradient Computation in Transformers | Jiuxiang Gu et.al. | 2405.05219 | null |
2024-05-08 | MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning | Inderjeet Nair et.al. | 2405.05189 | null |
2024-05-08 | Air Gap: Protecting Privacy-Conscious Conversational Agents | Eugene Bagdasaryan et.al. | 2405.05175 | null |
2024-05-08 | XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples | Peiqin Lin et.al. | 2405.05116 | null |
2024-05-08 | QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs | Weijia Zhang et.al. | 2405.05109 | null |
2024-05-08 | Concerns on Bias in Large Language Models when Creating Synthetic Personae | Helena A. Haxvig et.al. | 2405.05080 | null |
2024-05-07 | ChatHuman: Language-driven 3D Human Understanding with Retrieval-Augmented Tool Reasoning | Jing Lin et.al. | 2405.04533 | null |
2024-05-07 | QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving | Yujun Lin et.al. | 2405.04532 | link |
2024-05-07 | NaturalCodeBench: Examining Coding Performance Mismatch on HumanEval and Natural User Prompts | Shudan Zhang et.al. | 2405.04520 | null |
2024-05-07 | xLSTM: Extended Long Short-Term Memory | Maximilian Beck et.al. | 2405.04517 | null |
2024-05-07 | A Transformer with Stack Attention | Jiaoda Li et.al. | 2405.04515 | link |
2024-05-08 | Unveiling Disparities in Web Task Handling Between Human and Web Agent | Kihoon Son et.al. | 2405.04497 | null |
2024-05-07 | Toward In-Context Teaching: Adapting Examples to Students' Misconceptions | Alexis Ross et.al. | 2405.04495 | null |
2024-05-07 | The Silicone Ceiling: Auditing GPT's Race and Gender Biases in Hiring | Lena Armstrong et.al. | 2405.04412 | null |
2024-05-07 | Vision Mamba: A Comprehensive Survey and Taxonomy | Xiao Liu et.al. | 2405.04404 | link |
2024-05-07 | Learning To See But Forgetting To Follow: Visual Instruction Tuning Makes LLMs More Prone To Jailbreak Attacks | Georgios Pantazopoulos et.al. | 2405.04403 | link |
2024-05-06 | Complex Video Reasoning and Robustness Evaluation Suite for Video-LMMs | Muhammad Uzair Khattak et.al. | 2405.03690 | null |
2024-05-06 | Large Language Models Reveal Information Operation Goals, Tactics, and Narrative Frames | Keith Burghardt et.al. | 2405.03688 | null |
2024-05-06 | Language-Image Models with 3D Understanding | Jang Hyun Cho et.al. | 2405.03685 | null |
2024-05-06 | AtomGPT: Atomistic Generative Pre-trained Transformer for Forward and Inverse Materials Design | Kamal Choudhary et.al. | 2405.03680 | null |
2024-05-06 | When LLMs Meet Cybersecurity: A Systematic Literature Review | Jie Zhang et.al. | 2405.03644 | null |
2024-05-06 | A Controlled Experiment on the Energy Efficiency of the Source Code Generated by Code Llama | Vlad-Andrei Cursaru et.al. | 2405.03616 | null |
2024-05-06 | Enabling High-Sparsity Foundational Llama Models with Efficient Pretraining and Deployment | Abhinav Agarwalla et.al. | 2405.03594 | null |
2024-05-06 | AlphaMath Almost Zero: process Supervision without process | Guoxin Chen et.al. | 2405.03553 | null |
2024-05-06 | MAmmoTH2: Scaling Instructions from the Web | Xiang Yue et.al. | 2405.03548 | null |
2024-05-06 | Position Paper: Leveraging Foundational Models for Black-Box Optimization: Benefits, Challenges, and Future Directions | Xingyou Song et.al. | 2405.03547 | null |
2024-05-03 | Leveraging Large Language Models to Enhance Domain Expert Inclusion in Data Science Workflows | Jasmine Y. Shih et.al. | 2405.02260 | null |
2024-05-03 | What matters when building vision-language models? | Hugo Laurençon et.al. | 2405.02246 | null |
2024-05-03 | REASONS: A benchmark for REtrieval and Automated citationS Of scieNtific Sentences using Public and Proprietary LLMs | Deepa Tilwani et.al. | 2405.02228 | null |
2024-05-03 | FairEvalLLM. A Comprehensive Framework for Benchmarking Fairness in Large Language Model Recommender Systems | Yashar Deldjoo et.al. | 2405.02219 | null |
2024-05-03 | Automatic Programming: Large Language Models and Beyond | Michael R. Lyu et.al. | 2405.02213 | null |
2024-05-03 | Assessing and Verifying Task Utility in LLM-Powered Applications | Negar Arabzadeh et.al. | 2405.02178 | null |
2024-05-03 | The AI Review Lottery: Widespread AI-Assisted Peer Reviews Boost Paper Scores and Acceptance Rates | Giuseppe Russo Latona et.al. | 2405.02150 | null |
2024-05-03 | MedReadMe: A Systematic Study for Fine-grained Sentence Readability in Medical Domain | Chao Jiang et.al. | 2405.02144 | null |
2024-05-03 | Optimising Calls to Large Language Models with Uncertainty-Based Two-Tier Selection | Guillem Ramírez et.al. | 2405.02134 | null |
2024-05-06 | Unveiling the Potential of LLM-Based ASR on Chinese Open-Source Datasets | Xuelong Geng et.al. | 2405.02132 | null |
2024-05-02 | Plan-Seq-Learn: Language Model Guided RL for Solving Long Horizon Robotics Tasks | Murtaza Dalal et.al. | 2405.01534 | null |
2024-05-02 | OmniDrive: A Holistic LLM-Agent Framework for Autonomous Driving with 3D Perception, Reasoning and Planning | Shihao Wang et.al. | 2405.01533 | null |
2024-05-02 | FLAME: Factuality-Aware Alignment for Large Language Models | Sheng-Chieh Lin et.al. | 2405.01525 | null |
2024-05-02 | Transformer-Aided Semantic Communications | Matin Mortaheb et.al. | 2405.01521 | null |
2024-05-02 | Analyzing the Role of Semantic Representations in the Era of Large Language Models | Zhijing Jin et.al. | 2405.01502 | link |
2024-05-02 | Supporting Business Document Workflows via Collection-Centric Information Foraging with Large Language Models | Raymond Fok et.al. | 2405.01501 | null |
2024-05-02 | Controllable Text Generation in the Instruction-Tuning Era | Dhananjay Ashok et.al. | 2405.01490 | null |
2024-05-02 | NeMo-Aligner: Scalable Toolkit for Efficient Model Alignment | Gerald Shen et.al. | 2405.01481 | link |
2024-05-02 | A Systematic Literature Review on Large Language Models for Automated Program Repair | Quanjun Zhang et.al. | 2405.01466 | null |
2024-05-02 | Natural Language to Verilog: Design of a Recurrent Spiking Neural Network using Large Language Models and ChatGPT | Paola Vitolo et.al. | 2405.01419 | null |
2024-05-01 | Is Bigger Edit Batch Size Always Better? -- An Empirical Study on Model Editing with Llama-3 | Junsang Yoon et.al. | 2405.00664 | null |
2024-05-01 | HalluVault: A Novel Logic Programming-aided Metamorphic Testing Framework for Detecting Fact-Conflicting Hallucinations in Large Language Models | Ningke Li et.al. | 2405.00648 | null |
2024-05-01 | When Quantization Affects Confidence of Large Language Models? | Irina Proskurina et.al. | 2405.00632 | null |
2024-05-01 | "I'm Not Sure, But...": Examining the Impact of Large Language Models' Uncertainty Expression on User Reliance and Trust | Sunnie S. Y. Kim et.al. | 2405.00623 | null |
2024-05-01 | Addressing Topic Granularity and Hallucination in Large Language Models for Topic Modelling | Yida Mu et.al. | 2405.00611 | null |
2024-05-01 | Investigating Automatic Scoring and Feedback using Large Language Models | Gloria Ashiya Katuka et.al. | 2405.00602 | null |
2024-05-01 | Are Models Biased on Text without Gender-related Language? | Catarina G Belém et.al. | 2405.00588 | link |
2024-05-01 | The Real, the Better: Aligning Large Language Models with Online Human Behaviors | Guanying Jiang et.al. | 2405.00578 | null |
2024-05-01 | EALD-MLLM: Emotion Analysis in Long-sequential and De-identity videos with Multi-modal Large Language Model | Deng Li et.al. | 2405.00574 | null |
2024-05-01 | NumLLM: Numeric-Sensitive Large Language Model for Chinese Finance | Huan-Yi Su et.al. | 2405.00566 | null |
2024-04-30 | Visual Fact Checker: Enabling High-Fidelity Detailed Caption Generation | Yunhao Ge et.al. | 2404.19752 | null |
2024-04-30 | PrivComp-KG : Leveraging Knowledge Graph and Large Language Models for Privacy Policy Compliance Verification | Leon Garza et.al. | 2404.19744 | null |
2024-04-30 | Better & Faster Large Language Models via Multi-token Prediction | Fabian Gloeckle et.al. | 2404.19737 | null |
2024-04-30 | A Framework for Leveraging Human Computation Gaming to Enhance Knowledge Graphs for Accuracy Critical Generative AI Applications | Steph Buongiorno et.al. | 2404.19729 | null |
2024-04-30 | PANGeA: Procedural Artificial Narrative using Generative AI for Turn-Based Video Games | Steph Buongiorno et.al. | 2404.19721 | null |
2024-04-30 | Assessing LLMs in Malicious Code Deobfuscation of Real-world Malware Campaigns | Constantinos Patsakis et.al. | 2404.19715 | null |
2024-04-30 | Automated Generation of High-Quality Medical Simulation Scenarios Through Integration of Semi-Structured Data and Large Language Models | Scott Sumpter et.al. | 2404.19713 | null |
2024-04-30 | When to Retrieve: Teaching LLMs to Utilize Information Retrieval Effectively | Tiziano Labruna et.al. | 2404.19705 | null |
2024-04-30 | Naturally Supervised 3D Visual Grounding with Language-Regularized Concept Learners | Chun Feng et.al. | 2404.19696 | null |
2024-04-30 | On Training a Neural Network to Explain Binaries | Alexander Interrante-Grant et.al. | 2404.19631 | null |
2024-04-29 | Hallucination of Multimodal Large Language Models: A Survey | Zechen Bai et.al. | 2404.18930 | link |
2024-04-29 | DPO Meets PPO: Reinforced Token Optimization for RLHF | Han Zhong et.al. | 2404.18922 | null |
2024-04-29 | TheaterGen: Character Management with LLM for Consistent Multi-turn Image Generation | Junhao Cheng et.al. | 2404.18919 | null |
2024-04-29 | Kangaroo: Lossless Self-Speculative Decoding via Double Early Exiting | Fangcheng Liu et.al. | 2404.18911 | null |
2024-04-29 | Human-in-the-Loop Synthetic Text Data Inspection with Provenance Tracking | Hong Jin Kang et.al. | 2404.18881 | link |
2024-04-29 | More RLHF, More Trust? On The Impact of Human Preference Alignment On Language Model Trustworthiness | Aaron J. Li et.al. | 2404.18870 | link |
2024-04-29 | Truth-value judgment in language models: belief directions are context sensitive | Stefan F. Schouten et.al. | 2404.18865 | null |
2024-04-29 | Performance-Aligned LLMs for Generating Fast Code | Daniel Nichols et.al. | 2404.18864 | null |
2024-04-29 | A Survey on Vision Mamba: Models, Applications and Challenges | Rui Xu et.al. | 2404.18861 | link |
2024-04-29 | VERT: Verified Equivalent Rust Transpilation with Few-Shot Learning | Aidan Z. H. Yang et.al. | 2404.18852 | null |
2024-04-26 | Probabilistic Inference in Language Models via Twisted Sequential Monte Carlo | Stephen Zhao et.al. | 2404.17546 | null |
2024-04-26 | Large Language Model Agent as a Mechanical Designer | Yayati Jadhav et.al. | 2404.17525 | null |
2024-04-29 | On the Use of Large Language Models to Generate Capability Ontologies | Luis Miguel Vieira da Silva et.al. | 2404.17524 | null |
2024-04-26 | Enhancing Legal Compliance and Regulation Analysis with Large Language Models | Shabnam Hassani et.al. | 2404.17522 | null |
2024-04-26 | A Comprehensive Evaluation on Event Reasoning of Large Language Models | Zhengwei Tao et.al. | 2404.17513 | link |
2024-04-26 | Ruffle&Riley: Insights from Designing and Evaluating a Large Language Model-Based Conversational Tutoring System | Robin Schmucker et.al. | 2404.17460 | null |
2024-04-26 | "ChatGPT Is Here to Help, Not to Replace Anybody" -- An Evaluation of Students' Opinions On Integrating ChatGPT In CS Courses | Bruno Pereira Cipriano et.al. | 2404.17443 | null |
2024-04-26 | InspectorRAGet: An Introspection Platform for RAG Evaluation | Kshitij Fadnis et.al. | 2404.17347 | null |
2024-04-26 | When to Trust LLMs: Aligning Confidence with Response Quality | Shuchang Tao et.al. | 2404.17287 | null |
2024-04-26 | Reinforcement Retrieval Leveraging Fine-grained Feedback for Fact Checking News Claims with Black-Box LLM | Xuan Zhang et.al. | 2404.17283 | link |
2024-04-25 | Make-it-Real: Unleashing Large Multimodal Model's Ability for Painting 3D Objects with Realistic Materials | Ye Fang et.al. | 2404.16829 | null |
2024-04-25 | How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites | Zhe Chen et.al. | 2404.16821 | link |
2024-04-25 | IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages | Harman Singh et.al. | 2404.16816 | null |
2024-04-26 | Make Your LLM Fully Utilize the Context | Shengnan An et.al. | 2404.16811 | link |
2024-04-25 | Improving Diversity of Commonsense Generation by Large Language Models via In-Context Learning | Tianhui Zhang et.al. | 2404.16807 | null |
2024-04-25 | Weak-to-Strong Extrapolation Expedites Alignment | Chujie Zheng et.al. | 2404.16792 | link |
2024-04-25 | SEED-Bench-2-Plus: Benchmarking Multimodal Large Language Models with Text-Rich Visual Comprehension | Bohao Li et.al. | 2404.16790 | link |
2024-04-25 | Continual Learning of Large Language Models: A Comprehensive Survey | Haizhou Shi et.al. | 2404.16789 | link |
2024-04-25 | Prefix Text as a Yarn: Eliciting Non-English Alignment in Foundation Language Model | Runzhe Zhan et.al. | 2404.16766 | null |
2024-04-25 | RadGenome-Chest CT: A Grounded Vision-Language Dataset for Chest CT Analysis | Xiaoman Zhang et.al. | 2404.16754 | null |
2024-04-24 | Hybrid LLM/Rule-based Approaches to Business Insights Generation from Structured Data | Aliaksei Vertsel et.al. | 2404.15604 | null |
2024-04-24 | ImplicitAVE: An Open-Source Dataset and Multimodal LLMs Benchmark for Implicit Attribute Value Extraction | Henry Peng Zou et.al. | 2404.15592 | link |
2024-04-24 | Can Foundational Large Language Models Assist with Conducting Pharmaceuticals Manufacturing Investigations? | Hossein Salami et.al. | 2404.15578 | null |
2024-04-23 | PRISM: Patient Records Interpretation for Semantic Clinical Trial Matching using Large Language Models | Shashi Kant Gupta et.al. | 2404.15549 | null |
2024-04-23 | Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models | Mihir Parmar et.al. | 2404.15522 | link |
2024-04-23 | Visual Delta Generator with Large Multi-modal Models for Semi-supervised Composed Image Retrieval | Young Kyun Jang et.al. | 2404.15516 | null |
2024-04-23 | ToM-LM: Delegating Theory Of Mind Reasoning to External Symbolic Executors in Large Language Models | Weizhi Tang et.al. | 2404.15515 | null |
2024-04-23 | IryoNLP at MEDIQA-CORR 2024: Tackling the Medical Error Detection & Correction Task On the Shoulders of Medical Agents | Jean-Philippe Corbeil et.al. | 2404.15488 | link |
2024-04-23 | Large Language Models Spot Phishing Emails with Surprising Accuracy: A Comparative Analysis of Performance | Het Patel et.al. | 2404.15485 | null |
2024-04-23 | Can Large Language Models Learn the Physics of Metamaterials? An Empirical Study with ChatGPT | Darui Lu et.al. | 2404.15458 | null |
2024-04-23 | Aligning LLM Agents by Learning Latent Preference from User Edits | Ge Gao et.al. | 2404.15269 | null |
2024-04-23 | XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts | Yifeng Ding et.al. | 2404.15247 | link |
2024-04-23 | Revisiting Unnaturalness for Automated Program Repair in the Era of Large Language Models | Aidan Z. H. Yang et.al. | 2404.15236 | null |
2024-04-23 | Re-Thinking Inverse Graphics With Large Language Models | Peter Kulits et.al. | 2404.15228 | null |
2024-04-23 | Setting up the Data Printer with Improved English to Ukrainian Machine Translation | Yurii Paniv et.al. | 2404.15196 | null |
2024-04-23 | Regressive Side Effects of Training Language Models to Mimic Student Misconceptions | Shashank Sonkar et.al. | 2404.15156 | null |
2024-04-23 | Bias patterns in the application of LLMs for clinical decision support: A comprehensive study | Raphael Poulain et.al. | 2404.15149 | null |
2024-04-23 | Rethinking LLM Memorization through the Lens of Adversarial Compression | Avi Schwarzschild et.al. | 2404.15146 | null |
2024-04-23 | Multimodal Large Language Model is a Human-Aligned Annotator for Text-to-Image Generation | Xun Wu et.al. | 2404.15100 | null |
2024-04-23 | A Short Review for Ontology Learning from Text: Stride from Shallow Learning, Deep Learning to Large Language Models Trend | Rick Du et.al. | 2404.14991 | null |
2024-04-22 | AutoAD III: The Prequel -- Back to the Pixels | Tengda Han et.al. | 2404.14412 | null |
2024-04-22 | SpaceByte: Towards Deleting Tokenization from Large Language Modeling | Kevin Slagle et.al. | 2404.14408 | link |
2024-04-22 | RTP-LX: Can LLMs Evaluate Toxicity in Multilingual Scenarios? | Adrian de Wynter et.al. | 2404.14397 | null |
2024-04-22 | A Survey on Self-Evolution of Large Language Models | Zhengwei Tao et.al. | 2404.14387 | null |
2024-04-22 | Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph | Xiaochen Kev Gao et.al. | 2404.14372 | link |
2024-04-23 | Preference Fine-Tuning of LLMs Should Leverage Suboptimal, On-Policy Data | Fahim Tajwar et.al. | 2404.14367 | link |
2024-04-22 | Better Synthetic Data by Retrieving and Transforming Existing Datasets | Saumya Gandhi et.al. | 2404.14361 | link |
2024-04-22 | Rethinking Legal Compliance Automation: Opportunities with Large Language Models | Shabnam Hassani et.al. | 2404.14356 | null |
2024-04-22 | Automated Long Answer Grading with RiceChem Dataset | Shashank Sonkar et.al. | 2404.14316 | null |
2024-04-22 | Explaining Arguments' Strength: Unveiling the Role of Attacks and Supports (Technical Report) | Xiang Yin et.al. | 2404.14304 | null |
2024-04-19 | MoVA: Adapting Mixture of Vision Experts to Multimodal Context | Zhuofan Zong et.al. | 2404.13046 | link |
2024-04-19 | Unified Scene Representation and Reconstruction for 3D Large Language Models | Tao Chu et.al. | 2404.13044 | null |
2024-04-19 | Data Alignment for Zero-Shot Concept Generation in Dermatology AI | Soham Gadgil et.al. | 2404.13043 | null |
2024-04-19 | Sample Design Engineering: An Empirical Study of What Makes Good Downstream Fine-Tuning Samples for LLMs | Biyang Guo et.al. | 2404.13033 | link |
2024-04-19 | When Life gives you LLMs, make LLM-ADE: Large Language Models with Adaptive Data Engineering | Stephen Choi et.al. | 2404.13028 | null |
2024-04-19 | Groma: Localized Visual Tokenization for Grounding Multimodal Large Language Models | Chuofan Ma et.al. | 2404.13013 | null |
2024-04-19 | Rethinking the Evaluation of Dialogue Systems: Effects of User Feedback on Crowdworkers and LLMs | Clemencia Siro et.al. | 2404.12994 | link |
2024-04-19 | FineRec:Exploring Fine-grained Sequential Recommendation | Xiaokun Zhang et.al. | 2404.12975 | null |
2024-04-19 | Eyes Can Deceive: Benchmarking Counterfactual Reasoning Abilities of Multi-modal Large Language Models | Yian Li et.al. | 2404.12966 | null |
2024-04-19 | Towards Reliable Latent Knowledge Estimation in LLMs: In-Context Learning vs. Prompting Based Factual Knowledge Extraction | Qinyuan Wu et.al. | 2404.12957 | null |
2024-04-18 | BLINK: Multimodal Large Language Models Can See but Not Perceive | Xingyu Fu et.al. | 2404.12390 | null |
2024-04-18 | MedThink: Explaining Medical Visual Question Answering via Multimodal Decision-Making Rationale | Xiaotang Gai et.al. | 2404.12372 | null |
2024-04-18 | When LLMs are Unfit Use FastFit: Fast and Effective Text Classification with Many Classes | Asaf Yehudai et.al. | 2404.12365 | null |
2024-04-19 | Towards a Foundation Model for Partial Differential Equations: Multi-Operator Learning and Extrapolation | Jingmin Sun et.al. | 2404.12355 | link |
2024-04-18 | V2Xum-LLM: Cross-Modal Video Summarization with Temporal Prompt Instruction Tuning | Hang Hua et.al. | 2404.12353 | null |
2024-04-18 | Large Language Models in Targeted Sentiment Analysis | Nicolay Rusnachenko et.al. | 2404.12342 | link |
2024-04-18 | Normative Requirements Operationalization with Large Language Models | Nick Feng et.al. | 2404.12335 | null |
2024-04-18 | Large Language Models for Synthetic Participatory Planning of Shared Automated Electric Mobility Systems | Jiangbo Yu et.al. | 2404.12317 | null |
2024-04-18 | Simultaneous Interpretation Corpus Construction by Large Language Models in Distant Language Pair | Yusuke Sakai et.al. | 2404.12299 | null |
2024-04-18 | Augmenting emotion features in irony detection with Large language modeling | Yucheng Lin et.al. | 2404.12291 | null |
2024-04-17 | A Deep Dive into Large Language Models for Automated Bug Localization and Repair | Soneya Binta Hossain et.al. | 2404.11595 | null |
2024-04-17 | LLMTune: Accelerate Database Knob Tuning with Large Language Models | Xinmei Huang et.al. | 2404.11581 | null |
2024-04-17 | MoA: Mixture-of-Attention for Subject-Context Disentanglement in Personalized Image Generation | Kuan-Chieh et.al. | 2404.11565 | null |
2024-04-17 | Quantifying Multilingual Performance of Large Language Models Across Languages | Zihao Li et.al. | 2404.11553 | null |
2024-04-17 | Pack of LLMs: Model Fusion at Test-Time via Perplexity Optimization | Costas Mavromatis et.al. | 2404.11531 | null |
2024-04-17 | Embedding Privacy in Computational Social Science and Artificial Intelligence Research | Keenan Jones et.al. | 2404.11515 | null |
2024-04-17 | Towards Coarse-to-Fine Evaluation of Inference Efficiency for Large Language Models | Yushuo Chen et.al. | 2404.11502 | link |
2024-04-17 | Paraphrase and Solve: Exploring and Exploiting the Impact of Surface Form on Mathematical Reasoning in Large Language Models | Yue Zhou et.al. | 2404.11500 | link |
2024-04-18 | Octopus v3: Technical Report for On-device Sub-billion Multimodal AI Agent | Wei Chen et.al. | 2404.11459 | null |
2024-04-17 | Unifying Bias and Unfairness in Information Retrieval: A Survey of Challenges and Opportunities with Large Language Models | Sunhao Dai et.al. | 2404.11457 | link |
2024-04-16 | Nearly Optimal Algorithms for Contextual Dueling Bandits from Adversarial Feedback | Qiwei Di et.al. | 2404.10776 | null |
2024-04-16 | Deep Learning and LLM-based Methods Applied to Stellar Lightcurve Classification | Yu-Yang Li et.al. | 2404.10757 | null |
2024-04-16 | Is DPO Superior to PPO for LLM Alignment? A Comprehensive Study | Shusheng Xu et.al. | 2404.10719 | null |
2024-04-16 | An empirical study on code review activity prediction in practice | Doriane Olewicki et.al. | 2404.10703 | null |
2024-04-16 | Automating REST API Postman Test Cases Using LLM | S Deepika Sri et.al. | 2404.10678 | null |
2024-04-16 | Self-playing Adversarial Language Game Enhances LLM Reasoning | Pengyu Cheng et.al. | 2404.10642 | link |
2024-04-16 | HLAT: High-quality Large Language Model Pre-trained on AWS Trainium | Haozheng Fan et.al. | 2404.10630 | null |
2024-04-16 | Private Attribute Inference from Images with Vision-Language Models | Batuhan Tömekçe et.al. | 2404.10618 | null |
2024-04-16 | Automated Evaluation of Large Vision-Language Models on Self-driving Corner Cases | Yanze Li et.al. | 2404.10595 | null |
2024-04-16 | Construction of Domain-specified Japanese Large Language Model for Finance through Continual Pre-training | Masanori Hirano et.al. | 2404.10555 | null |
2024-04-15 | KG-CTG: Citation Generation through Knowledge Graph-guided Large Language Models | Avinash Anand et.al. | 2404.09763 | null |
2024-04-15 | Resilience of Large Language Models for Noisy Instructions | Bin Wang et.al. | 2404.09754 | null |
2024-04-15 | Personalized Collaborative Fine-Tuning for On-Device Large Language Models | Nicolas Wagner et.al. | 2404.09753 | null |
2024-04-15 | Quantization of Large Language Models with an Overdetermined Basis | Daniil Merkulov et.al. | 2404.09737 | null |
2024-04-15 | Unveiling Imitation Learning: Exploring the Impact of Data Falsity to Large Language Model | Hyunsoo Cho et.al. | 2404.09717 | null |
2024-04-15 | Enhancing Robot Explanation Capabilities through Vision-Language Models: a Preliminary Study by Interpreting Visual Inputs for Improved Human-Robot Interaction | David Sobrín-Hidalgo et.al. | 2404.09705 | null |
2024-04-15 | Generative AI for Game Theory-based Mobile Networking | Long He et.al. | 2404.09699 | null |
2024-04-15 | Are Large Language Models Reliable Argument Quality Annotators? | Nailia Mirzakhmedova et.al. | 2404.09696 | null |
2024-04-15 | LoRAP: Transformer Sub-Layers Deserve Differentiated Structured Compression for Large Language Models | Guangyan Li et.al. | 2404.09695 | null |
2024-04-15 | Multi-News+: Cost-efficient Dataset Cleansing via LLM-based Data Annotation | Juhwan Choi et.al. | 2404.09682 | null |
2024-04-15 | Do LLMs Understand Visual Anomalies? Uncovering LLM Capabilities in Zero-shot Anomaly Detection | Jiaqi Zhu et.al. | 2404.09654 | null |
2024-04-15 | Bridging Vision and Language Spaces with Assignment Prediction | Jungin Park et.al. | 2404.09632 | link |
2024-04-12 | Enhancing Visual Question Answering through Question-Driven Image Captions as Prompts | Övgü Özdemir et.al. | 2404.08589 | link |
2024-04-12 | Enhancing Autonomous Vehicle Training with Language Model Integration and Critical Scenario Generation | Hanlin Tian et.al. | 2404.08570 | null |
2024-04-12 | RLHF Deciphered: A Critical Analysis of Reinforcement Learning from Human Feedback for LLMs | Shreyas Chaudhari et.al. | 2404.08555 | null |
2024-04-12 | Online Safety Analysis for LLMs: a Benchmark, an Assessment, and a Path Forward | Xuan Xie et.al. | 2404.08517 | null |
2024-04-12 | Efficient Interactive LLM Serving with Proxy Model-based Sequence Length Prediction | Haoran Qiu et.al. | 2404.08509 | link |
2024-04-12 | LaSagnA: Language-based Segmentation Assistant for Complex Queries | Cong Wei et.al. | 2404.08506 | link |
2024-04-12 | Strategic Interactions between Large Language Models-based Agents in Beauty Contests | Siting Lu et.al. | 2404.08492 | null |
2024-04-12 | Thematic Analysis with Large Language Models: does it work with languages other than English? A targeted test in Italian | Stefano De Paoli et.al. | 2404.08488 | null |
2024-04-12 | Comparing Apples to Oranges: LLM-powered Multimodal Intention Prediction in an Object Categorization Task | Hassan Ali et.al. | 2404.08424 | null |
2024-04-12 | AdapterSwap: Continuous Training of LLMs with Data Removal and Access-Control Guarantees | William Fleshman et.al. | 2404.08417 | null |
2024-04-11 | OpenBias: Open-set Bias Detection in Text-to-Image Generative Models | Moreno D'Incà et.al. | 2404.07990 | null |
2024-04-11 | Manipulating Large Language Models to Increase Product Visibility | Aounon Kumar et.al. | 2404.07981 | link |
2024-04-11 | LLoCO: Learning Long Contexts Offline | Sijun Tan et.al. | 2404.07979 | link |
2024-04-11 | Ferret-v2: An Improved Baseline for Referring and Grounding with Large Language Models | Haotian Zhang et.al. | 2404.07973 | null |
2024-04-11 | Leveraging Large Language Models (LLMs) to Support Collaborative Human-AI Online Risk Data Annotation | Jinkyung Park et.al. | 2404.07926 | null |
2024-04-11 | LaVy: Vietnamese Multimodal Large Language Model | Chi Tran et.al. | 2404.07922 | null |
2024-04-11 | AmpleGCG: Learning a Universal and Transferable Generative Model of Adversarial Suffixes for Jailbreaking Both Open and Closed LLMs | Zeyi Liao et.al. | 2404.07921 | link |
2024-04-11 | DesignQA: A Multimodal Benchmark for Evaluating Large Language Models' Understanding of Engineering Documentation | Anna C. Doris et.al. | 2404.07917 | link |
2024-04-11 | HGRN2: Gated Linear RNNs with State Expansion | Zhen Qin et.al. | 2404.07904 | link |
2024-04-11 | High-Dimension Human Value Representation in Large Language Models | Samuel Cahyawijaya et.al. | 2404.07900 | null |
2024-04-10 | UMBRAE: Unified Multimodal Decoding of Brain Signals | Weihao Xia et.al. | 2404.07202 | null |
2024-04-10 | Leave No Context Behind: Efficient Infinite Context Transformers with Infini-attention | Tsendsuren Munkhdalai et.al. | 2404.07143 | null |
2024-04-10 | Continuous Language Model Interpolation for Dynamic and Controllable Text Generation | Sara Kangaslahti et.al. | 2404.07117 | link |
2024-04-11 | From Model-centered to Human-Centered: Revision Distance as a Metric for Text Evaluation in LLMs-based Applications | Yongqiang Ma et.al. | 2404.07108 | null |
2024-04-10 | 3DMambaComplete: Exploring Structured State Space Model for Point Cloud Completion | Yixuan Li et.al. | 2404.07106 | null |
2024-04-10 | Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs | Bowen Jin et.al. | 2404.07103 | null |
2024-04-10 | Dynamic Generation of Personalities with Large Language Models | Jianzhi Liu et.al. | 2404.07084 | null |
2024-04-10 | VLLMs Provide Better Context for Emotion Understanding Through Common Sense Reasoning | Alexandros Xenos et.al. | 2404.07078 | link |
2024-04-10 | Exploring Concept Depth: How Large Language Models Acquire Knowledge at Different Layers? | Mingyu Jin et.al. | 2404.07066 | link |
2024-04-10 | Groundedness in Retrieval-augmented Long-form Generation: An Empirical Study | Alessandro Stolfo et.al. | 2404.07060 | null |
2024-04-09 | Pitfalls of Conversational LLMs on News Debiasing | Ipek Baris Schlicht et.al. | 2404.06488 | null |
2024-04-10 | Ada-LEval: Evaluating long-context LLMs with length-adaptable benchmarks | Chonghua Wang et.al. | 2404.06480 | link |
2024-04-09 | Automated Federated Pipeline for Parameter-Efficient Fine-Tuning of Large Language Models | Zihan Fang et.al. | 2404.06448 | null |
2024-04-09 | Large Language Models to the Rescue: Deadlock Resolution in Multi-Robot Systems | Kunal Garg et.al. | 2404.06413 | null |
2024-04-09 | AgentQuest: A Modular Benchmark Framework to Measure Progress and Improve LLM Agents | Luca Gioacchini et.al. | 2404.06411 | link |
2024-04-09 | Take a Look at it! Rethinking How to Evaluate Language Model Jailbreak | Hongyu Cai et.al. | 2404.06407 | link |
2024-04-09 | Apprentices to Research Assistants: Advancing Research with Large Language Models | M. Namvarpour et.al. | 2404.06404 | null |
2024-04-09 | MiniCPM: Unveiling the Potential of Small Language Models with Scalable Training Strategies | Shengding Hu et.al. | 2404.06395 | link |
2024-04-10 | MuPT: A Generative Symbolic Music Pretrained Transformer | Xingwei Qu et.al. | 2404.06393 | null |
2024-04-09 | Latent Distance Guided Alignment Training for Large Language Models | Haotian Luo et.al. | 2404.06390 | null |
2024-04-08 | MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding | Bo He et.al. | 2404.05726 | null |
2024-04-08 | Ferret-UI: Grounded Mobile UI Understanding with Multimodal LLMs | Keen You et.al. | 2404.05719 | null |
2024-04-08 | Evaluating Mathematical Reasoning Beyond Accuracy | Shijie Xia et.al. | 2404.05692 | link |
2024-04-08 | Retrieval-Augmented Open-Vocabulary Object Detection | Jooyeon Kim et.al. | 2404.05687 | link |
2024-04-08 | MoMA: Multimodal LLM Adapter for Fast Personalized Image Generation | Kunpeng Song et.al. | 2404.05674 | null |
2024-04-08 | CoReS: Orchestrating the Dance of Reasoning and Segmentation | Xiaoyi Bao et.al. | 2404.05673 | null |
2024-04-09 | Fighting crime with Transformers: Empirical analysis of address parsing methods in payment data | Haitham Hammami et.al. | 2404.05632 | link |
2024-04-08 | LTNER: Large Language Model Tagging for Named Entity Recognition with Contextualized Entity Marking | Faren Yan et.al. | 2404.05624 | null |
2024-04-08 | MedExpQA: Multilingual Benchmarking of Large Language Models for Medical Question Answering | Iñigo Alonso et.al. | 2404.05590 | null |
2024-04-08 | 360°REA: Towards A Reusable Experience Accumulation with 360° Assessment for Multi-Agent System | Shen Gao et.al. | 2404.05569 | null |
2024-04-05 | Physical Property Understanding from Language-Embedded Feature Fields | Albert J. Zhai et.al. | 2404.04242 | null |
2024-04-05 | Cleared for Takeoff? Compositional & Conditional Reasoning may be the Achilles Heel to (Flight-Booking) Language Agents | Harsh Kohli et.al. | 2404.04237 | null |
2024-04-05 | Social Skill Training with Large Language Models | Diyi Yang et.al. | 2404.04204 | null |
2024-04-05 | Ambiguity in the use of SIR models to fit epidemic incidence data | B Shayak et.al. | 2404.04181 | null |
2024-04-05 | Chinese Tiny LLM: Pretraining a Chinese-Centric Large Language Model | Xinrun Du et.al. | 2404.04167 | null |
2024-04-05 | Large language models as oracles for instantiating ontologies with domain-specific knowledge | Giovanni Ciatto et.al. | 2404.04108 | link |
2024-04-05 | Robust Preference Optimization with Provable Noise Tolerance for LLMs | Xize Liang et.al. | 2404.04102 | null |
2024-04-05 | Assessing the quality of information extraction | Filip Seitl et.al. | 2404.04068 | null |
2024-04-05 | CLUE: A Clinical Language Understanding Evaluation for LLMs | Amin Dada et.al. | 2404.04067 | null |
2024-04-05 | VoicePilot: Harnessing LLMs as Speech Interfaces for Physically Assistive Robots | Akhil Padmanabha et.al. | 2404.04066 | null |
2024-04-04 | AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent | Hanyu Lai et.al. | 2404.03648 | link |
2024-04-04 | Capabilities of Large Language Models in Control Engineering: A Benchmark Study on GPT-4, Claude 3 Opus, and Gemini 1.0 Ultra | Darioush Kevian et.al. | 2404.03647 | null |
2024-04-04 | Training LLMs over Neurally Compressed Text | Brian Lester et.al. | 2404.03626 | null |
2024-04-04 | Unveiling LLMs: The Evolution of Latent Representations in a Temporal Knowledge Graph | Marco Bronzini et.al. | 2404.03623 | null |
2024-04-04 | Visualization-of-Thought Elicits Spatial Reasoning in Large Language Models | Wenshan Wu et.al. | 2404.03622 | null |
2024-04-04 | DeViDe: Faceted medical knowledge for improved medical vision-language pre-training | Haozhe Luo et.al. | 2404.03618 | null |
2024-04-04 | Sailor: Open Language Models for South-East Asia | Longxu Dou et.al. | 2404.03608 | link |
2024-04-04 | Evaluating LLMs at Detecting Errors in LLM Responses | Ryo Kamoi et.al. | 2404.03602 | link |
2024-04-04 | Intent Detection and Entity Extraction from BioMedical Literature | Ankan Mullick et.al. | 2404.03598 | link |
2024-04-04 | SemGrasp: Semantic Grasp Generation via Language Aligned Discretization | Kailin Li et.al. | 2404.03590 | null |
2024-04-03 | ALOHa: A New Measure for Hallucination in Captioning Models | Suzanne Petryk et.al. | 2404.02904 | null |
2024-04-03 | MatAtlas: Text-driven Consistent Geometry Texturing and Material Assignment | Duygu Ceylan et.al. | 2404.02899 | null |
2024-04-03 | ChatGLM-Math: Improving Math Problem-Solving in Large Language Models with a Self-Critique Pipeline | Yifan Xu et.al. | 2404.02893 | null |
2024-04-03 | Linear Attention Sequence Parallelism | Weigao Sun et.al. | 2404.02882 | link |
2024-04-03 | Integrating Explanations in Learning LTL Specifications from Demonstrations | Ashutosh Gupta et.al. | 2404.02872 | null |
2024-04-03 | Toward Inference-optimal Mixture-of-Expert Large Language Models | Longfei Yun et.al. | 2404.02852 | null |
2024-04-03 | I-Design: Personalized LLM Interior Designer | Ata Çelen et.al. | 2404.02838 | null |
2024-04-03 | Cherry on Top: Parameter Heterogeneity and Quantization in Large Language Models | Wanyun Cui et.al. | 2404.02837 | null |
2024-04-03 | Retrieving Examples from Memory for Retrieval Augmented Neural Machine Translation: A Systematic Comparison | Maxime Bouthors et.al. | 2404.02835 | null |
2024-04-03 | Empowering Biomedical Discovery with AI Agents | Shanghua Gao et.al. | 2404.02831 | null |
2024-04-02 | Topic-based Watermarks for LLM-Generated Text | Alexander Nemecek et.al. | 2404.02138 | null |
2024-04-02 | Exploring Automated Distractor Generation for Math Multiple-choice Questions via Large Language Models | Wanyong Feng et.al. | 2404.02124 | null |
2024-04-02 | CLAPNQ: Cohesive Long-form Answers from Passages in Natural Questions for RAG systems | Sara Rosenthal et.al. | 2404.02103 | link |
2024-04-02 | Advancing LLM Reasoning Generalists with Preference Trees | Lifan Yuan et.al. | 2404.02078 | link |
2024-04-02 | SPMamba: State-space model is all you need in speech separation | Kai Li et.al. | 2404.02063 | link |
2024-04-02 | Digital Forgetting in Large Language Models: A Survey of Unlearning Methods | Alberto Blanco-Justicia et.al. | 2404.02062 | null |
2024-04-02 | Long-context LLMs Struggle with Long In-context Learning | Tianle Li et.al. | 2404.02060 | link |
2024-04-02 | Deconstructing In-Context Learning: Understanding Prompts via Corruption | Namrata Shivagunde et.al. | 2404.02054 | link |
2024-04-02 | A Survey on Large Language Model-Based Game Agents | Sihao Hu et.al. | 2404.02039 | link |
2024-04-02 | MultiParaDetox: Extending Text Detoxification with Parallel Data to New Languages | Daryna Dementieva et.al. | 2404.02037 | null |
2024-03-29 | Gecko: Versatile Text Embeddings Distilled from Large Language Models | Jinhyuk Lee et.al. | 2403.20327 | null |
2024-03-29 | Convolutional Prompting meets Language Models for Continual Learning | Anurag Roy et.al. | 2403.20317 | null |
2024-03-29 | Towards Greener LLMs: Bringing Energy-Efficiency to the Forefront of LLM Inference | Jovan Stojkovic et.al. | 2403.20306 | null |
2024-03-29 | Can LLMs Correct Physicians, Yet? Investigating Effective Interaction Methods in the Medical Domain | Burcu Sayin et.al. | 2403.20288 | null |
2024-03-29 | LUQ: Long-text Uncertainty Quantification for LLMs | Caiqi Zhang et.al. | 2403.20279 | null |
2024-04-01 | Draw-and-Understand: Leveraging Visual Prompts to Enable MLLMs to Comprehend What You Want | Weifeng Lin et.al. | 2403.20271 | link |
2024-03-29 | Latxa: An Open Language Model and Evaluation Suite for Basque | Julen Etxaniz et.al. | 2403.20266 | link |
2024-03-29 | ELITR-Bench: A Meeting Assistant Benchmark for Long-Context Language Models | Thibaut Thonet et.al. | 2403.20262 | null |
2024-03-29 | Using LLMs to Model the Beliefs and Preferences of Targeted Populations | Keiichi Namikoshi et.al. | 2403.20252 | null |
2024-03-29 | Unleashing the Potential of Large Language Models for Predictive Tabular Tasks in Data Science | Yazheng Yang et.al. | 2403.20208 | null |
2024-03-28 | InterDreamer: Zero-Shot Text to 3D Dynamic Human-Object Interaction | Sirui Xu et.al. | 2403.19652 | null |
2024-03-28 | MagicLens: Self-Supervised Image Retrieval with Open-Ended Instructions | Kai Zhang et.al. | 2403.19651 | null |
2024-03-28 | Change-Agent: Towards Interactive Comprehensive Change Interpretation and Analysis from Change Detection and Change Captioning | Chenyang Liu et.al. | 2403.19646 | link |
2024-03-28 | Retrieval-Enhanced Knowledge Editing for Multi-Hop Question Answering in Language Models | Yucheng Shi et.al. | 2403.19631 | null |
2024-03-29 | Genetic Quantization-Aware Approximation for Non-Linear Operations in Transformers | Pingcheng Dong et.al. | 2403.19591 | link |
2024-03-28 | WaterJudge: Quality-Detection Trade-off when Watermarking Large Language Models | Piotr Molenda et.al. | 2403.19548 | null |
2024-03-28 | LLMs as Academic Reading Companions: Extending HCI Through Synthetic Personae | Celia Chen et.al. | 2403.19506 | null |
2024-03-28 | Evolving Assembly Code in an Adversarial Environment | Irina Maliukov et.al. | 2403.19489 | null |
2024-03-28 | JDocQA: Japanese Document Question Answering Dataset for Generative Language Models | Eri Onami et.al. | 2403.19454 | null |
2024-03-28 | Mixed Preference Optimization: Reinforcement Learning with Data Selection and Better Reference Model | Qi Gou et.al. | 2403.19443 | null |
2024-03-27 | Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models | Yanwei Li et.al. | 2403.18814 | link |
2024-03-27 | Long-form factuality in large language models | Jerry Wei et.al. | 2403.18802 | link |
2024-03-27 | 3P-LLM: Probabilistic Path Planning using Large Language Model for Autonomous Robot Navigation | Ehsan Latif et.al. | 2403.18778 | null |
2024-03-27 | CheckEval: Robust Evaluation Framework using Large Language Model via Checklist | Yukyung Lee et.al. | 2403.18771 | null |
2024-03-27 | MLDT: Multi-Level Decomposition for Complex Long-Horizon Robotic Task Planning with Open-Source Large Language Model | Yike Wu et.al. | 2403.18760 | null |
2024-03-27 | Understanding the Learning Dynamics of Alignment with Human Feedback | Shawn Im et.al. | 2403.18742 | null |
2024-03-27 | PhysicsAssistant: An LLM-Powered Interactive Learning Robot for Physics Lab Investigations | Ehsan Latif et.al. | 2403.18721 | null |
2024-03-27 | NL-ITI: Optimizing Probing and Intervention for Improvement of ITI Method | Jakub Hoscilowicz et.al. | 2403.18680 | link |
2024-03-27 | An Exploratory Study on Upper-Level Computing Students' Use of Large Language Models as Tools in a Semester-Long Project | Ben Arie Tanay et.al. | 2403.18679 | null |
2024-03-27 | SDSAT: Accelerating LLM Inference through Speculative Decoding with Semantic Adaptive Tokens | Chengbo Liu et.al. | 2403.18647 | null |
2024-03-26 | Towards Explaining Hypercomplex Neural Networks | Eleonora Lopez et.al. | 2403.17929 | null |
2024-03-26 | MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution | Wei Tao et.al. | 2403.17927 | null |
2024-03-26 | LISA: Layerwise Importance Sampling for Memory-Efficient Large Language Model Fine-Tuning | Rui Pan et.al. | 2403.17919 | null |
2024-03-26 | Addressing Social Misattributions of Large Language Models: An HCXAI-based Approach | Andrea Ferrario et.al. | 2403.17873 | null |
2024-03-26 | Exploring LLMs as a Source of Targeted Synthetic Textual Data to Minimize High Confidence Misclassifications | Philip Lippmann et.al. | 2403.17860 | null |
2024-03-26 | ChroniclingAmericaQA: A Large-scale Question Answering Dataset based on Historical American Newspaper Pages | Bhawna Piryani et.al. | 2403.17859 | link |
2024-03-26 | Verbing Weirds Language (Models): Evaluation of English Zero-Derivation in Five LLMs | David R. Mortensen et.al. | 2403.17856 | null |
2024-03-26 | ArabicaQA: A Comprehensive Dataset for Arabic Question Answering | Abdelrahman Abdallah et.al. | 2403.17848 | link |
2024-03-26 | Assessment of Multimodal Large Language Models in Alignment with Human Values | Zhelun Shi et.al. | 2403.17830 | null |
2024-03-26 | Accelerating Radio Spectrum Regulation Workflows with Large Language Models (LLMs) | Amir Ghasemi et.al. | 2403.17819 | null |
2024-03-25 | Synapse: Learning Preferential Concepts from Visual Demonstrations | Sadanand Modak et.al. | 2403.16689 | null |
2024-03-25 | Investigation of the effectiveness of applying ChatGPT in Dialogic Teaching Using Electroencephalography | Jiayue Zhang et.al. | 2403.16687 | null |
2024-03-26 | RU22Fact: Optimizing Evidence for Multilingual Explainable Fact-Checking on Russia-Ukraine Conflict | Yirong Zeng et.al. | 2403.16662 | link |
2024-03-26 | CLHA: A Simple yet Effective Contrastive Learning Framework for Human Alignment | Feiteng Fang et.al. | 2403.16649 | null |
2024-03-25 | Virtual Co-Pilot: Multimodal Large Language Model-enabled Quick-access Procedures for Single Pilot Operations | Fan Li et.al. | 2403.16645 | null |
2024-03-25 | Conversational Grounding: Annotation and Analysis of Grounding Acts and Grounding Units | Biswesh Mohapatra et.al. | 2403.16609 | null |
2024-03-25 | TrustAI at SemEval-2024 Task 8: A Comprehensive Analysis of Multi-domain Machine Generated Text Detection Techniques | Ashok Urlana et.al. | 2403.16592 | null |
2024-03-25 | Can Large Language Models (or Humans) Distill Text? | Nicolas Audinet de Pieuchon et.al. | 2403.16584 | null |
2024-03-25 | NSINA: A News Corpus for Sinhala | Hansi Hettiarachchi et.al. | 2403.16571 | link |
2024-03-25 | Elysium: Exploring Object-level Perception in Videos via MLLM | Han Wang et.al. | 2403.16558 | link |
2024-03-22 | LLaVA-PruMerge: Adaptive Token Reduction for Efficient Large Multimodal Models | Yuzhang Shang et.al. | 2403.15388 | null |
2024-03-22 | Can large language models explore in-context? | Akshay Krishnamurthy et.al. | 2403.15371 | null |
2024-03-22 | CoLLEGe: Concept Embedding Generation for Large Language Models | Ryan Teehan et.al. | 2403.15362 | null |
2024-03-22 | Sphere Neural-Networks for Rational Reasoning | Tiansi Dong et.al. | 2403.15297 | null |
2024-03-22 | Measuring Gender and Racial Biases in Large Language Models | Jiafu An et.al. | 2403.15281 | null |
2024-03-22 | Bioinformatics and Biomedical Informatics with ChatGPT: Year One Review | Jinge Wang et.al. | 2403.15274 | null |
2024-03-22 | Event Temporal Relation Extraction based on Retrieval-Augmented on LLMs | Xiaobin Zhang et.al. | 2403.15273 | null |
2024-03-22 | Imagination Augmented Generation: Learning to Imagine Richer Context for Question Answering over Large Language Models | Huanxuan Liao et.al. | 2403.15268 | link |
2024-03-22 | FollowIR: Evaluating and Teaching Information Retrieval Models to Follow Instructions | Orion Weller et.al. | 2403.15246 | null |
2024-03-22 | An Exploratory Investigation into Code License Infringements in Large Language Model Training Datasets | Jonathan Katzy et.al. | 2403.15230 | null |
2024-03-21 | MathVerse: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? | Renrui Zhang et.al. | 2403.14624 | null |
2024-03-21 | Parameter-Efficient Fine-Tuning for Large Models: A Comprehensive Survey | Zeyu Han et.al. | 2403.14608 | null |
2024-03-21 | Large Language Models for Multi-Choice Question Classification of Medical Subjects | Víctor Ponce-López et.al. | 2403.14582 | null |
2024-03-21 | RAmBLA: A Framework for Evaluating the Reliability of LLMs as Assistants in the Biomedical Domain | William James Bolton et.al. | 2403.14578 | link |
2024-03-21 | A Chain-of-Thought Prompting Approach with LLMs for Evaluating Students' Formative Assessment Responses in Science | Clayton Cohn et.al. | 2403.14565 | null |
2024-03-21 | EDT: Improving Large Language Models' Generation by Entropy-based Dynamic Temperature Sampling | Shimao Zhang et.al. | 2403.14541 | null |
2024-03-22 | Cobra: Extending Mamba to Multi-Modal Large Language Model for Efficient Inference | Han Zhao et.al. | 2403.14520 | null |
2024-03-21 | The Ethics of ChatGPT in Medicine and Healthcare: A Systematic Review on Large Language Models (LLMs) | Joschka Haltaufderheide et.al. | 2403.14473 | null |
2024-03-21 | Detoxifying Large Language Models via Knowledge Editing | Mengru Wang et.al. | 2403.14472 | link |
2024-03-21 | ChatGPT Alternative Solutions: Large Language Models Survey | Hanieh Alipour et.al. | 2403.14469 | null |
2024-03-20 | RAR: Retrieving And Ranking Augmented MLLMs for Visual Recognition | Ziyu Liu et.al. | 2403.13805 | null |
2024-03-20 | Learning from Models and Data for Visual Grounding | Ruozhen He et.al. | 2403.13804 | null |
2024-03-20 | ZigMa: Zigzag Mamba Diffusion Model | Vincent Tao Hu et.al. | 2403.13802 | null |
2024-03-20 | Reverse Training to Nurse the Reversal Curse | Olga Golovneva et.al. | 2403.13799 | null |
2024-03-20 | Chain-of-Interaction: Enhancing Large Language Models for Psychiatric Behavior Understanding by Dyadic Contexts | Guangzeng Han et.al. | 2403.13786 | null |
2024-03-20 | EthioLLM: Multilingual Large Language Models for Ethiopian Languages with Task Evaluation | Atnafu Lambebo Tonja et.al. | 2403.13737 | null |
2024-03-20 | Large Language Models meet Network Slicing Management and Orchestration | Abdulhalim Dandoush et.al. | 2403.13721 | null |
2024-03-21 | RoleInteract: Evaluating the Social Interaction of Role-Playing Agents | Hongzhan Chen et.al. | 2403.13679 | null |
2024-03-20 | H-vmunet: High-order Vision Mamba UNet for Medical Image Segmentation | Renkai Wu et.al. | 2403.13642 | link |
2024-03-21 | Do Not Worry if You Do Not Have Data: Building Pretrained Language Models Using Translationese | Meet Doshi et.al. | 2403.13638 | null |
2024-03-19 | Dated Data: Tracing Knowledge Cutoffs in Large Language Models | Jeffrey Cheng et.al. | 2403.12958 | null |
2024-03-19 | Automatic Information Extraction From Employment Tribunal Judgements Using Large Language Models | Joana Ribeiro de Faria et.al. | 2403.12936 | null |
2024-03-19 | Rapid AIdeation: Generating Ideas With the Self and in Collaboration With Large Language Models | Gionnieve Lim et.al. | 2403.12928 | null |
2024-03-19 | Supporting Energy Policy Research with Large Language Models | Grant Buster et.al. | 2403.12924 | null |
2024-03-19 | Semantic Layering in Room Segmentation via LLMs | Taehyeon Kim et.al. | 2403.12920 | null |
2024-03-19 | Toward Sustainable GenAI using Generation Directives for Carbon-Friendly Large Language Model Inference | Baolin Li et.al. | 2403.12900 | null |
2024-03-19 | mPLUG-DocOwl 1.5: Unified Structure Learning for OCR-free Document Understanding | Anwen Hu et.al. | 2403.12895 | link |
2024-03-20 | MEDBind: Unifying Language and Multimodal Medical Data Embeddings | Yuan Gao et.al. | 2403.12894 | null |
2024-03-19 | HYDRA: A Hyper Agent for Dynamic Compositional Visual Reasoning | Fucai Ke et.al. | 2403.12884 | null |
2024-03-19 | Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models | Zehui Chen et.al. | 2403.12881 | link |
2024-03-18 | HDLdebugger: Streamlining HDL debugging with Large Language Models | Xufeng Yao et.al. | 2403.11671 | null |
2024-03-18 | Let's Focus on Neuron: Neuron-Level Supervised Fine-tuning for Large Language Model | Haoyun Xu et.al. | 2403.11621 | null |
2024-03-18 | Linguacodus: A Synergistic Framework for Transformative Code Generation in Machine Learning Pipelines | Ekaterina Trofimova et.al. | 2403.11585 | null |
2024-03-18 | Sensitivity Assessment of Multi-Criteria Decision-Making Methods in Chemical Engineering Optimization Applications | Seyed Reza Nabavi et.al. | 2403.11569 | null |
2024-03-18 | Reinforcement Learning with Token-level Feedback for Controllable Text Generation | Wendi Li et.al. | 2403.11558 | null |
2024-03-18 | LLM^3:Large Language Model-based Task and Motion Planning with Motion Failure Reasoning | Shu Wang et.al. | 2403.11552 | link |
2024-03-18 | DEE: Dual-stage Explainable Evaluation Method for Text Generation | Shenyu Zhang et.al. | 2403.11509 | null |
2024-03-18 | VideoAgent: A Memory-augmented Multimodal Agent for Video Understanding | Yue Fan et.al. | 2403.11481 | null |
2024-03-18 | HateCOT: An Explanation-Enhanced Dataset for Generalizable Offensive Speech Detection via Large Language Models | Huy Nghiem et.al. | 2403.11456 | link |
2024-03-18 | LLM Guided Evolution - The Automation of Models Advancing Models | Clint Morris et.al. | 2403.11446 | null |
2024-03-15 | VideoAgent: Long-form Video Understanding with Large Language Model as Agent | Xiaohan Wang et.al. | 2403.10517 | null |
2024-03-15 | Demystifying Faulty Code with LLM: Step-by-Step Reasoning for Explainable Fault Localization | Ratnadira Widyasari et.al. | 2403.10507 | null |
2024-03-15 | ATOM: Asynchronous Training of Massive Models for Deep Learning in a Decentralized Environment | Xiaofeng Wu et.al. | 2403.10504 | null |
2024-03-15 | Reconfigurable Robot Identification from Motion Data | Yuhang Hu et.al. | 2403.10496 | null |
2024-03-15 | Can a GPT4-Powered AI Agent Be a Good Enough Performance Attribution Analyst? | Bruno de Melo et.al. | 2403.10482 | null |
2024-03-15 | Enhancing LLM Factual Accuracy with RAG to Counter Hallucinations: A Case Study on Domain-Specific Queries in Private Knowledge-Bases | Jiarui Li et.al. | 2403.10446 | link |
2024-03-15 | Optimal Block-Level Draft Verification for Accelerating Speculative Decoding | Ziteng Sun et.al. | 2403.10444 | null |
2024-03-15 | Using an LLM to Turn Sign Spottings into Spoken Language Sentences | Ozge Mercanoglu Sincan et.al. | 2403.10434 | null |
2024-03-15 | SocialGenPod: Privacy-Friendly Generative AI Social Web Applications with Decentralised Personal Data Stores | Vidminas Vizgirda et.al. | 2403.10408 | link |
2024-03-15 | A Thorough Comparison of Cross-Encoders and LLMs for Reranking SPLADE | Hervé Déjean et.al. | 2403.10407 | null |
2024-03-14 | Dynamic Memory Compression: Retrofitting LLMs for Accelerated Inference | Piotr Nawrot et.al. | 2403.09636 | null |
2024-03-14 | 3D-VLA: A 3D Vision-Language-Action Generative World Model | Haoyu Zhen et.al. | 2403.09631 | null |
2024-03-14 | Video Mamba Suite: State Space Model as a Versatile Alternative for Video Understanding | Guo Chen et.al. | 2403.09626 | link |
2024-03-14 | Compute-first optical detection for noise-resilient visual perception | Jungmin Kim et.al. | 2403.09612 | null |
2024-03-14 | MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training | Brandon McKinzie et.al. | 2403.09611 | null |
2024-03-14 | Large Language Models and Causal Inference in Collaboration: A Comprehensive Survey | Xiaoyu Liu et.al. | 2403.09606 | null |
2024-03-14 | Logical Discrete Graphical Models Must Supplement Large Language Models for Information Synthesis | Gregory Coppola et.al. | 2403.09599 | null |
2024-03-15 | ExploRLLM: Guiding Exploration in Reinforcement Learning with Large Language Models | Runyu Ma et.al. | 2403.09583 | null |
2024-03-14 | Eyes Closed, Safety On: Protecting Multimodal LLMs via Image-to-Text Transformation | Yunhao Gou et.al. | 2403.09572 | null |
2024-03-14 | Enhancing Trust in Autonomous Agents: An Architecture for Accountability and Explainability through Blockchain and Large Language Models | Laura Fernández-Becerra et.al. | 2403.09567 | null |
2024-03-13 | Simple and Scalable Strategies to Continually Pre-train Large Language Models | Adam Ibrahim et.al. | 2403.08763 | null |
2024-03-13 | Steering LLMs Towards Unbiased Responses: A Causality-Guided Debiasing Framework | Jingling Li et.al. | 2403.08743 | null |
2024-03-13 | The Garden of Forking Paths: Observing Dynamic Parameters Distribution in Large Language Models | Carlo Nicolini et.al. | 2403.08739 | null |
2024-03-13 | Strengthening Multimodal Large Language Model with Bootstrapped Preference Optimization | Renjie Pi et.al. | 2403.08730 | null |
2024-03-14 | SOTOPIA- |
Ruiyi Wang et.al. | 2403.08715 | link |
2024-03-13 | Review of Generative AI Methods in Cybersecurity | Yagmur Yigit et.al. | 2403.08701 | null |
2024-03-13 | TeaMs-RL: Teaching LLMs to Teach Themselves Better Instructions via Reinforcement Learning | Shangding Gu et.al. | 2403.08694 | null |
2024-03-14 | Zero-shot and Few-shot Generation Strategies for Artificial Clinical Records | Erlend Frayling et.al. | 2403.08664 | null |
2024-03-13 | Human Alignment of Large Language Models through Online Preference Optimisation | Daniele Calandriello et.al. | 2403.08635 | null |
2024-03-13 | MedInsight: A Multi-Source Context Augmentation Framework for Generating Patient-Centric Medical Responses using Large Language Models | Subash Neupane et.al. | 2403.08607 | null |
2024-03-12 | Beyond Text: Frozen Large Language Models in Visual Signal Comprehension | Lei Zhu et.al. | 2403.07874 | link |
2024-03-12 | Rethinking Generative Large Language Model Evaluation for Semantic Comprehension | Fangyun Wei et.al. | 2403.07872 | null |
2024-03-12 | Exploring Safety Generalization Challenges of Large Language Models via Code | Qibing Ren et.al. | 2403.07865 | null |
2024-03-12 | DeliGrasp: Inferring Object Mass, Friction, and Compliance with LLMs for Adaptive and Minimally Deforming Grasp Policies | William Xie et.al. | 2403.07832 | null |
2024-03-12 | The Missing Piece in Model Editing: A Deep Dive into the Hidden Damage Brought By Model Editing | Jianchen Wang et.al. | 2403.07825 | null |
2024-03-12 | Branch-Train-MiX: Mixing Expert LLMs into a Mixture-of-Experts LLM | Sainbayar Sukhbaatar et.al. | 2403.07816 | null |
2024-03-12 | Fine-tuning Large Language Models with Sequential Instructions | Hanxu Hu et.al. | 2403.07794 | link |
2024-03-12 | Transforming Competition into Collaboration: The Revolutionary Role of Multi-Agent Systems and Language Models in Modern Organizations | Carlos Jose Xavier Cruz et.al. | 2403.07769 | link |
2024-03-12 | Synth |
Sahand Sharifzadeh et.al. | 2403.07750 | null |
2024-03-12 | FineMath: A Fine-Grained Mathematical Evaluation Benchmark for Chinese Large Language Models | Yan Liu et.al. | 2403.07747 | null |
2024-03-11 | Hybrid Human-LLM Corpus Construction and LLM Evaluation for Rare Linguistic Phenomena | Leonie Weissweiler et.al. | 2403.06965 | null |
2024-03-11 | Materials science in the era of large language models: a perspective | Ge Lei et.al. | 2403.06949 | null |
2024-03-11 | Naming, Describing, and Quantifying Visual Objects in Humans and LLMs | Alberto Testoni et.al. | 2403.06935 | null |
2024-03-11 | ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis | Yanming Liu et.al. | 2403.06932 | link |
2024-03-12 | MEND: Meta dEmonstratioN Distillation for Efficient and Effective In-Context Learning | Yichuan Li et.al. | 2403.06914 | null |
2024-03-11 | Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents | Nishchal Prasad et.al. | 2403.06872 | null |
2024-03-11 | Development of a Reliable and Accessible Caregiving Language Model (CaLM) | Bambang Parmanto et.al. | 2403.06857 | null |
2024-03-11 | DriveDreamer-2: LLM-Enhanced World Models for Diverse Driving Video Generation | Guosheng Zhao et.al. | 2403.06845 | null |
2024-03-11 | RA-ISF: Learning to Answer and Understand from Retrieval Augmentation via Iterative Self-Feedback | Yanming Liu et.al. | 2403.06840 | link |
2024-03-11 | ACFIX: Guiding LLMs with Mined Common RBAC Practices for Context-Aware Repair of Access Control Vulnerabilities in Smart Contracts | Lyuye Zhang et.al. | 2403.06838 | null |
2024-03-08 | Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context | Machel Reid et.al. | 2403.05530 | null |
2024-03-08 | GEAR: An Efficient KV Cache Compression Recipefor Near-Lossless Generative Inference of LLM | Hao Kang et.al. | 2403.05527 | link |
2024-03-08 | Beyond Finite Data: Towards Data-free Out-of-distribution Generalization via Extrapola | Yijiang Li et.al. | 2403.05523 | null |
2024-03-08 | Will GPT-4 Run DOOM? | Adrian de Wynter et.al. | 2403.05468 | null |
2024-03-08 | Cost-Performance Optimization for Processing Low-Resource Language Tasks Using Commercial LLMs | Arijit Nag et.al. | 2403.05434 | null |
2024-03-08 | Explaining Pre-Trained Language Models with Attribution Scores: An Analysis in Low-Resource Settings | Wei Zhou et.al. | 2403.05338 | null |
2024-03-08 | ChatASU: Evoking LLM's Reflexion to Truly Understand Aspect Sentiment in Dialogues | Yiding Liu et.al. | 2403.05326 | null |
2024-03-08 | RAT: Retrieval Augmented Thoughts Elicit Context-Aware Reasoning in Long-Horizon Generation | Zihao Wang et.al. | 2403.05313 | null |
2024-03-08 | Tapilot-Crossing: Benchmarking and Evolving LLMs Towards Interactive Data Analysis Agents | Jinyang Li et.al. | 2403.05307 | null |
2024-03-08 | ACLSum: A New Dataset for Aspect-based Summarization of Scientific Publications | Sotaro Takeshita et.al. | 2403.05303 | link |
2024-03-07 | iScore: Visual Analytics for Interpreting How Language Models Automatically Score Summaries | Adam Coscia et.al. | 2403.04760 | link |
2024-03-07 | KnowledgeVIS: Interpreting Language Models by Comparing Fill-in-the-Blank Prompts | Adam Coscia et.al. | 2403.04758 | link |
2024-03-07 | LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error | Boshi Wang et.al. | 2403.04746 | link |
2024-03-07 | ObjectCompose: Evaluating Resilience of Vision-Based Models on Object-to-Background Compositional Changes | Hashmat Shadab Malik et.al. | 2403.04701 | null |
2024-03-07 | Fact-Checking the Output of Large Language Models via Token-Level Uncertainty Quantification | Ekaterina Fadeeva et.al. | 2403.04696 | null |
2024-03-07 | Telecom Language Models: Must They Be Large? | Nicola Piovesan et.al. | 2403.04666 | null |
2024-03-07 | Teaching Large Language Models to Reason with Reinforcement Learning | Alex Havrilla et.al. | 2403.04642 | null |
2024-03-07 | CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios | Qilang Ye et.al. | 2403.04640 | link |
2024-03-07 | A Detailed Audio-Text Data Simulation Pipeline using Single-Event Sounds | Xuenan Xu et.al. | 2403.04594 | null |
2024-03-07 | Wiki-TabNER:Advancing Table Interpretation Through Named Entity Recognition | Aneta Koleva et.al. | 2403.04577 | null |
2024-03-06 | Bridging Language and Items for Retrieval and Recommendation | Yupeng Hou et.al. | 2403.03952 | link |
2024-03-06 | Did Translation Models Get More Robust Without Anyone Even Noticing? | Ben Peters et.al. | 2403.03923 | null |
2024-03-06 | Fuzzing BusyBox: Leveraging LLM and Crash Reuse for Embedded Bug Unearthing | Asmita et.al. | 2403.03897 | null |
2024-03-06 | SaulLM-7B: A pioneering Large Language Model for Law | Pierre Colombo et.al. | 2403.03883 | null |
2024-03-06 | Learning to Decode Collaboratively with Multiple Language Models | Shannon Zejiang Shen et.al. | 2403.03870 | link |
2024-03-06 | On the Origins of Linear Representations in Large Language Models | Yibo Jiang et.al. | 2403.03867 | null |
2024-03-06 | KIWI: A Dataset of Knowledge-Intensive Writing Instructions for Answering Research Questions | Fangyuan Xu et.al. | 2403.03866 | null |
2024-03-06 | Are Language Models Puzzle Prodigies? Algorithmic Puzzles Unveil Serious Challenges in Multimodal Reasoning | Deepanway Ghosal et.al. | 2403.03864 | link |
2024-03-06 | X-Shot: A Unified System to Handle Frequent, Few-shot and Zero-shot Learning Simultaneously in Classification | Hanzi Xu et.al. | 2403.03863 | link |
2024-03-06 | Emojinize : Enriching Any Text with Emoji Translations | Lars Henning Klein et.al. | 2403.03857 | null |
2024-03-05 | The WMDP Benchmark: Measuring and Reducing Malicious Use With Unlearning | Nathaniel Li et.al. | 2403.03218 | null |
2024-03-05 | CLEVR-POC: Reasoning-Intensive Visual Question Answering in Partially Observable Environments | Savitha Sam Abraham et.al. | 2403.03203 | null |
2024-03-05 | Towards Democratized Flood Risk Management: An Advanced AI Assistant Enabled by GPT-4 for Enhanced Interpretability and Public Engagement | Rafaela Martelo et.al. | 2403.03188 | link |
2024-03-05 | How Well Can Transformers Emulate In-context Newton's Method? | Angeliki Giannou et.al. | 2403.03183 | null |
2024-03-05 | Behavior Generation with Latent Actions | Seungjae Lee et.al. | 2403.03181 | link |
2024-03-05 | SNIFFER: Multimodal Large Language Model for Explainable Out-of-Context Misinformation Detection | Peng Qi et.al. | 2403.03170 | null |
2024-03-05 | PARADISE: Evaluating Implicit Planning Skills of Language Models with Procedural Warnings and Tips Dataset | Arda Uzunoğlu et.al. | 2403.03167 | link |
2024-03-05 | Quantum Many-Body Physics Calculations with Large Language Models | Haining Pan et.al. | 2403.03154 | null |
2024-03-05 | Language Guided Exploration for RL Agents in Text Environments | Hitesh Golchha et.al. | 2403.03141 | null |
2024-03-05 | Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution | Flor Miriam Plaza-del-Arco et.al. | 2403.03121 | null |
2024-03-02 | LM4OPT: Unveiling the Potential of Large Language Models in Formulating Mathematical Optimization Problems | Tasnim Ahmed et.al. | 2403.01342 | null |
2024-03-02 | Chaining thoughts and LLMs to learn DNA structural biophysics | Tyler D. Ross et.al. | 2403.01332 | null |
2024-03-02 | VBART: The Turkish LLM | Meliksah Turker et.al. | 2403.01308 | null |
2024-03-02 | Improving the Validity of Automatically Generated Feedback via Reinforcement Learning | Alexander Scarlatos et.al. | 2403.01304 | link |
2024-03-02 | NoMAD-Attention: Efficient LLM Inference on CPUs Through Multiply-add-free Attention | Tianyi Zhang et.al. | 2403.01273 | null |
2024-03-02 | Employing LLMs for Incident Response Planning and Review | Sam Hays et.al. | 2403.01271 | null |
2024-03-02 | Dissecting Language Models: Machine Unlearning via Selective Pruning | Nicholas Pochinkov et.al. | 2403.01267 | null |
2024-03-02 | Accelerating Greedy Coordinate Gradient via Probe Sampling | Yiran Zhao et.al. | 2403.01251 | link |
2024-03-02 | SceneCraft: An LLM Agent for Synthesizing 3D Scene as Blender Code | Ziniu Hu et.al. | 2403.01248 | null |
2024-03-02 | Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal | Jianheng Huang et.al. | 2403.01244 | null |
2024-02-29 | The All-Seeing Project V2: Towards General Relation Comprehension of the Open World | Weiyun Wang et.al. | 2402.19474 | link |
2024-02-29 | Loose LIPS Sink Ships: Asking Questions in Battleship with Language-Informed Program Sampling | Gabriel Grand et.al. | 2402.19471 | null |
2024-02-29 | Towards Tracing Trustworthiness Dynamics: Revisiting Pre-training Period of Large Language Models | Chen Qian et.al. | 2402.19465 | link |
2024-02-29 | Curiosity-driven Red-teaming for Large Language Models | Zhang-Wei Hong et.al. | 2402.19464 | link |
2024-02-29 | ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL | Yifei Zhou et.al. | 2402.19446 | link |
2024-02-29 | Compositional API Recommendation for Library-Oriented Code Generation | Zexiong Ma et.al. | 2402.19431 | null |
2024-02-29 | Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models | Soham De et.al. | 2402.19427 | null |
2024-02-29 | Crafting Knowledge: Exploring the Creative Mechanisms of Chat-Based Search Engines | Lijia Ma et.al. | 2402.19421 | null |
2024-02-29 | On the Scaling Laws of Geographical Representation in Language Models | Nathan Godey et.al. | 2402.19406 | null |
2024-02-29 | Entity-Aware Multimodal Alignment Framework for News Image Captioning | Junzhe Zhang et.al. | 2402.19404 | null |
2024-02-28 | Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards | Haoxiang Wang et.al. | 2402.18571 | link |
2024-02-28 | A Categorization of Complexity Classes for Information Retrieval and Synthesis Using Natural Logic | Gregory Coppola et.al. | 2402.18566 | null |
2024-02-28 | Implicit Bias of Next-Token Prediction | Christos Thrampoulidis et.al. | 2402.18551 | null |
2024-02-28 | RNNs are not Transformers (Yet): The Key Bottleneck on In-context Retrieval | Kaiyue Wen et.al. | 2402.18510 | link |
2024-02-28 | Orchid: Flexible and Data-Dependent Convolution for Sequence Modeling | Mahdi Karami et.al. | 2402.18508 | null |
2024-02-28 | Few-Shot Fairness: Unveiling LLM's Potential for Fairness-Aware Classification | Garima Chhikara et.al. | 2402.18502 | null |
2024-02-28 | Language Models Represent Beliefs of Self and Others | Wentao Zhu et.al. | 2402.18496 | null |
2024-02-28 | Meta-Task Prompting Elicits Embedding from Large Language Models | Yibin Lei et.al. | 2402.18458 | null |
2024-02-28 | Beyond Natural Language: LLMs Leveraging Alternative Formats for Enhanced Reasoning and Communication | Weize Chen et.al. | 2402.18439 | link |
2024-02-28 | Decomposed Prompting: Unveiling Multilingual Linguistic Structure Knowledge in English-Centric Large Language Models | Ercong Nie et.al. | 2402.18397 | null |
2024-02-27 | ShapeLLM: Universal 3D Object Understanding for Embodied Interaction | Zekun Qi et.al. | 2402.17766 | link |
2024-02-27 | The Era of 1-bit LLMs: All Large Language Models are in 1.58 Bits | Shuming Ma et.al. | 2402.17764 | null |
2024-02-27 | Massive Activations in Large Language Models | Mingjie Sun et.al. | 2402.17762 | link |
2024-02-27 | Evaluating Very Long-Term Conversational Memory of LLM Agents | Adyasha Maharana et.al. | 2402.17753 | null |
2024-02-27 | Tower: An Open Multilingual Large Language Model for Translation-Related Tasks | Duarte M. Alves et.al. | 2402.17733 | null |
2024-02-27 | AmbigNLG: Addressing Task Ambiguity in Instruction for NLG | Ayana Niwa et.al. | 2402.17717 | null |
2024-02-27 | Case-Based or Rule-Based: How Do Transformers Do the Math? | Yi Hu et.al. | 2402.17709 | link |
2024-02-27 | NextLevelBERT: Investigating Masked Language Modeling with Higher-Level Representations for Long Documents | Tamara Czinczoll et.al. | 2402.17682 | null |
2024-02-27 | The Emergence of Large Language Models in Static Analysis: A First Look through Micro-Benchmarks | Ashwin Prasad Shivarpatna Venkatesh et.al. | 2402.17679 | null |
2024-02-27 | Beyond prompt brittleness: Evaluating the reliability and consistency of political worldviews in LLMs | Tanise Ceron et.al. | 2402.17649 | null |
2024-02-26 | Integrating Large Language Models with Graphical Session-Based Recommendation | Naicheng Guo et.al. | 2402.16539 | null |
2024-02-26 | LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments | Junzhe Chen et.al. | 2402.16499 | null |
2024-02-26 | Unveiling ChatGPT's Usage in Open Source Projects: A Mining-based Study | Rosalia Tufano et.al. | 2402.16480 | null |
2024-02-26 | Defending LLMs against Jailbreaking Attacks via Backtranslation | Yihan Wang et.al. | 2402.16459 | null |
2024-02-26 | ProLLaMA: A Protein Large Language Model for Multi-Task Protein Language Processing | Liuzhenghao Lv et.al. | 2402.16445 | null |
2024-02-26 | ShieldLM: Empowering LLMs as Aligned, Customizable and Explainable Safety Detectors | Zhexin Zhang et.al. | 2402.16444 | link |
2024-02-26 | Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models | Tianyi Tang et.al. | 2402.16438 | null |
2024-02-26 | RoCoIns: Enhancing Robustness of Large Language Models through Code-Style Instructions | Yuansen Zhang et.al. | 2402.16431 | null |
2024-02-26 | From RAGs to riches: Using large language models to write documents for clinical trials | Nigel Markey et.al. | 2402.16406 | null |
2024-02-26 | MoZIP: A Multilingual Benchmark to Evaluate Large Language Models in Intellectual Property | Shiwen Ni et.al. | 2402.16389 | link |
2024-02-26 | Immunization against harmful fine-tuning attacks | Domenic Rosati et.al. | 2402.16382 | null |
2024-02-23 | AgentOhana: Design Unified Data and Training Pipeline for Effective Agent Learning | Jianguo Zhang et.al. | 2402.15506 | null |
2024-02-23 | API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs | Kinjal Basu et.al. | 2402.15491 | null |
2024-02-23 | Prejudice and Caprice: A Statistical Framework for Measuring Social Discrimination in Large Language Models | Yiran Liu et.al. | 2402.15481 | null |
2024-02-23 | Repetition Improves Language Model Embeddings | Jacob Mitchell Springer et.al. | 2402.15449 | link |
2024-02-23 | A Data-Centric Approach To Generate Faithful and High Quality Patient Summaries with Large Language Models | Stefan Hegselmann et.al. | 2402.15422 | link |
2024-02-23 | PREDILECT: Preferences Delineated with Zero-Shot Language-based Reasoning in Reinforcement Learning | Simon Holk et.al. | 2402.15420 | null |
2024-02-23 | Explorations of Self-Repair in Language Models | Cody Rushing et.al. | 2402.15390 | link |
2024-02-23 | Safe Task Planning for Language-Instructed Multi-Robot Systems using Conformal Prediction | Jun Wang et.al. | 2402.15368 | null |
2024-02-23 | Farsight: Fostering Responsible AI Awareness During AI Application Prototyping | Zijie J. Wang et.al. | 2402.15350 | link |
2024-02-23 | NuNER: Entity Recognition Encoder Pre-training via LLM-Annotated Data | Sergei Bogdanov et.al. | 2402.15343 | null |
2024-02-22 | PALO: A Polyglot Large Multimodal Model for 5B People | Muhammad Maaz et.al. | 2402.14818 | link |
2024-02-22 | CriticBench: Benchmarking LLMs for Critique-Correct Reasoning | Zicheng Lin et.al. | 2402.14809 | link |
2024-02-22 | RelayAttention for Efficient Large Language Model Serving with Long System Prompts | Lei Zhu et.al. | 2402.14808 | null |
2024-02-22 | A Decision-Language Model (DLM) for Dynamic Restless Multi-Armed Bandit Tasks in Public Health | Nikhil Behari et.al. | 2402.14807 | null |
2024-02-22 | Identifying Multiple Personalities in Large Language Models with External Evaluation | Xiaoyang Song et.al. | 2402.14805 | null |
2024-02-22 | Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models | Xudong Lu et.al. | 2402.14800 | link |
2024-02-22 | Zero-shot cross-lingual transfer in instruction tuning of large language model | Nadezhda Chirkova et.al. | 2402.14778 | null |
2024-02-22 | DualFocus: Integrating Macro and Micro Perspectives in Multi-modal Large Language Models | Yuhang Cao et.al. | 2402.14767 | link |
2024-02-22 | MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues | Ge Bai et.al. | 2402.14762 | null |
2024-02-22 | Generalizing Reward Modeling for Out-of-Distribution Preference Learning | Chen Jia et.al. | 2402.14760 | null |
2024-02-21 | Coercing LLMs to do and reveal (almost) anything | Jonas Geiping et.al. | 2402.14020 | link |
2024-02-21 | Is LLM-as-a-Judge Robust? Investigating Universal Adversarial Attacks on Zero-shot LLM Assessment | Vyas Raina et.al. | 2402.14016 | null |
2024-02-21 | OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems | Chaoqun He et.al. | 2402.14008 | null |
2024-02-21 | Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models | Zhiwei He et.al. | 2402.14007 | null |
2024-02-21 | Hallucinations or Attention Misdirection? The Path to Strategic Value Extraction in Business Using Large Language Models | Aline Ioste et.al. | 2402.14002 | null |
2024-02-21 | Towards Building Multilingual Language Model for Medicine | Pengcheng Qiu et.al. | 2402.13963 | link |
2024-02-21 | Making Reasoning Matter: Measuring and Improving Faithfulness of Chain-of-Thought Reasoning | Debjit Paul et.al. | 2402.13950 | null |
2024-02-21 | Do Efficient Transformers Really Save Computation? | Kai Yang et.al. | 2402.13934 | null |
2024-02-21 | Large Language Models are Vulnerable to Bait-and-Switch Attacks for Generating Harmful Content | Federico Bianchi et.al. | 2402.13926 | null |
2024-02-21 | SYNFAC-EDIT: Synthetic Imitation Edit Feedback for Factual Alignment in Clinical Summarization | Prakamya Mishra et.al. | 2402.13919 | null |
2024-02-20 | Unlocking Insights: Semantic Search in Jupyter Notebooks | Lan Li et.al. | 2402.13234 | null |
2024-02-20 | Investigating Cultural Alignment of Large Language Models | Badr AlKhamissi et.al. | 2402.13231 | link |
2024-02-20 | Smaug: Fixing Failure Modes of Preference Optimisation with DPO-Positive | Arka Pal et.al. | 2402.13228 | null |
2024-02-20 | AgentMD: Empowering Language Agents for Risk Prediction with Large-Scale Clinical Tool Learning | Qiao Jin et.al. | 2402.13225 | null |
2024-02-20 | RoCode: A Dataset for Measuring Code Intelligence from Problem Definitions in Romanian | Adrian Cosma et.al. | 2402.13222 | link |
2024-02-20 | How Easy is It to Fool Your Multimodal LLMs? An Empirical Analysis on Deceptive Prompts | Yusu Qian et.al. | 2402.13220 | null |
2024-02-20 | Softmax Probabilities (Mostly) Predict Large Language Model Correctness on Multiple-Choice Q&A | Benjamin Plaut et.al. | 2402.13213 | link |
2024-02-20 | Soft Self-Consistency Improves Language Model Agents | Han Wang et.al. | 2402.13212 | link |
2024-02-20 | Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation | Dongjin Kang et.al. | 2402.13211 | null |
2024-02-20 | Bayesian Reward Models for LLM Alignment | Adam X. Yang et.al. | 2402.13210 | null |
2024-02-19 | Sequoia: Scalable, Robust, and Hardware-aware Speculative Decoding | Zhuoming Chen et.al. | 2402.12374 | null |
2024-02-19 | A Critical Evaluation of AI Feedback for Aligning Large Language Models | Archit Sharma et.al. | 2402.12366 | link |
2024-02-19 | Nonlinear Discrete-Time Observers with Physics-Informed Neural Networks | Hector Vargas Alvarez et.al. | 2402.12360 | null |
2024-02-19 | Graph-Based Retriever Captures the Long Tail of Biomedical Knowledge | Julien Delile et.al. | 2402.12352 | null |
2024-02-19 | GTBench: Uncovering the Strategic Reasoning Limitations of LLMs via Game-Theoretic Evaluations | Jinhao Duan et.al. | 2402.12348 | link |
2024-02-19 | Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! | Zhanhui Zhou et.al. | 2402.12343 | null |
2024-02-19 | Shall We Talk: Exploring Spontaneous Collaborations of Competing LLM Agents | Zengqing Wu et.al. | 2402.12327 | link |
2024-02-19 | ARKS: Active Retrieval in Knowledge Soup for Code Generation | Hongjin Su et.al. | 2402.12317 | null |
2024-02-19 | Is Open-Source There Yet? A Comparative Study on Commercial and Open-Source LLMs in Their Ability to Label Chest X-Ray Reports | Felix J. Dorfner et.al. | 2402.12298 | null |
2024-02-19 | Adaptive Skeleton Graph Decoding | Shuowei Jin et.al. | 2402.12280 | null |
2024-02-16 | PaLM2-VAdapter: Progressively Aligned Language Model Makes a Strong Vision-language Adapter | Junfei Xiao et.al. | 2402.10896 | null |
2024-02-16 | RLVF: Learning from Verbal Feedback without Overgeneralization | Moritz Stephan et.al. | 2402.10893 | null |
2024-02-16 | Instruction Diversity Drives Generalization To Unseen Tasks | Dylan Zhang et.al. | 2402.10891 | null |
2024-02-16 | When is Tree Search Useful for LLM Planning? It Depends on the Discriminator | Ziru Chen et.al. | 2402.10890 | null |
2024-02-16 | Multi-modal preference alignment remedies regression of visual instruction tuning on language model | Shengzhi Li et.al. | 2402.10884 | null |
2024-02-16 | EcoRank: Budget-Constrained Text Re-ranking Using Large Language Models | Muhammad Shihab Rashid et.al. | 2402.10866 | null |
2024-02-16 | Time Series Forecasting with LLMs: Understanding and Enhancing Model Capabilities | Mingyu Jin et.al. | 2402.10835 | null |
2024-02-16 | RAG-Driver: Generalisable Driving Explanations with Retrieval-Augmented In-Context Learning in Multi-Modal Large Language Model | Jianhao Yuan et.al. | 2402.10828 | null |
2024-02-16 | Quantifying the Persona Effect in LLM Simulations | Tiancheng Hu et.al. | 2402.10811 | null |
2024-02-16 | Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond | Yongqi Li et.al. | 2402.10805 | null |
2024-02-15 | Hierarchical State Space Models for Continuous Sequence-to-Sequence Modeling | Raunaq Bhirangi et.al. | 2402.10211 | null |
2024-02-15 | Self-Play Fine-Tuning of Diffusion Models for Text-to-Image Generation | Huizhuo Yuan et.al. | 2402.10210 | null |
2024-02-15 | Rewards-in-Context: Multi-objective Alignment of Foundation Models with Dynamic Preference Adjustment | Rui Yang et.al. | 2402.10207 | null |
2024-02-15 | Chain-of-Thought Reasoning Without Prompting | Xuezhi Wang et.al. | 2402.10200 | null |
2024-02-15 | A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents | Lingbo Mo et.al. | 2402.10196 | link |
2024-02-15 | BitDelta: Your Fine-Tune May Only Be Worth One Bit | James Liu et.al. | 2402.10193 | link |
2024-02-15 | Uncertainty Decomposition and Quantification for In-Context Learning of Large Language Models | Chen Ling et.al. | 2402.10189 | link |
2024-02-15 | Rethinking Information Structures in RLHF: Reward Generalization from a Graph Theory Perspective | Tianyi Qiu et.al. | 2402.10184 | null |
2024-02-15 | TDAG: A Multi-Agent Framework based on Dynamic Task Decomposition and Agent Generation | Yaoxiang Wang et.al. | 2402.10178 | null |
2024-02-15 | OpenMathInstruct-1: A 1.8 Million Math Instruction Tuning Dataset | Shubham Toshniwal et.al. | 2402.10176 | link |
2024-02-14 | AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential Reasoning Ability | Siwei Yang et.al. | 2402.09404 | link |
2024-02-14 | Reinforcement Learning from Human Feedback with Active Queries | Kaixuan Ji et.al. | 2402.09401 | null |
2024-02-14 | Get More with LESS: Synthesizing Recurrence with KV Cache Compression for Efficient LLM Inference | Harry Dong et.al. | 2402.09398 | null |
2024-02-14 | LlaSMol: Advancing Large Language Models for Chemistry with a Large-Scale, Comprehensive, High-Quality Instruction Tuning Dataset | Botao Yu et.al. | 2402.09391 | link |
2024-02-14 | HGOT: Hierarchical Graph of Thoughts for Retrieval-Augmented In-Context Learning in Factuality Evaluation | Yihao Fang et.al. | 2402.09390 | null |
2024-02-14 | Massively Multi-Cultural Knowledge Acquisition & LM Benchmarking | Yi Fung et.al. | 2402.09369 | null |
2024-02-14 | Copyright Traps for Large Language Models | Matthieu Meeus et.al. | 2402.09363 | null |
2024-02-14 | HiRE: High Recall Approximate Top- |
Yashas Samaga B L et.al. | 2402.09360 | null |
2024-02-14 | Developing a Framework for Auditing Large Language Models Using Human-in-the-Loop | Maryam Amirizaniani et.al. | 2402.09346 | null |
2024-02-14 | AuditLLM: A Tool for Auditing Large Language Models Using Multiprobe Approach | Maryam Amirizaniani et.al. | 2402.09334 | null |
2024-02-13 | COLD-Attack: Jailbreaking LLMs with Stealthiness and Controllability | Xingang Guo et.al. | 2402.08679 | link |
2024-02-13 | Human Curriculum Effects Emerge with In-Context Learning in Neural Networks | Jacob Russin et.al. | 2402.08674 | null |
2024-02-13 | Improving Generalization in Semantic Parsing by Increasing Natural Language Variation | Irina Saparina et.al. | 2402.08666 | null |
2024-02-13 | The Last JITAI? The Unreasonable Effectiveness of Large Language Models in Issuing Just-in-Time Adaptive Interventions: Fostering Physical Activity in a Prospective Cardiac Rehabilitation Setting | David Haag et.al. | 2402.08658 | null |
2024-02-13 | PIN: Positional Insert Unlocks Object Localisation Abilities in VLMs | Michael Dorkenwald et.al. | 2402.08657 | null |
2024-02-13 | Tandem Transformers for Inference Efficient LLMs | Aishwarya P S et.al. | 2402.08644 | null |
2024-02-13 | SemRel2024: A Collection of Semantic Textual Relatedness Datasets for 14 Languages | Nedjma Ousidhoum et.al. | 2402.08638 | null |
2024-02-13 | Knowledge Editing on Black-box Large Language Models | Xiaoshuai Song et.al. | 2402.08631 | null |
2024-02-13 | Test-Time Backdoor Attacks on Multimodal Large Language Models | Dong Lu et.al. | 2402.08577 | link |
2024-02-13 | Agent Smith: A Single Image Can Jailbreak One Million Multimodal LLM Agents Exponentially Fast | Xiangming Gu et.al. | 2402.08567 | link |
2024-02-12 | WildfireGPT: Tailored Large Language Model for Wildfire Analysis | Yangxinyu Xie et.al. | 2402.07877 | null |
2024-02-12 | Policy Improvement using Language Feedback Models | Victor Zhong et.al. | 2402.07876 | null |
2024-02-12 | Scaling Laws for Fine-Grained Mixture of Experts | Jakub Krajewski et.al. | 2402.07871 | null |
2024-02-12 | PoisonedRAG: Knowledge Poisoning Attacks to Retrieval-Augmented Generation of Large Language Models | Wei Zou et.al. | 2402.07867 | link |
2024-02-12 | AI-Augmented Predictions: LLM Assistants Improve Human Forecasting Accuracy | Philipp Schoenegger et.al. | 2402.07862 | null |
2024-02-12 | Lissard: Long and Simple Sequential Reasoning Datasets | Mirelle Bueno et.al. | 2402.07859 | null |
2024-02-12 | Mercury: An Efficiency Benchmark for LLM Code Synthesis | Mingzhe Du et.al. | 2402.07844 | null |
2024-02-12 | Do Membership Inference Attacks Work on Large Language Models? | Michael Duan et.al. | 2402.07841 | null |
2024-02-12 | Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model | Ahmet Üstün et.al. | 2402.07827 | null |
2024-02-12 | Differentially Private Zeroth-Order Methods for Scalable Large Language Model Finetuning | Z Liu et.al. | 2402.07818 | null |
2024-02-09 | Understanding the Effects of Iterative Prompting on Truthfulness | Satyapriya Krishna et.al. | 2402.06625 | null |
2024-02-09 | Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning | Shivalika Singh et.al. | 2402.06619 | null |
2024-02-09 | On the Out-Of-Distribution Generalization of Multimodal Large Language Models | Xingxuan Zhang et.al. | 2402.06599 | null |
2024-02-09 | CigaR: Cost-efficient Program Repair with LLMs | Dávid Hidvégi et.al. | 2402.06598 | null |
2024-02-09 | Understanding the Weakness of Large Language Model Agents within a Complex Android Environment | Mingzhe Xing et.al. | 2402.06596 | link |
2024-02-09 | G-SciEdBERT: A Contextualized LLM for Science Assessment Tasks in German | Ehsan Latif et.al. | 2402.06584 | null |
2024-02-09 | The Quantified Boolean Bayesian Network: Theory and Experiments with a Logical Graphical Model | Gregory Coppola et.al. | 2402.06557 | null |
2024-02-09 | Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA | Marek Šuppa et.al. | 2402.06549 | null |
2024-02-09 | Calibrating Long-form Generations from Large Language Models | Yukun Huang et.al. | 2402.06544 | null |
2024-02-09 | Introspective Planning: Guiding Language-Enabled Agents to Refine Their Own Uncertainty | Kaiqu Liang et.al. | 2402.06529 | null |
2024-02-08 | SPHINX-X: Scaling Data and Parameters for a Family of Multi-modal Large Language Models | Peng Gao et.al. | 2402.05935 | link |
2024-02-08 | Driving Everywhere with Large Language Model Policy Adaptation | Boyi Li et.al. | 2402.05932 | null |
2024-02-08 | WebLINX: Real-World Website Navigation with Multi-Turn Dialogue | Xing Han Lù et.al. | 2402.05930 | null |
2024-02-08 | On the Convergence of Zeroth-Order Federated Tuning in Large Language Models | Zhenqing Ling et.al. | 2402.05926 | null |
2024-02-08 | Efficient Stagewise Pretraining via Progressive Subnetworks | Abhishek Panigrahi et.al. | 2402.05913 | null |
2024-02-08 | FACT-GPT: Fact-Checking Augmentation via Claim Matching with LLMs | Eun Cheol Choi et.al. | 2402.05904 | null |
2024-02-08 | Large Language Model Meets Graph Neural Network in Knowledge Distillation | Shengxiang Hu et.al. | 2402.05894 | null |
2024-02-08 | Generative Echo Chamber? Effects of LLM-Powered Search Systems on Diverse Information Seeking | Nikhil Sharma et.al. | 2402.05880 | null |
2024-02-08 | PromptCrypt: Prompt Encryption for Secure Communication with Large Language Models | Guo Lin et.al. | 2402.05868 | link |
2024-02-08 | How Well Can LLMs Negotiate? NegotiationArena Platform and Analysis | Federico Bianchi et.al. | 2402.05863 | link |
2024-02-07 | Opening the AI black box: program synthesis via mechanistic interpretability | Eric J. Michaud et.al. | 2402.05110 | null |
2024-02-07 | You Can REST Now: Automated Specification Inference and Black-Box Testing of RESTful APIs with Large Language Models | Alix Decrop et.al. | 2402.05102 | null |
2024-02-07 | Hydragen: High-Throughput LLM Inference with Shared Prefixes | Jordan Juravsky et.al. | 2402.05099 | null |
2024-02-07 | Mamba-UNet: UNet-Like Pure Visual Mamba for Medical Image Segmentation | Ziyang Wang et.al. | 2402.05079 | link |
2024-02-07 | SALAD-Bench: A Hierarchical and Comprehensive Safety Benchmark for Large Language Models | Lijun Li et.al. | 2402.05044 | link |
2024-02-07 | A Sober Look at LLMs for Material Discovery: Are They Actually Good for Bayesian Optimization Over Molecules? | Agustinus Kristiadi et.al. | 2402.05015 | link |
2024-02-07 | Pedagogical Alignment of Large Language Models | Shashank Sonkar et.al. | 2402.05000 | null |
2024-02-07 | An Enhanced Prompt-Based LLM Reasoning Scheme via Knowledge Graph-Integrated Collaboration | Yihao Li et.al. | 2402.04978 | null |
2024-02-07 | ChatScratch: An AI-Augmented System Toward Autonomous Visual Programming Learning for Children Aged 6-12 | Liuqing Chen et.al. | 2402.04975 | null |
2024-02-07 | Reconfidencing LLMs from the Grouping Loss Perspective | Lihu Chen et.al. | 2402.04957 | null |
2024-02-06 | AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls | Yu Du et.al. | 2402.04253 | null |
2024-02-06 | HarmBench: A Standardized Evaluation Framework for Automated Red Teaming and Robust Refusal | Mantas Mazeika et.al. | 2402.04249 | link |
2024-02-06 | Prioritizing Safeguarding Over Autonomy: Risks of LLM Agents for Science | Xiangru Tang et.al. | 2402.04247 | null |
2024-02-06 | Can Generative Agents Predict Emotion? | Ciaran Regan et.al. | 2402.04232 | null |
2024-02-06 | Explaining Autonomy: Enhancing Human-Robot Interaction through Explanation Generation with Large Language Models | David Sobrín-Hidalgo et.al. | 2402.04206 | null |
2024-02-06 | SHIELD : An Evaluation Benchmark for Face Spoofing and Forgery Detection with Multimodal Large Language Models | Yichen Shi et.al. | 2402.04178 | link |
2024-02-06 | Scaling Laws for Downstream Task Performance of Large Language Models | Berivan Isik et.al. | 2402.04177 | null |
2024-02-06 | Multi-line AI-assisted Code Authoring | Omer Dunay et.al. | 2402.04141 | null |
2024-02-06 | U-shaped Vision Mamba for Single Image Dehazing | Zhuoran Zheng et.al. | 2402.04139 | null |
2024-02-06 | Scientific Language Modeling: A Quantitative Review of Large Language Models in Molecular Science | Pengfei Liu et.al. | 2402.04119 | link |
2024-02-05 | Nevermind: Instruction Override and Moderation in Large Language Models | Edward Kim et.al. | 2402.03303 | null |
2024-02-05 | Swin-UMamba: Mamba-based UNet with ImageNet-based pretraining | Jiarun Liu et.al. | 2402.03302 | link |
2024-02-05 | GUARD: Role-playing to Generate Natural-language Jailbreakings to Test Guideline Adherence of Large Language Models | Haibo Jin et.al. | 2402.03299 | null |
2024-02-05 | Make Every Move Count: LLM-based High-Quality RTL Code Generation Using MCTS | Matthew DeLorenzo et.al. | 2402.03289 | null |
2024-02-05 | Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models | Anthony Sicilia et.al. | 2402.03284 | null |
2024-02-05 | Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models | Zhiyuan Hu et.al. | 2402.03271 | link |
2024-02-05 | Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills | Kolby Nottingham et.al. | 2402.03244 | null |
2024-02-05 | JOBSKAPE: A Framework for Generating Synthetic Job Postings to Enhance Skill Matching | Antoine Magron et.al. | 2402.03242 | link |
2024-02-05 | English Prompts are Better for NLI-based Zero-Shot Emotion Classification than Target-Language Prompts | Patrick Barreiß et.al. | 2402.03223 | null |
2024-02-05 | Unified Hallucination Detection for Multimodal Large Language Models | Xiang Chen et.al. | 2402.03190 | link |
2024-02-02 | TravelPlanner: A Benchmark for Real-World Planning with Language Agents | Jian Xie et.al. | 2402.01622 | null |
2024-02-02 | Stochastic Two Points Method for Deep Model Zeroth-order Optimization | Yijiang Pang et.al. | 2402.01621 | null |
2024-02-02 | MAGDi: Structured Distillation of Multi-Agent Interaction Graphs Improves Reasoning in Smaller Language Models | Justin Chih-Yao Chen et.al. | 2402.01620 | link |
2024-02-02 | KB-Plugin: A Plug-and-play Framework for Large Language Models to Induce Programs over Low-resourced Knowledge Bases | Jiajie Zhang et.al. | 2402.01619 | link |
2024-02-02 | Style Vectors for Steering Generative Large Language Model | Kai Konen et.al. | 2402.01618 | link |
2024-02-02 | Foundation Model Sherpas: Guiding Foundation Models through Knowledge and Reasoning | Debarun Bhattacharjya et.al. | 2402.01602 | null |
2024-02-02 | BAT: Learning to Reason about Spatial Sounds with Large Language Models | Zhisheng Zheng et.al. | 2402.01591 | null |
2024-02-02 | Homogenization Effects of Large Language Models on Human Creative Ideation | Barrett R. Anderson et.al. | 2402.01536 | null |
2024-02-02 | Decoding Speculative Decoding | Minghao Yan et.al. | 2402.01528 | null |
2024-02-02 | K-Level Reasoning with Large Language Models | Yadong Zhang et.al. | 2402.01521 | null |
2024-02-01 | Evaluating Large Language Models for Generalization and Robustness via Data Compression | Yucheng Li et.al. | 2402.00861 | null |
2024-02-01 | Can Large Language Models Understand Context? | Yilun Zhu et.al. | 2402.00858 | null |
2024-02-01 | SymbolicAI: A framework for logic-based approaches combining generative models and solvers | Marius-Constantin Dinu et.al. | 2402.00854 | link |
2024-02-01 | Score-based Causal Representation Learning: Linear and General Transformations | Burak Varıcı et.al. | 2402.00849 | null |
2024-02-01 | Tiny Titans: Can Smaller Large Language Models Punch Above Their Weight in the Real World for Meeting Summarization? | Xue-Yong Fu et.al. | 2402.00841 | null |
2024-02-01 | Common errors in Generative AI systems used for knowledge extraction in the climate action domain | Denis Havlik et.al. | 2402.00830 | null |
2024-02-01 | Formal-LLM: Integrating Formal Language and Natural Language for Controllable LLM-based Agents | Zelong Li et.al. | 2402.00798 | link |
2024-02-01 | LLMs learn governing principles of dynamical systems, revealing an in-context neural scaling law | Toni J. B. Liu et.al. | 2402.00795 | null |
2024-02-01 | CroissantLLM: A Truly Bilingual French-English Language Model | Manuel Faysse et.al. | 2402.00786 | link |
2024-02-01 | Dense Reward for Free in Reinforcement Learning from Human Feedback | Alex J. Chan et.al. | 2402.00782 | link |
2024-01-31 | Do Language Models Exhibit the Same Cognitive Biases in Problem Solving as Human Learners? | Andreas Opedal et.al. | 2401.18070 | null |
2024-01-31 | LongAlign: A Recipe for Long Context Alignment of Large Language Models | Yushi Bai et.al. | 2401.18058 | link |
2024-01-31 | Paramanu: A Family of Novel Efficient Indic Generative Foundation Language Models | Mitodru Niyogi et.al. | 2401.18034 | null |
2024-01-31 | Supporting Anticipatory Governance using LLMs: Evaluating and Aligning Large Language Models with the News Media to Anticipate the Negative Impacts of AI | Mowafak Allaham et.al. | 2401.18028 | null |
2024-01-31 | Prompt-Driven LLM Safeguarding via Directed Representation Optimization | Chujie Zheng et.al. | 2401.18018 | link |
2024-01-31 | EEG-GPT: Exploring Capabilities of Large Language Models for EEG Classification and Interpretation | Jonathan W. Kim et.al. | 2401.18006 | null |
2024-01-31 | Evaluating the Effectiveness of GPT-4 Turbo in Creating Defeaters for Assurance Cases | Kimya Khakzad Shahandashti et.al. | 2401.17991 | null |
2024-01-31 | Enhancing Multimodal Large Language Models with Vision Detection Models: An Empirical Study | Qirui Jiao et.al. | 2401.17981 | null |
2024-01-31 | HyperZ $\cdot$Z$\cdot$ W Operator Connects Slow-Fast Networks for Full Context Interaction | Harvie Zhang et.al. | 2401.17948 | null |
2024-01-31 | LOCOST: State-Space Models for Long Document Abstractive Summarization | Florian Le Bronnec et.al. | 2401.17919 | link |
2024-01-30 | Weaver: Foundation Models for Creative Writing | Tiannan Wang et.al. | 2401.17268 | null |
2024-01-30 | Weak-to-Strong Jailbreaking on Large Language Models | Xuandong Zhao et.al. | 2401.17256 | link |
2024-01-30 | LLaMP: Large Language Model Made Powerful for High-fidelity Materials Knowledge Retrieval and Distillation | Yuan Chiang et.al. | 2401.17244 | null |
2024-01-31 | GazeGPT: Augmenting Human Capabilities using Gaze-contingent Contextual AI for Smart Eyewear | Robert Konrad et.al. | 2401.17217 | null |
2024-01-30 | Data-efficient Fine-tuning for LLM-based Recommendation | Xinyu Lin et.al. | 2401.17197 | null |
2024-01-30 | Transfer Learning for Text Diffusion Models | Kehang Han et.al. | 2401.17181 | null |
2024-01-30 | Conditional and Modal Reasoning in Large Language Models | Wesley H. Holliday et.al. | 2401.17169 | null |
2024-01-30 | Planning, Creation, Usage: Benchmarking LLMs for Comprehensive Tool Utilization in Real-World Complex Scenarios | Shijue Huang et.al. | 2401.17167 | link |
2024-01-30 | Learning Agent-based Modeling with LLM Companions: Experiences of Novices and Experts Using ChatGPT & NetLogo Chat | John Chen et.al. | 2401.17163 | null |
2024-01-30 | Large Language Model Evaluation via Matrix Entropy | Lai Wei et.al. | 2401.17139 | link |
2024-01-29 | Scaling Sparse Fine-Tuning to Large Language Models | Alan Ansell et.al. | 2401.16405 | null |
2024-01-29 | Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling | Pratyush Maini et.al. | 2401.16380 | null |
2024-01-29 | The role of library versions in Developer-ChatGPT conversations | Rachna Raj et.al. | 2401.16340 | null |
2024-01-29 | Machine Translation Meta Evaluation through Translation Accuracy Challenge Sets | Nikita Moghe et.al. | 2401.16313 | null |
2024-01-29 | Security Code Review by LLMs: A Deep Dive into Responses | Jiaxin Yu et.al. | 2401.16310 | null |
2024-01-29 | CO2: Efficient Distributed Training with Full Communication-Computation Overlap | Weigao Sun et.al. | 2401.16265 | null |
2024-01-29 | An Empirical Study on Usage and Perceptions of LLMs in a Software Engineering Project | Sanka Rasnayaka et.al. | 2401.16186 | null |
2024-01-29 | LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs' Vulnerability Reasoning | Yuqiang Sun et.al. | 2401.16185 | null |
2024-01-29 | LLaMandement: Large Language Models for Summarization of French Legislative Proposals | Joseph Gesnouin et.al. | 2401.16182 | null |
2024-01-29 | On Decentralized Linearly Separable Computation With the Minimum Computation Cost | Haoning Chen et.al. | 2401.16181 | null |
2024-01-26 | EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty | Yuhui Li et.al. | 2401.15077 | null |
2024-01-26 | From GPT-4 to Gemini and Beyond: Assessing the Landscape of MLLMs on Generalizability, Trustworthiness and Causality through Four Modalities | Chaochao Lu et.al. | 2401.15071 | null |
2024-01-26 | Health Text Simplification: An Annotated Corpus for Digestive Cancer Education and Novel Strategies for Reinforcement Learning | Md Mushfiqur Rahman et.al. | 2401.15043 | null |
2024-01-26 | PROXYQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models | Haochen Tan et.al. | 2401.15042 | null |
2024-01-26 | On the generalization capacity of neural networks during generic multimodal reasoning | Takuya Ito et.al. | 2401.15030 | null |
2024-01-26 | SliceGPT: Compress Large Language Models by Deleting Rows and Columns | Saleh Ashkboos et.al. | 2401.15024 | null |
2024-01-26 | Reassessing Java Code Readability Models with a Human-Centered Approach | Agnia Sergeyuk et.al. | 2401.14936 | null |
2024-01-26 | Appropriateness of LLM-equipped Robotic Well-being Coach Language in the Workplace: A Qualitative Evaluation | Micol Spitale et.al. | 2401.14935 | null |
2024-01-26 | Do LLMs Dream of Ontologies? | Marco Bombieri et.al. | 2401.14931 | null |
2024-01-26 | The Power of Noise: Redefining Retrieval for RAG Systems | Florin Cuconasu et.al. | 2401.14887 | null |
2024-01-25 | The Typing Cure: Experiences with Large Language Model Chatbots for Mental Health Support | Inhwa Song et.al. | 2401.14362 | null |
2024-01-25 | ServerlessLLM: Locality-Enhanced Serverless Inference for Large Language Models | Yao Fu et.al. | 2401.14351 | null |
2024-01-25 | Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts | Maciej Besta et.al. | 2401.14295 | null |
2024-01-25 | RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models models via Romanization | Jaavid Aktar Husain et.al. | 2401.14280 | null |
2024-01-25 | ZS4C: Zero-Shot Synthesis of Compilable Code for Incomplete Code Snippets using ChatGPT | Azmain Kabir et.al. | 2401.14279 | null |
2024-01-25 | GPTVoiceTasker: LLM-Powered Virtual Assistant for Smartphone | Minh Duc Vu et.al. | 2401.14268 | null |
2024-01-25 | Transformers and Cortical Waves: Encoders for Pulling In Context Across Time | Lyle Muller et.al. | 2401.14267 | null |
2024-01-25 | Improving Natural Language Capability of Code Large Language Model | Wei Li et.al. | 2401.14242 | link |
2024-01-25 | DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence | Daya Guo et.al. | 2401.14196 | link |
2024-01-25 | How Can Large Language Models Understand Spatial-Temporal Data? | Lei Liu et.al. | 2401.14192 | null |
2024-01-24 | How Good is ChatGPT at Face Biometrics? A First Look into Recognition, Soft Biometrics, and Explainability | Ivan DeAndres-Tame et.al. | 2401.13641 | null |
2024-01-25 | MM-LLMs: Recent Advances in MultiModal Large Language Models | Duzhen Zhang et.al. | 2401.13601 | null |
2024-01-24 | Consistency Guided Knowledge Retrieval and Denoising in LLMs for Zero-shot Document-level Relation Triplet Extraction | Qi Sun et.al. | 2401.13598 | null |
2024-01-24 | Graph Guided Question Answer Generation for Procedural Question-Answering | Hai X. Pham et.al. | 2401.13594 | null |
2024-01-24 | Evaluation of General Large Language Models in Contextually Assessing Semantic Concepts Extracted from Adult Critical Care Electronic Health Record Notes | Darren Liu et.al. | 2401.13588 | null |
2024-01-24 | Fine-grained Contract NER using instruction based model | Hiranmai Sri Adibhatla et.al. | 2401.13545 | null |
2024-01-24 | SpeechGPT-Gen: Scaling Chain-of-Information Speech Generation | Dong Zhang et.al. | 2401.13527 | link |
2024-01-24 | Research about the Ability of LLM in the Tamper-Detection Area | Xinyu Yang et.al. | 2401.13504 | null |
2024-01-24 | How AI Ideas Affect the Creativity, Diversity, and Evolution of Human Ideas: Evidence From a Large, Dynamic Experiment | Joshua Ashkinaze et.al. | 2401.13481 | null |
2024-01-24 | Clue-Guided Path Exploration: An Efficient Knowledge Base Question-Answering Framework with Low Computational Resource Consumption | Dehao Tao et.al. | 2401.13444 | null |
2024-01-23 | HAZARD Challenge: Embodied Decision Making in Dynamically Changing Environments | Qinhong Zhou et.al. | 2401.12975 | link |
2024-01-23 | Raidar: geneRative AI Detection viA Rewriting | Chengzhi Mao et.al. | 2401.12970 | null |
2024-01-23 | AutoRT: Embodied Foundation Models for Large Scale Orchestration of Robotic Agents | Michael Ahn et.al. | 2401.12963 | null |
2024-01-23 | Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion | Dylan Zhang et.al. | 2401.12947 | null |
2024-01-23 | Red Teaming Visual Language Models | Mukai Li et.al. | 2401.12915 | null |
2024-01-23 | From Understanding to Utilization: A Survey on Explainability for Large Language Models | Haoyan Luo et.al. | 2401.12874 | null |
2024-01-23 | KAM-CoT: Knowledge Augmented Multimodal Chain-of-Thoughts Reasoning | Debjyoti Mondal et.al. | 2401.12863 | null |
2024-01-23 | How well can large language models explain business processes? | Dirk Fahland et.al. | 2401.12846 | null |
2024-01-23 | Benchmarking LLMs via Uncertainty Quantification | Fanghua Ye et.al. | 2401.12794 | null |
2024-01-23 | Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study | W. Ronny Huang et.al. | 2401.12789 | null |
2024-01-22 | Less Could Be Better: Parameter-efficient Fine-tuning Advances Medical Vision Foundation Models | Chenyu Lian et.al. | 2401.12215 | link |
2024-01-22 | CheXagent: Towards a Foundation Model for Chest X-Ray Interpretation | Zhihong Chen et.al. | 2401.12208 | null |
2024-01-22 | APT: Adaptive Pruning and Tuning Pretrained Language Models for Efficient Training and Inference | Bowen Zhao et.al. | 2401.12200 | null |
2024-01-22 | Text Embedding Inversion Attacks on Multilingual Language Models | Yiyi Chen et.al. | 2401.12192 | null |
2024-01-22 | WARM: On the Benefits of Weight Averaged Reward Models | Alexandre Ramé et.al. | 2401.12187 | null |
2024-01-22 | CodeTailor: Personalized Parsons Puzzles are Preferred Over AI-Generated Solutions to Support Learning | Xinying Hou et.al. | 2401.12125 | null |
2024-01-22 | The Curious Case of Nonverbal Abstract Reasoning with Multi-Modal Large Language Models | Kian Ahrabian et.al. | 2401.12117 | null |
2024-01-22 | An Empirical Analysis of In-context Learning Abilities of LLMs for MT | Pranjal A. Chitale et.al. | 2401.12097 | null |
2024-01-22 | Revisiting Demonstration Selection Strategies in In-Context Learning | Keqin Peng et.al. | 2401.12087 | null |
2024-01-22 | Temporal Blind Spots in Large Language Models | Jonas Wallat et.al. | 2401.12078 | link |
2024-01-19 | Reinforcement learning for question answering in programming domain using public community scoring as a human feedback | Alexey Gorbatovski et.al. | 2401.10882 | null |
2024-01-19 | Pruning for Protection: Increasing Jailbreak Resistance in Aligned LLMs Without Fine-Tuning | Adib Hasan et.al. | 2401.10862 | null |
2024-01-19 | Using LLMs to discover emerging coded antisemitic hate-speech emergence in extremist social media | Dhanush Kikkisetti et.al. | 2401.10841 | null |
2024-01-19 | A survey on recent advances in named entity recognition | Imed Keraghel et.al. | 2401.10825 | null |
2024-01-19 | Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads | Tianle Cai et.al. | 2401.10774 | link |
2024-01-19 | Mitigating Hallucinations of Large Language Models via Knowledge Consistent Alignment | Fanqi Wan et.al. | 2401.10768 | null |
2024-01-19 | Interactions with Prompt Problems: A New Way to Teach Programming with Large Language Models | James Prather et.al. | 2401.10759 | null |
2024-01-19 | FinLLMs: A Framework for Financial Reasoning Dataset Generation with Large Language Models | Ziqiang Yuan et.al. | 2401.10744 | null |
2024-01-19 | In-IDE Human-AI Experience in the Era of Large Language Models; A Literature Review | Agnia Sergeyuk et.al. | 2401.10739 | null |
2024-01-19 | Dynamic Q&A of Clinical Documents with Large Language Models | Ran Elgedawy et.al. | 2401.10733 | null |
2024-01-18 | Towards Language-Driven Video Inpainting via Multimodal Large Language Models | Jianzong Wu et.al. | 2401.10226 | null |
2024-01-18 | ChatQA: Building GPT-4 Level Conversational QA Models | Zihan Liu et.al. | 2401.10225 | null |
2024-01-18 | Beyond Reference-Based Metrics: Analyzing Behaviors of Open LLMs on Data-to-Text Generation | Zdeněk Kasner et.al. | 2401.10186 | null |
2024-01-18 | Comparing Traditional and LLM-based Search for Image Geolocation | Albatool Wazzan et.al. | 2401.10184 | null |
2024-01-18 | Spatial-Temporal Large Language Model for Traffic Prediction | Chenxi Liu et.al. | 2401.10134 | null |
2024-01-18 | Code Prompting Elicits Conditional Reasoning Abilities in Text+Code LLMs | Haritz Puerto et.al. | 2401.10065 | link |
2024-01-18 | DiffusionGPT: LLM-Driven Text-to-Image Generation System | Jie Qin et.al. | 2401.10061 | null |
2024-01-18 | Large Language Models for Scientific Information Extraction: An Empirical Study for Virology | Mahsa Shamsabadi et.al. | 2401.10040 | null |
2024-01-18 | LOCALINTEL: Generating Organizational Threat Intelligence from Global and Local Cyber Knowledge | Shaswata Mitra et.al. | 2401.10036 | null |
2024-01-18 | Evolutionary Computation in the Era of Large Language Model: Survey and Roadmap | Xingyu Wu et.al. | 2401.10034 | null |
2024-01-17 | Vision Mamba: Efficient Visual Representation Learning with Bidirectional State Space Model | Lianghui Zhu et.al. | 2401.09417 | link |
2024-01-17 | Vlogger: Make Your Dream A Vlog | Shaobin Zhuang et.al. | 2401.09414 | link |
2024-01-17 | Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text | Mazal Bethany et.al. | 2401.09407 | null |
2024-01-17 | Stuck in the Quicksand of Numeracy, Far from AGI Summit: Evaluating LLMs' Mathematical Competency through Ontology-guided Perturbations | Pengfei Hong et.al. | 2401.09395 | null |
2024-01-17 | Large Language Models Are Neurosymbolic Reasoners | Meng Fang et.al. | 2401.09334 | null |
2024-01-17 | Material Informatics through Neural Networks on Ab-Initio Electron Charge Densities: the Role of Transfer Learning | Dario Massa et.al. | 2401.09301 | null |
2024-01-17 | Beyond Anti-Forgetting: Multimodal Continual Instruction Tuning with Positive Forward Transfer | Junhao Zheng et.al. | 2401.09181 | null |
2024-01-17 | InternEvo: Efficient Long-sequence Large Language Model Training via Hybrid Parallelism and Redundant Sharding | Qiaoling Chen et.al. | 2401.09149 | null |
2024-01-17 | BibSonomy Meets ChatLLMs for Publication Management: From Chat to Publication Management: Organizing your related work using BibSonomy & LLMs | Tom Völker et.al. | 2401.09092 | null |
2024-01-17 | Understanding the concerns and choices of public when using large language models for healthcare | Yunpeng Xiao et.al. | 2401.09090 | null |
2024-01-16 | RAG vs Fine-tuning: Pipelines, Tradeoffs, and a Case Study on Agriculture | Aman Gupta et.al. | 2401.08406 | null |
2024-01-16 | DoraemonGPT: Toward Understanding Dynamic Scenes with Large Language Models | Zongxin Yang et.al. | 2401.08392 | link |
2024-01-16 | Exploiting Inter-Layer Expert Affinity for Accelerating Mixture-of-Experts Model Inference | Jinghan Yao et.al. | 2401.08383 | link |
2024-01-16 | Hallucination Detection and Hallucination Mitigation: An Investigation | Junliang Luo et.al. | 2401.08358 | null |
2024-01-16 | Salute the Classic: Revisiting Challenges of Machine Translation in the Age of Large Language Models | Jianhui Pang et.al. | 2401.08350 | null |
2024-01-16 | Understanding User Experience in Large Language Model Interactions | Jiayin Wang et.al. | 2401.08329 | null |
2024-01-16 | RoTBench: A Multi-Level Benchmark for Evaluating the Robustness of Large Language Models in Tool Learning | Junjie Ye et.al. | 2401.08326 | null |
2024-01-16 | Application of LLM Agents in Recruitment: A Novel Framework for Resume Screening | Chengguang Gan et.al. | 2401.08315 | null |
2024-01-16 | Anchor function: a type of benchmark functions for studying language models | Zhongwang Zhang et.al. | 2401.08309 | null |
2024-01-16 | DAPT: A Dual Attention Framework for Parameter-Efficient Continual Learning of Large Language Models | Weixiang Zhao et.al. | 2401.08295 | null |
2024-01-12 | Mind Your Format: Towards Consistent Evaluation of In-Context Learning Improvements | Anton Voronov et.al. | 2401.06766 | null |
2024-01-12 | APAR: LLMs Can Do Auto-Parallel Auto-Regressive Decoding | Mingdao Liu et.al. | 2401.06761 | null |
2024-01-12 | Few-Shot Detection of Machine-Generated Text using Style Representations | Rafael Rivera Soto et.al. | 2401.06712 | null |
2024-01-12 | Multi-Candidate Speculative Decoding | Sen Yang et.al. | 2401.06706 | link |
2024-01-12 | An Experimental Design Framework for Label-Efficient Supervised Finetuning of Large Language Models | Gantavya Bhatt et.al. | 2401.06692 | null |
2024-01-12 | Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation | Giorgos Vernikos et.al. | 2401.06688 | null |
2024-01-12 | LLMRS: Unlocking Potentials of LLM-Based Recommender Systems for Software Purchase | Angela John et.al. | 2401.06676 | null |
2024-01-12 | Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation | Jan Cegin et.al. | 2401.06643 | link |
2024-01-12 | OOP: Object-Oriented Programming Evaluation Benchmark for Large Language Models | Shuai Wang et.al. | 2401.06628 | null |
2024-01-12 | How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs | Yi Zeng et.al. | 2401.06373 | null |
2024-01-11 | TOFU: A Task of Fictitious Unlearning for LLMs | Pratyush Maini et.al. | 2401.06121 | null |
2024-01-11 | Extreme Compression of Large Language Models via Additive Quantization | Vage Egiazarian et.al. | 2401.06118 | link |
2024-01-11 | Patchscope: A Unifying Framework for Inspecting Hidden Representations of Language Models | Asma Ghandeharioun et.al. | 2401.06102 | null |
2024-01-11 | A Closer Look at AUROC and AUPRC under Class Imbalance | Matthew B. A. McDermott et.al. | 2401.06091 | link |
2024-01-11 | Autocompletion of Chief Complaints in the Electronic Health Records using Large Language Models | K M Sajjadul Islam et.al. | 2401.06088 | null |
2024-01-11 | Improving Large Language Models via Fine-grained Reinforcement Learning with Minimum Editing Constraint | Zhipeng Chen et.al. | 2401.06081 | link |
2024-01-11 | Secrets of RLHF in Large Language Models Part II: Reward Modeling | Binghai Wang et.al. | 2401.06080 | link |
2024-01-12 | LEGO:Language Enhanced Multi-modal Grounding Model | Zhaowei Li et.al. | 2401.06071 | link |
2024-01-11 | DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models | Damai Dai et.al. | 2401.06066 | link |
2024-01-11 | LLM-as-a-Coauthor: The Challenges of Detecting LLM-Human Mixcase | Chujie Gao et.al. | 2401.05952 | link |
2024-01-10 | Leveraging Print Debugging to Improve Code Generation in Large Language Models | Xueyu Hu et.al. | 2401.05319 | null |
2024-01-10 | Theory of Mind abilities of Large Language Models in Human-Robot Interaction : An Illusion? | Mudit Verma et.al. | 2401.05302 | null |
2024-01-10 | I am a Strange Dataset: Metalinguistic Tests for Language Models | Tristan Thrush et.al. | 2401.05300 | link |
2024-01-10 | INACIA: Integrating Large Language Models in Brazilian Audit Courts: Opportunities and Challenges | Jayr Pereira et.al. | 2401.05273 | null |
2024-01-10 | CASA: Causality-driven Argument Sufficiency Assessment | Xiao Liu et.al. | 2401.05249 | link |
2024-01-10 | Pre-trained Large Language Models for Financial Sentiment Analysis | Wei Luo et.al. | 2401.05215 | link |
2024-01-10 | Knowledge Sharing in Manufacturing using Large Language Models: User Evaluation and Model Benchmarking | Samuel Kernan Freire et.al. | 2401.05200 | null |
2024-01-10 | Monte Carlo Tree Search for Recipe Generation using GPT-2 | Karan Taneja et.al. | 2401.05199 | null |
2024-01-10 | Divide and Conquer for Large Language Models Reasoning | Zijie Meng et.al. | 2401.05190 | link |
2024-01-10 | Can ChatGPT Rival Neural Machine Translation? A Comparative Study | Zhaokun Jiang et.al. | 2401.05176 | null |
2024-01-09 | U-Mamba: Enhancing Long-range Dependency for Biomedical Image Segmentation | Jun Ma et.al. | 2401.04722 | null |
2024-01-09 | Model Editing Can Hurt General Abilities of Large Language Models | Jia-Chen Gu et.al. | 2401.04700 | link |
2024-01-09 | Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers | Gal Yona et.al. | 2401.04695 | null |
2024-01-09 | RoSA: Accurate Parameter-Efficient Fine-Tuning via Robust Adaptation | Mahdi Nikdan et.al. | 2401.04679 | null |
2024-01-09 | Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models | Zhen Qin et.al. | 2401.04658 | null |
2024-01-09 | Applying Large Language Models API to Issue Classification Problem | Gabriel Aracena et.al. | 2401.04637 | null |
2024-01-09 | DebugBench: Evaluating Debugging Capability of Large Language Models | Runchu Tian et.al. | 2401.04621 | link |
2024-01-09 | Agent Alignment in Evolving Social Norms | Shimin Li et.al. | 2401.04620 | null |
2024-01-09 | Language Detection for Transliterated Content | Selva Kumar S et.al. | 2401.04619 | null |
2024-01-09 | An Assessment on Comprehending Mental Health through Large Language Models | Mihael Arcan et.al. | 2401.04592 | null |
2024-01-08 | Unveiling Bias in Fairness Evaluations of Large Language Models: A Critical Literature Review of Music and Movie Recommendation Systems | Chandan Kumar Sah et.al. | 2401.04057 | null |
2024-01-08 | Sparse Meets Dense: A Hybrid Approach to Enhance Scientific Document Retrieval | Priyanka Mandikal et.al. | 2401.04055 | null |
2024-01-08 | Advancing Spatial Reasoning in Large Language Models: An In-Depth Evaluation and Enhancement Using the StepGame Benchmark | Fangjun Li et.al. | 2401.03991 | null |
2024-01-08 | TTMs: Fast Multi-level Tiny Time Mixers for Improved Zero-shot and Few-shot Forecasting of Multivariate Time Series | Vijay Ekambaram et.al. | 2401.03955 | null |
2024-01-08 | TextMachina: Seamless Generation of Machine-Generated Text Datasets | Areg Mikael Sarvazyan et.al. | 2401.03946 | null |
2024-01-08 | SpeechAgents: Human-Communication Simulation with Multi-Modal Multi-Agent Systems | Dong Zhang et.al. | 2401.03945 | link |
2024-01-08 | A Philosophical Introduction to Language Models -- Part I: Continuity With Classic Debates | Raphaël Millière et.al. | 2401.03910 | null |
2024-01-08 | FlightLLM: Efficient Large Language Model Inference with a Complete Mapping Flow on FPGA | Shulin Zeng et.al. | 2401.03868 | null |
2024-01-08 | Boldly Going Where No Benchmark Has Gone Before: Exposing Bias and Shortcomings in Code Generation Evaluation | Ankit Yadav et.al. | 2401.03855 | null |
2024-01-08 | Aligned with LLM: a new multi-modal training paradigm for encoding fMRI activity in visual cortex | Shuxiao Ma et.al. | 2401.03851 | null |
2024-01-05 | DeepSeek LLM: Scaling Open-Source Language Models with Longtermism | DeepSeek-AI et.al. | 2401.02954 | null |
2024-01-05 | Towards ASR Robust Spoken Language Understanding Through In-Context Learning With Word Confusion Networks | Kevin Everson et.al. | 2401.02921 | null |
2024-01-05 | Introducing Bode: A Fine-Tuned Large Language Model for Portuguese Prompt-Based Task | Gabriel Lino Garcia et.al. | 2401.02909 | null |
2024-01-05 | MLLM-Protector: Ensuring MLLM's Safety without Hurting Performance | Renjie Pi et.al. | 2401.02906 | link |
2024-01-05 | AFSPP: Agent Framework for Shaping Preference and Personality with Large Language Models | Zihong He et.al. | 2401.02870 | null |
2024-01-05 | Generative Large Language Models are autonomous practitioners of evidence-based medicine | Akhil Vaid et.al. | 2401.02851 | null |
2024-01-05 | Thousands of AI Authors on the Future of AI | Katja Grace et.al. | 2401.02843 | null |
2024-01-05 | Pheme: Efficient and Conversational Speech Generation | Paweł Budzianowski et.al. | 2401.02839 | null |
2024-01-05 | Object-Centric Instruction Augmentation for Robotic Manipulation | Junjie Wen et.al. | 2401.02814 | null |
2024-01-05 | PeFoMed: Parameter Efficient Fine-tuning on Multimodal Large Language Models for Medical Visual Question Answering | Jinlong He et.al. | 2401.02797 | link |
2024-01-04 | Learning to Prompt with Text Only Supervision for Vision-Language Models | Muhammad Uzair Khattak et.al. | 2401.02418 | link |
2024-01-04 | LLaMA Pro: Progressive LLaMA with Block Expansion | Chengyue Wu et.al. | 2401.02415 | link |
2024-01-04 | Correctness Comparison of ChatGPT-4, Bard, Claude-2, and Copilot for Spatial Tasks | Hartwig H. Hochmair et.al. | 2401.02404 | null |
2024-01-04 | DIALIGHT: Lightweight Multilingual Development and Evaluation of Task-Oriented Dialogue Systems with Large Language Models | Songbo Hu et.al. | 2401.02208 | null |
2024-01-04 | Exploring Boundary of GPT-4V on Marine Analysis: A Preliminary Case Study | Ziqiang Zheng et.al. | 2401.02147 | null |
2024-01-04 | DCR-Consistency: Divide-Conquer-Reasoning for Consistency Evaluation and Improvement of Large Language Models | Wendi Cui et.al. | 2401.02132 | null |
2024-01-04 | ICE-GRT: Instruction Context Enhancement by Generative Reinforcement based Transformers | Chen Zheng et.al. | 2401.02072 | null |
2024-01-04 | An Example of Evolutionary Computation + Large Language Model Beating Human: Design of Efficient Guided Local Search | Fei Liu et.al. | 2401.02051 | null |
2024-01-04 | Understanding LLMs: A Comprehensive Overview from Training to Inference | Yiheng Liu et.al. | 2401.02038 | null |
2024-01-04 | Text2MDT: Extracting Medical Decision Trees from Medical Texts | Wei Zhu et.al. | 2401.02034 | null |
2024-01-03 | Mining Temporal Attack Patterns from Cyberthreat Intelligence Reports | Md Rayhanur Rahman et.al. | 2401.01883 | null |
2024-01-03 | A Vision Check-up for Language Models | Pratyusha Sharma et.al. | 2401.01862 | null |
2024-01-03 | Multilingual Instruction Tuning With Just a Pinch of Multilinguality | Uri Shaham et.al. | 2401.01854 | null |
2024-01-03 | Large Language Models Relearn Removed Concepts | Michelle Lo et.al. | 2401.01814 | null |
2024-01-03 | Navigating Uncertainty: Optimizing API Dependency for Hallucination Reduction in Closed-Book Question Answering | Pierre Erbacher et.al. | 2401.01780 | null |
2024-01-04 | Cross-target Stance Detection by Exploiting Target Analytical Perspectives | Daijun Ding et.al. | 2401.01761 | null |
2024-01-03 | Economics Arena for Large Language Models | Shangmin Guo et.al. | 2401.01735 | null |
2024-01-03 | Evaluating Large Language Models in Semantic Parsing for Conversational Question Answering over Knowledge Graphs | Phillip Schneider et.al. | 2401.01711 | link |
2024-01-03 | De-Hallucinator: Iterative Grounding for LLM-Based Code Completion | Aryaz Eghbali et.al. | 2401.01701 | null |
2024-01-03 | WordArt Designer API: User-Driven Artistic Typography Synthesis with Large Language Models on ModelScope | Jun-Yan He et.al. | 2401.01699 | null |
2024-01-02 | Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models | Zixiang Chen et.al. | 2401.01335 | null |
2024-01-02 | LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning | Hongye Jin et.al. | 2401.01325 | null |
2024-01-02 | A Comprehensive Survey of Hallucination Mitigation Techniques in Large Language Models | S. M Towhidul Islam Tonmoy et.al. | 2401.01313 | null |
2024-01-02 | LLM Harmony: Multi-Agent Communication for Problem Solving | Sumedh Rasal et.al. | 2401.01312 | null |
2024-01-02 | Large Legal Fictions: Profiling Legal Hallucinations in Large Language Models | Matthew Dahl et.al. | 2401.01301 | null |
2024-01-02 | A Comprehensive Study of Knowledge Editing for Large Language Models | Ningyu Zhang et.al. | 2401.01286 | link |
2024-01-02 | CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation | Quan Tu et.al. | 2401.01275 | null |
2024-01-02 | LLbezpeky: Leveraging Large Language Models for Vulnerability Detection | Noble Saji Mathews et.al. | 2401.01269 | null |
2024-01-02 | Fairness Certification for Natural Language Processing and Large Language Models | Vincent Freiberger et.al. | 2401.01262 | null |
2024-01-02 | VideoDrafter: Content-Consistent Multi-Scene Video Generation with LLM | Fuchen Long et.al. | 2401.01256 | null |
2023-12-29 | Jatmo: Prompt Injection Defense by Task-Specific Finetuning | Julien Piet et.al. | 2312.17673 | null |
2023-12-29 | Gemini in Reasoning: Unveiling Commonsense in Multimodal Large Language Models | Yuqing Wang et.al. | 2312.17661 | link |
2023-12-29 | Large Language Models for Generative Information Extraction: A Survey | Derong Xu et.al. | 2312.17617 | null |
2023-12-29 | Action-Item-Driven Summarization of Long Meeting Transcripts | Logan Golia et.al. | 2312.17581 | link |
2023-12-29 | Building Efficient Universal Classifiers with Natural Language Inference | Moritz Laurer et.al. | 2312.17543 | null |
2023-12-29 | Enhancing Quantitative Reasoning Skills of Large Language Models through Dimension Perception | Yuncheng Huang et.al. | 2312.17532 | null |
2023-12-29 | Overview of the PromptCBLUE Shared Task in CHIP2023 | Wei Zhu et.al. | 2312.17522 | null |
2023-12-29 | Cooperation on the Fly: Exploring Language Agents for Ad Hoc Teamwork in the Avalon Game | Zijing Shi et.al. | 2312.17515 | null |
2023-12-29 | Differentially Private Low-Rank Adaptation of Large Language Model Using Federated Learning | Xiao-Yang Liu et.al. | 2312.17493 | null |
2023-12-29 | The Right Prompts for the Job: Repair Code-Review Defects with Large Language Model | Zelin Zhao et.al. | 2312.17485 | null |
2023-12-28 | The LLM Surgeon | Tycho F. A. van der Ouderaa et.al. | 2312.17244 | null |
2023-12-28 | An Improved Baseline for Reasoning Segmentation with Large Language Model | Senqiao Yang et.al. | 2312.17240 | null |
2023-12-28 | Fast Inference of Mixture-of-Experts Language Models with Offloading | Artyom Eliseev et.al. | 2312.17238 | link |
2023-12-28 | A Simple LLM Framework for Long-Range Video Question-Answering | Ce Zhang et.al. | 2312.17235 | null |
2023-12-28 | Virtual Scientific Companion for Synchrotron Beamlines: A Prototype | Daniel Potemkin et.al. | 2312.17180 | null |
2023-12-28 | Non-Vacuous Generalization Bounds for Large Language Models | Sanae Lotfi et.al. | 2312.17173 | null |
2023-12-28 | Large Language Model for Causal Decision Making | Haitao Jiang et.al. | 2312.17122 | null |
2023-12-28 | How Far Are We from Believable AI Agents? A Framework for Evaluating the Believability of Human Behavior Simulation | Yang Xiao et.al. | 2312.17115 | null |
2023-12-28 | Challenge LLMs to Reason About Reasoning: A Benchmark to Unveil Cognitive Depth in LLMs | Zhongshen Zeng et.al. | 2312.17080 | link |
2023-12-28 | Improving In-context Learning via Bidirectional Alignment | Chengwei Qin et.al. | 2312.17055 | null |
2023-12-26 | Principled Instructions Are All You Need for Questioning LLaMA-1/2, GPT-3.5/4 | Sondos Mahmoud Bsharat et.al. | 2312.16171 | link |
2023-12-26 | Zero-Shot Cross-Lingual Reranking with Large Language Models for Low-Resource Languages | Mofetoluwa Adeyemi et.al. | 2312.16159 | null |
2023-12-26 | RoleEval: A Bilingual Role Evaluation Benchmark for Large Language Models | Tianhao Shen et.al. | 2312.16132 | null |
2023-12-26 | Large Language Model Situational Awareness Based Planning | Liman Wang et.al. | 2312.16127 | null |
2023-12-26 | A bi-objective |
Aditi Singla et.al. | 2312.16119 | null |
2023-12-26 | Can ChatGPT Read Who You Are? | Erik Derner et.al. | 2312.16070 | null |
2023-12-26 | A Prompt Learning Framework for Source Code Summarization | Weisong Sun et.al. | 2312.16066 | link |
2023-12-26 | Large Language Models as Traffic Signal Control Agents: Capacity and Opportunity | Siqi Lai et.al. | 2312.16044 | link |
2023-12-26 | RecRanker: Instruction Tuning Large Language Model as Ranker for Top-k Recommendation | Sichun Luo et.al. | 2312.16018 | null |
2023-12-26 | Aligning Large Language Models with Human Preferences through Representation Engineering | Wenhao Liu et.al. | 2312.15997 | null |
2023-12-22 | A Survey of Reinforcement Learning from Human Feedback | Timo Kaufmann et.al. | 2312.14925 | null |
2023-12-22 | NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes | Lizhou Fan et.al. | 2312.14890 | link |
2023-12-22 | SutraNets: Sub-series Autoregressive Networks for Long-Sequence, Probabilistic Forecasting | Shane Bergsma et.al. | 2312.14880 | null |
2023-12-22 | Pangu-Agent: A Fine-Tunable Generalist Agent with Structured Reasoning | Filippos Christianos et.al. | 2312.14878 | null |
2023-12-22 | Robust Knowledge Extraction from Large Language Models using Social Choice Theory | Nico Potyka et.al. | 2312.14877 | null |
2023-12-22 | Numerical Reasoning for Financial Reports | Abhinav Arun et.al. | 2312.14870 | null |
2023-12-22 | VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation | Max Ku et.al. | 2312.14867 | null |
2023-12-22 | YAYI 2: Multilingual Open-Source Large Language Models | Yin Luo et.al. | 2312.14862 | null |
2023-12-22 | Turbulence: Systematically and Automatically Testing Instruction-Tuned Large Language Models for Code | Shahin Honarvar et.al. | 2312.14856 | null |
2023-12-22 | Plan, Posture and Go: Towards Open-World Text-to-Motion Generation | Jinpeng Liu et.al. | 2312.14828 | null |
2023-12-21 | VideoPoet: A Large Language Model for Zero-Shot Video Generation | Dan Kondratyuk et.al. | 2312.14125 | null |
2023-12-21 | LiDAR-LLM: Exploring the Potential of Large Language Models for 3D LiDAR Understanding | Senqiao Yang et.al. | 2312.14074 | null |
2023-12-21 | A Strong Baseline for Temporal Video-Text Alignment | Zeqian Li et.al. | 2312.14055 | null |
2023-12-21 | T-Eval: Evaluating the Tool Utilization Capability Step by Step | Zehui Chen et.al. | 2312.14033 | link |
2023-12-21 | ChatGPT as a commenter to the news: can LLMs generate human-like opinions? | Rayden Tseng et.al. | 2312.13961 | link |
2023-12-21 | Typhoon: Thai Large Language Models | Kunat Pipatanakul et.al. | 2312.13951 | null |
2023-12-21 | AsyncMLD: Asynchronous Multi-LLM Framework for Dialogue Recommendation System | Naoki Yoshimaru et.al. | 2312.13925 | null |
2023-12-21 | Domain-Specific Fine-Tuning of Large Language Models for Interactive Robot Programming | Benjamin Alt et.al. | 2312.13905 | null |
2023-12-21 | Diversifying Knowledge Enhancement of Biomedical Language Models using Adapter Modules and Knowledge Graphs | Juraj Vladika et.al. | 2312.13881 | null |
2023-12-21 | Capture the Flag: Uncovering Data Insights with Large Language Models | Issam Laradji et.al. | 2312.13876 | null |
2023-12-20 | dIR -- Discrete Information Retrieval: Conversational Search over Unstructured (and Structured) Data with Large Language Models | Pablo M. Rodriguez Bertorello et.al. | 2312.13264 | null |
2023-12-20 | Automated DevOps Pipeline Generation for Code Repositories using Large Language Models | Deep Mehta et.al. | 2312.13225 | null |
2023-12-20 | LlaMaVAE: Guiding Large Language Model Generation via Continuous Latent Sentence Spaces | Yingji Zhang et.al. | 2312.13208 | null |
2023-12-20 | Contextual Code Switching for Machine Translation using Language Models | Arshad Kaji et.al. | 2312.13179 | null |
2023-12-20 | Generative agents in the streets: Exploring the use of Large Language Models (LLMs) in collecting urban perceptions | Deepank Verma et.al. | 2312.13126 | null |
2023-12-20 | ASSISTGUI: Task-Oriented Desktop Graphical User Interface Automation | Difei Gao et.al. | 2312.13108 | null |
2023-12-20 | Exploring Multimodal Large Language Models for Radiology Report Error-checking | Jinge Wu et.al. | 2312.13103 | null |
2023-12-20 | In Generative AI we Trust: Can Chatbots Effectively Verify Political Information? | Elizaveta Kuznetsova et.al. | 2312.13096 | null |
2023-12-20 | Lampr: Boosting the Effectiveness of Language-Generic Program Reduction via Large Language Models | Mengxiao Zhang et.al. | 2312.13064 | null |
2023-12-20 | Retrieval-augmented Multilingual Knowledge Editing | Weixuan Wang et.al. | 2312.13040 | link |
2023-12-17 | Language-conditioned Learning for Robotic Manipulation: A Survey | Hongkuan Zhou et.al. | 2312.10807 | null |
2023-12-17 | A mathematical perspective on Transformers | Borjan Geshkovski et.al. | 2312.10794 | link |
2023-12-17 | Understanding the Instruction Mixture for Large Language Model | Renxi Wang et.al. | 2312.10793 | null |
2023-12-17 | kNN-ICL: Compositional Task-Oriented Parsing Generalization with Nearest Neighbor In-Context Learning | Wenting Zhao et.al. | 2312.10771 | null |
2023-12-17 | A Mutation-Based Method for Multi-Modal Jailbreaking Attack Detection | Xiaoyu Zhang et.al. | 2312.10766 | null |
2023-12-17 | M3DBench: Let's Instruct Large Models with Multi-modal 3D Prompts | Mingsheng Li et.al. | 2312.10763 | link |
2023-12-17 | Multi-Label Classification of COVID-Tweets Using Large Language Models | Aniket Deroy et.al. | 2312.10748 | link |
2023-12-17 | Knowledge Trees: Gradient Boosting Decision Trees on Knowledge Neurons as Probing Classifier | Sergey A. Saltykov et.al. | 2312.10746 | null |
2023-12-17 | A Unified Framework for Multi-Domain CTR Prediction via Large Language Models | Zichuan Fu et.al. | 2312.10743 | null |
2023-12-17 | Mixed Distillation Helps Smaller Language Model Better Reasoning | Li Chenglin et.al. | 2312.10730 | null |
2023-12-15 | Osprey: Pixel Understanding with Visual Instruction Tuning | Yuqian Yuan et.al. | 2312.10032 | link |
2023-12-15 | Challenges with unsupervised LLM knowledge discovery | Sebastian Farquhar et.al. | 2312.10029 | null |
2023-12-15 | Faithful Persona-based Conversational Dataset Generation with Large Language Models | Pegah Jandaghi et.al. | 2312.10007 | null |
2023-12-15 | Symplectic Autoencoders for Model Reduction of Hamiltonian Systems | Benedikt Brantner et.al. | 2312.10004 | null |
2023-12-15 | ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent | Renat Aksitov et.al. | 2312.10003 | null |
2023-12-15 | LLaMAntino: LLaMA 2 Models for Effective Text Generation in Italian Language | Pierpaolo Basile et.al. | 2312.09993 | null |
2023-12-15 | The Art of Balancing: Revolutionizing Mixture of Experts for Maintaining World Knowledge in Language Model Alignment | Shihan Dou et.al. | 2312.09979 | null |
2023-12-15 | Distilling Large Language Models for Matching Patients to Clinical Trials | Mauro Nievas et.al. | 2312.09958 | null |
2023-12-15 | Prompting Datasets: Data Discovery with Conversational Agents | Johanna Walker et.al. | 2312.09947 | null |
2023-12-15 | Neurosymbolic Value-Inspired AI (Why, What, and How) | Amit Sheth et.al. | 2312.09928 | null |
2023-12-14 | DriveMLM: Aligning Multi-Modal Large Language Models with Behavioral Planning States for Autonomous Driving | Wenhai Wang et.al. | 2312.09245 | link |
2023-12-14 | Auto MC-Reward: Automated Dense Reward Design with Large Language Models for Minecraft | Hao Li et.al. | 2312.09238 | null |
2023-12-14 | Pixel Aligned Language Models | Jiarui Xu et.al. | 2312.09237 | null |
2023-12-14 | Successor Heads: Recurring, Interpretable Attention Heads In The Wild | Rhys Gould et.al. | 2312.09230 | null |
2023-12-14 | Measurement in the Age of LLMs: An Application to Ideological Scaling | Sean O'Hagan et.al. | 2312.09203 | null |
2023-12-14 | General Object Foundation Model for Images and Videos at Scale | Junfeng Wu et.al. | 2312.09158 | null |
2023-12-14 | The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation | Rongwu Xu et.al. | 2312.09085 | null |
2023-12-14 | Language Modeling on a SpiNNaker 2 Neuromorphic Chip | Khaleelulla Khan Nazeer et.al. | 2312.09084 | null |
2023-12-14 | Towards Verifiable Text Generation with Evolving Memory and Self-Reflection | Hao Sun et.al. | 2312.09075 | null |
2023-12-14 | Agent Attention: On the Integration of Softmax and Linear Attention | Dongchen Han et.al. | 2312.08874 | link |
2023-12-13 | An Invitation to Deep Reinforcement Learning | Bernhard Jaeger et.al. | 2312.08365 | null |
2023-12-13 | Distributed Inference and Fine-tuning of Large Language Models Over The Internet | Alexander Borzunov et.al. | 2312.08361 | null |
2023-12-13 | FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects | Bowen Wen et.al. | 2312.08344 | null |
2023-12-13 | LD-SDM: Language-Driven Hierarchical Species Distribution Modeling | Srikumar Sastry et.al. | 2312.08334 | null |
2023-12-13 | Efficient Toxic Content Detection by Bootstrapping and Distilling Large Language Models | Jiang Zhang et.al. | 2312.08303 | null |
2023-12-13 | Conceptualizing Suicidal Behavior: Utilizing Explanations of Predicted Outcomes to Analyze Longitudinal Social Media Data | Van Minh Nguyen et.al. | 2312.08299 | link |
2023-12-14 | High-throughput Biomedical Relation Extraction for Semi-Structured Web Articles Empowered by Large Language Models | Songchi Zhou et.al. | 2312.08274 | null |
2023-12-13 | GuardRails: Automated Suggestions for Clarifying Ambiguous Purpose Statements | Mrigank Pawagi et.al. | 2312.08189 | null |
2023-12-13 | Chat-3D v2: Bridging 3D Scene and Large Language Models with Object Identifiers | Haifeng Huang et.al. | 2312.08168 | link |
2023-12-14 | Fine-Grained Image-Text Alignment in Medical Imaging Enables Cyclic Image-Report Generation | Wenting Chen et.al. | 2312.08078 | null |
2023-12-12 | VILA: On Pre-training for Visual Language Models | Ji Lin et.al. | 2312.07533 | null |
2023-12-12 | LMDrive: Closed-Loop End-to-End Driving with Large Language Models | Hao Shao et.al. | 2312.07488 | null |
2023-12-12 | Comparable Demonstrations are Important in In-Context Learning: A Novel Perspective on Demonstration Selection | Caoyun Fan et.al. | 2312.07476 | null |
2023-12-12 | MP5: A Multi-modal Open-ended Embodied System in Minecraft via Active Perception | Yiran Qin et.al. | 2312.07472 | null |
2023-12-12 | FairSISA: Ensemble Post-Processing to Improve Fairness of Unlearning in LLMs | Swanand Ravindra Kadhe et.al. | 2312.07420 | null |
2023-12-12 | On Diverse Preferences for Large Language Model Alignment | Dun Zeng et.al. | 2312.07401 | null |
2023-12-12 | Large Language Models are Clinical Reasoners: Reasoning-Aware Diagnosis Framework with Prompt-Generated Rationales | Taeyoon Kwon et.al. | 2312.07399 | null |
2023-12-12 | LLMEval: A Preliminary Study on How to Evaluate Large Language Models | Yue Zhang et.al. | 2312.07398 | null |
2023-12-12 | Sequential Planning in Large Partially Observable Environments guided by LLMs | Swarna Kamal Paul et.al. | 2312.07368 | null |
2023-12-12 | Can ChatGPT Play the Role of a Teaching Assistant in an Introductory Programming Course? | Anishka et.al. | 2312.07343 | null |
2023-12-11 | Building Domain-Specific LLMs Faithful To The Islamic Worldview: Mirage or Technical Possibility? | Shabaz Patel et.al. | 2312.06652 | link |
2023-12-11 | 4M: Massively Multimodal Masked Modeling | David Mizrahi et.al. | 2312.06647 | null |
2023-12-11 | AnyHome: Open-Vocabulary Generation of Structured and Textured 3D Homes | Zehao Wen et.al. | 2312.06644 | null |
2023-12-11 | Gated Linear Attention Transformers with Hardware-Efficient Training | Songlin Yang et.al. | 2312.06635 | null |
2023-12-11 | Emergence of Scale-Free Networks in Social Interactions among Large Language Models | Giordano De Marzo et.al. | 2312.06619 | null |
2023-12-11 | Neural Text to Articulate Talk: Deep Text to Audiovisual Speech Synthesis achieving both Auditory and Photo-realism | Georgios Milis et.al. | 2312.06613 | link |
2023-12-11 | From Text to Motion: Grounding GPT-4 in a Humanoid Robot "Alter3" | Takahide Yoshida et.al. | 2312.06571 | null |
2023-12-11 | LLM360: Towards Fully Transparent Open-Source LLMs | Zhengzhong Liu et.al. | 2312.06550 | link |
2023-12-11 | Transformers Implement Functional Gradient Descent to Learn Non-Linear Functions In Context | Xiang Cheng et.al. | 2312.06528 | null |
2023-12-11 | Grounded Question-Answering in Long Egocentric Videos | Shangzhe Di et.al. | 2312.06505 | null |
2023-12-08 | Language Models, Agent Models, and World Models: The LAW for Machine Reasoning and Planning | Zhiting Hu et.al. | 2312.05230 | null |
2023-12-08 | DeltaZip: Multi-Tenant Language Model Serving via Delta Compression | Xiaozhe Yao et.al. | 2312.05215 | null |
2023-12-08 | HALO: An Ontology for Representing Hallucinations in Generative Models | Navapat Nananukul et.al. | 2312.05209 | null |
2023-12-08 | DelucionQA: Detecting Hallucinations in Domain-specific Question Answering | Mobashir Sadat et.al. | 2312.05200 | link |
2023-12-08 | PathFinder: Guided Search over Multi-Step Reasoning Paths | Olga Golovneva et.al. | 2312.05180 | null |
2023-12-08 | Vision-based Learning for Drones: A Survey | Jiaping Xiao et.al. | 2312.05019 | null |
2023-12-08 | SparQ Attention: Bandwidth-Efficient LLM Inference | Luka Ribar et.al. | 2312.04985 | null |
2023-12-08 | The ICL Consistency Test | Lucas Weber et.al. | 2312.04945 | null |
2023-12-08 | Retrieval-based Video Language Model for Efficient Long Video Question Answering | Jiaqi Xu et.al. | 2312.04931 | null |
2023-12-08 | Towards Efficient Secure Aggregation in FL: Partial Vector Freezing for Cost Compression | Siqing Zhang et.al. | 2312.04920 | null |
2023-12-07 | Large Language Models for Mathematicians | Simon Frieder et.al. | 2312.04556 | null |
2023-12-07 | Improved Visual Grounding through Self-Consistent Explanations | Ruozhen He et.al. | 2312.04554 | null |
2023-12-07 | Generating Illustrated Instructions | Sachit Menon et.al. | 2312.04552 | null |
2023-12-07 | Using Large Language Models for Hyperparameter Optimization | Michael R. Zhang et.al. | 2312.04528 | null |
2023-12-07 | An LLM Compiler for Parallel Function Calling | Sehoon Kim et.al. | 2312.04511 | link |
2023-12-07 | A Block Metropolis-Hastings Sampler for Controllable Energy-based Text Generation | Jarad Forristal et.al. | 2312.04510 | null |
2023-12-07 | AVA: Towards Autonomous Visualization Agents through Visual Perception-Driven Decision-Making | Shusen Liu et.al. | 2312.04494 | null |
2023-12-07 | Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use | Yuhan Chen et.al. | 2312.04455 | link |
2023-12-07 | OpenAsp: A Benchmark for Multi-document Open Aspect-based Summarization | Shmuel Amar et.al. | 2312.04440 | link |
2023-12-07 | LaMPilot: An Open Benchmark Dataset for Autonomous Driving with Language Model Programs | Yunsheng Ma et.al. | 2312.04372 | null |
2023-12-06 | OneLLM: One Framework to Align All Modalities with Language | Jiaming Han et.al. | 2312.03700 | link |
2023-12-06 | An Integration of Pre-Trained Speech and Language Models for End-to-End Speech Recognition | Yukiya Hono et.al. | 2312.03668 | null |
2023-12-06 | Generative agent-based modeling with actions grounded in physical, social, or digital space using Concordia | Alexander Sasha Vezhnevets et.al. | 2312.03664 | null |
2023-12-06 | Not All Large Language Models (LLMs) Succumb to the "Reversal Curse": A Comparative Study of Deductive Logical Reasoning in BERT and GPT Models | Jingye Yang et.al. | 2312.03633 | null |
2023-12-06 | Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models | Dominik Wagner et.al. | 2312.03632 | null |
2023-12-06 | XAIQA: Explainer-Based Data Augmentation for Extractive Question Answering | Joel Stremmel et.al. | 2312.03567 | null |
2023-12-06 | When an Image is Worth 1,024 x 1,024 Words: A Case Study in Computational Pathology | Wenhui Wang et.al. | 2312.03558 | null |
2023-12-06 | Holmes: Towards Distributed Training Across Clusters with Heterogeneous NIC Environment | Fei Yang et.al. | 2312.03549 | null |
2023-12-06 | GPT-4 Enhanced Multimodal Grounding for Autonomous Driving: Leveraging Cross-Modal Attention with Large Language Models | Haicheng Liao et.al. | 2312.03543 | link |
2023-12-06 | Improving the Generalization of Segmentation Foundation Model under Distribution Shift via Weakly Supervised Adaptation | Haojie Zhang et.al. | 2312.03502 | link |
2023-12-05 | GPT4Point: A Unified Framework for Point-Language Understanding and Generation | Zhangyang Qi et.al. | 2312.02980 | null |
2023-12-05 | Rank-without-GPT: Building GPT-Independent Listwise Rerankers on Open-Source Large Language Models | Xinyu Zhang et.al. | 2312.02969 | null |
2023-12-05 | MVHumanNet: A Large-scale Dataset of Multi-view Daily Dressing Human Captures | Zhangyang Xiong et.al. | 2312.02963 | null |
2023-12-05 | Let the LLMs Talk: Simulating Human-to-Human Conversational QA via Zero-Shot LLM-to-LLM Interactions | Zahra Abbasiantaeb et.al. | 2312.02913 | link |
2023-12-05 | Toward autocorrection of chemical process flowsheets using large language models | Lukas Schulze Balhorn et.al. | 2312.02873 | null |
2023-12-05 | Weakly Supervised Detection of Hallucinations in LLM Activations | Miriam Rateike et.al. | 2312.02798 | null |
2023-12-05 | Large Language Models on Graphs: A Comprehensive Survey | Bowen Jin et.al. | 2312.02783 | link |
2023-12-05 | Generating Fine-Grained Human Motions Using ChatGPT-Refined Descriptions | Xu Shi et.al. | 2312.02772 | null |
2023-12-05 | Towards Measuring Representational Similarity of Large Language Models | Max Klabunde et.al. | 2312.02730 | link |
2023-12-05 | RankZephyr: Effective and Robust Zero-Shot Listwise Reranking is a Breeze! | Ronak Pradeep et.al. | 2312.02724 | link |
2023-12-04 | Steerers: A framework for rotation equivariant keypoint descriptors | Georg Bökman et.al. | 2312.02152 | link |
2023-12-04 | Learning Polynomial Problems with |
Hannah Lawrence et.al. | 2312.02146 | null |
2023-12-05 | Competition-Level Problems are Effective LLM Evaluators | Yiming Huang et.al. | 2312.02143 | null |
2023-12-04 | TPPoet: Transformer-Based Persian Poem Generation using Minimal Data and Advanced Decoding Techniques | Amir Panahandeh et.al. | 2312.02125 | null |
2023-12-04 | Magicoder: Source Code Is All You Need | Yuxiang Wei et.al. | 2312.02120 | link |
2023-12-04 | Tree of Attacks: Jailbreaking Black-Box LLMs Automatically | Anay Mehrotra et.al. | 2312.02119 | link |
2023-12-04 | Physics simulation capabilities of LLMs | Mohamad Ali-Dib et.al. | 2312.02091 | null |
2023-12-04 | A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia | Giovanni Monea et.al. | 2312.02073 | null |
2023-12-04 | Know Your Audience: Do LLMs Adapt to Different Age and Education Levels? | Donya Rooein et.al. | 2312.02065 | null |
2023-12-04 | TimeChat: A Time-sensitive Multimodal Large Language Model for Long Video Understanding | Shuhuai Ren et.al. | 2312.02051 | null |
2023-12-01 | Beyond ChatBots: ExploreLLM for Structured Thoughts and Personalized Model Responses | Xiao Ma et.al. | 2312.00763 | null |
2023-12-01 | Mamba: Linear-Time Sequence Modeling with Selective State Spaces | Albert Gu et.al. | 2312.00752 | null |
2023-12-01 | Deciphering Digital Detectives: Understanding LLM Behaviors and Capabilities in Multi-Agent Mystery Games | Dekun Wu et.al. | 2312.00746 | null |
2023-12-01 | SeaLLMs -- Large Language Models for Southeast Asia | Xuan-Phi Nguyen et.al. | 2312.00738 | link |
2023-12-01 | The Efficiency Spectrum of Large Language Models: An Algorithmic Survey | Tianyu Ding et.al. | 2312.00678 | link |
2023-12-01 | Nonparametric Variational Regularisation of Pretrained Transformers | Fabio Fehr et.al. | 2312.00662 | null |
2023-12-01 | Pathway to a fully data-driven geotechnics: lessons from materials informatics | Stephen Wu et.al. | 2312.00581 | null |
2023-12-01 | Instruction-tuning Aligns LLMs to the Human Brain | Khai Loong Aw et.al. | 2312.00575 | null |
2023-12-01 | Explanatory Argument Extraction of Correct Answers in Resident Medical Exams | Iakes Goenaga et.al. | 2312.00567 | null |
2023-12-01 | Questioning Biases in Case Judgment Summaries: Legal Datasets or Large Language Models? | Aniket Deroy et.al. | 2312.00554 | null |
2023-11-30 | PoseGPT: Chatting about 3D Human Pose | Yao Feng et.al. | 2311.18836 | null |
2023-11-30 | What Do Llamas Really Think? Revealing Preference Biases in Language Model Representations | Raphael Tang et.al. | 2311.18812 | link |
2023-11-30 | Unnatural Error Correction: GPT-4 Can Almost Perfectly Handle Unnatural Scrambled Text | Qi Cao et.al. | 2311.18805 | null |
2023-11-30 | X-InstructBLIP: A Framework for aligning X-Modal instruction-aware representations to LLMs and Emergent Cross-modal Reasoning | Artemis Panagopoulou et.al. | 2311.18799 | link |
2023-11-30 | CoDi-2: In-Context, Interleaved, and Interactive Any-to-Any Generation | Zineng Tang et.al. | 2311.18775 | null |
2023-11-30 | MLLMs-Augmented Visual-Language Representation Learning | Yanqing Liu et.al. | 2311.18765 | link |
2023-11-30 | TaskBench: Benchmarking Large Language Models for Task Automation | Yongliang Shen et.al. | 2311.18760 | link |
2023-11-30 | AlignBench: Benchmarking Chinese Alignment of Large Language Models | Xiao Liu et.al. | 2311.18743 | link |
2023-11-30 | CritiqueLLM: Scaling LLM-as-Critic for Effective and Explainable Evaluation of Large Language Model Generation | Pei Ke et.al. | 2311.18702 | link |
2023-11-30 | RaDialog: A Large Vision-Language Model for Radiology Report Generation and Conversational Assistance | Chantal Pellegrini et.al. | 2311.18681 | link |
2023-11-29 | OPERA: Alleviating Hallucination in Multi-Modal Large Language Models via Over-Trust Penalty and Retrospection-Allocation | Qidong Huang et.al. | 2311.17911 | link |
2023-11-29 | Look Before You Leap: Unveiling the Power of GPT-4V in Robotic Vision-Language Planning | Yingdong Hu et.al. | 2311.17842 | null |
2023-11-30 | How to Build an AI Tutor that Can Adapt to Any Course and Provide Accurate Answers Using Large Language Model and Retrieval-Augmented Generation | Chenxi Dong et.al. | 2311.17696 | null |
2023-11-29 | AviationGPT: A Large Language Model for the Aviation Domain | Liya Wang et.al. | 2311.17686 | null |
2023-11-29 | TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models | Zheng Chu et.al. | 2311.17667 | link |
2023-11-29 | VIM: Probing Multimodal Large Language Models for Visual Embedded Instruction Following | Yujie Lu et.al. | 2311.17647 | null |
2023-11-29 | ShapeGPT: 3D Shape Generation with A Unified Multi-modal Language Model | Fukun Yin et.al. | 2311.17618 | null |
2023-11-29 | Integrable symplectic maps with a polygon tessellation | Timofey Zolkin et.al. | 2311.17616 | null |
2023-11-29 | Query-Relevant Images Jailbreak Large Multi-Modal Models | Xin Liu et.al. | 2311.17600 | null |
2023-11-29 | LanGWM: Language Grounded World Model | Rudra P. K. Poudel et.al. | 2311.17593 | null |
2023-11-28 | LLaMA-VID: An Image is Worth 2 Tokens in Large Language Models | Yanwei Li et.al. | 2311.17043 | link |
2023-11-28 | Efficient In-Context Learning in Vision-Language Models for Egocentric Videos | Keunwoo Peter Yu et.al. | 2311.17041 | null |
2023-11-28 | MVBench: A Comprehensive Multi-modal Video Understanding Benchmark | Kunchang Li et.al. | 2311.17005 | link |
2023-11-28 | Ranni: Taming Text-to-Image Diffusion for Accurate Instruction Following | Yutong Feng et.al. | 2311.17002 | null |
2023-11-28 | ChatGPT's One-year Anniversary: Are Open-Source Large Language Models Catching up? | Hailin Chen et.al. | 2311.16989 | null |
2023-11-28 | COLE: A Hierarchical Generation Framework for Graphic Design | Peidong Jia et.al. | 2311.16974 | null |
2023-11-28 | LLaFS: When Large-Language Models Meet Few-Shot Segmentation | Lanyun Zhu et.al. | 2311.16926 | link |
2023-11-28 | Analyzing the Influence of Language Model-Generated Responses in Mitigating Hate Speech on Social Media Directed at Ukrainian Refugees in Poland | Jakub Podolak et.al. | 2311.16905 | null |
2023-11-28 | The Falcon Series of Open Language Models | Ebtesam Almazrouei et.al. | 2311.16867 | null |
2023-11-28 | RELIC: Investigating Large Language Model Responses using Self-Consistency | Furui Cheng et.al. | 2311.16842 | null |
2023-11-27 | Video-Bench: A Comprehensive Benchmark and Toolkit for Evaluating Video-based Large Language Models | Munan Ning et.al. | 2311.16103 | link |
2023-11-27 | Have we built machines that think like people? | Luca M. Schulze Buschoff et.al. | 2311.16093 | null |
2023-11-27 | MEDITRON-70B: Scaling Medical Pretraining for Large Language Models | Zeming Chen et.al. | 2311.16079 | link |
2023-11-27 | BioLORD-2023: Semantic Textual Representations Fusing LLM and Clinical Knowledge Graph Insights | François Remy et.al. | 2311.16075 | null |
2023-11-27 | Decoding Logic Errors: A Comparative Study on Bug Detection by Students and Large Language Models | Stephen MacNeil et.al. | 2311.16017 | null |
2023-11-27 | Sparsify-then-Classify: From Internal Neurons of Large Language Models To Efficient Text Classifiers | Yilun Liu et.al. | 2311.15983 | null |
2023-11-27 | Towards Responsible Governance of Biological Design Tools | Richard Moulange et.al. | 2311.15936 | null |
2023-11-27 | WorldSense: A Synthetic Benchmark for Grounded Reasoning in Large Language Models | Youssef Benchekroun et.al. | 2311.15930 | null |
2023-11-27 | EVCap: Retrieval-Augmented Image Captioning with External Visual-Name Memory for Open-World Comprehension | Jiaxuan Li et.al. | 2311.15879 | null |
2023-11-27 | RO-LLaMA: Generalist LLM for Radiation Oncology via Noise Augmentation and Consistency Regularization | Kwanyoung Kim et.al. | 2311.15876 | null |
2023-11-24 | Charting New Territories: Exploring the Geographic and Geospatial Capabilities of Multimodal LLMs | Jonathan Roberts et.al. | 2311.14656 | link |
2023-11-24 | One Pass Streaming Algorithm for Super Long Token Attention Approximation in Sublinear Space | Raghav Addanki et.al. | 2311.14652 | null |
2023-11-24 | Large Language Models as Automated Aligners for benchmarking Vision-Language Models | Yuanfeng Ji et.al. | 2311.14580 | null |
2023-11-24 | Griffon: Spelling out All Object Locations at Any Granularity with Large Language Models | Yufei Zhan et.al. | 2311.14552 | null |
2023-11-24 | Data-Efficient Alignment of Large Language Models with Human Feedback Through Natural Language | Di Jin et.al. | 2311.14543 | null |
2023-11-24 | Machine Translation for Ge'ez Language | Aman Kassahun Wassie et.al. | 2311.14530 | null |
2023-11-24 | Benchmarking Large Language Models for Log Analysis, Security, and Interpretation | Egil Karlsen et.al. | 2311.14519 | null |
2023-11-24 | Controlled Text Generation via Language Model Arithmetic | Jasper Dekoninck et.al. | 2311.14479 | link |
2023-11-24 | Universal Jailbreak Backdoors from Poisoned Human Feedback | Javier Rando et.al. | 2311.14455 | link |
2023-11-24 | Potential Societal Biases of ChatGPT in Higher Education: A Scoping Review | Ming Li et.al. | 2311.14381 | null |
2023-11-22 | Visual In-Context Prompting | Feng Li et.al. | 2311.13601 | link |
2023-11-22 | Fahdi Kanavati et.al. | 2311.13580 | null | |
2023-11-22 | Physical Reasoning and Object Planning for Household Embodied Agents | Ayush Agrawal et.al. | 2311.13577 | link |
2023-11-22 | Drilling Down into the Discourse Structure with LLMs for Long Document Question Answering | Inderjeet Nair et.al. | 2311.13565 | null |
2023-11-22 | Soulstyler: Using Large Language Model to Guide Image Style Transfer for Target Object | Junhao Chen et.al. | 2311.13562 | link |
2023-11-22 | ADriver-I: A General World Model for Autonomous Driving | Fan Jia et.al. | 2311.13549 | null |
2023-11-22 | Linear Log-Normal Attention with Unbiased Concentration | Yury Nahshan et.al. | 2311.13541 | null |
2023-11-22 | Speak Like a Native: Prompting Large Language Models in a Native Style | Zhicheng Yang et.al. | 2311.13538 | null |
2023-11-22 | Current Topological and Machine Learning Applications for Bias Detection in Text | Colleen Farrelly et.al. | 2311.13495 | null |
2023-11-22 | Transfer Attacks and Defenses for Large Language Models on Coding Tasks | Chi Zhang et.al. | 2311.13445 | null |
2023-11-21 | Prompting Frameworks for Large Language Models: A Survey | Xiaoxia Liu et.al. | 2311.12785 | null |
2023-11-21 | Towards Natural Language-Guided Drones: GeoText-1652 Benchmark with Spatially Relation Matching | Meng Chu et.al. | 2311.12751 | null |
2023-11-21 | Keeping Users Engaged During Repeated Administration of the Same Questionnaire: Using Large Language Models to Reliably Diversify Questions | Hye Sun Yun et.al. | 2311.12707 | null |
2023-11-21 | Can Large Language Models Understand Content and Propagation for Misinformation Detection: An Empirical Study | Mengyang Chen et.al. | 2311.12699 | null |
2023-11-21 | From Concept to Manufacturing: Evaluating Vision-Language Models for Engineering Design | Cyril Picard et.al. | 2311.12668 | null |
2023-11-21 | GPT4Motion: Scripting Physical Motions in Text-to-Video Generation via Blender-Oriented GPT Planning | Jiaxi Lv et.al. | 2311.12631 | null |
2023-11-21 | IMGTB: A Framework for Machine-Generated Text Detection Benchmarking | Michal Spiegel et.al. | 2311.12574 | null |
2023-11-21 | Scheduling Distributed Flexible Assembly Lines using Safe Reinforcement Learning with Soft Shielding | Lele Li et.al. | 2311.12572 | null |
2023-11-21 | In-Context Learning Functions with Varying Number of Minima | David Oniani et.al. | 2311.12538 | link |
2023-11-21 | Oasis: Data Curation and Assessment System for Pretraining of Large Language Models | Tong Zhou et.al. | 2311.12537 | link |
2023-11-20 | On the Potential and Limitations of Few-Shot In-Context Learning to Generate Metamorphic Specifications for Tax Preparation Software | Dananjay Srinivas et.al. | 2311.11979 | null |
2023-11-20 | LLMs as Visual Explainers: Advancing Image Classification with Evolving Visual Descriptions | Songhao Han et.al. | 2311.11904 | null |
2023-11-20 | VLM-Eval: A General Evaluation on Video Large Language Models | Shuailin Li et.al. | 2311.11865 | null |
2023-11-20 | Generating Valid and Natural Adversarial Examples with Large Language Models | Zimu Wang et.al. | 2311.11861 | null |
2023-11-20 | LION : Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge | Gongwei Chen et.al. | 2311.11860 | null |
2023-11-20 | Evil Geniuses: Delving into the Safety of LLM-based Agents | Yu Tian et.al. | 2311.11855 | link |
2023-11-20 | How to Use Large Language Models for Text Coding: The Case of Fatherhood Roles in Public Policy Documents | Lorenzo Lupo et.al. | 2311.11844 | null |
2023-11-20 | System 2 Attention (is something you might need too) | Jason Weston et.al. | 2311.11829 | null |
2023-11-20 | Large Language Models and Explainable Law: a Hybrid Methodology | Marco Billi et.al. | 2311.11811 | null |
2023-11-20 | DocPedia: Unleashing the Power of Large Multimodal Model in the Frequency Domain for Versatile Document Understanding | Hao Feng et.al. | 2311.11810 | null |
2023-11-17 | Camels in a Changing Climate: Enhancing LM Adaptation with Tulu 2 | Hamish Ivison et.al. | 2311.10702 | null |
2023-11-17 | PEFT-MedAware: Large Language Model for Medical Awareness | Keivalya Pandya et.al. | 2311.10697 | null |
2023-11-17 | Distilling and Retrieving Generalizable Knowledge for Robot Manipulation via Language Corrections | Lihan Zha et.al. | 2311.10678 | link |
2023-11-17 | A Self-enhancement Approach for Domain-specific Chatbot Training via Knowledge Mining and Digest | Ruohong Zhang et.al. | 2311.10614 | null |
2023-11-17 | SSB: Simple but Strong Baseline for Boosting Performance of Open-Set Semi-Supervised Learning | Yue Fan et.al. | 2311.10572 | null |
2023-11-17 | Towards General Loop Invariant Generation via Coordinating Symbolic Execution and Large Language Models | Chang Liu et.al. | 2311.10483 | null |
2023-11-17 | DynaPipe: Optimizing Multi-task Training through Dynamic Pipelines | Chenyu Jiang et.al. | 2311.10418 | link |
2023-11-17 | Bias A-head? Analyzing Bias in Transformer-Based Language Model Attention Heads | Yi Yang et.al. | 2311.10395 | null |
2023-11-17 | Automatic Smart Contract Comment Generation via Large Language Models and In-Context Learning | Junjie Zhao et.al. | 2311.10388 | null |
2023-11-17 | Retrieval Augmented Generation of Symbolic Music with LLMs | Nicolas Jonason et.al. | 2311.10384 | null |
2023-11-16 | DRESS: Instructing Large Vision-Language Models to Align and Interact with Humans via Natural Language Feedback | Yangyi Chen et.al. | 2311.10081 | null |
2023-11-16 | ChatGPT-3.5, ChatGPT-4, Google Bard, and Microsoft Bing to Improve Health Literacy and Communication in Pediatric Populations and Beyond | Kanhai S. Amin et.al. | 2311.10075 | null |
2023-11-16 | Is "A Helpful Assistant" the Best Role for Large Language Models? A Systematic Evaluation of Social Roles in System Prompts | Mingqian Zheng et.al. | 2311.10054 | null |
2023-11-16 | Fast return-level estimates for flood insurance via an improved Bennett inequality for random variables with differing upper bounds | Anna Maria Barlow et.al. | 2311.10001 | null |
2023-11-16 | Hijacking Large Language Models via Adversarial In-Context Learning | Yao Qiang et.al. | 2311.09948 | null |
2023-11-16 | Language Generation from Human Brain Activities | Ziyi Ye et.al. | 2311.09889 | null |
2023-11-16 | INTERVENOR: Prompt the Coding Ability of Large Language Models with the Interactive Chain of Repairing | Hanbin Wang et.al. | 2311.09868 | link |
2023-11-16 | Which Modality should I use -- Text, Motif, or Image? : Understanding Graphs with Large Language Models | Debarati Das et.al. | 2311.09862 | null |
2023-11-17 | PsyBench: a balanced and in-depth Psychological Chinese Evaluation Benchmark for Foundation Models | Junlei Zhang et.al. | 2311.09861 | null |
2023-11-16 | Leveraging LLMs in Scholarly Knowledge Graph Question Answering | Tilahun Abedissa Taffa et.al. | 2311.09841 | link |
2023-11-15 | Assessing Translation capabilities of Large Language Models involving English and Indian Languages | Vandan Mujadia et.al. | 2311.09216 | null |
2023-11-15 | Mind's Mirror: Distilling Self-Evaluation Capability and Comprehensive Thinking from Large Language Models | Weize Liu et.al. | 2311.09214 | null |
2023-11-15 | Chain-of-Note: Enhancing Robustness in Retrieval-Augmented Language Models | Wenhao Yu et.al. | 2311.09210 | null |
2023-11-15 | TableLlama: Towards Open Large Generalist Models for Tables | Tianshu Zhang et.al. | 2311.09206 | null |
2023-11-15 | Fusion-Eval: Integrating Evaluators with LLMs | Lei Shu et.al. | 2311.09204 | null |
2023-11-15 | Never Lost in the Middle: Improving Large Language Models via Attention Strengthening Question Answering | Junqing He et.al. | 2311.09198 | null |
2023-11-15 | Structural Priming Demonstrates Abstract Grammatical Representations in Multilingual Language Models | James A. Michaelov et.al. | 2311.09194 | null |
2023-11-15 | PsyEval: A Comprehensive Large Language Model Evaluation Benchmark for Mental Health | Haoan Jin et.al. | 2311.09189 | null |
2023-11-15 | Towards Verifiable Text Generation with Symbolic References | Lucas Torroba Hennigen et.al. | 2311.09188 | null |
2023-11-15 | Benchmarking Generation and Evaluation Capabilities of Large Language Models for Instruction Controllable Summarization | Yixin Liu et.al. | 2311.09184 | link |
2023-11-14 | Towards Open-Ended Visual Recognition with Large Language Model | Qihang Yu et.al. | 2311.08400 | link |
2023-11-14 | Are Large Language Models Temporally Grounded? | Yifu Qiu et.al. | 2311.08398 | link |
2023-11-14 | Zero-shot audio captioning with audio-language model guidance and audio context keywords | Leonard Salewski et.al. | 2311.08396 | link |
2023-11-14 | On What Basis? Predicting Text Preference Via Structured Comparative Reasoning | Jing Nathan Yan et.al. | 2311.08390 | null |
2023-11-14 | TSST: A Benchmark and Evaluation Models for Text Speech-Style Transfer | Huashan Sun et.al. | 2311.08389 | null |
2023-11-14 | Direct Preference Optimization for Neural Machine Translation with Minimum Bayes Risk Decoding | Guangyu Yang et.al. | 2311.08380 | null |
2023-11-14 | A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts | Nafis Irtiza Tripto et.al. | 2311.08374 | null |
2023-11-14 | SimpleSafetyTests: a Test Suite for Identifying Critical Safety Risks in Large Language Models | Bertie Vidgen et.al. | 2311.08370 | null |
2023-11-14 | How You Prompt Matters! Even Task-Oriented Constraints in Instructions Affect LLM-Generated Text Detection | Ryuto Koike et.al. | 2311.08369 | null |
2023-11-14 | Plum: Prompt Learning using Metaheuristic | Rui Pan et.al. | 2311.08364 | link |
2023-11-13 | SPHINX: The Joint Mixing of Weights, Tasks, and Visual Embeddings for Multi-modal Large Language Models | Ziyi Lin et.al. | 2311.07575 | link |
2023-11-13 | To See is to Believe: Prompting GPT-4V for Better Visual Instruction Tuning | Junke Wang et.al. | 2311.07574 | null |
2023-11-13 | Using Natural Language Explanations to Improve Robustness of In-context Learning for Natural Language Inference | Xuanli He et.al. | 2311.07556 | null |
2023-11-13 | It's Not Easy Being Wrong: Evaluating Process of Elimination Reasoning in Large Language Models | Nishant Balepur et.al. | 2311.07532 | link |
2023-11-13 | A Benchmark to Understand the Role of Knowledge Graphs on Large Language Model's Accuracy for Question Answering on Enterprise SQL Databases | Juan Sequeda et.al. | 2311.07509 | null |
2023-11-13 | A Step Closer to Comprehensive Answers: Constrained Multi-Stage Question Decomposition with Large Language Models | Hejing Cao et.al. | 2311.07491 | null |
2023-11-13 | Psychometric Predictive Power of Large Language Models | Tatsuki Kuribayashi et.al. | 2311.07484 | null |
2023-11-13 | Finding and Editing Multi-Modal Neurons in Pre-Trained Transformer | Haowen Pan et.al. | 2311.07470 | null |
2023-11-13 | InCA: Rethinking In-Car Conversational System Assessment Leveraging Large Language Models | Ken E. Friedl et.al. | 2311.07469 | null |
2023-11-13 | Are We Falling in a Middle-Intelligence Trap? An Analysis and Mitigation of the Reversal Curse | Ang Lv et.al. | 2311.07468 | null |
2023-11-10 | Parameter-Efficient Orthogonal Finetuning via Butterfly Factorization | Weiyang Liu et.al. | 2311.06243 | null |
2023-11-10 | Summon a Demon and Bind it: A Grounded Theory of LLM Red Teaming in the Wild | Nanna Inie et.al. | 2311.06237 | null |
2023-11-10 | Data Contamination Quiz: A Tool to Detect and Estimate Contamination in Large Language Models | Shahriar Golchin et.al. | 2311.06233 | null |
2023-11-10 | Vox Populi, Vox ChatGPT: Large Language Models, Education and Democracy | Niina Zuber et.al. | 2311.06207 | null |
2023-11-10 | Syntax-semantics interface: an algebraic model | Matilde Marcolli et.al. | 2311.06189 | null |
2023-11-10 | Language Models can be Logical Solvers | Jiazhan Feng et.al. | 2311.06158 | null |
2023-11-10 | Is it indeed bigger better? The comprehensive study of claim detection LMs applied for disinformation tackling | Martin Hyben et.al. | 2311.06121 | null |
2023-11-10 | Making LLMs Worth Every Penny: Resource-Limited Text Classification in Banking | Lefteris Loukas et.al. | 2311.06102 | null |
2023-11-10 | Practical Membership Inference Attacks against Fine-tuned Large Language Models via Self-prompt Calibration | Wenjie Fu et.al. | 2311.06062 | null |
2023-11-10 | Structure of the space of folding protein sequences defined by large language models | A. Zambon et.al. | 2311.06034 | null |
2023-11-09 | Efficient Parallelization Layouts for Large-Scale Distributed Model Training | Johannes Hagemann et.al. | 2311.05610 | link |
2023-11-09 | FigStep: Jailbreaking Large Vision-language Models via Typographic Visual Prompts | Yichen Gong et.al. | 2311.05608 | link |
2023-11-09 | Accuracy of a Vision-Language Model on Challenging Medical Cases | Thomas Buckley et.al. | 2311.05591 | link |
2023-11-09 | Conversational AI Threads for Visualizing Multidimensional Datasets | Matt-Heun Hong et.al. | 2311.05590 | null |
2023-11-09 | Zero-Shot Goal-Directed Dialogue via RL on Imagined Conversations | Joey Hong et.al. | 2311.05584 | null |
2023-11-09 | Removing RLHF Protections in GPT-4 via Fine-Tuning | Qiusi Zhan et.al. | 2311.05553 | null |
2023-11-09 | ChatGPT and other Large Language Models for Cybersecurity of Smart Grid Applications | Aydin Zaboli et.al. | 2311.05462 | null |
2023-11-09 | Automated Mobile Sensing Strategies Generation for Human Behaviour Understanding | Nan Gao et.al. | 2311.05457 | null |
2023-11-09 | Cognitively Inspired Components for Social Conversational Agents | Alex Clay et.al. | 2311.05450 | null |
2023-11-09 | TencentLLMEval: A Hierarchical Evaluation of Real-World Capabilities for Human-Aligned LLMs | Shuyi Xie et.al. | 2311.05374 | link |
2023-11-08 | Beyond Size: How Gradients Shape Pruning Decisions in Large Language Models | Rocktim Jyoti Das et.al. | 2311.04902 | link |
2023-11-08 | GENOME: GenerativE Neuro-symbOlic visual reasoning by growing and reusing ModulEs | Zhenfang Chen et.al. | 2311.04901 | null |
2023-11-08 | How Abstract Is Linguistic Generalization in Large Language Models? Experiments with Argument Structure | Michael Wilson et.al. | 2311.04900 | link |
2023-11-08 | AutoChip: Automating HDL Generation Using LLM Feedback | Shailja Thakur et.al. | 2311.04887 | null |
2023-11-08 | SEMQA: Semi-Extractive Multi-Source Question Answering | Tal Schuster et.al. | 2311.04886 | link |
2023-11-08 | LongQLoRA: Efficient and Effective Method to Extend Context Length of Large Language Models | Jianxin Yang et.al. | 2311.04879 | link |
2023-11-08 | Rethinking Benchmark and Contamination for Language Models with Rephrased Samples | Shuo Yang et.al. | 2311.04850 | link |
2023-11-08 | Using large language models to study human memory for meaningful narratives | Antonios Georgiou Tankut Can et.al. | 2311.04742 | link |
2023-11-08 | Evaluating Generative Ad Hoc Information Retrieval | Lukas Gienapp et.al. | 2311.04694 | null |
2023-11-08 | Pre-training LLMs using human-like development data corpus | Khushi Bhardwaj et.al. | 2311.04666 | null |
2023-11-07 | Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves | Yihe Deng et.al. | 2311.04205 | null |
2023-11-07 | Enhancing LLM Intelligence with ARM-RAG: Auxiliary Rationale Memory for Retrieval Augmented Generation | Eric Melz et.al. | 2311.04177 | null |
2023-11-07 | Perturbed examples reveal invariances shared by language models | Ruchit Rawal et.al. | 2311.04166 | null |
2023-11-08 | Black-Box Prompt Optimization: Aligning Large Language Models without Model Training | Jiale Cheng et.al. | 2311.04155 | link |
2023-11-07 | Unveiling Safety Vulnerabilities of Large Language Models | George Kour et.al. | 2311.04124 | null |
2023-11-07 | Do LLMs exhibit human-like response biases? A case study in survey design | Lindia Tjuatja et.al. | 2311.04076 | link |
2023-11-07 | Beyond Imitation: Leveraging Fine-grained Quality Signals for Alignment | Geyang Guo et.al. | 2311.04072 | null |
2023-11-07 | Extracting human interpretable structure-property relationships in chemistry using XAI and large language models | Geemi P. Wellawatte et.al. | 2311.04047 | null |
2023-11-07 | Reinforcement Learning Fine-tuning of Language Models is Biased Towards More Extractable Features | Diogo Cruz et.al. | 2311.04046 | null |
2023-11-07 | Aspects of human memory and Large Language Models | Romuald A. Janik et.al. | 2311.03839 | link |
2023-11-06 | GLaMM: Pixel Grounding Large Multimodal Model | Hanoona Rasheed et.al. | 2311.03356 | null |
2023-11-06 | CoVLM: Composing Visual Entities and Relationships in Large Language Models Via Communicative Decoding | Junyan Li et.al. | 2311.03354 | null |
2023-11-06 | Scalable and Transferable Black-Box Jailbreaks for Language Models via Persona Modulation | Rusheb Shah et.al. | 2311.03348 | null |
2023-11-06 | DAIL: Data Augmentation for In-Context Learning via Self-Paraphrase | Dawei Li et.al. | 2311.03319 | null |
2023-11-06 | Unraveling Downstream Gender Bias from Large Language Models: A Study on AI Educational Writing Assistance | Thiemo Wambsganss et.al. | 2311.03311 | link |
2023-11-06 | Ziya2: Data-centric Learning is All LLMs Need | Ruyi Gan et.al. | 2311.03301 | null |
2023-11-06 | S-LoRA: Serving Thousands of Concurrent LoRA Adapters | Ying Sheng et.al. | 2311.03285 | null |
2023-11-06 | Instructed Language Models with Retrievers Are Powerful Entity Linkers | Zilin Xiao et.al. | 2311.03250 | link |
2023-11-06 | ALYMPICS: Language Agents Meet Game Theory | Shaoguang Mao et.al. | 2311.03220 | null |
2023-11-06 | DeepInception: Hypnotize Large Language Model to Be Jailbreaker | Xuan Li et.al. | 2311.03191 | null |
2023-11-03 | Post Turing: Mapping the landscape of LLM Evaluation | Alexey Tikhonov et.al. | 2311.02049 | null |
2023-11-03 | Conditions on Preference Relations that Guarantee the Existence of Optimal Policies | Jonathan Colaco Carr et.al. | 2311.01990 | null |
2023-11-03 | Don't Make Your LLM an Evaluation Benchmark Cheater | Kun Zhou et.al. | 2311.01964 | null |
2023-11-03 | Hint-enhanced In-Context Learning wakes Large Language Models up for knowledge-intensive tasks | Yifan Wang et.al. | 2311.01949 | null |
2023-11-03 | Supermind Ideator: Exploring generative AI to support creative problem-solving | Steven R. Rick et.al. | 2311.01937 | null |
2023-11-03 | GateLoop: Fully Data-Controlled Linear Recurrence for Sequence Modeling | Tobias Katsch et.al. | 2311.01927 | null |
2023-11-03 | ChartGPT: Leveraging LLMs to Generate Charts from Abstract Natural Language | Yuan Tian et.al. | 2311.01920 | null |
2023-11-03 | Large Language Models Illuminate a Progressive Pathway to Artificial Healthcare Assistant: A Review | Mingze Yuan et.al. | 2311.01918 | link |
2023-11-03 | LLM-driven Multimodal Target Volume Contouring in Radiation Oncology | Yujin Oh et.al. | 2311.01908 | null |
2023-11-03 | Indicative Summarization of Long Discussions | Shahbaz Syed et.al. | 2311.01882 | link |
2023-11-02 | TopicGPT: A Prompt-based Topic Modeling Framework | Chau Minh Pham et.al. | 2311.01449 | link |
2023-11-02 | Deep Double Descent for Time Series Forecasting: Avoiding Undertrained Models | Valentino Assandri et.al. | 2311.01442 | null |
2023-11-02 | REAL: Resilience and Adaptation using Large Language Models on Autonomous Aerial Robots | Andrea Tagliabue et.al. | 2311.01403 | null |
2023-11-02 | Collaborative Large Language Model for Recommender Systems | Yaochen Zhu et.al. | 2311.01343 | link |
2023-11-02 | The Effect of Scaling, Retrieval Augmentation and Form on the Factual Consistency of Language Models | Lovisa Hagström et.al. | 2311.01307 | null |
2023-11-02 | AWEQ: Post-Training Quantization with Activation-Weight Equalization for Large Language Models | Baisong Li et.al. | 2311.01305 | null |
2023-11-02 | FlashDecoding++: Faster Large Language Model Inference on GPUs | Ke Hong et.al. | 2311.01282 | null |
2023-11-02 | Let's Discover More API Relations: A Large Language Model-based AI Chain for Unsupervised API Relation Inference | Qing Huang et.al. | 2311.01266 | null |
2023-11-02 | Expressive TTS Driven by Natural Language Prompts Using Few Human Annotations | Hanglei Zhang et.al. | 2311.01260 | null |
2023-11-02 | An energy-based comparative analysis of common approaches to text classification in the Legal domain | Sinan Gultekin et.al. | 2311.01256 | null |
2023-11-01 | Unleashing the Creative Mind: Language Model As Hierarchical Policy For Improved Exploration on Challenging Problem Solving | Zhan Ling et.al. | 2311.00694 | link |
2023-11-01 | Improving Interpersonal Communication by Simulating Audiences with Language Models | Ryan Liu et.al. | 2311.00687 | link |
2023-11-01 | Little Giants: Exploring the Potential of Small LLMs as Evaluation Metrics in Summarization in the Eval4NLP 2023 Shared Task | Neema Kotonya et.al. | 2311.00686 | null |
2023-11-01 | Attention Alignment and Flexible Positional Embeddings Improve Transformer Length Extrapolation | Ta-Chung Chi et.al. | 2311.00684 | null |
2023-11-01 | Are Large Language Models Reliable Judges? A Study on the Factuality Evaluation Capabilities of LLMs | Xue-Yong Fu et.al. | 2311.00681 | null |
2023-11-01 | Emotion Detection for Misinformation: A Review | Zhiwei Liu et.al. | 2311.00671 | null |
2023-11-01 | De-Diffusion Makes Text a Strong Cross-Modal Interface | Chen Wei et.al. | 2311.00618 | null |
2023-11-01 | Crosslingual Retrieval Augmented In-context Learning for Bangla | Xiaoqian Li et.al. | 2311.00587 | null |
2023-11-01 | The Development of LLMs for Embodied Navigation | Jinzhou Lin et.al. | 2311.00530 | null |
2023-11-01 | Efficient LLM Inference on CPUs | Haihao Shen et.al. | 2311.00502 | link |
2023-10-31 | Learning From Mistakes Makes LLM Better Reasoner | Shengnan An et.al. | 2310.20689 | link |
2023-10-31 | Defining a New NLP Playground | Sha Li et.al. | 2310.20633 | null |
2023-10-31 | LoRA Fine-tuning Efficiently Undoes Safety Training in Llama 2-Chat 70B | Simon Lermen et.al. | 2310.20624 | null |
2023-10-31 | Unleashing the Power of Pre-trained Language Models for Offline Reinforcement Learning | Ruizhe Shi et.al. | 2310.20587 | null |
2023-10-31 | CapsFusion: Rethinking Image-Text Data at Scale | Qiying Yu et.al. | 2310.20550 | null |
2023-10-31 | LLMs may Dominate Information Access: Neural Retrievers are Biased Towards LLM-Generated Texts | Sunhao Dai et.al. | 2310.20501 | null |
2023-10-31 | Leveraging Word Guessing Games to Assess the Intelligence of Large Language Models | Tian Liang et.al. | 2310.20499 | null |
2023-10-31 | Large Language Model Can Interpret Latent Space of Sequential Recommender | Zhengyi Yang et.al. | 2310.20487 | null |
2023-10-31 | The SourceData-NLP dataset: integrating curation into scientific publishing for training large language models | Jorge Abreu-Vicente et.al. | 2310.20440 | null |
2023-10-31 | FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models | Yuxin Jiang et.al. | 2310.20410 | null |
2023-10-30 | M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models | Wai-Chung Kwan et.al. | 2310.19240 | null |
2023-10-30 | Building Real-World Meeting Summarization Systems using Large Language Models: A Practical Perspective | Md Tahmid Rahman Laskar et.al. | 2310.19233 | null |
2023-10-30 | EHRTutor: Enhancing Patient Understanding of Discharge Instructions | Zihao Zhang et.al. | 2310.19212 | null |
2023-10-30 | Leveraging generative artificial intelligence to simulate student learning behavior | Songlin Xu et.al. | 2310.19206 | null |
2023-10-29 | From Chatbots to PhishBots? -- Preventing Phishing scams created using ChatGPT, Google Bard and Claude | Sayak Saha Roy et.al. | 2310.19181 | null |
2023-10-29 | Atom: Low-bit Quantization for Efficient and Accurate LLM Serving | Yilong Zhao et.al. | 2310.19102 | null |
2023-10-29 | Roles of Scaling and Instruction Tuning in Language Perception: Model vs. Human Attention | Changjiang Gao et.al. | 2310.19084 | link |
2023-10-29 | Reward Finetuning for Faster and More Accurate Unsupervised Object Discovery | Katie Z Luo et.al. | 2310.19080 | null |
2023-10-29 | Myriad: Large Multimodal Model by Applying Vision Experts for Industrial Anomaly Detection | Yuanze Li et.al. | 2310.19070 | null |
2023-10-29 | Multimodal ChatGPT for Medical Applications: an Experimental Study of GPT-4V | Zhiling Yan et.al. | 2310.19061 | link |
2023-10-27 | FP8-LM: Training FP8 Large Language Models | Houwen Peng et.al. | 2310.18313 | link |
2023-10-27 | Image Clustering Conditioned on Text Criteria | Sehyun Kwon et.al. | 2310.18297 | link |
2023-10-27 | ArcheType: A Novel Framework for Open-Source Column Type Annotation using Large Language Models | Benjamin Feuer et.al. | 2310.18208 | link |
2023-10-27 | Lost in Translation, Found in Spans: Identifying Claims in Multilingual Social Media | Shubham Mittal et.al. | 2310.18205 | null |
2023-10-27 | Personas as a Way to Model Truthfulness in Language Models | Nitish Joishi et.al. | 2310.18168 | null |
2023-10-27 | MPrompt: Exploring Multi-level Prompt Tuning for Machine Reading Comprehension | Guoxin Chen et.al. | 2310.18167 | null |
2023-10-27 | Disentangled Representation Learning with Large Language Models for Text-Attributed Graphs | Yijian Qin et.al. | 2310.18152 | null |
2023-10-27 | DELPHI: Data for Evaluating LLMs' Performance in Handling Controversial Issues | David Q. Sun et.al. | 2310.18130 | link |
2023-10-27 | Ask more, know better: Reinforce-Learned Prompt Questions for Decision Making with Large Language Models | Xue Yan et.al. | 2310.18127 | null |
2023-10-27 | Knowledge Corpus Error in Question Answering | Yejoon Lee et.al. | 2310.18076 | link |
2023-10-26 | In-Context Learning Dynamics with Random Binary Sequences | Eric J. Bigelow et.al. | 2310.17639 | null |
2023-10-26 | JudgeLM: Fine-tuned Large Language Models are Scalable Judges | Lianghui Zhu et.al. | 2310.17631 | link |
2023-10-26 | InstOptima: Evolutionary Multi-objective Instruction Optimization via Large Language Model-based Instruction Operators | Heng Yang et.al. | 2310.17630 | null |
2023-10-26 | Proving Test Set Contamination in Black Box Language Models | Yonatan Oren et.al. | 2310.17623 | null |
2023-10-26 | An Open Source Data Contamination Report for Llama Series Models | Yucheng Li et.al. | 2310.17589 | link |
2023-10-26 | Interactive Robot Learning from Verbal Correction | Huihan Liu et.al. | 2310.17555 | null |
2023-10-27 | Can large language models replace humans in the systematic review process? Evaluating GPT-4's efficacy in screening and extracting data from peer-reviewed and grey literature in multiple languages | Qusai Khraisha et.al. | 2310.17526 | null |
2023-10-27 | The Expressive Power of Low-Rank Adaptation | Yuchen Zeng et.al. | 2310.17513 | link |
2023-10-26 | CompeteAI: Understanding the Competition Behaviors in Large Language Model-based Agents | Qinlin Zhao et.al. | 2310.17512 | null |
2023-10-26 | Improving Zero-shot Reader by Reducing Distractions from Irrelevant Documents in Open-Domain Question Answering | Sukmin Cho et.al. | 2310.17490 | null |
2023-10-25 | LLM-FP4: 4-Bit Floating-Point Quantized Transformers | Shih-yang Liu et.al. | 2310.16836 | link |
2023-10-25 | Can GPT models Follow Human Summarization Guidelines? Evaluating ChatGPT and GPT-4 for Dialogue Summarization | Yongxin Zhou et.al. | 2310.16810 | null |
2023-10-25 | QMoE: Practical Sub-1-Bit Compression of Trillion-Parameter Models | Elias Frantar et.al. | 2310.16795 | link |
2023-10-25 | Detecting Pretraining Data from Large Language Models | Weijia Shi et.al. | 2310.16789 | null |
2023-10-26 | DEFT: Data Efficient Fine-Tuning for Large Language Models via Unsupervised Core-Set Selection | Devleena Das et.al. | 2310.16776 | null |
2023-10-25 | SuperHF: Supervised Iterative Learning from Human Feedback | Gabriel Mukobi et.al. | 2310.16763 | link |
2023-10-25 | HI-TOM: A Benchmark for Evaluating Higher-Order Theory of Mind Reasoning in Large Language Models | Yinghui He et.al. | 2310.16755 | link |
2023-10-25 | HANSEN: Human and AI Spoken Text Benchmark for Authorship Analysis | Nafis Irtiza Tripto et.al. | 2310.16746 | null |
2023-10-25 | Disentangling Extraction and Reasoning in Multi-hop Spatial Reasoning | Roshanak Mirzaee et.al. | 2310.16731 | null |
2023-10-26 | SkyMath: Technical Report | Liu Yang et.al. | 2310.16713 | null |
2023-10-24 | MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning | Zayne Sprague et.al. | 2310.16049 | link |
2023-10-24 | AI Alignment and Social Choice: Fundamental Limitations and Policy Implications | Abhilash Mishra et.al. | 2310.16048 | null |
2023-10-24 | Woodpecker: Hallucination Correction for Multimodal Large Language Models | Shukang Yin et.al. | 2310.16045 | link |
2023-10-25 | WebWISE: Web Interface Control and Sequential Exploration with Large Language Models | Heyi Tao et.al. | 2310.16042 | null |
2023-10-24 | Instruct and Extract: Instruction Tuning for On-Demand Information Extraction | Yizhu Jiao et.al. | 2310.16040 | null |
2023-10-24 | What's Left? Concept Grounding with Logic-Enhanced Foundation Models | Joy Hsu et.al. | 2310.16035 | link |
2023-10-24 | Visual Cropping Improves Zero-Shot Question Answering of Multimodal Large Language Models | Jiarui Zhang et.al. | 2310.16033 | null |
2023-10-24 | What Algorithms can Transformers Learn? A Study in Length Generalization | Hattie Zhou et.al. | 2310.16028 | null |
2023-10-24 | White-box Compiler Fuzzing Empowered by Large Language Models | Chenyuan Yang et.al. | 2310.15991 | null |
2023-10-24 | Dissecting In-Context Learning of Translations in GPTs | Vikas Raunak et.al. | 2310.15987 | null |
2023-10-23 | Large Language Models are Visual Reasoning Coordinators | Liangyu Chen et.al. | 2310.15166 | link |
2023-10-23 | LINC: A Neurosymbolic Approach for Logical Reasoning by Combining Language Models with First-Order Logic Provers | Theo X. Olausson et.al. | 2310.15164 | link |
2023-10-23 | Linear Representations of Sentiment in Large Language Models | Curt Tigges et.al. | 2310.15154 | null |
2023-10-23 | S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models | Fangyu Lei et.al. | 2310.15147 | link |
2023-10-23 | SpecTr: Fast Speculative Decoding via Optimal Transport | Ziteng Sun et.al. | 2310.15141 | null |
2023-10-23 | AutoDAN: Automatic and Interpretable Adversarial Attacks on Large Language Models | Sicheng Zhu et.al. | 2310.15140 | null |
2023-10-23 | Quantifying the Dialect Gap and its Correlates Across Languages | Anjali Kantharuban et.al. | 2310.15135 | null |
2023-10-23 | Open-Ended Instructable Embodied Agents with Memory-Augmented Large Language Models | Gabriel Sarch et.al. | 2310.15127 | null |
2023-10-23 | Branch-Solve-Merge Improves Large Language Model Evaluation and Generation | Swarnadeep Saha et.al. | 2310.15123 | null |
2023-10-23 | Causal Inference Using LLM-Guided Discovery | Aniket Vashishtha et.al. | 2310.15117 | null |
2023-10-20 | Improving Long-form Speech Translation through Segmentation with Large Language Models and Finite State Decoding Constraints | Arya D. McCarthy et.al. | 2310.13678 | null |
2023-10-20 | StereoMap: Quantifying the Awareness of Human-like Stereotypes in Large Language Models | Sullam Jeoung et.al. | 2310.13673 | link |
2023-10-20 | Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models | Ruida Wang et.al. | 2310.13671 | link |
2023-10-20 | BotChat: Evaluating LLMs' Capabilities of Having Multi-Turn Dialogues | Haodong Duan et.al. | 2310.13650 | link |
2023-10-20 | Contrastive Prefence Learning: Learning from Human Feedback without RL | Joey Hejna et.al. | 2310.13639 | link |
2023-10-20 | Three Questions Concerning the Use of Large Language Models to Facilitate Mathematics Learning | An-Zi Yen et.al. | 2310.13615 | null |
2023-10-20 | MarineGPT: Unlocking Secrets of Ocean to the Public | Ziqiang Zheng et.al. | 2310.13596 | link |
2023-10-20 | Entangled Preferences: The History and Risks of Reinforcement Learning and Human Feedback | Nathan Lambert et.al. | 2310.13595 | null |
2023-10-20 | Why Can Large Language Models Generate Correct Chain-of-Thoughts? | Rasul Tutunov et.al. | 2310.13571 | null |
2023-10-20 | Cache & Distil: Optimising API Calls to Large Language Models | Guillem Ramírez et.al. | 2310.13561 | null |
2023-10-19 | Frozen Transformers in Language Models Are Effective Visual Encoder Layers | Ziqi Pang et.al. | 2310.12973 | link |
2023-10-19 | CLAIR: Evaluating Image Captions with Large Language Models | David Chan et.al. | 2310.12971 | null |
2023-10-19 | AutoMix: Automatically Mixing Language Models | Aman Madaan et.al. | 2310.12963 | link |
2023-10-19 | An Emulator for Fine-Tuning Large Language Models using Small Language Models | Eric Mitchell et.al. | 2310.12962 | null |
2023-10-19 | SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving | Xueliang Zhao et.al. | 2310.12960 | null |
2023-10-19 | Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation | Sangho Suh et.al. | 2310.12953 | null |
2023-10-19 | 3D-GPT: Procedural 3D Modeling with Large Language Models | Chunyi Sun et.al. | 2310.12945 | null |
2023-10-19 | Eureka: Human-Level Reward Design via Coding Large Language Models | Yecheng Jason Ma et.al. | 2310.12931 | null |
2023-10-19 | Experimental Narratives: A Comparison of Human Crowdsourced Storytelling and AI Storytelling | Nina Begus et.al. | 2310.12902 | null |
2023-10-19 | StoryAnalogy: Deriving Story-level Analogies from Large Language Models to Unlock Analogical Understanding | Cheng Jiayang et.al. | 2310.12874 | null |
2023-10-18 | Pseudointelligence: A Unifying Framework for Language Model Evaluation | Shikhar Murty et.al. | 2310.12135 | null |
2023-10-18 | Monarch Mixer: A Simple Sub-Quadratic GEMM-Based Architecture | Daniel Y. Fu et.al. | 2310.12109 | null |
2023-10-18 | Non-Intrusive Adaptation: Input-Centric Parameter-efficient Fine-Tuning for Versatile Multimodal Modeling | Yaqing Wang et.al. | 2310.12100 | null |
2023-10-18 | Unveiling the Siren's Song: Towards Reliable Fact-Conflicting Hallucination Detection | Xiang Chen et.al. | 2310.12086 | link |
2023-10-18 | On the Benefit of Generative Foundation Models for Human Activity Recognition | Zikang Leng et.al. | 2310.12085 | null |
2023-10-18 | SPEED: Speculative Pipelined Execution for Efficient Decoding | Coleman Hooper et.al. | 2310.12072 | null |
2023-10-18 | Evaluating the Symbol Binding Ability of Large Language Models for Multiple-Choice Questions in Vietnamese General Education | Duc-Vu Nguyen et.al. | 2310.12059 | null |
2023-10-18 | Concept-Guided Chain-of-Thought Prompting for Pairwise Comparison Scaling of Texts with Large Language Models | Patrick Y. Wu et.al. | 2310.12049 | null |
2023-10-18 | LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation | Shengqiang Zhang et.al. | 2310.12020 | null |
2023-10-18 | Fast Multipole Attention: A Divide-and-Conquer Attention Mechanism for Long Sequences | Yanming Kang et.al. | 2310.11960 | null |
2023-10-17 | VeRA: Vector-based Random Matrix Adaptation | Dawid Jan Kopiczko et.al. | 2310.11454 | null |
2023-10-17 | BitNet: Scaling 1-bit Transformers for Large Language Models | Hongyu Wang et.al. | 2310.11453 | null |
2023-10-17 | Seeking Neural Nuggets: Knowledge Transfer in Large Language Models from a Parametric Perspective | Ming Zhong et.al. | 2310.11451 | null |
2023-10-18 | EvalCrafter: Benchmarking and Evaluating Large Video Generation Models | Yaofang Liu et.al. | 2310.11440 | null |
2023-10-17 | An Empirical Study of Translation Hypothesis Ensembling with Large Language Models | António Farinhas et.al. | 2310.11430 | link |
2023-10-17 | Neural Attention: Enhancing QKV Calculation in Self-Attention Mechanism with Neural Networks | Muhan Zhang et.al. | 2310.11398 | link |
2023-10-17 | Last One Standing: A Comparative Analysis of Security and Privacy of Soft Prompt Tuning, LoRA, and In-Context Learning | Rui Wen et.al. | 2310.11397 | null |
2023-10-17 | Towards Automatic Satellite Images Captions Generation Using Large Language Models | Yingxu He et.al. | 2310.11392 | null |
2023-10-17 | DialogueLLM: Context and Emotion Knowledge-Tuned LLaMA Models for Emotion Recognition in Conversations | Yazhou Zhang et.al. | 2310.11374 | null |
2023-10-17 | Quantifying Language Models' Sensitivity to Spurious Features in Prompt Design or: How I learned to start worrying about prompt formatting | Melanie Sclar et.al. | 2310.11324 | null |
2023-10-13 | Vision-by-Language for Training-Free Compositional Image Retrieval | Shyamgopal Karthik et.al. | 2310.09291 | null |
2023-10-13 | User Inference Attacks on Large Language Models | Nikhil Kandpal et.al. | 2310.09266 | null |
2023-10-13 | **PromptRE: Weakly-Supervised Document-Level Relation Extraction via Prompting-Based Data Progr |