Natural Language Processing

A Simple Yet Effective Method for Non-Refusing Context Relevant Fine-grained Safety Steering in LLMs

Iterative Multilingual Spectral Attribute Erasure

Knowing Before Saying: LLM Representations Encode Information About Chain-of-Thought Success Before Completion

Padding Tone: A Mechanistic Analysis of Padding Tokens in T2I Models