Reasoning | Ravid Shwartz Ziv

Seq-VCR: Preventing Collapse in Intermediate Transformer Representations for Enhanced Reasoning

We propose Seq-VCR, a method to prevent representation collapse in Transformer models, significantly improving their performance on complex reasoning tasks without requiring chain-of-thought supervision.