E2Rank: Efficient and Effective Layer-wise Reranking

Reranking models are critical to enhancing the quality of retrieval systems by refining initial search results based on query relevance. Among these, cross-encoders demonstrate higher effectiveness because of their deep semantic understanding, achieved through transformer-based architectures. However, their high computational demands pose significant challenges for real-time applications and scalability.

This paper introduces E2Rank, a layer-wise reranking model that optimizes both efficiency and effectiveness by leveraging intermediate transformer outputs, progressively applying deeper model layers to a narrowed candidate set, to reduce computational costs with minimal impact on quality. Our training approach, which includes model merging and layerwise contrastive training, yields substantial gains in effectiveness. Extensive experiments conducted on standard benchmarks demonstrate that E2Rank achieves state-of-the-art performance, outperforming existing rerankers in both effectiveness and computational efficiency.

E2Rank: Efficient and Effective Layer-wise Reranking

Subscribe to Pinecone

Start building knowledgeable AI today