Google has unveiled a new multi-vector retrieval algorithm known as MUVERA. This innovative algorithm enhances retrieval speed, ranking, and accuracy. It’s applicable in search, recommender systems like YouTube, and natural language processing (NLP).
While the announcement didn’t explicitly confirm its use in search, the research paper highlights MUVERA’s capability for efficient multi-vector retrieval at web scale. It achieves this by integrating with existing infrastructure (via MIPS), reducing latency, and minimizing memory usage.
Understanding Vector Embedding
Vector embedding represents the relationships between words, topics, and phrases in multiple dimensions. This allows machines to discern similarities through patterns, such as words appearing in similar contexts or phrases with equivalent meanings. Related words and phrases are positioned closely in this space.
- “King Lear” is near “Shakespeare tragedy.”
- “A Midsummer Night’s Dream” is close to “Shakespeare comedy.”
- Both are located near “Shakespeare.”
The distances between words, phrases, and concepts, defined by mathematical similarity measures, indicate their relatedness. These patterns enable machines to infer similarities effectively.
MUVERA Enhances Multi-Vector Embeddings
According to the MUVERA research paper, neural embeddings have been integral to information retrieval for a decade. The ColBERT multi-vector model, introduced in 2020, marked a significant advancement but faced computational challenges.
“Recently, beginning with the landmark ColBERT paper, multi-vector models, which produce a set of embedding per data point, have achieved markedly superior performance for IR tasks. Unfortunately, using these models for IR is computationally expensive due to the increased complexity of multi-vector retrieval and scoring.”
Google’s MUVERA announcement acknowledges these challenges:
“… recent advances, particularly the introduction of multi-vector models like ColBERT, have demonstrated significantly improved performance in IR tasks. While this multi-vector approach boosts accuracy and enables retrieving more relevant documents, it introduces substantial computational challenges. In particular, the increased number of embeddings and the complexity of multi-vector similarity scoring make retrieval significantly more expensive.”
A Potential Successor to RankEmbed?
Testimony from the United States Department of Justice antitrust lawsuit revealed the use of RankEmbed in creating search engine results pages (SERPs). RankEmbed is a dual encoder model that embeds queries and documents into embedding space, considering semantic properties and other signals.
“RankEmbed is a dual encoder model that embeds both query and document into embedding space. Embedding space considers semantic properties of query and document in addition to other signals. Retrieval and ranking are then a dot product (distance measure in the embedding space)… Extremely fast; high quality on common queries but can perform poorly for tail queries…”
MUVERA advances beyond the limitations of dual-encoder models like RankEmbed, offering greater semantic depth and improved handling of tail query performance. It employs a technique called Fixed Dimensional Encoding (FDE), which divides embedding space into sections and combines vectors into a single, fixed-length vector. This enhances search speed without sacrificing accuracy.
According to the announcement:
“Unlike single-vector embeddings, multi-vector models represent each data point with a set of embeddings, and leverage more sophisticated similarity functions that can capture richer relationships between datapoints.
While this multi-vector approach boosts accuracy and enables retrieving more relevant documents, it introduces substantial computational challenges. In particular, the increased number of embeddings and the complexity of multi-vector similarity scoring make retrieval significantly more expensive.
In ‘MUVERA: Multi-Vector Retrieval via Fixed Dimensional Encodings’, we introduce a novel multi-vector retrieval algorithm designed to bridge the efficiency gap between single- and multi-vector retrieval.
…This new approach allows us to leverage the highly-optimized MIPS algorithms to retrieve an initial set of candidates that can then be re-ranked with the exact multi-vector similarity, thereby enabling efficient multi-vector retrieval without sacrificing accuracy.”
Multi-vector models offer more precise answers than dual-encoder models, but they require intensive computing. MUVERA resolves these complexities, achieving greater accuracy without high computing demands.
Implications for SEO
MUVERA demonstrates the shift in search ranking towards similarity judgments rather than traditional keyword signals. SEOs and publishers should focus on aligning content with the overall context and intent of queries. For example, a search for “corduroy jackets men’s medium” is more likely to rank pages offering those products, rather than pages merely mentioning the keywords.
For businesses looking to enhance their online presence, Cyberset offers a range of services, including SEARCH ENGINE OPTIMIZATION, Content marketing, and social media marketing. These services can help align your content with modern search algorithms, ensuring better visibility and engagement.
Read Google’s full announcement:
MUVERA: Making multi-vector retrieval as fast as single-vector search