GEO: Generative Engine Optimization GEO:
Generative Engine Optimization

Pranjal Aggarwal♢*, Vishvak Murahari♠*, Tanmay Rajpurohit, Ashwin Kalyan,
Karthik R Narasimhan, Ameet Deshpande,
Princeton University Allen Institute of Technology Georgia Tech IIT Delhi

TLDR: GEO is a novel paradigm, that will help content creators optimize their content for
next generation of search engines augmented with Large Language Models.

Generative Engines

The advent of large language models (LLMs) has ushered in a new paradigm of search engines that use generative models to gather and summarize information to answer user queries. These Generative Engines (GEs), have the potential to generate accurate and personalized responses, and are rapidly replacing traditional search engines and improving user utility.

What is GEO about?


However, Generative Engines present numerous challenges for content creators:

  • Ranking and Visibility: Assessing site performance in traditional search engines is straightforward, with websites listed in ranked order with verbatim content. However, Generative Engines generate rich and structured responses, often embedding citations in a single block. This makes the notion of ranking and visibility highly nuanced and multi-faceted.
  • Optimization of Websites: Unlike search engines, where significant research has been conducted on improving website visibility, it remains unclear how to optimize visibility in generative engine responses.
  • Black Box Nature: Due to black box nature of the generative engines, content creators have little to no control over when and how their content is displayed, making it even difficult to optimize their content.
To address these challenges we propose Generative Engine Optimization (GEO), a novel paradigm designed to aid content creators in enhancing their content's visibility. We contribute various impression metrics, a benchmark for evaluation and suite of optimization strategies empowering creators to improve content visibility, on deployed commercial Generative Engines.

Key-Highlights

  • 🚀 Content Optimization for Generative Engines: GEO allows website owners to optimize content specifically for generative engines.
  • 🔎 Customized Visibility Metrics: Proposes specialized impression metrics tailored to generative engines, so that you can judge your content optimizations effectively.
  • 📊 Comprehensive Benchmark: Introduces GEO-BENCH, a diverse benchmark of 10K queries for evaluation and corresponding leaderboard for up-to date comparison with best methods.
  • ⚡️ Significant Visibility Gains: Simple GEO strategies improve visibility by up to 40% in experiments on deployed commercial generative engines.

Results Summary

GEO outperforms baselines and traditional SEO methods by wide margins. We evaluate on both our proposed Generative Engine and an actual deployed GE to compare the performance. The results highlights, that creators can start using GEO out of box, to improve their content for the next search shift 🔥.


Our Analysis suggest a.) GEO is more helpful if optimization methods are made for target domain, b.) GEO will benefit websites lower ranked in search engine more in the future.


We qualitatively show, that with GEO, creators can make minor changes, however leading to large visibility enhancement for their content.


GEO-BENCH

Generative Engines are becoming the common way to search for information, and GEO would enable creators to optimize the content. However, how do they evaluate their methods, how to assess their performance. Current datasets, are scattered and do not directly represent the kind of queries asked in the new paradigm. We present GEO-bench, A diverse benchmark of Generative Engine related queries from different sources!

1. GEO-BENCH

  • 10K Queries: 10K queries from different domains, difficulty, cateogry. Train Set of 8K queries, with val and test set of 1k queries each
  • Datasets: Benchmark composed of datasets from 1. MS Macro, 2. ORCAS-1, 3. Natural Questions, 4. AllSouls, 5. LIMA, 6. Davinci-Debtate, 7. Perplexity.ai Discover, 8. ELI-5, 9. GPT-4 Generated Queries
  • Sources & Tags: Every query is provided with 5 cleaned relevant sources from Google Search Engine. Queries are tagged from more than 50 diverse tags to allow for targeted optimization and analysis.

2. GEO-BENCH Leaderboard

  • Public Leaderboard:We present a leaderboard for GEO-BENCH, which will be updated regularly with the latest results.
  • Different Domains: Your method targets particular domain? Leaderboard subsets for different domains, datasets and tags.
  • Submit your methods: Submit your methods to the leaderboard and compete with the best methods.

A step-forward in facilitating research in the era of new search paradigm!

BibTeX

@misc{aggarwal2023geo,
      title={GEO: Generative Engine Optimization}, 
      author={Pranjal Aggarwal and Vishvak Murahari and Tanmay Rajpurohit and Ashwin Kalyan and Karthik R Narasimhan and Ameet Deshpande},
      year={2023},
      eprint={2311.09735},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}