Saibo Geng

Saibo Geng

PhD student in EPFL

EPFL Data Science Lab

Biography

Hey there! I’m currently a Ph.D. student at EPFL’s Data Science Lab, where I’m advised by Prof. Robert West.

During my time at EPFL, I had the chance to dive into the world of knowledge-graph enhanced LLM with Prof. Antoine Bosseult(NLP lab), text summarization with Diego Antognini(now at Google), language model for law with Rémi Lebret(AI lab), and LLM evaluation with Maxime Peyrard(now professor at CNRS France). Prior to joining EPFL, I received my B.S. in Physics from University Paris-Saclay, France.

My research interests include:

  • Develop new constrained decoding research techniques for increasing LLM inference quality and/or efficiency. Examples include:

    • speculative execution
    • new decoding strategies (e.g. extensions to beam search)
    • “classifier in the loop” decoding for responsible AI
    • improving AI planning
    • explorations of attention-masking based constraints.
  • Re-imagine the use and construction of context-free grammars (CFG) and beyond to fit Generative AI. Examples here include

    • better tools for constructing formal grammars
    • extensions to Earley parsing
    • efficient batch processing for constrained generation
  • Design principled evaluation frameworks and benchmarks for measuring the effects of constrained decoding on a model. Some areas of interest to study carefully include

    • efficiency (token throughput and latency),
    • generation quality
    • impacts of constrained decoding on AI safety.
  • Consideration of how these techniques are presented to developers – who may not be well versed in grammars and constrained generation – in an intuitive, idiomatic programming syntax is also top of mind.

Interests
  • Grammar-constrained Decoding
  • Efficient Decoding methods
  • Domain-Specific Language generation with LLM
Education
  • PhD in Computer Science, 2022-Now

    EPFL

Updates

Transformers-CFG is now available on GitHub
A grammar-constrained decoding library for LLM that supports CFG, EBNF, and unicode grammars
Low-memory Beam Search is merged into Huggingface Transformers
A low-memory beam search implementation that trades off speed for memory