Abstract Synthesis

Semantic Programming by Example with Pre-trained Models - Gust Verbruggen

Gust Verbruggen, Senior AI researcher and member of the PROSE team at Microsoft, discusses his paper "Semantic Programming by Example with Pre-trained Models," which introduces a framework for integrating inductive program synthesis with large language models.

The project emerged from an attempt to extend Flash Fill-style program synthesis beyond purely syntactic string transformations. Motivated by limitations in symbolic systems - especially their inability to access semantic knowledge without manually encoding it - Verbruggen and collaborators explored how GPT-3 could serve as a semantic oracle within the PROSE framework. The result is a neurosymbolic architecture that preserves the efficiency and guarantees of symbolic synthesis while selectively delegating semantic subproblems to a language model.

In This Episode

• Limitations of both program synthesis and LLMs

• Programming by example

• Syntactic versus semantic

• Integrating GPT-3 as semantic operators

• Semantic map, position, and condition operators

• Deductive backpropagation in PROSE

• Deferred query execution for efficiency

• Greedy clustering to control search explosion

• Ranking programs to minimize semantic calls

References

• https://www.microsoft.com/en-us/research/group/prose/

• https://www.microsoft.com/en-us/research/project/prose-framework/

• https://www.dagstuhl.de/en/seminars/seminar-calendar

• Sumit Gulwani's Flash Fill talk: https://youtu.be/421gU482xFE

About the Paper

"Semantic Programming by Example with Pre-trained Models"

Gust Verbruggen, Vu Le, Sumit Gulwani

Proceedings of the ACM on Programming Languages (OOPSLA), 2021

This paper presents a framework for augmenting inductive program synthesis with semantic operators powered by large language models. By decomposing tasks into syntactic and semantic subproblems, the system delegates only the irreducibly semantic components to a pre-trained model, while maintaining symbolic guarantees elsewhere. A deferred query execution strategy allows efficient learning without excessive model calls.

https://dl.acm.org/doi/10.1145/3485477

About the Guest

Gust Verbruggen is a researcher at KU Leuven and a member of Microsoft’s PROSE team. His work focuses on program synthesis, data wrangling, and neurosymbolic integration, particularly in real-world automation settings such as spreadsheets and code refactoring tools.

• https://www.microsoft.com/en-us/research/people/gverbruggen/

• https://scholar.google.com/citations?user=TmU3sKMAAAAJ&hl=en

Credits

• Host & Music: Bryan Landers, Technical Staff, Ndea

• Editor: Alejandro Ramirez

• https://x.com/ndea

• https://x.com/bryanlanders

• https://ndea.com