Can we combine integer linear programming with exemplar selection to improve In-Context Learning?
Yes! All you need is to optimize your Knapsack
The paper by Jonathan Tonglet, Manon Reusens, Philipp Borchert and Bart Baesens on #SEER was just accepted to #EMNLP2023 – learn more in this thread (1/).
It was observed that the performance of ICL depends heavily on the selection of the exemplars.
Jonathan Tonglet et al. show how this combinatorial optimization problem can be formulated as a Knapsack Integer Linear program and optimized efficiently with deterministic solvers.
The Knapsack consists of an objective function, a capacity constraint and optional additional constraints. (2/)
In their #EMNLP2023 paper, the authors use a capacity constraint to control the size in tokens of the prompt and diversity constraints to favor the selection of exemplars – sharing the same reasoning properties as the test problem.
They propose #SEER, a method to automatically generate a Knapsack program for HybridQA problems. It achieves superior performance to exemplar selection baselines on the FinQA and TAT-QA datasets (3/) #EMNLP2023
Tokens are the main unit price for commercial LLMs. Thanks to capacity constraints, #SEER directly optimizes the prompt size to meet restricted token budgets. (4/)
If you're interested in our research: We provide open access to our code and results:
https://github.com/jtonglet/SEER
Jonathan Tonglet, Manon Reusens, Philipp Borchert and Bart Baesens:
SEER : A Knapsack approach to Exemplar Selection for In-Context HybridQA https://arxiv.org/abs/2310.06675v2
(5/) #EMNLP2023