Sep 12, 2025
Nice deep dive on using LLMs for retrieval/recommendation. It’s a two parter, and there’s also a great guide to building a retrieval engine using a constrained decoding approach with vLLM and a HF hosted model. The whole thing is about 30 LoC.
LLMs as Retrieval and Recommendation Engines