Implementations.
Reference reproductions of recent LLM papers. Kept simple, kept readable, and kept on github.com/llmsresearch.
-
01llmsresearch / paperbananaPaperBanana
Open source implementation and extension of Google Research's PaperBanana for automated academic figures, diagrams, and research visuals, expanded to new domains including slide generation.
-
02llmsresearch / AVAAnytime Verified Agents (AVA)
A framework for building reliable LLM-based reasoning systems that adaptively allocate computational resources under budget constraints. Combines adaptive search strategies, uncertainty estimation, and verification cascades to maximize reliability while respecting token, tool-call, and verification budgets.
-
03llmsresearch / kvcompressKV-Cache Compression
KV-cache compression for LLMs. Reference implementations of TurboAngle and TurboQuant codecs with Triton GPU kernels for inference time KV cache reduction.
-
04llmsresearch / specstreamSpecStream
Fast LLM inference with 2.8x speedup using speculative decoding. A streaming implementation focused on practical end-to-end latency improvements.
-
05llmsresearch / sconeSCONE
Implementation and evaluation of the Scaling Embedding Layers in Language Models research paper.
-
06llmsresearch / rearagReaRAG
Implementation of ReaRAG, a knowledge-guided reasoning model that enhances factual accuracy using iterative retrieval-augmented generation. Adapted for modular LLM integration and customizable retrieval strategies.
-
07llmsresearch / coupledadamCoupled Adam
Implementation of the Better Embeddings with Coupled Adam research paper.
-
08llmsresearch / darwinlmDarwinLM
Implementation of the research paper DarwinLM: Evolutionary Structured Pruning of Large Language Models.
-
09llmsresearch / param-deltaParamΔ
ParamΔ (Parameter Delta) implementation. Enabling continuous model post-training at no training cost through parameter delta merging.
Browse all repositories at github.com/orgs/llmsresearch/repositories.