Implementations.

Reference reproductions of recent LLM papers. Kept simple, kept readable, and kept on github.com/llmsresearch.

01

llmsresearch / paperbanana

PaperBanana

Open source implementation and extension of Google Research's PaperBanana for automated academic figures, diagrams, and research visuals, expanded to new domains including slide generation.

Python Figures Visualization MIT
02

llmsresearch / AVA

Anytime Verified Agents (AVA)

A framework for building reliable LLM-based reasoning systems that adaptively allocate computational resources under budget constraints. Combines adaptive search strategies, uncertainty estimation, and verification cascades to maximize reliability while respecting token, tool-call, and verification budgets.

Python Agents Reasoning MIT
03

llmsresearch / kvcompress

KV-Cache Compression

KV-cache compression for LLMs. Reference implementations of TurboAngle and TurboQuant codecs with Triton GPU kernels for inference time KV cache reduction.

Python Triton KV cache Inference
04

llmsresearch / specstream

SpecStream

Fast LLM inference with 2.8x speedup using speculative decoding. A streaming implementation focused on practical end-to-end latency improvements.

Python Inference Speculative decoding MIT
05

llmsresearch / scone

SCONE

Implementation and evaluation of the Scaling Embedding Layers in Language Models research paper.

Python Embeddings Scaling MIT
06

llmsresearch / rearag

ReaRAG

Implementation of ReaRAG, a knowledge-guided reasoning model that enhances factual accuracy using iterative retrieval-augmented generation. Adapted for modular LLM integration and customizable retrieval strategies.

Python Retrieval RAG MIT
07

llmsresearch / coupledadam

Coupled Adam

Implementation of the Better Embeddings with Coupled Adam research paper.

Python Optimization Embeddings MIT
08

llmsresearch / darwinlm

DarwinLM

Implementation of the research paper DarwinLM: Evolutionary Structured Pruning of Large Language Models.

Python Pruning Compression
09

llmsresearch / param-delta

ParamΔ

ParamΔ (Parameter Delta) implementation. Enabling continuous model post-training at no training cost through parameter delta merging.

Python Fine-tuning Adaptation

Browse all repositories at github.com/orgs/llmsresearch/repositories.