PROJECTS
LMSYS Org develops open models, datasets, systems, and evaluation tools for large models.
MODELS
EVALUATION
Chatbot Arena
A benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner. It comes with a leaderboard based on Elo ratings.
MT-Bench
A set of challenging, multi-turn, and open-ended questions for evaluating chat assistants. It uses LLM-as-a-judge to evaluate model responses.
SYSTEMS
FastChat
An open and scalable platform for training, finetuning, serving, and evaluating LLM-based chatbots.
SGLang
Efficient interface and runtime for complex LLM programs.
S-LoRA
A system for serving thousands of concurrent LoRA adapters.
Lookahead Decoding
An exact, fast, parallel decoding algorithm without the need for draft models or data stores.
DATASETS
LMSYS-Chat-1M
This dataset contains one million real-world conversations with 25 state-of-the-art LLMs.
Chatbot Arena Conversations
This dataset contains 33K cleaned conversations with pairwise human preferences collected on Chatbot Arena.
ToxicChat
This dataset contains 10K high-quality data for content moderation in real-world user-AI interactions based on user queries from the Vicuna online demo.