BLOG

Latest updates and releases by LMSYS Org are announced through our blogpost series.

Building a Truly "Open" OpenAI API Server with Open Models Locally

by: Shuo Yang and Siyuan Zhuang, June 9, 2023


Many applications have been built on closed-source OpenAI APIs, but now you can effortlessly port them to use open-source alternatives without modifying the code. FastChat's OpenAI-compatible API server enables this seamless transition. In this blog post, we show how you can do this and use LangChai...

Chatbot Arena Leaderboard Updates (Week 4)

by: LMSYS Org, May 25, 2023


In this update, we are excited to welcome the following chatbots joining the Arena: Google PaLM 2, chat-tuned with the code name chat-bison@001 on Google Cloud Vertex AI Anthropic Claude-instant-v1 MosaicML MPT-7B-chat Vicuna-7B A new Elo rating leaderboard based on the 27K anonymous voting data c...

Chatbot Arena Leaderboard Updates (Week 2)

by: LMSYS Org, May 10, 2023


We release an updated leaderboard with more models and new data we collected last week, after the announcement of the anonymous Chatbot Arena. We are actively iterating on the design of the arena and leaderboard scores. In this update, we have added 4 new yet strong players into the Arena, including...

Chatbot Arena: Benchmarking LLMs in the Wild with Elo Ratings

by: Lianmin Zheng*, Ying Sheng*, Wei-Lin Chiang, Hao Zhang, Joseph E. Gonzalez, Ion Stoica, May 3, 2023


We present Chatbot Arena, a benchmark platform for large language models (LLMs) that features anonymous, randomized battles in a crowdsourced manner. In this blog post, we are releasing our initial results and a leaderboard based on the Elo rating system, which is a widely-used rating system in ches...

Vicuna: An Open-Source Chatbot Impressing GPT-4 with 90%* ChatGPT Quality

by: The Vicuna Team, March 30, 2023


We introduce Vicuna-13B, an open-source chatbot trained by fine-tuning LLaMA on user-shared conversations collected from ShareGPT. Preliminary evaluation using GPT-4 as a judge shows Vicuna-13B achieves more than 90%* quality of OpenAI ChatGPT and Google Bard while outperforming other models like LL...