SGLang

SGLang
SGLang
Developer	LMSYS
Initial release	January 17, 2024; 2 years ago
Written in	Python, Rust, CUDA, C++
Type	Large language model inference engine
License	Apache License 2.0
Website	sglang.io
Repository	github.com/sgl-project/sglang

SGLang (short for Structured Generation Language) is an open-source framework for programming and serving large language models and multimodal models. It was introduced by researchers affiliated with LMSYS^[1] and other institutions as a system combining a Python-embedded language for structured generation with a runtime for high-throughput inference.^[2]^[3]^[4]

DeveloperLMSYS

Initial releaseJanuary 17, 2024; 2 years ago

Written inPython, Rust, CUDA, C++

TypeLarge language model inference engine

Quick facts Developer, Initial release ...

Close

The project is designed for low latency and high-throughput inference workloads, and its documentation describes support for features such as structured outputs, speculative decoding, continuous batching, quantization, and compatibility with OpenAI-style APIs.^[5]

[1]

[2]

[3]

[4]

[5]

SGLang

History

Architecture

See also

References

External links

Related Articles