Garak (software)

Vulnerability scanner for large language models From Wikipedia, the free encyclopedia

garak is a computer security tool that provides information about  LLM security vulnerabilities and aids in penetration testing and red teaming of language models and dialog systems. It is supported by Nvidia. Officially the name is short for "generative AI red-teaming & assessment kit".

Original authorLeon Derczynski
Initial releaseJune 13, 2023; 2 years ago (2023-06-13)
Stable release
0.13.0 / September 2, 2025 (2025-09-02)
Quick facts Original author, Developer ...
garak
Original authorLeon Derczynski
DeveloperNvidia
Initial releaseJune 13, 2023; 2 years ago (2023-06-13)
Stable release
0.13.0 / September 2, 2025 (2025-09-02)
Written inPython
Operating systemCross-platform
TypeSecurity
LicenseFramework: Apache License
Websitegarak.ai
Repositorygithub.com/NVIDIA/garak
Close

garak is described as the leading LLM vulnerability scanner in an independent 2024 review by Fujitsu Research.[1] It is used and recommended as tooling in articles from Microsoft,[2] Trend Micro,[3] Nvidia[4] and Cisco,[5] and has been covered in major IT news outlets.[6][7]

History

garak was developed in Spring 2023 by Prof. Leon Derczynski of ITU Copenhagen[8] during a sabbatical at the University of Washington. It was first released under GPL on 13 June 2023.[9] The license was later updated to Apache 2.0. The software is now homed at Nvidia, where it lives as an open-source project with long-term support, and has been available via the Nvidia public GitHub since November 2024.[10]

Framework

The main components in garak are probes, generators, and detectors.[11] Probes manage attacks and implement an adversarial technique. Generators abstract away targets, which may be an LLM, a dialogue system, or anything that can take text and return text (plus optionally other modalities). Probes attempt to attack generators and pass the resulting output to a detector. The detectors assess whether or not the output indicates a successful attack. The whole is compiled into reporting by an HTML page and a JSON object summarizing results.

See also

References

Related Articles

Wikiwand AI