xvulnhuntr

In 2024 we looked at the possibility of leveraging open weights LLMs for source code analysis.

The answer was clearly negative, as a small code base could easily take 200K tokens, more than any context window offered by open weights models.

The table below summarizes the top LLMs by context window as of today. Context windows significantly increased, compared to an average of 32.000 tokens last year. However, we are still a few orders of magnitude away from being able to feed entire code bases to LLMs.

Model	Tokens	Open Weight
Gemini 1.5 Pro (Google)	2.000.000	No
GPT 4.1	1.000.000	No
Claude (Anthropic)	200.000	No
DeepSeek	128.000	Yes
LLAMA 3.1	128.000	Yes

Last updated: 23.04.2025

Later in the year a tool named vulnhuntr was published (https://github.com/protectai/vulnhuntr). The tool approached limited context windows by performing source code analysis from source to sink and providing code blocks of the call chains iteratively in a multi-step process. Initially the LLM is fed code blocks for a set of files (e.g. sources where API entry points are defined). Then it literally asks for code blocks of functions and classes which are required for the analysis.

In addition to the clever approach, a track record of discovered vulnerabilities was published, making the tool even more appealing.

There is one relevant limitation though: only python code bases are supported. It is in fact very difficult to statically determine the call chain from source to sink for non-typed languages.

We decided that it was worth extending the tool to support additional languages. Therefore we created xvulnhuntr (https://github.com/CompassSecurity/xvulnhuntr), a fork of the original project where the ‘x’ stands for extended.

Xvulnuhntr also supports C#, Java and Go. For each language, there is a dedicated tool developed in the corresponding language. Given a repository path and a class or function name in input, the tool returns a json with the file name for the match and the source code of the code block matching the function/class name. This modular approach allows to easily extend support to other typed languages.

In addition, we intentionally put additional effort into making xvulnhuntr easier for developers to contribute to. Compared to the original project, it is possible to run xvulnhuntr against a local test suite, mocking API responses. This provides multiple advantages:

reproducibility: while LLMs are not deterministic (even with temperature set to zero), mocked responses allow easier debugging
speed: mocked responses avoid latency from the LLM provider
cost reduction: no need to waste tokens during development

Next Steps

We primarily focused on the development of the tool. We expect refinements and bugs to come up as the tool is used against a variety of code bases. We are also interested in evaluating how analysis from different LLM providers compare to each other. Finally, we welcome and encourage contributions. Until then, happy LLM-powered hacking!

Compass Security Blog

Offensive Defense

Next Steps

1 Comment

Leave a Reply to Fabio Zuber Cancel reply

Recent Posts

Categories

Compass Links

Compass Security Blog

Offensive Defense

xvulnhuntr

Next Steps

Previous post

1 Comment

Leave a Reply to Fabio Zuber Cancel reply

Recent Posts

Categories

Tags

Compass Links