Both presenters, Pascal Junod and Jean-Roland Schuler work for the HES-SO – the University of Applied Science Western Switzerland. This talk is the follow-up of last year’s presentation, including the improvements done since. While Pascal Junod, from the HES-SO HEIVd (Yverdon-les-Bains) focused on obfuscating binaries based on their source code, Jean-Roland Schuler of the HES-SO EIA of Fribourg applied other obfuscation techniques based directly on ELF binaries, where no source code was available anymore.
The first part was an introduction about obfuscation and why this was different from standard protection techniques. The main idea is that usually you protect yourself against security threats. But confronted to an attacker, who got a copy of your software, you can’t as the attacker can isolate himself for reverse engineering the product, without you being able to detect this ongoing analysis. Combined with the rise of features available within the cloud (e.g. AWS used to perform some brute force), software copy protections and DRMs have less and less chances of being kept unbroken. The HES-SO project about code obfuscation started based on the fact that there are lots of obfuscation products available, but mostly on high-level languages. For C / C++, only a few options were available and none of them was open source.
Let’s dive into the source code obfuscation part done by Pascal Junod: it is based on the LLVM compiler which produces an “intermediate language”, close to a kind of ASM pseudo-code. Why use LLVM and not the famous GCC? According to them and their research, GCC is a beast in complexity and badly documented. On the other hand, LLVM is more modular and offers clearly documented APIs at various steps of the compilation process. The project was born in 2000 and pushed – for license reasons – by Apple since 2005. Apple supports the project by employing the core developers. The obfuscation process takes – obviously – place after the compiler optimization process and bases itself on various techniques such as operator substitution or code flattening. The results, tested on several libraries such as libtomcrypt or Image Magic. Libtomcrypt is a good candidate, as the project provides good test cases and errors introduced by the optimization process will almost certainly alter the result at the end of the process. An impressive image of the code flattening feature was shown for Image Magic, where a single clause got flattened into 3000 cases.
Inserting obfuscation into an existing ELF binary is much trickier, as first you need to find places where to insert the additional code. Several techniques were presented, as well as their limitations. The best identified way is to rely on junk code and opaque predicates in such cases, which may get handy for software watermarking.
The questions focused mainly on performance losses due to obfuscation, which are held reasonably low (between 30 and 50% on libtomcrypt). The future of the project is unclear for the moment, as publishing the project under an open-source license may help good guys, but malware authors may certainly be interested in as well.
Update: on 16.11.2012, Pascal announced the final workshop about Obfuscator taking place on Monday 3rd of December – for further details, see his blog article http://crypto.junod.info/2012/11/16/workshop-final-du-projet-obfuscator/