How To Make Your Own Sandbox: An Introduction to Virtualization Techniques

If you happened to write a big system software, you probably had to use some sort of virtual machine – it could be VMWare, Virtual PC or whatever. Have you ever asked yourself how those machines work? I’ve been excited by these wonderful technologies for quite a long time. They looked like a piece of magic to me. And the best approach to uncover the magic, to understand the details is to write a virtualization solution from a scratch. For now, I have my own virtualization solution – a sandboxing tool. It was a challenging task to accomplish. There were a lot of questions you have to answer while writing such a product, and there is a great deficiency of, say, Googlable answers to most of those questions. So, I would like to share my experience with the public. This is going to be a series of articles on virtualization.

Developing a virtual machine – is not a task for a novice programmer, so I assume you have experience in Windows programming and, in particular, have mastered these skills: you are good at C/C++ coding, have experience in Win32 API programming, have read some books on Windows Internals, such as one by Mark Russionovich, have some basic assembly knowledge. It would be a big advantage, if you have programmed kernel mode drivers for Windows, but, despite sandboxing solution requires some kernel mode code, I assume you have no or little experience in this area. I’ll uncover driver development themes in great detail in this tutorial.

Virtual machines could be divided into 2 big classes – a ‘hardcore’ virtual machine, which emulates hardware completely, such as VMWare, and a light-weight virtual machine, which, in contrast, emulates critical operating system actions, such as file system operations, operations on the Registry and some other OS primitives, such as mutexes. Some examples of such light-weight virtual machines include featherweight virtual machine, Sandboxie, and Cybergenic Shade. Featherweight Virtual Machine is an open source project, however it has some cons, such as a way it intercepts kernel mode calls. It uses hooking technique, which means that it modifies OS kernel code – something forbidden on x64 OSes starting from Vista. Such kind of patch causes Patch Guard, a special OS component to bring the OS to BSOD because these patches are now considered by Microsoft as malicious. So, FVM could be a good starting point to get a vision on virtualization as a whole, but not quite compatible with modern OSes. Most of the challenges arise when it comes to preserving compatibility with Patch Guard and we will look at them in detail in further articles. A lot of programmers invent techniques to bypass Patch Guard, but, in fact, such bypassing weakens OS, making it more vulnerable to legacy kernel mode infections which otherwise could not run on Patch Guarded OS. So our goal would be to preserve compatibility with Patch Guard instead of disabling or bypassing it. Our sandbox should add some armor to the OS making it more resistant to malware attacks, so it’s a crime to weaken it down on the other side, by disabling or bypassing Patch Guard. In this tutorial, we are going to focus on development of a light-weight virtualization solution. The main idea is to intercept OS requests to critical system operations and to redirect them to some sort of virtual storage, to dedicated folder, for file system operations, in particular. Say, some application, we want to emulate wants to modify a file named C:Myfile. We must make a copy of this file in virtual folder, say C:SandboxCMyfile, and redirect all the operations, an application performs on C:Myfile to its virtual “sibling” C:SandboxCMyfile. The same is done for registry operations and some other system mechanisms.

Let’s summarize what exactly it means to virtualize an operation. Let’s start with file systems virtualization. As you should already know, when an application wants to access a file, it first opens it. From Windows point of view, it calls a API such as CreateFile(). If a call was successful, it can now read from and write to the file. When work is done, file is to be closed with CloseHandle() API. So, we could intercept CreateFile(), modify its parameters such as a file name the program wants to open. By doing so, we would force an app to open a different file. But, it’s a bad idea to intercept exactly CreateFile() for several reasons: first, there are other ways an app could open the file. For example, it could do so by calling NtCreateFile(), a native API, which in fact is being called by CreateFile(). So, if we intercept NtCreateFile(), we will also intercept an upper CreateFile() because it will eventually call NtCreateFile(), and, will call us. But NtCreateFile() is also not the bottom one. So, where is the ‘lowest’ CreateFile() equivalent to be called by all applications willing to open/create the file? It is inside kernel mode code. All File systems operations are driven by File System Drivers. Windows supports so called file system filters, more specifically, minifilters, which are here for filtering file systems operations. So, by writing a minifilter driver, we could intercept all the file system operations we need. So this our first goal – to intercept file system operations from kernel mode – by writing a minifilter driver. By doing so, we could force an OS to open a completely different file, in our case, it will be a sibling, a copy of the original one. But, an attentive reader would have noticed that copying is a very resource-consuming operation, so there are some optimizations we should apply. First, we may check if a file is being opened for ReadOnly access or not. If so, there is no need to make a copy. As far as there will be no modifications to the file, an access to the original one could be provided. But, if there were some modifications to the file, since an app was virtualized, there could be a sandboxed copy of the file, created as a result of such modifications. So, in general, we should first check an existence of a sandboxed sibling of the file. And if it exists, an access to it should be provided. And only if there is no virtual sibling, we may disable virtualization for this particular create/open request, thus, giving access to the original file.

As you can see, we don’t to intercept reading or writing request – it is fairly enough to just intercept CreateFile() (OpenFile()) kernel mode equivalent to redirect all the work with the file to our virtual mode folder. But, virtualizing File System is not enough. We also should virtualize registry and some other OS primitives. But, for now, let’s just focus on File System virtualization.