Some notes on reinstating shadow copies on Windows 8

Most of the time, the thing that people need from a backup system isn’t disaster recovery. It’s accident reversal. Yes, it’s important to have backups stored on separate disks, ideally at separate locations, to handle hardware failures. But that’s not the most common data loss scenario, just the most severe. Far more common is “whoops, I didn’t mean to save over that file”.

Shadow Copies are an excellent solution for this common problem. When Shadow Copies are enabled, Windows periodically creates snapshots of the disk using the Volume Snapshot Service (VSS). As long as the snapshot exists, you can get a view of the disk as it existed at the time of the snapshot. For Shadow Copies, the operating system reserves a chunk of disk space (which is configurable) for snapshot storage. Once this disk space is full, snapshots start getting purged, starting with the oldest.

The length a snapshot can last depends on the amout of disk space you assign to the shadows and the amount of churn your disk experiences. Every modification made after the snapshot causes the snapshot delta file to grow larger, so disks with lots of writes will tend to burn through their snapshots quite quickly. Disks with less write activity will tend to retain a longer history.

Server versions of Windows, since Windows Server 2003, have had Shadow Copies as a supported feature with a user-visible interface. Shadow Copy configuration is done in Computer Management\System Tools\Shared Folders, where you can configure which volumes get snapshotted, how much storage is assigned to snapshots, and how often the snapshots are made. Shadow Copies are off by default; turn them on, and their schedule is organized around the working week, with two snapshots per day Monday to Friday, at 0700 hrs and 1200 hrs.

Browsing and restoring from Shadow Copies is done through Explorer. The Properties dialog of any folder protected by Shadow Copies has a Previous Versions tab, which allows you to pick an earlier version and restore from it.

Windows Vista and Windows 7 incorporate most of the machinery used to support Shadow Copies, and in these operating systems too, it’s a supported feature. They lack much of the configuration user interface, however. The operating system does make periodic snapshots, but these are primarily created for the purpose of System Restore; a new snapshot is made at midnight each night, by default.

Whether server or workstation, all versions of Windows have essentially the same underlying functionality. The behind-the-scenes components—the Volume Snapshot Service, or VSS, the filesystem drivers, the plugins from apps like SQL Server (so that they can make sure that their files are consistent before a snapshot is preserved)—are all the same.

This means that Windows 8 is perfectly capable of making periodic snapshots. What it’s lacking is some of the user interface components, both in the GUI and at the command-line. As with the underlying parts, the user interface parts are all there; it’s just that they’re deliberately crippled.

There are perhaps three important command-line tools for manipulating VSS. vssadmin.exe, diskshadow.exe, and vshadow.exe. vssadmin.exe is found on both server and desktop Windows. On the desktop, it can be used for querying VSS (enumerating disks with snapshot storage, stored snapshots, VSS plugins, and so on), deleting shadow copies, and changing the amount of storage dedicated to snapshot storage. On the server, it gains the all-important ability to create snapshots and revert the disk to a previous version (equivalent to System Restore).

diskshadow.exe is found only on server versions. It’s modelled after tools like diskpart.exe, providing a kind of interactive shell. It gives fairly extensive control over the creation of shadow copies, including the creation of multi-volume snapshots, exposing existing shadows as drive letters or mount points, and certain more advanced tasks. It’s quite useful to have, but it’s not required for basic shadow copy functionality.

vshadow.exe is found in the Windows SDK. It’s notionally a “sample” client for VSS to showcase its features. It is, in practice, a fairly fully-featured front-end to VSS.

Because it’s found in all versions of Windows already, vssadmin.exe is the best place to start with shadow copies. Not only is it found in all versions of Windows, it’s also the program used to actually create shadow copies in server versions of Window; it’s simply executed as a scheduled task. Although using vshadow.exe would work, making vssadmin.exe do the right thing also means that we don’t need to install the SDK onto systems just to get working shadow copies.

There’s just one small problem with that. As mentioned, its reported capabilities are different on desktop and server. However, a quick check of the executable itself shows that the program is the same regardless of which operating system variant it’s being used with.

On desktop, we see the reduced feature set:

C:\>vssadmin
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2012 Microsoft Corp.

Error: Invalid command.

---- Commands Supported ----

Delete Shadows        - Delete volume shadow copies
List Providers        - List registered volume shadow copy providers
List Shadows          - List existing volume shadow copies
List ShadowStorage    - List volume shadow copy storage associations
List Volumes          - List volumes eligible for shadow copies
List Writers          - List subscribed volume shadow copy writers
Resize ShadowStorage  - Resize a volume shadow copy storage association

On a server:

C:\>vssadmin
vssadmin 1.1 - Volume Shadow Copy Service administrative command-line tool
(C) Copyright 2001-2012 Microsoft Corp.

Error: Invalid command.

---- Commands Supported ----

Add ShadowStorage     - Add a new volume shadow copy storage association
Create Shadow         - Create a new volume shadow copy
Delete Shadows        - Delete volume shadow copies
Delete ShadowStorage  - Delete volume shadow copy storage associations
List Providers        - List registered volume shadow copy providers
List Shadows          - List existing volume shadow copies
List ShadowStorage    - List volume shadow copy storage associations
List Volumes          - List volumes eligible for shadow copies
List Writers          - List subscribed volume shadow copy writers
Resize ShadowStorage  - Resize a volume shadow copy storage association
Revert Shadow         - Revert a volume to a shadow copy
Query Reverts         - Query the progress of in-progress revert operations.

With identical binaries on both systems, it’s clear that it must be doing some kind of runtime operating system detection to figure out which features to offer.

There are a few different ways of figuring out what operating system a Windows program is running on. To do this, I used the very useful API Monitor by Rohitab Batra. Sure enough, early in the program’s execution was a call to GetVersionExW(), one of the APIs for figuring out what version of Windows is being used.

I then wanted to verify the way in which the API was being used; which information was being used, and how it was being used. To do this, I used Visual Studio 2012. I created a new blank solution, then added an existing project, using the vssadmin.exe executable itself as the project file. This rather obtuse operation lets you debug an existing binary. I also made sure to turn on symbol server support, so that public symbols for Windows programs would be downloaded from Microsoft’s servers.

Right click the “project”, choose Debug… Start new instance, and Visual Studio loads the program and stops at the entrypoint. With the program suspended, we can then set a breakpoint on GetVersionExW() to let see the context in which the API was called.

The call stack shows that the call is made from a function CVssSKU::Initialize(). This is a fairly short and simple function:

000007F60C71B9FC  push        rbx  
000007F60C71B9FE  sub         rsp,150h  
000007F60C71BA05  mov         rax,qword ptr [__security_cookie (07F60C723110h)]  
000007F60C71BA0C  xor         rax,rsp  
000007F60C71BA0F  mov         qword ptr [rsp+140h],rax  
000007F60C71BA17  xor         ebx,ebx  
000007F60C71BA19  cmp         dword ptr [CVssSKU::ms_bInitialized (07F60C723814h)],ebx  
000007F60C71BA1F  jne         CVssSKU::Initialize+8Dh (07F60C71BA89h)  
000007F60C71BA21  lea         rcx,[rsp+20h]  
000007F60C71BA26  mov         dword ptr [rsp+20h],11Ch  
000007F60C71BA2E  call        qword ptr [__imp_GetVersionExW (07F60C725788h)]  
000007F60C71BA34  mov         al,byte ptr [rsp+13Ah]  
000007F60C71BA3B  lea         r11d,[rbx+1]  
000007F60C71BA3F  cmp         al,2  
000007F60C71BA41  je          CVssSKU::Initialize+59h (07F60C71BA55h)  
000007F60C71BA43  cmp         al,3  
000007F60C71BA45  je          CVssSKU::Initialize+59h (07F60C71BA55h)  
000007F60C71BA47  cmp         al,r11b  
000007F60C71BA4A  sete        bl  
000007F60C71BA4D  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],ebx  
000007F60C71BA53  jmp         CVssSKU::Initialize+86h (07F60C71BA82h)  
000007F60C71BA55  test        byte ptr [rsp+138h],40h  
000007F60C71BA5D  je          CVssSKU::Initialize+75h (07F60C71BA71h)  
000007F60C71BA5F  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],4  
000007F60C71BA69  mov         dword ptr [CVssSKU::ms_bTransportableShadowsAllowed (07F60C72381Ch)],ebx  
000007F60C71BA6F  jmp         CVssSKU::Initialize+86h (07F60C71BA82h)  
000007F60C71BA71  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],2  
000007F60C71BA7B  mov         dword ptr [CVssSKU::ms_bTransportableShadowsAllowed (07F60C72381Ch)],r11d  
000007F60C71BA82  mov         dword ptr [CVssSKU::ms_bInitialized (07F60C723814h)],r11d  
000007F60C71BA89  mov         rcx,qword ptr [rsp+140h]  
000007F60C71BA91  xor         rcx,rsp  
000007F60C71BA94  call        __security_check_cookie (07F60C71D5E0h)  
000007F60C71BA99  add         rsp,150h  
000007F60C71BAA0  pop         rbx  
000007F60C71BAA1  ret  

From here there are a few important details. First, that the structure is on the stack, at an offset of 0x20, and second, that its field dwOSVersionInfoSize is initialized to 0x11c (284 bytes):

000007F60C71BA26  mov         dword ptr [rsp+20h],11Ch  

This means that the program is using the larger, more detailed OSVERSIONINFOEXW structure.

Third, after the call is made (with no check of the return value), one value is read from the structure:

000007F60C71BA34  mov         al,byte ptr [rsp+13Ah]  

This is the field at offset 0x11a within the structure itself, which is:

BYTE  wProductType;

There are three possible values for this field. If it contains 2 or 3, then the Windows version is Domain Controller or Server, respectively. If it contains 1, then it’s workstation/desktop Windows.

The function treats 2 and 3 identically:

000007F60C71BA3F  cmp         al,2  
000007F60C71BA41  je          CVssSKU::Initialize+59h (07F60C71BA55h)  
000007F60C71BA43  cmp         al,3  
000007F60C71BA45  je          CVssSKU::Initialize+59h (07F60C71BA55h)

If neither test branch is taken, it falls through to the workstation case. We’ll consider that case first. There may be some nuance to this that I don’t understand (I find reading assembly troublesome at the best of times). The line

000007F60C71BA17  xor         ebx,ebx  

Zeroes out the bottom 32 bits of the rbx register. The line

000007F60C71BA3B  lea         r11d,[rbx+1]  

sets the bottom 32 bits of the r11 register to the value of rbx + 1 (i.e. 1). Why it does it this way I don’t understand. rbx is a preserved register; its value is supposed to be maintained across function calls. This means that it must be a constant 1, so why not just set r11 to a constant 1? Indeed, why set r11 to anything at all—why not continue to use rbx?

Anyway, the bottom 8 bits of r11 are then compared with the wProductType:

000007F60C71BA47  cmp         al,r11b  

If they’re equal, then the bottom 8 bits of rbx are set to 1:

000007F60C71BA4A  sete        bl  

This value is then stored for subsequent lookups:

000007F60C71BA4D  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],ebx  

Note how the public symbols are helpful here; it is fairly clear that CVssSKU::ms_eSKU is used as a record of which Windows SKU is being used.

The code then jumps to the common tail of the function

000007F60C71BA53  jmp         CVssSKU::Initialize+86h (07F60C71BA82h)  

which sets a flag to indicate that the initialization has been called

000007F60C71BA82  mov         dword ptr [CVssSKU::ms_bInitialized (07F60C723814h)],r11d  

and then does the stack cookie check and exits

000007F60C71BA89  mov         rcx,qword ptr [rsp+140h]  
000007F60C71BA91  xor         rcx,rsp  
000007F60C71BA94  call        __security_check_cookie (07F60C71D5E0h)  
000007F60C71BA99  add         rsp,150h  
000007F60C71BAA0  pop         rbx  
000007F60C71BAA1  ret  

So that’s what’s happening on desktop Windows. How about the servers?

In that case, a second test is performed.

000007F60C71BA55  test        byte ptr [rsp+138h],40h  

This examines the byte at offset 0x118 in the structure and does a bitwise AND with 0x40. This is the first byte of

  WORD  wSuiteMask;

Because x64 is a little-endian architecture, the first byte of the 16-bit word is the bottom 8 bits. For some reason I cannot fathom, this appears to be a test of whether VER_SUITE_EMBEDDEDNT is specified.

If the result of the bitwise AND is zero then the ZF flag is set, and je jumps are taken. If the branch is not taken (i.e. if it is running on Embedded NT) then

000007F60C71BA5F  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],4  
000007F60C71BA69  mov         dword ptr [CVssSKU::ms_bTransportableShadowsAllowed (07F60C72381Ch)],ebx  

The SKU is set to 4, and the value 0 (remembering that in this branch the rbx register is never modified after being initially zeroed) is stored in the CVssSKU::ms_bTransportableShadowsAllowed variable to denote that transportable shadow copies aren’t permitted.

Transportable shadow copies are shadow copies with extra metadata preserved in Windows Cabinet (.cab) files so that they can be mounted on different systems.

It then jumps to the common tail, as before

000007F60C71BA6F  jmp         CVssSKU::Initialize+86h (07F60C71BA82h)  

Otherwise, for non-embedded NT, we have

000007F60C71BA71  mov         dword ptr [CVssSKU::ms_eSKU (07F60C723818h)],2  
000007F60C71BA7B  mov         dword ptr [CVssSKU::ms_bTransportableShadowsAllowed (07F60C72381Ch)],r11d  

The SKU is set to 2, and transportable shadow copies are permitted.

These various stored values (CVssSKU::ms_eSKU and CVssSKU::ms_bTransportableShadowsAllowed) are then examined at various places in vssadmin.exe. It doesn’t appear to perform any other interrogation or examination of the underlying operating system or its capabilities; this check is it.

To trick vssadmin.exe into giving us the full set of server features, we need only change the value of wProductType, replacing a 1 with a 2 or a 3. 3 makes the most sense, as obviously Windows workstations are not domain controllers. If we change the value in the debugger, lo and behold, the full set of options and features are shown in the help text, and the extra, hidden commands spring into life.

Obviously, stopping the program in the debugger and changing values in memory is not particularly convenient. What we really want to do is to hook the call to GetVersionExW() and replace the value automatically.

The standard technique for doing this on Windows is to spawn the process, inject a DLL into the process, and use that DLL to modify functions appropriately. It’s a time-honoured and trusted technique, it’s robust and well-understood, and so it’s what we’ll do here. First things first: DLL injection.

This is actually pretty simple. The outline is:

  1. Start the process in the suspended state.
  2. Allocate some memory within the process to store parameters, with VirtualAllocEx().
  3. Copy the parameters to the memory, with WriteProcessMemory().
  4. Use CreateRemoteThread() to start a thread within the process, with LoadLibrary() as the thread’s function, and the allocated memory as the thread’s parameter.
  5. Do the actual work from the DLL’s DllMain().

We can do this because essentially LoadLibrary() has a compatible type with ThreadProc() (the type of thread functions). The types are

HMODULE WINAPI LoadLibraryW(const wchar_t* lpFileName);

and

DWORD WINAPI ThreadProc(void* lpParameter);

There is one minor difference—the width of the return value (pointer-sized for LoadLibrary(), 32-bit for ThreadProc()) but it’s immaterial, as return values are passed in the rax register anyway, so there’s no practical difference between 32- and 64-bit integers.

The code to do this is pretty simple. It looks something like this:


#define _CRT_SECURE_NO_WARNINGS 1
#define NOMINMAX
#define STRICT

#include <SDKDDKVer.h>
#include <Windows.h>

#include <memory>
#include <cstring>

int main() {
    const wchar_t* vssadmin_path =  L"%systemroot%\\system32\\vssadmin.exe";
    DWORD buffer_size = ::ExpandEnvironmentStringsW(vssadmin_path, nullptr, 0);
    std::unique_ptr<wchar_t[]> vssadmin(new wchar_t[buffer_size]);
    ::ExpandEnvironmentStringsW(vssadmin_path, vssadmin.get(), buffer_size);

    ::STARTUPINFOW si = { sizeof(STARTUPINFOW) };
    ::PROCESS_INFORMATION pi = { 0 };
    ::CreateProcessW(vssadmin.get(), ::GetCommandLineW(), nullptr, nullptr, FALSE, CREATE_SUSPENDED | CREATE_PRESERVE_CODE_AUTHZ_LEVEL | INHERIT_PARENT_AFFINITY, nullptr, nullptr, &si, &pi);

    // we want to load the DLL from the directory with the wrapper program in it, not from vssadmin.exe's directory
    // so we need to provide the full path to the DLL
    const wchar_t* dll_base_name = L"patcher.dll";
    wchar_t dll_name[MAX_PATH] = {0}; // HATE
    ::GetModuleFileName(nullptr, dll_name, sizeof(dll_name) / sizeof(*dll_name));
    std::wcscpy(std::wcsrchr(dll_name, L'\\') + 1, dll_base_name);

    void* target_memory = ::VirtualAllocEx(pi.hProcess, nullptr, sizeof(dll_name), MEM_COMMIT, PAGE_READWRITE);
    SIZE_T bytes_written = 0;
    ::WriteProcessMemory(pi.hProcess, target_memory, dll_name, sizeof(dll_name), &bytes_written);
    HANDLE remote_thread = ::CreateRemoteThread(pi.hProcess, nullptr, 0, reinterpret_cast<PTHREAD_START_ROUTINE>(&LoadLibraryW), target_memory, 0, nullptr);
    ::WaitForSingleObject(remote_thread, INFINITE);
    ::CloseHandle(remote_thread);
    ::ResumeThread(pi.hThread);
    ::WaitForSingleObject(pi.hProcess, INFINITE);
    DWORD exit_code = 0;
    ::GetExitCodeProcess(pi.hProcess, &exit_code);
    ::CloseHandle(pi.hThread);
    ::CloseHandle(pi.hProcess);
    return exit_code;
}

That lets us get code running inside the vssadmin.exe process. But what should that code do?

There are two ways of using DLLs in windows. Although all DLLs are “dynamically linked”, there’s dynamic and there’s dynamic. In the really dynamic case, you look up the name of the function you want to use with GetProcAddress() at runtime. This allows for things like loading DLLs supplied by a user (such as plugins) or tentative loading of functions that may or may not be present.

However, most Windows system DLLs aren’t used this way at all. They use load-time dynamic listing. Each executable embeds a data structure called the import address table (IAT). This table lists the names of all the DLLs that the executable uses, the names of all the functions in each DLL that the executable uses, and the address of each of those functions. The executable itself just contains null pointers for the addresses. When it loads the executable, Windows replaces all those null pointers with the actual addresses required.

vssadmin.exe's use of GetVersionExW() uses this load-time dynamic linking. The vssadmin.exe executable has an IAT, and that IAT includes the import of the function GetVersionExW() from kernel32.dll.

To hook the function, what our DLL needs to do is to examine vssadmin.exe's IAT, find the specific entry we're interested in, and replace the address with an address of a function that we control.

We don’t need to go into the finer details of the PE format. The format is documented by Microsoft, and there’s lots of code out there to do this kind of thing. Here’s mine:


#define _CRT_SECURE_NO_WARNINGS 1
#define NOMINMAX
#define STRICT

#include <SDKDDKVer.h>
#include <Windows.h>

#define MakePtr(cast, base, offset) reinterpret_cast<cast>(reinterpret_cast<size_t>(base) + static_cast<size_t>(offset))

PROC hook_iat(HMODULE importing_module, const char* exporting_module, PSTR function_name, PROC hooking_proc)
{
    if(!importing_module) {
        return nullptr;
    }

    PROC original_proc = ::GetProcAddress(::GetModuleHandleA(exporting_module), function_name);
    if(!original_proc) {
        return nullptr;
    }

    if(::IsBadCodePtr(hooking_proc)) {
        return nullptr;
    }

    IMAGE_DOS_HEADER* dos_header = reinterpret_cast<IMAGE_DOS_HEADER*>(importing_module);
    if(::IsBadReadPtr(dos_header, sizeof(IMAGE_DOS_HEADER)) || dos_header->e_magic != IMAGE_DOS_SIGNATURE) {
        return nullptr;
    }

    IMAGE_NT_HEADERS* pe_header = MakePtr(IMAGE_NT_HEADERS*, dos_header, dos_header->e_lfanew);
    if(::IsBadReadPtr(pe_header, sizeof(IMAGE_NT_HEADERS)) || pe_header->Signature != IMAGE_NT_SIGNATURE) {
        return nullptr;
    }

    if(pe_header->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress == 0) {
        return nullptr;
    }

    for(IMAGE_IMPORT_DESCRIPTOR* import_descriptor = MakePtr(IMAGE_IMPORT_DESCRIPTOR*, dos_header, pe_header->OptionalHeader.DataDirectory[IMAGE_DIRECTORY_ENTRY_IMPORT].VirtualAddress);
        import_descriptor->Name;
        ++import_descriptor) {
        if(_stricmp(MakePtr(const char*, dos_header, import_descriptor->Name), exporting_module) == 0) {
            for(IMAGE_THUNK_DATA* thunk = MakePtr(IMAGE_THUNK_DATA*, dos_header, import_descriptor->FirstThunk);
                thunk->u1.Function;
                ++thunk) {

                if(thunk->u1.Function == reinterpret_cast<size_t>(original_proc)) {
                    DWORD original_protection = 0;
                    ::VirtualProtect(&thunk->u1.Function, sizeof(&thunk->u1.Function), PAGE_READWRITE, &original_protection);
                    thunk->u1.Function = reinterpret_cast<size_t>(hooking_proc);
                    DWORD ignored = 0;
                    ::VirtualProtect(&thunk->u1.Function, sizeof(&thunk->u1.Function), original_protection, &ignored);
                    return original_proc;
                }
            }
        }
    }
    return nullptr; // Function not found
}

typedef BOOL (WINAPI* gve)(OSVERSIONINFOW*);

gve GetVersionExOriginal;

BOOL WINAPI GetVersionExForcedServer(OSVERSIONINFOW* version_info) {
    BOOL return_value = GetVersionExOriginal(version_info);
    switch(version_info->dwOSVersionInfoSize) {
    case sizeof(OSVERSIONINFOW):
        break;
    case sizeof(OSVERSIONINFOEXW): {
            OSVERSIONINFOEXW* version_info_ex = reinterpret_cast<OSVERSIONINFOEXW*>(version_info);
            if(version_info_ex->wProductType == VER_NT_WORKSTATION) {
                version_info_ex->wProductType = VER_NT_SERVER;
            }
        }
        break;
    }
    return return_value;
}

BOOL APIENTRY DllMain(HMODULE module, DWORD reason, void* reserved) {
    switch(reason) {
    case DLL_PROCESS_ATTACH: {
            GetVersionExOriginal = reinterpret_cast<gve>(hook_iat(::GetModuleHandleW(nullptr), "kernel32.dll", "GetVersionExW", reinterpret_cast<PROC>(&GetVersionExForcedServer)));
        }
        break;
    case DLL_THREAD_ATTACH:
    case DLL_THREAD_DETACH:
    case DLL_PROCESS_DETACH: {
            
        }
        break;
    }
    return TRUE;
}

Compile that all and run the resulting program from an elevated command prompt. It’ll give you the full set of vssadmin.exe options, including the important create shadow command used to actually create shadow copy snapshots.

From here it’s just a matter of creating a scheduled task to run the wrapper program periodically.

David Chartier HD mini lite (Pro): Since I may have created some confusion over how I feel about Tumblr’s...

chartier:

Since I may have created some confusion over how I feel about Tumblr’s ads: I applaud Tumblr’s attempt at bringing in some more revenue, and I apologize for misunderstanding William Wright’s response earlier.

Paid themes and highlighted posts are some of my favorite attempts by any…

What I find interesting about Tumblr’s advertisements is that they are advertising solely to Tumblr users, and not the (presumably) much larger number of Tumblog readers. It means that the blogs themselves avoid looking like Geocities (unless you choose a theme that looks like Geocities).

Promoted posts are similar in their target.

Fisking hell.

I love a good fisking. Sadly, this post by Nadim Kobeissi was not a good fisking of my article.

Now, much to my distaste, I shall fisk through Mr. Bright’s article, a true labyrinth of misinterpretations, inaccuracies and horrible, headache-inducing journalism. First, regarding above claim that the data is anonymized, the whole point of my research was to show you that it isn’t. It is not. IP addresses are communicated to Microsoft in the clear.

That IP addresses are known to both endpoints of an IP connection is not noteworthy. SmartScreen is one of many, many Windows services that require IP connectivity to Redmond’s servers. Windows Update, Windows Activation, the Microsoft Store, crash reporting, CEIP and many more all require occasional connections to servers controlled by the software giant.

Kobeissi’s articles both equivocate somewhat between “users” and “IP addresses”, which is not entirely accurate, but perhaps close enough. The value of an IP address varies from person to person. In the olden days of widespread dial-up, end-user IP addresses could be expected to vary daily, if not more often, due to the vagaries of dial-up access. Thanks to the broadband revolution, this is less often the case; IP addresses can identify single households or offices for long periods of time. For those with static addresses, they can do so more or less indefinitely. This is not quite a per-user identification,

If this information leakage offends you, SmartScreen is neither the only offender, nor is it the first—and if it offends you, then you are probably not using Windows in any case, due to the many ways in which Microsoft can see your IP address.

That this communication is “in the clear” is similarly unexceptional. Every IP address is communicated “in the clear”; to do otherwise requires some kind of IP-in-IP encapsulation (e.g. IPsec ESP tunnel mode), though even here, the “outer” IP address is still in the clear, for obvious reasons.

Kobeissi says that the server Microsoft sends the information to supports the SSLv2 protocol, which is known to be insecure.

Yes, However, far before Mr. Bright wrote his article, I updated my research with the following new finding:

Update 3: Approximately 14 hours after this article was published, another scan of Microsoft’s SmartScreen servers reveals that they have been reconfigured to no longer support SSLv2. The servers now only support SSLv3 connections.

This is unfortunate. I wrote my article before Kobeissi updated his post. It wasn’t published, however, until the following day. These things happen.

This update went completely ignored by Mr. Bright, who brings up SSLv2 again and again in his article, much to my sadness. He later goes on to state that my research was about security risks (No, it’s about privacy risks with only a few notes on security) and that focusing on SSLv2 is unwarranted.

  1. I mention SSLv2 on four occasions. First in a brief description of Kobeissi’s findings; second in a description of the alleged privacy problem; third in explaining why server-side support is not a concern; fourth in Microsoft’s statement. Is this truly “again and again”?

  2. I use the term “security risk” on one occasion. I stand by that usage, as Kobeissi positioned his article as a “security” piece at least as much as he positioned it as a privacy one. The SSL issues, in particular, are security problems as much as they are privacy problems. The original article complains that Microsoft transmits the data “Not Very Securely”, and Kobeissi says that he was “tinkering around from a security/privacy perspective”. He repeats security concerns twice; “The Microsoft server is configured to support SSLv2 which is known to be insecure and susceptible to interception.” and again “Windows 8 appears to send this information to Microsoft to a server that relies on Certificate Authorities for authentication and supports an outdated and insecure method of encrypted communication.”

  3. I do not “focus” on SSLv2. The largest part of the article is about SmartScreen; its history, its purpose, a few pieces about its implementation, and its configuration. I do, however, address the matter of SSLv2, because Kobeissi’s initial research raised it as a concern.

My research was about how Microsoft is making itself an omniscient and single point of data collection regarding what every Windows 8 computer is downloading and installing, and that this is very dangerous from a privacy perspective. In actuality, I’ve found Microsoft’s SmartScreen servers to be vulnerable to the BEAST attack (In retrospect, I don’t think this is the case.)

This is a remarkable statement. Although Kobeissi added the parenthetical after a comment was posted, he initially claimed that SmartScreen was vulnerable to the BEAST attack. The BEAST attack against SSLv3 and TLS 1.0 requires the use of a CBC algorithm on the server side, and Microsoft’s servers do support this cipher suite (indeed, they use AES-CBC algorithm in preference to the BEAST-proof RC4 algorithm, even when the client supports RC4), but it also requires the ability for a hostile party to inject adaptive chosen plaintext to determine the encryption keys being used. This is possible with, for example, Java applets (and WebSockets code written against older versions of the WebSockets specification), but it isn’t possible with the essentially hard-coded SmartScreen check.

On one level, that’s OK. Kobeissi made a mistake (or perhaps had not studied BEAST or previous literature in any great detail) and when the error was pointed out he updated his post accordingly.

On a deeper, however, level it’s problematic. At the very least, this indicates poor attention to detail on Kobeissi’s part—the fact that he has heard of BEAST and knows that SSLv3 and TLS 1.0 are susceptible is enough for him to mention it, regardless of its relevance. But it does more than that. He does not simply say that there may be a possibility of a BEAST attack. No, he states unambiguously “In actuality, I’ve found Microsoft’s SmartScreen servers to be vulnerable to the BEAST attack”. In actuality, he has found no such thing.

But lo and behold, Mr. Bright brings up SSLv2 yet again:

There are some technical problems with Kobeissi’s complaint. Although he says that the server supports SSLv2, that is only part of the story.

Here my state of frustration at Peter Bright’s lacking in the journalistic faculties morphs into a state of boyish wonder; Could it be? Has Peter Bright exceeded the natural limits of misinformed, under-researched journalism? Has he set a new standard? For the entire Internet, perhaps? Am I witnessing history?

Or perhaps, he is witnessing an article written before he updated his post.

This still means that Microsoft could determine which programs individual IP addresses are using.

But lo, a ray of light! A sign of redemption; Peter Bright might actually focus on what I’m trying to say after all!

Here Kobeissi attempts to change history. His original post goes far beyond saying “Microsoft could log IP addresses”. He claims, for example that “The user is not informed [of SmartScreen’s behaviour] while installing and setting up Windows 8”. This is false. He claims—still, even after his later SSLv2 update—”Windows 8 appears to send this information to Microsoft to a server that […] supports an outdated and insecure method of encrypted communication.” He suggests that “SmartScreen is not easy to disable”, which is farcial. He might wish his original claim had been as narrow as “Microsoft can correlate executable hashes to IP addresses” but it was not.

When asked for comment, a Microsoft spokeswoman told us:

“We can confirm that we are not building a historical database of program and user IP data. Like all online services, IP addresses are necessary to connect to our service, but we periodically delete them from our logs. As our privacy statements indicate, we take steps to protect our users’ privacy on the backend. We don’t use this data to identify, contact or target advertising to our users and we don’t share it with third parties.”

The company has also talked in the past about the privacy implications of earlier iteration of SmartScreen. Although Microsoft does collect some data (for example, it distinguishes between popular downloads and unpopular downloads, as part of its application reputation feature), that same data is also anonymized.

As such, the privacy risk here is minimal.

Peter Bright, a tech journalist, just came to the conclusion that a privacy risk was minimal because the corporation being accused of the privacy risk asked him to please trust them; they swear it’s minimal.

My conclusion is that it’s minimal because Microsoft does not, contrary to common belief, set out to deliberately open itself up to legal liability. The company has clear privacy policies in place, and operates in a regulatory climate that is hostile to privacy breaches, whether perceived or real (especially in the EU). Creating such a database would plainly violate the terms of Microsoft’s own privacy policy and as such open up the company to considerable legal liability. The downsides are obvious.

The upsides, however, are not. No advantage to Microsoft of building a persistent database cross-referencing IP addresses to applications is immediately apparent; nor is any such advantage described in Kobeissi’s post.

My reaction to just how ready Mr. Bright is to dismiss my body of evidence that Microsoft could, at any time, record what every default Windows 8 configuration is installing because a Microsoft spokesperson told him to was first this, this, then this, then realizing how painfully backwards parts of the Internet can be, coming to peace with it, and writing this article where I nicely explain to Peter Bright why he sucks at being a tech journalist.

My heart, it is breaking.

It might be unfashionable, but although I am happy to regard corporations as essentially amoral entities, it’s simply not enough to say “Well, they could do something bad” and regard that as reason enough to condemn their actions. There needs to be some evidence of some degree of wrong-doing first. Such evidence is entirely lacking from his analysis.

E&OE.

How do I fix the white balance in this stupid thing? (Taken with Instagram)

How do I fix the white balance in this stupid thing? (Taken with Instagram)