Writing a decent win32 keylogger [2/3]

Written by Martin Balc'h - 21/12/2023 - in Outils , Système - Download

In this series of articles, we talk about the ins and out of how to build a keylogger for Windows that is able to support all keyboard layouts and reconstruct Unicode characters correctly regardless of the language (excluding those using input method editors).

In the first part, after a brief introduction introducing the concepts of scan codes, virtual keys, characters and glyphs, we describe three different ways to capture keystrokes (GetKeyState, SetWindowsHookEx, GetRawInputData) and the differences between those techniques.

In the second part, we detail how Windows stores keyboard layouts information in the kbd*.dll and how to parse them.

In the third and last part, we go through the process of using the extracted information to convert scan codes into characters and all the tricky cases presented by different layouts, such as ligatures, dead keys, extra shift-states and SGCAPS.
Finally, we present our methodology to test and validate that our reconstruction is correct by writing a testing tool which can automate the injection of scan codes and retrieve the reference text produced by Windows which we compare with our reconstructed text.

Part 1 Part 2 Part 3 Github

Looking to improve your skills? Discover our trainings sessions! Learn more.

In the previous article, we saw a few different techniques to capture key-presses on Windows. In this article, we explain how Windows translates scan-codes into characters and how we can parse keyboard layout DLLs to extract the data required to emulate that process.

Translating to characters

Now that we saw a few ways to retrieve the key-presses and context info, let’s find out how Windows goes about converting scan codes to first virtual keys, and then characters. You can find information on Windows' keyboard input model here.

Here is a simplified overview of the process:

overview of windows keyboard input model

As we saw earlier, the activated input language will select a keyboard layout. Windows supports more than a hundred different layouts out of the box, with the option to create or import more. For each layout you will find a DLL whose name starts with ‘KBD’ located in C:\Windows\System32\, such as KBDFR.DLL, KBDUS.DLL, and so on.

You can find the list of usable keyboard layouts by enumerating the following registry key:

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Keyboard Layouts

Those DLLs contain everything Windows needs to convert scan codes (from the keyboard) into characters (displayed inside application windows). We will now describe their contents and how you can go about parsing that into a more readily useable intermediary format.

KBD*.DLL structure

While keyboard layout DLLs can export two functions, only one of them is really useful for our purposes and we we will not cover the second (KbdNlsLayerDescriptor) in this article. It you want to read more about it, you can go here for the definitions and here for a reference implementation.

The only function of interest to us is KbdLayerDescriptor which is defined like this:

PKBDTABLES KbdLayerDescriptor(VOID);

By calling it, you get a pointer to a KBDTABLES structure, which is defined as:

typedef struct tagKbdLayer {
    // shift states modifiers info (shift, control, alt, alt-gr, etc.)
    PMODIFIERS pCharModifiers;

    // virtual keys to character conversion tables
    PVK_TO_WCHAR_TABLE pVkToWcharTable;

    // list of supported diactritics for this layout (aka dead-keys)
    PDEADKEY pDeadKey;

    // names of keys (eg: RETURN => ENTRÉE)
    PVSC_LPWSTR pKeyNames;
    PVSC_LPWSTR pKeyNamesExt;
    WCHAR *KBD_LONG_POINTER *KBD_LONG_POINTER pKeyNamesDead;

    // scan code to virtual keys conversion
    USHORT  *KBD_LONG_POINTER pusVSCtoVK;
    BYTE    bMaxVSCtoVK;
    PVSC_VK pVSCtoVK_E0;
    PVSC_VK pVSCtoVK_E1;

    // locale specific flags (ALT-GR, Left to right, ...)
    DWORD fLocaleFlags;

    // ligatures
    BYTE       nLgMax;
    BYTE       cbLgEntry;
    PLIGATURE1 pLigature;

    // type & subtype
    DWORD      dwType;
    DWORD      dwSubType;

} KBDTABLES, * PKBDTABLES;

Exploring the KBDTABLES structure

In order to understand the contents of this structure, we wrote a tool to dump the DLLs to JSON files which will be easier to work with afterwards.

You can also check out the XML files generated by kbdlayout.info for each keyboard layout. The “XML internal tables” is basically a XML representation of the data contained in the keyboard layout DLLs. The “XML for processing” files are a higher level view of the data, in a more structured, easier to use format. Those tables were very useful to us to make sure our extraction process was correct, thanks Jan!

You can find the full source code of the tool here.

The program is written in C++ (C with 1 class to be truthful ;p) and uses the JSON library by Niels Lohmann which is very useful, powerful and easy to use (<3 single header libs).

Loading the dll & getting the pointer

Let’s start loading the DLL and then retrieve a pointer to KbdLayerDescriptor and call it to get our KBDTABLES pointer.

// load FR keyboard layout
const char * dll = "KBDFR";
HMODULE hmod = LoadLibraryA(dll);
if(!hmod)
    // handle error

// find the exported function KbdLayerDescriptor
FARPROC func = GetProcAddress(hmod, "KbdLayerDescriptor");
if(!func)
    // handle error

// cast the function pointer and call it
PKBDTABLES kbd = ((PKBDTABLES(*)())func)();

Keyboard layout locale flags

Now that we have our kbd pointer, we can start to extract information. We start by parsing the keyboard layout locale flags.

// create a json object we will fill with our extracted data
j = json({});
// parse flags
j["flag_altgr"]     = kbd->fLocaleFlags & KLLF_ALTGR       ? 1 : 0;
j["flag_shiftlock"] = kbd->fLocaleFlags & KLLF_SHIFTLOCK   ? 1 : 0;
j["flag_ltr"]       = kbd->fLocaleFlags & KLLF_LRM_RLM     ? 1 : 0;

There are 3 different locale flags:

0x1: KLLF_ALTGR, if set, indicates that for this keyboard layout, the right hand ALT key should be handled as CONTROL + ALT
0x2: KLLF_SHIFTLOCK, unused but if set, indicates that pressing the SHIFT key will reset the status of the CAPSLOCK key
0x4: KLLF_LRM_RLM, only used for keyboard layouts with right-to-left scripts, inserts left-to-right marker (LRM) and right-to-left marker (RLM) on specific key presses (left/right shift/control and backspace combinations)

Parsing shift states & modifiers

Some layouts have more modifier keys than the common SHIFT, CONTROL, ALT and relatively common ALT-GR. For instance, Japanese keyboard use a dedicated KANA key and Canadian Multilangual uses the right control key as an extra modifier key. So this information has to be stored in the keyboard layout. Here are the relevant structures:

typedef struct {
    BYTE Vk;
    BYTE ModBits;
} VK_TO_BIT, *PVK_TO_BIT;

typedef struct {
    PVK_TO_BIT pVkToBit;
    WORD       wMaxModBits;
    BYTE       ModNumber[];
} MODIFIERS, *PMODIFIERS;

The KBDTABLES structure contains a pointer to this MODIFIERS struct which contains:

a list (pVkToBit) of virtual keys which act as modifiers, with an associated modifier bit value
a list (ModNumber) of wMaxModBits values which map an input modifier bit-field to a column index (the shift state)

Here are example values taken from the French keyboard layout:

MODIFIERS mods_fr = {
    pVkToBit: [
        { Vk=16, ModBits=1 },                   // VK_SHIFT
        { Vk=17, ModBits=2 },                   // VK_CONTROL
        { Vk=18, ModBits=4 },                   // VK_MENU (ALT)
    ],
    wMaxModBits:    6,
    ModNumber:      [0, 1, 2, 4, 15, 15, 3],
}

Now let us see what the ModNumber list means:

Modifier	Mod value	ModNumber	Comment
no modifier	0x0	0
shift	0x1	1
control	0x2	2
alt	0x4	15	no valid combo with ALT only
alt+shift	0x5	15	no valid combo with ALT+SHIFT only
alt+control	0x6	3	ALT-GR = ALT + CONTROL

So for the legacy AZERTY French keyboard layout there are at most 3 possible modifiers (shift, control and alt-gr) for a single key and the column order in the virtual keys to character tables (that we will describe later) will be:

0 = no modifier
1 = shift
2 = control
3 = alt-gr

Now this is the code we wrote to dump those shift states to JSON:

// shift states & modifiers
j["shiftstates"] = json::array();
for(int i=0; i<=kbd->pCharModifiers->wMaxModBits; i++)
    j["shiftstates"].push_back(kbd->pCharModifiers->ModNumber[i]);

j["modifiers"] = json::array();
for(int i=0; kbd->pCharModifiers->pVkToBit[i].Vk; i++)
{
    json o = json();
    o["modbits"] = kbd->pCharModifiers->pVkToBit[i].ModBits;
    o["vk"] = kbd->pCharModifiers->pVkToBit[i].Vk;
    o["vkn"] = VKN(kbd->pCharModifiers->pVkToBit[i].Vk);
    j["modifiers"].push_back(o);
}

The VKN macro returns the ASCII string representation of the virtual key name (see vk_names.h and vk_names.py in the repo).

Parsing VkToWcharTable

Now things get a little more dicey, the VkToWcharTable structure pointed to in KBDTABLES is defined like this:

typedef struct tagKbdLayer {
    ...
    PVK_TO_WCHAR_TABLE pVkToWcharTable;
    ...
} KBDTABLES, * PKBDTABLES;

typedef struct _VK_TO_WCHAR_TABLE {
    PVK_TO_WCHARS1 pVkToWchars;
    BYTE           nModifications;
    BYTE           cbSize;
} VK_TO_WCHAR_TABLE, *PVK_TO_WCHAR_TABLE;

Which refers PVK_TO_WCHARS1 which is a structure defined by a macro:

#define TYPEDEF_VK_TO_WCHARS(n) typedef struct _VK_TO_WCHARS##n { \
    BYTE  VirtualKey; \
    BYTE  Attributes; \
    WCHAR wch[n]; \
} VK_TO_WCHARS##n, *PVK_TO_WCHARS##n;

Which is called ten times:

TYPEDEF_VK_TO_WCHARS(1)
TYPEDEF_VK_TO_WCHARS(2)
TYPEDEF_VK_TO_WCHARS(3)
TYPEDEF_VK_TO_WCHARS(4)
TYPEDEF_VK_TO_WCHARS(5)
TYPEDEF_VK_TO_WCHARS(6)
TYPEDEF_VK_TO_WCHARS(7)
TYPEDEF_VK_TO_WCHARS(8)
TYPEDEF_VK_TO_WCHARS(9)
TYPEDEF_VK_TO_WCHARS(10)

This will result in defining 10 almost identical structures, named VK_TO_WCHARS1, VK_TO_WCHARS2, … to VK_TO_WCHARS10 with the only difference between them beeing the size of the wch buffer. Those structures contain:

VirtualKey: a virtual key
Attributes: a set of flags for this entry of the conversion table from virtual key to character
wch: a list of characters that can be output when this virtual key is pressed (based upon the current shift state)

So to sum up, we have a pointer to multiple VK_TO_WCHAR_TABLE structures which each contain:

a pointer to VK_TO_WCHARS structures (of a specific size)
how many modifiers will be present for each entry (nModifications)
the offset to the next entry in the VK_TO_WCHARS struct

Here is our code to dump the tables to JSON, note how we cheat by using only VK_TO_WCHARS10 pointers and go to the next entry by recasting the pointer at the right address, instead of using the proper pointer according to the size (which would complexify the code).

j["vk_to_wchars"] = json::array();
for(int i=0; kbd->pVkToWcharTable[i].cbSize; i++)
{
    json o = json();
    o["index"] = i+1;
    o["num_mods"] = kbd->pVkToWcharTable[i].nModifications;
    o["table"] = json::array();

    PVK_TO_WCHARS10 pvk2wch = (PVK_TO_WCHARS10)kbd->pVkToWcharTable[i].pVkToWchars;
    while(pvk2wch->VirtualKey)
    {
        json it = json();
        it["vk"] = pvk2wch->VirtualKey;
        it["vkn"] = VKN(pvk2wch->VirtualKey);
        it["attrs"] = pvk2wch->Attributes;
        it["wch"] = json::array();

        for(int j=0; j<kbd->pVkToWcharTable[i].nModifications; j++)
            it["wch"].push_back(pvk2wch->wch[j]);

        pvk2wch = (PVK_TO_WCHARS10)((char*)pvk2wch + kbd->pVkToWcharTable[i].cbSize);
        o["table"].push_back(it);
    }
    j["vk_to_wchars"].push_back(o);
}

Parsing VSCtoVK

Now let us talk about the data structures that allow us to convert scan codes into virtual keys. The first one is pusVSCtoVK which is just a pointer to USHORT, accompanied with bMaxVSCtoVK which gives us the number of items in the array. The index is the scan code and the value of the USHORT pointed to is the virtual key or’d to eventual flags.

The parsing code is straight forward:

j["vsc_to_vk"] = json::array();
USHORT * vvk = kbd->pusVSCtoVK;
if(vvk)
{
    for(int i=0; i<kbd->bMaxVSCtoVK; i++)
    {
        json o = json();
        o["sc"] = i;
        o["vk"] = vvk[i] & 0xff;            // mask out the virtual key flags
        o["vkn"] = VKN(vvk[i] & 0xff);      // mask out the virtual key flags
        o["flags"] = vkftos(vvk[i]);        // utility function to convert the flags to json
        j["vsc_to_vk"].push_back(o);
    }
}

With our function vkftos declared like this:

json vkftos(int vk)
{
    json o = json::array();
    if(vk & KBDEXT)         o.push_back("KBDEXT");
    if(vk & KBDMULTIVK)     o.push_back("KBDMULTIVK");
    if(vk & KBDSPECIAL)     o.push_back("KBDSPECIAL");
    if(vk & KBDNUMPAD)      o.push_back("KBDNUMPAD");
    if(vk & KBDUNICODE)     o.push_back("KBDUNICODE");
    if(vk & KBDINJECTEDVK)  o.push_back("KBDINJECTEDVK");
    if(vk & KBDMAPPEDVK)    o.push_back("KBDMAPPEDVK");
    if(vk & KBDBREAK)       o.push_back("KBDBREAK");
    return o;
}

All flag values and defines can be found here.

Now there are two more tables (pVSCtoVK_E0 and pVSCtoVK_E1) which map extended scan codes to virtual keys. Both tables work the same:

typedef struct tagKbdLayer {
    ...
    PVSC_VK pVSCtoVK_E0;
    PVSC_VK pVSCtoVK_E1;
    ...
} KBDTABLES, * PKBDTABLES;

typedef struct _VSC_VK {
    BYTE Vsc;
    USHORT Vk;
} VSC_VK, *PVSC_VK;

So we have an array of VSC_VK structures with two members, one for the virtual scan code and one for the associated virtual key. You have to keep reading items from the list until you get a nil virtual scan code.

Here is our parsing code:

j["vsc_to_vk_e0"] = json::array();
PVSC_VK vv0 = kbd->pVSCtoVK_E0;
for(int i=0; vv0 && vv0[i].Vsc; i++)
{
    json o = json();
    o["sc"] = vv0[i].Vsc;
    o["vk"] = vv0[i].Vk & 0xff;
    o["vkn"] = VKN(vv0[i].Vk & 0xff);
    o["flags"] = vkftos(vv0[i].Vk);
    j["vsc_to_vk_e0"].push_back(o);
}

j["vsc_to_vk_e1"] = json::array();
PVSC_VK vv1 = kbd->pVSCtoVK_E1;
for(int i=0; vv1 && vv1[i].Vsc; i++)
{
    json o = json();
    o["sc"] = vv1[i].Vsc;
    o["vk"] = vv1[i].Vk & 0xff;
    o["vkn"] = VKN(vv1[i].Vk & 0xff);
    o["flags"] = vkftos(vv1[i].Vk);
    j["vsc_to_vk_e0"].push_back(o);
}

Parsing dead keys

An interesting feature that is not used by all keyboard layouts is the support of “dead keys”. A dead key is a key that will not output a character when pressed but will instead wait for the next key press to output one or more characters to the screen. One such example is the key ^ (circumflex accent) on a french keyboard:

first key	second key	output
^	e	ê
^	i	î
^	' '	^^
^	p	^p

Such information is stored in the PDEADKEY pDeadKey variable whose structure is pretty simple:

typedef struct {
    DWORD  dwBoth;
    WCHAR  wchComposed;
    USHORT uFlags;
} DEADKEY, *PDEADKEY;

For each of those entries, the upper 16 bits of the dwBoth variable represent the 1st character (the ‘dead’ character) and the lower 16 bits will be the character that the dead key can be combined with (for example, the letter E). The wchComposed variable is the combination of both those characters (in our example ê). This table will contain the list of all valid combinations:

// list of all valid dead characters for fr_FR legacy AZERTY layout
âêîôûÂÊÎÔÛ^äëïöüÿÄËÏÖÜ¨ãÃñÑõÕ~àèìòùÀÈÌÒÙ`

We can parse all that information like this:

j["deadkeys"] = json::array();
PDEADKEY pd = kbd->pDeadKey;
for(int i=0; pd && pd[i].dwBoth != 0; i++)
{
    json o = json();
    o["vk1"] = pd[i].dwBoth >> 16;
    o["vk2"] = pd[i].dwBoth & 0xffff;
    o["combined"] = pd[i].wchComposed;
    o["flags"] = pd[i].uFlags;
    j["deadkeys"].push_back(o);
}

Parsing ligatures

The only thing left that we have to parse in order to fully emulate Windows character translation from keypresses are ligatures. Ligatures are the representation of two or more characters into a single glyph. The word “cœur” (which means “heart”) contains such a ligature, the characters o and e are merged into a single glyph œ. Funnily enough, we can’t type that word with a standard french keyboard 💔. As an additionnal note, TTF fonts have support for ligatures, and can sometimes automatically handle such cases to display the proper joined character without requiring the input text to use the specific unicode codepoints for the ligature characters.

Now let us see an example from a keyboard layout that supports ligature: arabic. When you press the B key, the output character will be ﻻ, which is the combination of ل (Arabic letter LAM) and ا (Arabic letter ALEF). If you were to press backspace just after pressing the b key, you would only remove the ALEF character, and not both, and only the character LAM would remain.

Here are the relevant data structures in the KBDTABLES struct:

typedef struct tagKbdLayer {
    ...
    BYTE       nLgMax;
    BYTE       cbLgEntry;
    PLIGATURE1 pLigature;
    ...
} KBDTABLES, *PKBDTABLES;

With PLIGATURE1 defined by the following macro for up to 5 characters long ligatures:

#define TYPEDEF_LIGATURE(n) typedef struct _LIGATURE##n { \
    BYTE  VirtualKey; \
    WORD  ModificationNumber; \
    WCHAR wch[n]; \
} LIGATURE##n, *PLIGATURE##n;

TYPEDEF_LIGATURE(1)
TYPEDEF_LIGATURE(2)
TYPEDEF_LIGATURE(3)
TYPEDEF_LIGATURE(4)
TYPEDEF_LIGATURE(5)

The nLgMax variable indicates the maximum number of characters for a single ligature for the current keyboard layout. The cbLgEntry variable indicates the size in bytes of a single ligature entry. We parse the ligature table like this (using only PLIGATURE5 pointers for shorter code):

j["ligatures"] = json::array();
PLIGATURE5 lg = (PLIGATURE5)((BYTE*)kbd->pLigature);
for(int i=0; lg && lg->VirtualKey; i++, lg = (PLIGATURE5)((BYTE*)kbd->pLigature + i*kbd->cbLgEntry))
{
    json o = json();
    o["vk"] = lg->VirtualKey;
    o["modnum"] = lg->ModificationNumber;
    o["chars"] = json::array();
    for(int k=0; k<kbd->nLgMax; k++)
        o["chars"].push_back(lg->wch[k]);
    j["ligatures"].push_back(o);
}

… And that’s it, we’re done extracting data from those keyboard layout DLLs! There is more that we haven’t covered, such as key names, because we won’t be needing them to reconstruct our text.

In the next article we will explain how to emulate Windows' scan code to character translation with the data we extracted from the keyboard layout DLLs, you can find it here.

Writing a decent win32 keylogger [2/3]

Translating to characters

KBD*.DLL structure

Exploring the KBDTABLES structure

Loading the dll & getting the pointer

Keyboard layout locale flags

Parsing shift states & modifiers

Parsing VkToWcharTable

Parsing VSCtoVK

Parsing dead keys

Parsing ligatures

Other publications

Extraction of Synology encrypted archives - Pwn2Own Ireland 2024

Should you trust your zero trust? Bypassing Zscaler posture checks

2025 Summer Challenge: OCInception

Contact us

PARIS

TOULOUSE

LYON

RENNES

LILLE

BORDEAUX