using Programming;

A Blog about some of the intrinsics related to programming and how one can get the best out of various languages.

Demonstrating Insecurity of Managed Windows Program Memory

Is Memory in a Managed Windows Program Secure?

Recently I was on one of the (many) Stack Exchange sites answering a question a user posted, like usual. This question was a bit different though: the asker was concerned about the best way to make sure people couldn't read the password the user entered out of memory.

Unfortunately, this is not a task that can really be solved on consumer devices. Anyone with enough knowledge (and it's not really a lot) can do it. I'm going to demonstrate how to today with a simple Visual Studio programme.

Essentially, what I'm going to do is 'connect' to a fake SQL server (nothing about it will really exist) and then demonstrate how one can (with a copy of Visual Studio) extract the entire Connection String of that SQL connection out. It's actually quite trivial and with enough practice can be done in seconds.

Of course there are other ways to do this, it can be done programatically, there are other bits of software for it, etc. I'm just going to demonstrate how any developer can do it with the tools (s)he has at their disposal.

Creating our test projects

So the first step is to create a test project we can use to 'attack'. We're going to consider this an attacker/victim scenario, since that's what one of the real world applications is.

Our code is going to be pretty simple:

using Evbpc.Framework.Utilities.Prompting;
using System;
using System.Data.SqlClient;

namespace VictimApplication
{
    class Program
    {
        static void Main(string[] args)
        {
            var consolePrompt = new ConsolePrompt(null);

            var connectionString = new SqlConnectionStringBuilder();
            connectionString.DataSource = consolePrompt.Prompt<string>("Enter the SQL server hostname/ip", PromptOptions.Required);
            connectionString.UserID = consolePrompt.Prompt<string>("Enter the SQL server user id", PromptOptions.Required);
            connectionString.Password = consolePrompt.Prompt<string>("Enter the SQL server password", PromptOptions.Required);
            connectionString.InitialCatalog = consolePrompt.Prompt<string>("Enter the SQL server database", PromptOptions.Required);

            using (var sqlConnection = new SqlConnection(connectionString.ToString()))
            {
                try
                {
                    Console.WriteLine("Connecting...");
                    sqlConnection.Open();

                    using (var command = new SqlCommand("SELECT 15", sqlConnection))
                    {
                        Console.WriteLine($"Command output: {command.ExecuteScalar()}");
                    }
                }
                catch
                {
                    Console.WriteLine("Could not establish a connection to the server.");
                }
            }

            Console.WriteLine("Press enter to exit.");
            Console.ReadLine();
        }
    }
}

Do note this uses my ConsolePrompt from GitHub.

So we have our victim application, now we'll go ahead and attack it.

Attaching Visual Studio to a running application

You're probably expecting a title like 'attacking an application with Visual Studio' but that's not as descriptive as what we're doing. Yes, this is how you attack it, but attack sounds nefarious. We're not doing anything nefarious, we're just attaching a debugger to a running application.

So we're going to open a new instance of Visual Studio, and not open or create a project. Just open the instance.

Screenshot 1 - Fresh Visual Studio Instance

So, we've opened Visual Studio (I'm using 2015 but this should work on 2010+). The next thing we'll do is launch our application.

Screenshot 2 - Launch Application Outside Debugger

Right, so we have the application running outside the debugger. No other instances of Visual Studio need to be open, nothing else needs to be running, just that application and our fresh instance. The next step is to attach the debugger to a process.

This is under Debug -> Attach to Process. You should see a new window open, and we want to find our 'Victim Application' (VictimApplication.exe).

Screenshot 3 - Attach to Process

We'll go ahead and attach it. Our screen should change to look like we're in a regular debug session, even though we didn't launch the program through Visual Studio.

Screenshot 4 - Debug Session is Green err Blue

Now we still have our other window open with our running application in it. All we have to do next is start checking it out and see what we can inspect.

This next part isn't required, but it should help you familiarize yourself with what we're going to do. Let's hit the 'Break All' button (CTRL + ALT + Break with default shortcuts).

Screenshot 5 - Break mode

As of this moment the program is paused. Since it's a console application, you can still type into it, but your typing will not be processed by the program at this point.

Screenshot 6 - Text not handled by program

Next we'll hit 'Show Diagnostic Tools' and then select the 'Memory Usage' tab. Once we have done that, we'll hit 'Take Snapshot'.

Screenshot 7 - Taking our first memory snapshot

So now we're at the point we can start inspecting objects in our program. The first thing we'll want to do is click the blue 429 (your number may vary) link to the list of objects.

We'll then sort them by name since we're not concerned about the count, we just want to look through them.

I'm going to inspect our ConsolePrompt as an example, which in this case is listed as Evbpc.Framework.Utilities.Prompting.ConsolePrompt. When you find an object you want to inspect, hover over it and you should see an icon that looks like a square grid with a circular shape on the top-left corner, click that and a new page should open.

Screenshot 8 - Selecting an object to inspect

We'll then see a new page with all the instances of that object listed. If you hover over the Value, you should get a tool-tip that has a breakdown of the object itself, which you can explore just like normal. We'll see that the Logger is in fact null like we wanted.

Screenshot 9 - Exploring our object

Now that we've played with our explorer, we can go ahead and close that breakdown and continue with our program. We'll hit 'Continue' and resume execution. If you typed a server host into the console, you'll see as soon as we hit continue that the program continues to the next step. We'll fill out all our requirements and then break our program again when it starts connecting, and take another memory snapshot.

Screenshot 10 - Connecting to our server

We see that the new snapshot has 3,825 objects allocated, and the difference is an increase of 3,396. Our graph shows that we allocated a lot more memory (relatively speaking) and we can now go ahead and inspect our snapshot to try to find our password. We'll be looking for a string type with a value of pass.

We know it'll be part of the SqlConnection, so we'll sort that by name and then go down to SqlConnection and explore it like before.

Screenshot 11 - Find our SqlConnection

Upon exploring it we'll just a different method of extracting our string. Click 'Referenced Objects' at the bottom of our window, and hover over the middle String object. (Mine is 0x2FB0BD8)

Screenshot 12 - Extracting our Connection String

And there we have it. We have successfully extracted our password from a separate Visual Studio instance while the original application was running completely separately.

Debug Symbols and why they are important

Of course, our demonstration was made slightly easier by the inclusion of the .pdb files (debug symbols), usually you won't have access to these for the running application, so you'll have to look a little harder sometimes to find what you're looking for.

If you don't know what Debug Symbols are, Wikipedia has a nice description. Essentially, the pdb file (stands for 'Program Database') is the symbol map for .NET programs. It contains each generated instruction header and what the generated name of it was.

Finding our String without SqlConnection

The last thing we'll do is find our string value without exploring the SqlConnection object. We're only going to look with the Diff list, and run from there.

So, we'll restart our application, then attach the debugger, then enter our host and user, then take a memory snapshot like we did earlier.

Screenshot 13 - Round 2 First Snapshot

Then we'll hit 'Continue', enter our password, and take another snapshot.

Screenshot 14 - Round 2 Second Snapshot

The next step is to disable 'Just My Code' in the filter. If we don't do this it becomes much more difficult to locate what we changed.

Screenshot 15 - Round 2 Disable Just My Code

So we see that it created one string, by the Count Diff. being +1 on the String type, this helps us narrow down what we're looking for. If click once into it, and view our 'Paths to Root', it helps us discover that we have +1 in String [Local Variable]. So we're in the right place.

Screenshot 16 - Round 2 String Local Variable

We'll inspect the String like before (Square icon with Round outset) and we'll see that by default it sorts the list by `'Inclusive Size (Bytes)', we'll sort it by 'Instance'. Theorhetically our password should be the last instance listed. If we scroll to the bottom of the list we see that, indeed, it is.

We also see that our user id is right above it.

Screenshot 17 - Round 2 Find our Password


And there we have it! We learned how to inspect objects in our program when it was launched outside Visual Studio by attaching a debug instance of Visual Studio to it.

SQL Server Datatypes: How to avoid VarChar

I've seen, time and time again, programmers make many of the same mistakes regarding their SQL datatypes, and one of them is to use VarChar for almost everything. I've seen it so many times that if I had a nickel for each time I saw it, well, let's just say my McLaren P1 would be yellow.

Why do people use VarChar so much?

Well, to be honest, it's easy. We, as people, are generally lazy, and it's easy to store anything in a VarChar(50), or worse, a VarChar(MAX)! Why is this a bad thing? Well for some data, it's not, but for others, it's just not the best option. As developers and programmers, we almost always have a choice as to how we should store our data, and sometimes, it's easy to make an inefficient one.

Let's take a solid example. I was over on Stack Overflow one day, and I noticed a developer doing something odd: the developer was storing an IP address (we'll assume IPv4 of 192.168.0.1 which is a pretty common IP for default gateways in small home and office networks) in a VarChar or a Char field. I'm not sure on the precision of it, or which it was (as the developer left out the DDL), but for sake of argument let's assume it was the smallest precision required to store any IP Address, and as such a VarChar(15).

The developer, much like the rest of us, was trying to find a way to shrink the amount of data used down. So, the developer proposed the suggestion of, instead of store 1.1.1.1, we'll just omit all the characters except the last two (in this example: .1), and keep the fourth octet in the database. The downfall of this is quite obvious: we now have no way of distinguishing whether our value is 1.1.1.1, 2.2.2.1, 3.3.3.1 or any other repeated value. But, there's a better way.

Let's take a peek at what we know at this point:
  1. The data being stored is binary data;
  2. It's being stored in a string field;
  3. The maximum length on the string field is 15 characters;
Now this doesn't just apply to IP Addresses, it also applies to hashes, encrypted data and other binary objects.

At first glance this might not seem so bad. The IP Address as a string is 192.168.0.1. The maximum data-size is going to be 17 bytes, as the VarChar type takes one byte per character, and two bytes of overhead. The size for our specific address is 12, by the same math. The developer took the time to address the issue of fitting the data within the seemingly smallest datatype possible. But what did the developer forget?

First, we're trying to store binary data. The smallest way to store this (at least in string format) is either in hexadecimal or Base64 encoding. Let's assume we use hexadecimal (it really doesn't matter either way). We're storing data that is four bytes, which means we need eight characters. Our example leaves us with 0xC0A80001 or, for short: C0A80001. So, this alone allows us to reduce our maximum storage space to almost half it's original size, and our utilized space (for this example) to 10 bytes from 12. With just one quick optimization we converted our 15-character string to an 8-character hexadecimal string. Now that we know that, we can make another optimization and change it to a Char(8) type. This reduces two more bytes of overhead, and leaves our example at a cool 8 bytes of storage space.

But, we're forgetting one small thing: SQL Server (at least, Microsoft SQL Server) has a Binary type. Much like the Char type, the Binary type has a fixed size. The difference is that the Binary type can store raw byte data. It takes a length, just like the Char does, so in our case, it would be Binary(4) (to store four bytes for one IPv4 address). The binary type will only store the raw data for the address, so we're left with:
  1. Byte 1: 0xC0
  2. Byte 2: 0xA8
  3. Byte 3: 0x00
  4. Byte 4: 0x01
Microsoft SQL Server also has a VarBinary type which works just like the VarChar type. It supports the same size limits: 1-8000 or MAX. It also requires two bytes of overhead for each row, just like a VarChar type.

The nice thing about using a Binary type for this field, is that it allows us to save a significant amount of space. By optimizing this field, we've saved 11 bytes of storage per row. How significant is that? If we had 500,000,000 we've saved 5.5GB of data. (And for big-data applications, 500,000,000 rows is insignificant.)

You might say, "well my application is small data, 500,000,000 rows is a pretty significant number, and 5.5GB for that many records is small." While that may be true, this is just one field we've optimized.

The DateTime example

Let's take another example: I've seen a lot of people use the VarChar type for DateTime data as well, when it's completely unnecessary. The SQL Server has several types for DateTime data, the more useful being DateTime, DateTime2, and DateTimeOffset. Microsoft recommends that you no longer use DateTime for new work, as the DateTime2 and DateTimeOffset types align with the SQL standard, and are more portable. The DateTime2 and DateTimeOffset fields also have better precision and a larger range.

Why is this so important? You can just as easily store a as a string in a VarChar field, and then parse it later. The problem with that is that you can't filter quite so easily for certain criteria. It's easy (at least with a DateTime2 field) to filter for dates within a certain range, on a certain date, etc. It's less intuitive with any string type.

The other problem is less obvious: with a VarChar type, there is no validation done that guarantees the input string is a DateTime string. This means it's up to whatever logic you have manipulating the database to make this guarantee.

What about the NVarChar and NChar types?

I've not discussed these so far because we were talking about binary data, which in most any form is stored in some ASCII or raw form. These types (NVarChar and NChar) are Unicode (UTF-16, specifically) variants of the VarChar and Char types, respectively. These types take two bytes per character, with the variable-length type taking an extra two bytes of overhead. In our example, were the first field type an NVarChar(15) it would have taken up to 32 bytes of data. (As 30 bytes for the 15 characters plus two bytes of overhead.) The specifiable sizes for these two fields are any integers in the range 1-4000, or MAX for NVarChar.

What do the numbers in parenthesis represent?

Many fields have an optional size, precision or other parameter to represent different amounts and forms of data that can be stored within them. For all fields we're discussing in this article, the parenthesis represent how many characters (for the Char, VarChar, NChar and NVarChar types), or how many bytes (for the Binary and VarBinary types) the field can store.

What are the VarChar, NVarChar and VarBinary types doing internally?

All three of these types work in a very specific way, internally. You can see that the maximum size any of the three of them can take is up to 8000 bytes, but what does that mean?

Internally, in Microsoft SQL Server, the variable length fields (which have the optional MAX specification) store data in one of two ways:
  1. For data that fits within 8000 bytes, the data is stored in-row;
  2. For data greater than 8000 bytes, the data is stored out-of-row and a pointer to the data is stored in-row;
This should help clarify what the server is doing, and what the specifications mean, and why I always cringe when I see VarChar(MAX) or NVarChar(MAX), in a situation that doesn't call for it.

In summation:

As always: know your data, know your users, and most of all, know your environment.