Secure Code Review for l33t hax0rs

Code review is, hopefully, part of regular development practices for any organization. Adding security elements to code review is the most effective measure in preventing vulnerabilities, even before the first commit.

This series of short articles is a primer that includes the basic elements to get you started. We will review several software examples in order to figure out the good from the ugly.

The Biggest Security Issues

OWASP Top 10 and MITRE Top 25 include lists of common programming flaws that lead to vulnerabilities and exploits in software. These lists are reviewed periodically by industry experts and are included in compliance standards such as PCI-DSS.

Our security code review examples will focus on the following OWASP Top 10and MITRE Top 25 entries:

  • Injection

  • Memory Flaws

  • Sensitive Data Exposure

  • Cross-Site Scripting

  • Broken Access Control.

The Basic Secure Coding Practices

One of the goals of regular code review is to spot missing best practices. Security code review is about identifying the missing secure coding practices. These practices are also known as software defences or in Threat Modeling terms countermeasures. There are many types of software defences but some are more important and effective than others.

The short list below is based on the number of security flaws that each defence prevents as well as the severity of the flaws based on OWASP Top 10and MITRE Top 25 industry classifications.

  • Input Validation, prevents a wide range of attacks such as Injection, Cross-Site Scripting and Path Manipulation.

  • Parameterized Statements, prevent Injection attacks when Input Validation cannot be employed. Injection is #1 in the OWASP Top 10.

  • Memory Safe Functions and Safe Memory Management practicesprevent memory flaws such as Buffer Overflow which is at #3 in the Mitre Top 25 immediately after SQL Injection and OS Command Injection.

  • Data Encryption in transit and at rest prevents data breaches which are covered at #3 in the OWASP Top 10 category Sensitive Data Exposure.

  • Neutralizing Output prevents Cross-Site Scripting flaws which plague web applications and should be the primary concern for front-end developers.

  • Indirect Object References prevent dangerous flaws associated with file management such as Path Traversal a vulnerability in the Broken Access Control OWASP Top 10 category.

There are many threat categories and best practices that will not be discussed in this series. It is important to realize security code review is not a replacement for other application security controls such as static analysis or penetration testing.

Security code review is also only a small part of the code review process. It should not take too long. As such we must prioritize the things we are looking for to get the most bang for the buck.

In addition many threat categories are handled in the application framework as opposed to every day code changes. For example, #2 in OWASP Top 10 is Broken Authentication. It is not likely that you will deal with this category often during code review because authentication is usually handled by the framework.

The articles will contain examples from Java, C/C++ and JavaScript programming languages. However the concepts can be easily transferred to other languages. Input Validation in Java is similar to input validation in Python and PHP.

In next article in this series we will review in detail the concept of Input Validation.

GIGO

Would you let someone in your house if you thought they should not be there?

You wouldn’t. So why allow any input to your application? Why allow symbols into a variable that is intended to be numeric?

Even when validation is used, a common mistake is to use black lists. For example an application will prevent symbols that are known to cause trouble. The weakness of this countermeasure is that some symbols may be overlooked.

Would you maintain a black list of people that cannot come to your house?

You wouldn’t. So why block quotes when you know the input should be numeric? You should only allow numbers.

In the two code samples below one has a security issue due to improper input validation. Can you tell which?

gigo.png

Both code samples, shell out, to execute OS commands, in this case sending a ping to a server. Shelling out is an insecure practice because it can lead to OS Command Injection, however the bottom code mitigates the issue because it uses an allow list to prevent hazardous characters while the top code uses a block list which is, in this case, insufficient since an attacker could pass in something like updateserver.com `command` or updateserver.com |commandand neither ` nor | have been included in the block list.

As you can see, allow lists are much more effective at preventing application security issues. A simple multi-purpose function that checks if the input is alphanumeric can prevent multiple types of flaws:

public class InputValidation {

/**
Sample java input validation function that validates that the input is alphanumeric or part of a list of allowed exceptions
 */
public static boolean isAlphanumOrExcepted(int maxSize, String val, char ... excepted){
    boolean result = true;
    int count = val.length();
    if(count>maxSize) return false;

    for(int i=0;i<count;i++){
        char c = val.charAt(i);
        boolean isOk = false;
        //if  alphabetic turns true , this works for Unicode chars
        isOk = isOk | Character.isAlphabetic(c);
        //if it's a digit turns true, this works for Unicode chars
        isOk = isOk | Character.isDigit(c);
        //if it's in the list of exceptions turns true
        for(char ex : excepted){
            isOk = isOk | ex==c;
        }

        if(isOk == false){ //if one of the characters didn't meet the requirements return false
            return false;
        }
    }
    return result;
}

And here is how to use this simple function to prevent a wide range of attacks:

`` protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {

String input = request.getParameter("input");

if(InputValidation.isAlphanumericOrExcepted(EXPECTED_SIZE,input,'.','-','_')){
    //process the input
}
else{
    //return 400 Invalid Input and log attack attempt
}

}

One little function can prevent multiple attack types. The table below demonstrates how the function prevents SQL Injection, OS Injection, Cross-Site Scripting and Path Traversal.

table.png

The function also works for multi-language support. For example the character è will be considered a letter and will be allowed.

In an HTTP request there are many parameters that are simply numeric or alphanumeric. Let’s analyze the URL below which is part of a Twitter API request generated by executing a search for security.

https://api.twitter.com/2/search/adaptive.json?include_profile_interstitial_type=1&include_blocking=1&include_blocked_by=1&include_followed_by=1&include_want_retweets=1&include_mute_edge=1&include_can_dm=1&include_can_media_tag=1&skip_status=1&cards_platform=Web-12&include_cards=1&include_composer_source=true&include_ext_alt_text=true&include_reply_count=1&tweet_mode=extended&include_entities=true&include_user_entities=true&include_ext_media_color=true&send_error_codes=true&q=security&count=20&query_source=typd&pc=1&spelling_corrections=1&ext=mediaStats%2ChighlightedLabel

The only parameter here that may need to be excluded from validation is q . More than 90% of the request parameters can benefit from an alphanumeric allow list. By applying input validation to 90% of the input on the request, we reduce 90% of the attack surface. This is why input validation is the most effective way of reducing vulnerabilities.

Binding Parameters

In the previous article we reviewed Input Validation. While Input Validationis an effective deterrent to a large number of attacks, including Injection, not all input can be filtered.

For example imagine someone’s name is O’Brien. The single quote in O’Brienhappens to also be part of SQL command syntax. For example a website may perform a database record search like this:

simple.png

Notice that the single quote in the name O’Brien is causing a syntax error. The SQL command processor considers the string ends with O and the rest, BRIEN%, is just an unrecognized command.

In order to work around this problem one must escape the single quote with a another single quote like in the image below.

simple2.png

However when this query is executed by a program, things look different. The name would be read from a variable and the database query would be constructed dynamically.

Let’s take a look at the following code snippet.

simple3.png

The variable lastName contains input coming from the user. It is concatenated to a constant SQL query string and the resulting command is passed to the database server.

There is no input validation and no input escaping whatsoever. This means that a user entering O’Brien would cause an SQL syntax error, which is a bug. However a malicious user would take advantage of this behaviour. What would happen if the user entered something like the string below?

simple4.png

The SQL query being passed to the database would end up being two different commands. One selects all users in the database, while the other deletes the users table.

simple5.png

If these concepts are new to you, now you can finally enjoy this old hacker joke about little Bobby Tables.

simple6.png

Preventing Injection

As the old comic suggests, sanitizing the user input could have prevented the issue. “Sanitizing” could have been done with Input Validation or escaping of the single quote. However not all input can be validated and not all SQL Injection is done with single quotes.

In the example below Injection takes advantage of an un-sanitized ORDER BY parameter:

simple7.png

In this case Input Validation could be employed because column names should be alphanumeric however this leads to a more complex defence strategy where the application must employ Input Validation for values going into the ORDER BY section and Escaping for values going into the WHEREclause.

There is however a simpler way. Do not use concatenation at all. This approach also holds true for other Injection scenarios like Command Injection.

simple8.png

This is a scenario where we can employ Occam’s Razor to identify the simplest most universal defence to Injection attacks. The solution here is using Parameterized Statements.

Parameterized Statements interact with the command processor to separate the variables from the SQL query, thus effectively neutralizing characters that may influence the query.

simple9.png

In Java this can be achieved by using a Prepared Statement as per the example belowThe question mark in the query string at line 3 is a placeholder for the parameter value.

simple10.png

When dealing with OS Commands, is best to avoid them completely and write the equivalent functionality in your programming language of choice. For example if you must read a file in Java use APIs such as File and BufferedReader rather than the Linux cat command. However in certain cases there’s no choice and the code must “Shell Out”. For those situations the equivalent of a Prepared Statement is passing command line arguments as a separate array, thus avoiding concatenation.

simple11.png

Object-Relational Mapping (ORM)

There’s an even better way to abstract SQL statements from application code. ORM frameworks allow developers to work with objects rather than SQL queries. Instead of working with the users table developers would work with a users collection and execute a method on that collection to return the corresponding record.

simple12.png

Under the covers the framework would perform a transformation similar to the diagram below.

simple13.png

Caution

There are cases where Injection will still occur in spite of Parameterized Statements being used. For example when the injection occurs in a database stored procedure or the OS shell script being executed

In order to reduce the likelihood of such scenarios occurring, Input Validation should still be used as much as possible.

To sum it all up

When reviewing a code change that involves a command processor look for the following:

  • Input Validation for known alphanumeric values

  • Employing Parameterized Statements

  • Using a ORM framework where applicable

Ticker Tape Challenges

Problems with Memory

Memory related vulnerabilities are very dangerous. They are often classified as 0-days. 0-days are vulnerabilities in common software leveraged by malware and viruses. The 0day name came from illegal software released by an organized group who called themselves 0day. Their software releases were the latest and greatest video games. Never before did the public see these games. Hence vulnerabilities later became known as 0days. The challenge with memory vulnerabilities is that potential operating system components and programs such as system services, browsers, crypto libraries and document readers are written in C/C++ and are particularly exposed to these types of flaws.

There are many flavors of memory vulnerabilities but we will focus on the most common mistakes:

  1. Buffer Copy without Checking Size of Input — CWE 120

  2. Incorrect Calculation of Buffer Size — CWE 131

  3. Uncontrolled Format String — CWE 134

  4. Off by One — CWE 193

How do we know they are the most common? All of these mistakes are documented by MITRE with the top 3 being included in the MITRE Sans Top 25.

They all lead to arbitrary access of memory outside the intended boundaries. This is also known as Overflow.

Memory Overflow Explained

Imagine the program uses two variables a and b. The contents and length of variable b are controlled by the user. The variable memory locations are represented in the simplistic example below. Each variable is intended to store 3 characters. At first variable a contains the string “Hi!”.

valid1.png

If the user enters the string AAAAA (5 As) for variable b the following will happen.

valid2.png

Variable b only had 3 locations reserved for its value plus one for the string terminator \0 . By entering 5 characters the user is now writing into space reserved for variable a.

Variables are pointers to memory locations. The value of b will now become AAAAAi! while the value of a will become Ai!. If the program outputs the value of b then the attacker will be able to know part of the value of a which is known as a memory leak.

If the user enters a sufficiently long value, they will hit the instruction area and alter the program code.

valid3.png

Memory SafeR Functions

Overflow can be prevented by controlling the number of characters read into the buffer. This is done using memory safe functions. It is important to note that simply using these functions will not prevent overflow. They also must be used correctly.

The image below lists several functions that allow the program to limit the size of the value read into a buffer: fgetssnprintfstrncpystrncmp.

valid4.png

If the BUFF_SIZE argument is larger than the size of the buffer, overflow will still occur.

In the example below we have two different code snippets that both read a password from the standard input. Can you spot the one that allows Buffer Overflow because it does not check the size of the input?

valid5.png

If you identified bottom.cpp as the vulnerable code you were correct. The top example, is making use of fgets and it restricts the number of characters to 9.

Incorrect Calculation of Buffer Size

Some of our keen readers may have noticed that if the size of userPass is less than 9, then overflow will still occur.

This is an example of Incorrect Calculation of Buffer Size. Defining constants for the size argument rather than using numerals and paying close attention during code review to the code checking boundaries, can prevent this type of flaw. Let’s take a look at the example below and see if we can spot the vulnerable code.

valid6.png

If you identified top.cpp as the vulnerable code you were correct. The bottom example, is making use of the constant BUFFER_SIZE to ensure consistency. Besides being a safer option the code is also easier to maintain if the constant value must be modified in the future. Many secure coding practices have other benefits besides security.

Off-by-one

Off-by-one is another variation of buffer size flaw. This type of programming mistake is introduced when employing comparison operators. A simple extra equal sign, for example using<= instead of< can lead to the program crashing. Let’s take a look at an example. Can you spot the vulnerable code?

valid7.png

If you spotted the error in top.cpp you are correct.

Memory Injection?

Format String Injection is a type of vulnerability caused by concatenating or using user input in a format parameter. Code that logs user values using functions such as printf is particularly exposed to this type of vulnerability.

Take for example the snippets below. Can you spot which of the two snippets allows the user to control the format string?

valid8.png

If you identified top.cpp as the code that allows Format String Injection you were correct. If the user includes %d or %p in the password, printf will take the next value following the format string and include it in the display. This can lead to information disclosure or program crashes.

By the way there are a couple of bonus insecure practices in both code snippets :) . Did you spot them? One is the use of printf, a dangerous function as mentioned earlier in the article. It’s also a bad practice to display the user password.

Compiler Flags

This is not really a code review item but is worth mentioning because is a countermeasure that can be applied at build time to prevent the exploitation of memory flaws.

Compiler flags enable operating system defences such as ASLR in Windows or PIE/SSP in Linux. They tell the operating system to employ countermeasures such as randomizing memory, which is making it hard for attackers to insert arbitrary code.

Even with compiler flags in place attackers can still crash the program so the main effect of compiler flags is reducing the impact of the attack. The best defence is to prevent the flaws in the code, from the start, by employing the best practices discussed in this article.

To sum it all up…

  • Safer functions allow limiting the number of bytes read into the buffer

  • Even with safe functions special attention should be paid to size specified

  • Use constants to prevent mistakes

  • Careful with the <= operator

  • Do not allow user input in format strings