Security flaws that can introduce vulnerabilities in a software under development are normally repaired in the source code itself. Although it is not always possible to classify a vulnerability as being of only one type, Seacord and Householder (2005) describe three broad classes that have been used to categorize the identified vulnerabilities — “buffer overflows, format string vulnerabilities, and integer range errors (including integer overflow)”. According to Haugh and Bishop (2003), buffer overflow errors correspond to the “one of the most common security flaws”. Nevertheless, resulting failures of inaccuracy in the dynamic memory management are behind many of those related errors. By knowing the available mitigation strategies against those flaws, programmers might be able to produce more secure source codes. This article details buffer overflow errors and problems in the manipulation of the dynamic memory with practical samples. Prevention strategies are presented respectively.
Code security threats are usually exploited in the context of buffer overflow vulnerabilities and imprecise dynamic memory management. The first one takes place when the written of some data extrapolate the memory space reserved for the related data structure. That type of error may affect both stack and heap regions of the random access memory. Different consequences can be generated according to the memory positions affected and the overflow magnitude like data corruption, unexpected behaviors, or program abnormal interruption. The code below shows an excerpt of code presented by Seacord (2013) where the buffer overflow flaw may be exploited in a simple way. This vulnerability is known as stack mashing and it occurs when a buffer overflow overwrites data in the memory allocated to the execution stack.
The critical issue in code above lies in the use of the gets() function which can create a security breach allowing some spurious user to have unauthorized access. According to Seacord (2013), such a function “copies characters from standard input into password until end-of-file is encountered or a new line character is read”. Since space for only eleven characters was allocated (the last position is reserved for the null terminator character), by entering a string whose length is larger than that threshold a malicious user is able to cause a buffer overflow failure. Another vulnerability detected in that same code is the absence of checking the return of the gets() function. There is no control over the value stored into the buffer if the function fails. Consequently, the strcmp() behavior is undefined.
Those two vulnerabilities identified in the aforementioned code can be exploited by an intruder in an even more harmful way. The use of a maliciously prepared string may lead to a grant of unauthorized access. This is related to the way the stack memory is managed. Although the exact organization of process memory depends on the implementation of the programming language (operating system, compiler, linker, and loader), the invocation of the IsPasswordOK() function can be described by three general steps. However, before the execution of those steps, the operating system pushes onto the stack the return address of the main() function, and the value of EBX (Extended Base Pointer) register is also stored in the stack. Thus, in the sequence, the IsPasswordOK() function is evaluated: (i) the local boolean variable PwStatus is pushed onto the stack so that to store the status returned by the function, (ii) the return address of the caller main() function is pushed onto the stack, and (iii) the value of the EBX register is updated and it is also pushed to the stack.
Based on the way the memory stack works, the following crafted string can be used to corrupt the program execution: 1234567890123456j►*!. Seacord (2013) explains that the correspondent hexadecimal values to the last four characters are j = 0x6A, ► = 0x10, * = 0x2A, and ! = 0x21. Together, they are equivalent to a 4-byte address in memory. When loaded, the four characters overwrite the return address of the stack causing the is_password_ok() function does not return to the next statement in the main() function. Instead, the function returns to the else branch of the password validation. In this case, the string informed is not checked and the access is authorized. The figure below presents the program stack using the crafted input string.
Falling to check returning values constitutes another source of several security issues related to deficient dynamic memory management. Since some programming languages like C delegate that activity to programmers, memory management tends to be an error-prone task. The next code presents a piece of code with a common vulnerability of this type. An array is allocated in the heap memory, however, the return of the malloc() function is not checked. If the memory could not be reserved, the program may exhibit unpredictable behavior.
Strategies for Vulnerabilities Mitigation
The spectrum of techniques focused on vulnerabilities prevention is wide and embraces since hardware modifications until approaches of source code reviews. Kuperman et al. (2005) mention that changes in compilers and in operating systems are also part of this set of available defenses. Nevertheless, a very widespread strategy is known as static code analysis. In this approach, tools are employed to scan the source code in a static way, that is, they carry out the search by potential vulnerabilities without running the code. According to Chatzieleftheriou and Katsaros (2011), those tools may aggregate relevant costs to the process of software development. Because of this, the authors suggest that four aspects must be taken into account when it is planning the adoption of tools designed for static code analysis: “(i) the programming language, (ii) the targeted defects, (iii) the analysis effectiveness, i. e. the portion of detected real defects, and (iv) the analysis efficiency that affects the needed computing resources for code scanning”.
Considering all variables related to the implementation of the aforementioned techniques, more simple but effective alternatives have also been proposed in order to ensure a more secure source code. Madau (1999) presented a sample of a set of rules that when followed by programmers could become them more able to produce a code less “prone to illness”. The complete list of guidelines suggested by the author is based on a robust knowledge of the programming language and on the precise use of its commands. By applying those recommendations on the code showed in Listing 1 and Listing 2, the security issues could be worked around. In the first one, all failures could be avoided through a very straightforward intervention in the code replacing the function gets() by fgets. This alternative function reads n-1 characters, where n is the size allocated to the buffer. The second presented listing could be fixed by adding a pointer checking after the memory allocation (line 2). That verification is shown in Listing 3 and it should replace line 3 of the original code.
The presence of security vulnerabilities in source codes remains to be a critical issue for the software quality. Although several prevention techniques based on computational resources have been developed, the insertion of those ones in a software development pipeline tends to cause an expressive impact. In reason of this, an effective alternative that should not be disregarded is the use of less invasive approaches. By employing the best programming practices like the guidelines proposed by techniques of defensive programming, developers could be able to avoid the most common security errors. Besides, when programmers put efforts toward knowing the target programming language deeply and being acquainted with operating details of the underlying infrastructure, they may achieve that security failures be minimized. The examples described in this article show that simple modifications in the original source code might stand for significant quality improvement. Nevertheless, the identification of which type of change should be applied depends on a solid technical background of the programmers directly involved in the software development process.