A denial-of-service (DoS) attack is an attack from one attacker against one target. A distributed denial-of-service (DDoS) attack is an attack from two or more computers against a single target. DDoS attacks often include sustained, abnormally high network traffic on the network interface card of the attacked computer. Other system resources usage, such as the processor and memory usage, will also be abnormally high. The goal of both is to perform a service attack and prevent legitimate users from accessing services on the target computer.
Attackers often gain access to a target system with limited privileges (typically least privileges) when they first exploit a system. However, they use various privilege escalation techniques to gain more and more privileges. Most of the attacks in this chapter also use privilege escalation techniques for the same reason – to gain more and more access to a system or a network.
Spoofing occurs when one person or entity impersonates or masquerades as someone or something else. Two common spoofing mentioned specifically in the CompTIA objectives are media access control (MAC) address spoofing and Internet Protocol (IP) address spoofing.
Host systems on a network have a media access control (MAC) address assigned to the network interface card (NIC). These are hard-coded unto the NIC. However, it’s possible to use software methods to associate a different MAC address to the NIC in MAC spoofing attack. For example, in Chapter 3, we discussed a MAC flood attack where an attacker overwhelms(淹没) a switch with spoofed MAC address. Flood guards prevent these types of attacks. Chapter 4 discussed how wireless attackers can bypass MAC address filtering by spoofing the MAC address of authorized systems.
In an IP spoofing attack, the attacker changes the source address so that it looks like the IP packet originated from a different source. This can allow an attacker to launch an attack form a single system, while it appears that the attack is coming from different IP addresses.
The SYN flood attack is a common attack used against servers on the Internet. These are easy for attackers to launch, difficult to stop, and can cause significant problems. The SYN flood attack disrupts(扰乱) the TCP handshake process and can prevent legitimate clients from connecting. As a reminder, two systems normally start a TCP session by exchanging three packets in a TCP handshake, where the client sends a SYN packet to the server first; the server responds with a SYN/ACK; and then the client completes the handshake by sending an ACK packet. After establishing the session, the two systems exchange data. However, in a SYN flood attack, the attacker never completes the handshake by sending the ACK packet. Instead, the attacker sends a barrage of SYN packets, leaving the server with multiple half-open connections. In some cases, these half-open connections can consume a server’s resources while it is waiting for the third packet, and it can actually crash the system. More often, though, the server limits the number of these half-open connections (for example, Linux systems support an iptables command that can set a threshold for SYM packets). Once the limit is reached, the server won’t accept any new connections, blocking connections from legitimate users. Attackers launch SYN flood attacks from a single system in a DoS attack. They will often spoof the IP address when doing so. Attackers can also coordinate an attack from multiple systems using a DDoS attack.
A man-in-the-middle (MITM) attack is a form of active interception or active eavesdropping. It uses a separate computer that accepts traffic from each party in a conversation and forward the traffic between the two. The two computers are unaware of the MITM computer, and it can interrupt the traffic at will or insert malicious code. Address Resolution Protocol (ARP) poisoning is one way that an attacker can launch an MITM attack. Kerberos helps prevent man-in-the-middle attacks with mutual authentication. It doesn’t allow a malicious system to insert itself in the middle of the conversation without the knowledge of the other two systems.
ARP poisoning is an attack that misleads computers or switches about the actual MAC address of a system. The MAC address is the physical address, or hardware address, assigned to the NIC. ARP resolves the IP addresses of system to their hardware address and stores the result in an area of memory known as the ARP cache. TCP/IP uses the IP address to get a packet to a destination network. Once the packet arrives on the destination network, it uses the MAC address to get it to the correct host. ARP uses two primary messages:
The ARP request broadcast the IP address and essentially asks “who is this IP address?”
The computer with the IP address in the ARP request responds with its MAC address. The computer that sent the ARP request cache the MAC address for the IP. In many operating systems, all computers that hear the ARP reply also cache the MAC address.
A vulnerability with ARP is that it is very trusting. It will believe any ARP reply packet. Attackers can easily create ARP reply packets with spoofed or bogus MAC addresses and poison the ARP cache on systems in the network. Two possible attacks from ARP poisoning are a man-in-the-middle attack and a DoS attack.
In a man-in-the-middle attack, an attacker can redirect network traffic and, in some cases, insert malicious code. Normally, traffic from the user to the Internet will go through the switch directly to the router, However, after poisoning the ARP cache of the victim, traffic is redirected to the attacker. For example, the victim’s ARP cache should be “192.168.1.1, 01-23-45-01-01-01”, but after poisoning the ARP cache, it becomes “192.168.1.1, 01-23-45-66-66”. The victim now sends all traffic destined for the router to the attacker. The attacker captures the data for analysis later, It also uses another method such as IP forwarding to send the traffic to the router so that the victim is unaware of the attack.
An attacker can also use ARP poisoning in a DoS attack. For example, an attacker can send an ARP reply with a bogus MAC address for the default gateway. The default gateway is the IP address of a router connection that provides a path out of the network. If all the computers cache a bogus MAC address for the default gateway, none of them can reach it, and it stop all traffic out of the network.
DNS resolves host name to IP addresses. This eliminates the need for users to have to remember the IP address for web sites. DNS also provides reverse lookups. In a reverse lookup, a client sends an IP address to a DNS server with request to resolve it to a name. Some applications use this as a rudimentary(基本的) security mechanism to detect spoofing. For example, an attacker may try to spoof the computer’s identity by using different name during a session. However, the TCP/IP packets in the session include the IP address of the masquerading system and a reverse lookup shows the system’s actual name. If the name is different, it shows suspicious activity. Reverse lookup are not 100 percent reliable because reverse lookup records are optional on DNS server. However, they are useful when they are available.
A DNS poisoning attack attempts to modify or corrupt DNS results. For example, a successful DNS poisoning attack can modify the IP address associated with google.com and replace it with the IP address of a malicious web site, Each time a user queries DNS for the IP address of google.com. the DNS server responds with the IP address of the malicious web site. There have been several successful DNS poisoning attacks over the years. Many current DNS servers use Domain Name System Security Extensions (DNSSEC) to protect the DNS records and prevent DNS poisoning attacks.
A pharming(嫁接) attack is another type of attack that manipulates the DNS name resolution process. It either tries to corrupt the DNS server or the DNS client. Just as a DNS poisoning attack can redirect users to different web sites, a successful pharming attack redirects a user to a different web site. Pharming attacks on the client computer modify the host file used on Windows systems. The file is in the C:\ Windows\ System32\drivers\etc\ folder and can include IP addresses along with host name mappings. By default, it doesn’t have anything other than comments on current Windows computers. However, a mapping might look like this:
127.0.0.1
localhost
13.207.21.200
google.com
the first entry maps the name localhost to the loopback IP address of 127.0.0.1. The second entry maps the name google.com to the IP address of bing.com (13.207.12.200). If a user enters google.com into the address bar of a browser, the browser will instead go to bing.com. Practical jokers might do this to a friend’s computer, and it isn’t malicious. However, if the IP address points to a malicious server, this might cause the system to download malware.
A cyberattack in October 2016 effectively tack down the Internet for millions of users in North America and Europe. Attackers infects many Internet-connected devices, such as video cameras, video recorders, printers, and baby monitors, with malware called Mirai. Mirai forces individual systems to become bots within large botnets. They sent commands to millions of infected devices directing them to repeatedly sent queries to DNS servers. These queries overwhelmed the DNS servers and prevented regular users from accessing dozens of websites. They clearly demonstrated that it is possible to seriously disrupt DNS services, causing Internet access problems for millions of people.
An amplification(放大) attack is a type of DDoS attack. It typically uses a method that significantly increases the amount of traffic send to, or requested from, a victim.
As an example, a smurf attack spoofs the source address of a directed broadcast ping packet to flood a victim with ping replies. It’s worthwhile to break this down:
DNS amplification attacks send DNS amplification attacks send DNS requests to DNS servers spoofing the IP address of the victim. Instead of just asking a single record, these attacks tell the DNS servers to send as much zone data as possible, amplifying the data sent to the victim. Repeating this process from multiple attackers can overload the victim system.
An example of a Network Time Protocol (NTP) amplification attack uses the monlist command. When used normally, it sends a list of the last 600 hosts that connected to the NTP server. In an NTP amplification attack with monlist, the attacker spoofs the source IP address when sending the command. The NTP server then flood the victim with details of the last 600 systems that requested the time from the NTP server.
A brute force attack attempts to guess all possible character combinations. The two types of brute force attacks are online and offline. An online password attack attempts to discover a password from an online system. For example, an attacker can try to log on an account by repeatedly guessing the username and password. Many tools (such as ncrack) are available that attackers can use to automate the process. Chapter 2 discusses account lockout policies used in Windows systems. They are effective against online brute force password attacks. An account lockout setting locks an account after the user enters the incorrect password a preset number of times, Individual services often have their own settings to prevent brute force attacks.
Offline password attacks attempt to discover passwords from a captured database or captured packet scan. For example, when attacker hack into a system or network causing a data breach, they can download entire database. They the perform offline attacks to discover the passwords contained within the database. One of the first steps to thwart offline brute force attack is to use complex passwords and to store the passwords in an encrypted or hashed format. Complex passwords include a mix of uppercase letters, lowercase letters, numbers, and special characters. Additionally, longer passwords are much more difficult to crack than shorter passwords.
A dictionary attack is one of the original password attacks. It uses a dictionary of words and attempts every word in the dictionary to see if it works. A dictionary in this context is simply a list of words and character combinations. Dictionaries used in these attacks have evolved over time to reflect user behavior. Today, they include many of the common passwords that uneducated users configure for their accounts. These attacks are thwarted by using complex passwords. A complex password will not include words in a dictionary.
Most systems don’t store the actual password for an account. Instead, they store a hash of the password. Hash attacks attack the hash of a password instead of the password. A hash is simply a number created with a hashing algorithm such as Message Digest 5 (MD5) or Secure Hash Algorithm 3 (SHA-3). A system can sue a hashing algorithm such as MD5 to create a hash of a password. As an example, if a user’s password is IC@nP@$$S3curity+, the system calculates the hash and stores it instead. In the example, the MD5 hash is 75c8ac11c86ca966b58166187589cc15. Later, a user authenticates with a username and password. The system then calculates the hash of the password that entered and compares the calculated hash against the stored hash. If they match, it indicates the user enter the correct password.
Unfortunately, tools are available to discover many hashed passwords. For example, MD5 Online (http://www.md5online.org) allows you to enter a hash, and it gives you the text of the password. MD5 Online uses a database of hashed words from a dictionary. If the hash matches a database entry, the site returns the password. The password is rarely sent across the network in cleartext, because with protocol analyzer, an attacker can capture and view a password if it is sent across a network if it is sent across a network in cleartext. To prevent this, a protocol can calculate the hash of the password on the user’s system and then send the hash across the network instead of the password. Unfortunately, if the hash is pass across the network in an unencrypted format, the attacker may be able to capture the hash and use it to log on a system. Instead, most authentication protocols encrypt the password or the hash before sending it across the network.
In a pass the hash attack, the attacker discovers the hash of the user’s password and then uses it to log on to the system as the user. Any authentication protocol that passes the hash over the network in an unencrypted format is susceptible to this attack. However, it is most associated with Microsoft LAN Manager (LM) and NT LAN Manager (NTLM), two old security protocols used to authenticate Microsoft clients. They are both susceptible to pass the hash attacks. Any system using LM or NTLM is susceptible to pass the hash attack. The simple and recommended solution is to use NTLMv2 or Kerberos instead. NTLMv2 uses a number used once (nonce) on both the client and the authenticating server. The authentication process uses both the client nonce and the server nonce in a challenge/response process.
Unfortunately, many existing applications still use NTLM, so it can still be enabled on many Windows systems for backward compatibility. However, Microsoft recommends configuring clients to only sent NTLMv2 responses and configuring authenticating servers to refuse any use of LM or NTLM. This is relatively easy to do via a Group Policy setting.
In a birthday attack, an attacker is able to create a password that produces the same hash as the user’s actual password. This is also known as a hash collision(冲突). A hash collision occurs when the hashing algorithm creates the same hash from different passwords. This is not desirable. Birthday attacks on hashes are thwarted by increasing the number of bits used in the hash to increase the number of possible hashes.
Rainbow table attacks are a type of attack that attempts to discover the password from the hash. A rainbow table is a huge database of precomputed hashes. It helps to look at the process of how some password cracker application discover passwords without a rainbow table. Assume that an attacker has the hash of a password. The application can use the following steps to crack it:
From a computing perspective, the most time-consuming part of these step is hashing the guessed password in step 2. However, by using rainbow table, application eliminate this step. Rainbow tables are huge database of passwords and their calculated hashes. Some rainbow tables are as large as 160 GB in size, and they include hashes for every possible combination of characters up to eight characters in length. Larger rainbow table are also available using more characters.
Salting passwords is a common method of preventing rainbow table attacks, along with other password attacks such as dictionary attacks. A salt is a set of random data such as two additional characters. Password salting adds these additional characters to a password before hashing it. These additional characters add complexity to the password and also result in a different hash than the system would create using only the original password. This cause password attacks that compare hashes fail.
A replay attack is one where an attacker replays data that was already part of a communication session. In a replay attack, a third party attempts to impersonate a client that is involved in the original session. Replay attacks can occur on both wired and wireless networks. Many protocols use timestamps and sequence numbers to thwart replay attacks.
An attacker can launch a known plaintext attack if he has samples of both the plaintext and the ciphertext. As an example, if an attacker captures an encrypted message (the ciphertext) and knows the plaintext if the message, he can use both sets of data to discover the encryption and decryption methods. If successful, he can use the same decryption method on other ciphertext. A chosen plaintext attack is similar, but the attacker doesn’t have access to all the plaintexts. IF the entire message is encrypted, the attacker can try various methods to decrypt the chosen plaintext (the last two sentences included in every email). When he’s successful, he can use the same method to decrypt the entire message.
In a ciphertext only attack, the attacker doesn’t have any information on the plaintext. Known plaintext and chose plaintext attacks are almost always successful if an attacker has the resources and time. However, ciphertext only attacks are typically only successful on weak encryption algorithms. They can be thwarted by not using legacy and deprecated encryption algorithms.
Typo squatting (also called URL hijacking) occurs when someone buys a domain name that is close to a legitimate domain name. People often do so for malicious purpose. Attackers might buy a similar domain for variety of reasons, including
Clickjacking tricks users into clicking something other than what they think they are clicking. For example, Bart is browsing Facebook, and he sees a comment labelled Chalkboard(黑板) Sayings, so he clicks it. He’s taking to a page with a heading of ‘Human Test’ and directions to ‘Find the blue button to continue’. This looks like a Completely Automated Public Turing test to tell Computers and Humans Apart (CAPTCHA), but it isn’t. When he clicks the blue button, he’s actually clicking on a Facebook share button, causing him to share the original comment labelled Chalkboard Sayings with his Facebook Friends.
While it’s rarely apparent to the user, most clickjacking attacks use HTML frames. A frame allows one web page to display another web page within an area defined as a frame or iframe. For example, the Facebook share example is thwarted by Facebook web developers adding code to their web page preventing the user of frame. Attackers continue to find new ways to launch clickjacking attacks. As they appear, web developers implement new standards to defeat them. Most methods focus on breaking or disabling frames. This ensures that attackers cannot display your web page within a frame on their web page.
Session hijacking takes advantage of session IDs stored in cookie. When a user log on to a web site, the web site often returns a small text file (called cookie) with a session ID. In many cases, this cookie is stored on the user’s system and remains active until the user log off. If the user closes session and returns to the web site, the web site reads the cookie and automatically logs the user on. This is convenient for the user but can be exploited by an attacker. In a session hijacking attack, the attacker utilizes the user’s session ID to impersonate the user. The web server doesn’t know the difference between the original user and the attacker because it is only identifying the user based on the session ID. Attackers can read cookie installed on systems through several methods, such as through cross-site scripting attacks. Once they have the session ID, they can insert it into the HTTP header and send it to the web site. If the web server uses this session ID to log the user on automatically, it gives the attacker access to the user’s account.
In a domain hijacking attack, an attacker changes the registration of a domain name without permission from the owner. Attackers often do so with social engineering techniques to gain unauthorized access to the domain owner’s email account. For example, Homer sets up a domain named homersimpson.com. He uses his Gmail account as the email address when he registers it, though he rarely checks his Gmail account anymore. Attackers watch his Facebook page and notice that he often adds simple comments like “Doh!”. Later, they try to log on to his Gmail account with a brute force attempt. They try the password of “Doh!Doh!” and get in. They then go to the domain name registrar and use the Forget Password feature. It sends a link to Homer’s Gmail account to reset the password. After resetting the password at the domain name registrar site, the attackers change the domain ownership. They also delete all the emails tracking what they did. Later, Homer notices his web site is completely changed, and he no longer has access to it.
A man-in-the-browser is a type of proxy trojan horse that infects vulnerable web browsers. Successful man-in-the-browser attacks can capture browser session data. This includes keyloggers to capture keystroke, along with all data sent to and from the web browser.
Operating systems use drivers to interact with hardware devices or software components. Occasionally(偶尔), an application needs to support an old driver. For example, Windows 10 needed to compatible with drivers used in Windows 8, but all the drivers weren’t compatible at first. Shimming provides the solution that makes it appear that the old drivers are compatible. A drive shim is additional code that can be run instead of the original driver. When an application attempts to call an old driver, the operating system intercepts the call and redirects it to run the shim code instead.
Refactoring(重构) code is the process of rewriting the internal process of the code without changing its external behavior. It’s usually done to correct problems related to software design. Developers have a choice when a driver is no longer compatible. They can write a shim to provide compatibility, or they can completely rewrite the driver to refactor the relevant code. If the code is clunky(笨拙的), it’s appropriate to rewrite the driver.
Attackers with strong programming skills can use their knowledge to manipulate drivers by wither creating shims, or by rewriting the internal code. If the attackers can fool the operating system into using a manipulated driver, they can cause it to run malicious code contained within the manipulated driver.
A zero-day vulnerability us a weakness or bug that is unknown to trusted sources, such as operating system and antivirus vendors. A zero-day attack exploits an undocumented vulnerability. Many times, the vendor isn’t aware of the issue. At some point, the vendor learns of the vulnerability and begins to write and test a patch to eliminate it. However, until the vendor releases the patch, the vulnerability is still a zero-day vulnerability. In most cases, a zero-day vulnerability is a new threat, however, there have been zero-day vulnerabilities that have existed for years. Both attackers and security experts are constantly looking for new threats, such as zero-day vulnerabilities. Attackers want to learn about them so that they can exploit them. Most security experts want to know about them so that they can help ensure that vendors patch them before causing damage to users.
A memory leak is a bug in a computer application that causes the application to consume more and more memory the longer it runs. In extreme case, the application can consume so much memory that the operating system crashes. Memory leaks are typically caused by an application that reserves memory for short-term use but never releases it.
An integer overflow attack attempts to use or create a numeric value that is too big for an application to handle. The result is that application gives inaccurate results. As an example, if an application reserves 8 bits to store a number, it can store any value between 0 and 255. If the application attempts to multiply two values such as 95x59, the result is 5605. This number cannot be stored in the 8 bits, so it causes an integer overflow error. It’s good practice to double-check the size of buffers to ensure they can handle any data generated by the application. In some situations, an integer overflow error occurs if an application experts a positive number but receives a negative number instead. If the application doesn’t have adequate error- and exception-handling routines, this might cause a buffer overflow error.
A buffer overflow occurs when an application receives more input, or different input, than it expects. The result is an error that exposes system memory that would otherwise be protected and inaccessible. Normally, an application will have access only to a specific area of memory, called a buffer. The buffer overflow allows access to memory locations beyond the application’s buffer, enabling an attacker to write malicious code into this area of memory.
The buffer overflow exposes a vulnerability. But it doesn’t necessarily cause damage by itself. However, once attackers discover the vulnerability, they exploit it and overwrite memory locations with their own code. If the attack uses the buffer overflow to crash the system or disrupt its services, it is a DoS attack. More often, the attacker’s goal is to insert malicious code in a memory location that the system will execute. It’s not easy for an attacker to know the exact memory location where the malicious code is stored making it difficult to get the computer to execute it. However, an attacker can make educated guesses to get close. A popular method that makes guessing easier is with no operation (NOP) commands, written as a NOP slide or NOP sled. Many Intel processors use hexadecimal 90 (open written as x90
) as a NOP command, so a string of x90
characters is a NOP sled. The attacker writes a long string of x90
instructions into memory, followed by malicious code. When a computer execute code form a memory location anywhere in the NOP slide, the system will execute the attacker’s malicious code. The malicious code varies, In some instances, the attackers, write code spread a worm through the web server’ network. In other cases, the code modifies the web application so that the web application tries to infect every user who visits the web site with other malware. The attacker possibilities are almost endless.
A budder overflow attack include several different elements, but they happen all at once. The attacker sends strings of data to the application. The first part of the string causes the buffer overflow. The net part of the string is a long string of NOPs followed by the attacker’s malicious code, stored in the attacked system’s memory. Last, the malicious code goes to work. In some cases, an attacker writes a malicious script to discover buff overflow vulnerabilities.
Although error-handling routines and input validation go a long way to prevent buffer overflows, they don’t prevent them all. Attackers occasionally discover a bug allowing them to send a specific string of data to an application causing a buffer overflow. When vendors discover buffer overflow vulnerabilities, they are usually quick to release a patch or hotfix. From an administrator’s perspective, the solution is easy: Keep the systems up to date with current patches.
Programming language such as C, C++, and Pascal commonly use pointers, which simply store a reference to something. Some languages such as Java call them references. As an example, imagine an application has multiple modules. When new customer starts an order, the application invokes the CustomerData module. This module needs to populate the city and state in a form after user enters a zip code. How does the module get this array? One way is to pass the entire array to the module when invoking it. However, this consumes a lot of memory. The second method is to pass a reference to the data array, which is preferred method. This method uses a pointer dereference. Dereferencing is the process of using the pointer to access the data array. Image the pointer is named ptrZip, and the name of the full data array is named arrZip. The value with ptrZip is arrZip, which references the array. What is this thing that the pointer pointes to? There isn’t a standard name, but some developers refer to it as a pointee.
A failed dereference operation can cause an application to crash. In some programming languages, it can subtly(巧妙地) corrupt memory, which can be even worse than a crash. The subtle, random changes result in the application using incorrect data. This can often be difficult to troubleshoot and correct. The cause of a failed dereference operation is a pointer that references a nonexistent pointee.
Applications commonly use a Dynamic Link Library (DLL) or multiple DLLs. A DLL is a complied set of code that an application can use without recreating the code. As an example, most programming languages include math-based DLLs. Instead of writing the code to discover the square root of a number, a developer can include the appropriate DLL and access the square root function within it. DLL injection is an attack that injects a DLL into a system’s memory and causes it to run. In a successful DLL injection attack, the attacker attaches to a running process, allocates memory within the running process, connects the malicious DLL within the allocated memory, and then executes functions within the DLL.
Complied code has been optimized by an application (called complier(编译器)) and converted into an executable file. The compiler checks the program for errors and provides a report of items developers might like to check. Some commonly used complied programming languages are C, C++, Visual Basic, and Pascal.
Runtime code is code that is evaluated, interpreted, and executed when the code is run. As an example, HTML is the standard used to create web pages. In includes specific tags that are interpreted by the browser, when it renders(呈递) the web page. HTML-based web pages are interpreted at runtime.
Many languages use a cross between complied and runtime code. For example, Python is an interpreted language widely used to create sophisticated web sites. However, when it is first run, the Python interpreter complies it. The server will then use the complied version each time it runs. If the system detects a change in the Python source code, it will recompile it.
One of the most important security steps that developers should take is to include input validation. Input validation is the practice of checking data for validity before using it. Input validation prevents an attacker from sending malicious code that an application will use by either sanitizing the input to remove malicious code or rejecting the input. Improper input handling or the lack of input validation is one of the most common security issues on web-based applications. It allows many different types of attacks, such as buffer overflow attacks, SQL injection, command injection, and XSS attacks.
Some common checks performed by input validation include:
Some fields such as a zip code use only numbers, whereas other fields such as state names use only letters. Other fields are a hybrid. For example, a phone number uses only numbers and dashes. Developers can configure input validation code to check for specific character types and even verify that characters are entered in the correct order.
These checks ensure that values are within expected boundaries or ranges. For example, if the maximum purchase for a product is 3, a range check verifies the quality is 3 or less. The validation check identifies data outside the range as invalid and the application does not use it.
Some malicious attacks embed HTML code within the input as a part of attack. These can be blocked by preventing the suer form entering the HTML code, such as < and > characters.
Some attacks, such as SQL injection attacks, use specific characters such as the dash (-), apostrophe (‘), and equal sign (=). Blocking these characters helps to prevent these attacks.
It’s possible to perform input validation at the client and the server. The client-side execution indicates that the code runs on the client’s system, such as a user’s web browser. Server-side execution indicates that the code runs on the server, such as on a web server. Client-side input validation is quicker but is vulnerable to attacks. Server-side input validation takes longer but is secure because it ensures the application doesn’t receive invalid data. Many applications use both.
In client-side input validation, the validation code is usually included in the HTML page sent to users. Unfortunately, it’s possible to bypass client-side validation techniques. Many web browsers allow users to disable JavaScript in the web browser which bypass client-side validation. It’s also possible to use a web proxy to capture the data sent from the client in the HTTP POST command and modify it before forwarding to the server. Server-side input validation checks the inputted values when it reaches the server. This ensure that the user hasn’t bypassed the client-side checks. Using both client-side and server-side validation provides speed and security. The client-side validation prevent round-trip to the server until the user has enter the correct data. The server-side validation is a final check before the server uses the data.
Other input validation techniques attempt to sanitize HTML code before sending it to a web browser. These methods are sometimes referred to as escaping the HTML code or encoding the HTML code. As an example, the greater than symbol (>) can be encoded within the ASCII replacement characters (>). Doing so, along with following specific guidelines related to note inserting untrusted data into the web pages, helps prevent many web application attacks. Most languages include libraries that developers can use to sanitize the HTML code. As an example, the Open Web Application Security Project (OWASP) Enterprise Security API (ESAPI) is a free, open-source library based available for many programming languages. It includes a rich set of security-based tools, including many used for input validation.
When two or more modules of an application, or two or more applications, attempt to access to resource at the same time, it can cause a conflict known as race condition. Most application developers are aware of race conditions and include methods to avoid them when writing code. However, when new developers aren’t aware of race conditions, or they ignore them, a race condition can cause significant problems. An example it that two persons select one same seat in the same flight in an online ticketing application. Online ticketing applications for planes, concerts and other events avoid this type of race condition. In some cases, they lock the selection before offering it to a customer. In other cases, they double-check for a conflict later in the process. Most database applications have internal concurrency(并发) control processes to prevent two entities from modifying a value at the same time. However, unexperienced web application developers often overlook race conditions.
Error-handling and exception-handling routines ensure that an application can handle an error gracefully. They catch errors and provide user-friendly feedback to the user. When an application doesn’t catch an error, it can cause the application to fail. In the worst-case scenario, improper error-handling techniques within an application can cause the operating system to crash. Using effective error- and exception-handling routines protects the integrity of the underlying operating system. Improper error handing can often give attacker information about an application. When an application doesn’t catch an error, it often provides debugging information that attackers can use against the application. In contrast, when an application catches the error, it can control what information it shows to the user. There are two important points about error reporting:
Detailed errors provide information that attackers can use against the system, so the errors should be general. Attackers can analyze the errors to determine details about the system. For example, if an application is unable to connect with a database, a detailed error can let the attacker know exactly what type pf database the system is using. This indirectly lets the attacker know what types of commands the system will accept. Also detailed errors confese most other users.
Detailed information on the errors typically includes debugging information. By logging this information, it makes it easier for developers to identify what caused the error and how to resolve it.
In general, sensitive data is often encrypted to prevent the unauthorized disclosure of data. If an application is accessing any sensitive data, developers need to ensure that this access doesn’t result in inadvertent(因疏忽造成的) data exposure. Applications need to decrypt data before processing it. When done processing the data, applications need to encrypt the data before storing it or transferring it. Additionally, applications need to ensure that all remnants(剩余) of the data should be flushed(清除) from memory.
Certificates are used for various purposes such as authenticating users and computers. They can also be used to authenticate and validate software code. As an example, developers can purchase a certificate and associate it with an application or code. The code signing process provides a digital signature for the code, and the certificate includes a hash of the code. This provides two benefits. First, the certificate identifies the author. Second, the hash verifies the code has not been modified. If malware change the code, the hash no longer matches, alerting the user that the code has been modified.
Developers are encouraged to reuse code whenever possible. Code reuse saves time and helps prevent the introduction of new bugs, because reused code has been tested in both internal testing and actual using. However, when reusing code, developers should ensure that they are using all the code that they copy into another application. As an example, imagine a develop has created a module that has three purpose: create users, modify users, and authenticate users. While working on a new application, he realizes he needs a module that will authenticate users. If he simply copies the entire module into the new application, it creates dead code. Dead code is code that is never executed or used. In this example, the copied code to create and modify users isn’t used in the new application, so it is dead code. Logic errors can create dead code.
Another popular method of code reuse is the use of third-party libraries. As an example, JavaScript is a rich, interpreted language used by many web applications. Netscape originally developed it, and it was ultimately standardized as an open-source language. Software development kit (SDKs) are like third-party libraries, but they are typically tied to a single vendor. For example, if you’re creating an Android app, you can use the Android SDK. In includes software tools that will help you create apps for Android-based devices.
Obfuscation attempts to make something unclear or difficult to understand. Code obfuscation (or code camouflage(伪装)) attempts to make the code unreadable. It does things like rename variables, replace numbers with expressions, replace strings of characters with hexadecimal codes, and remove comments. It’s worth noting that most security experts reject security through obscurity as a reliable method of maintaining security. Similarly, code obfuscation might make the code difficult to understand by most people, however, it’s still possible for someone with skills to dissect(剖析) the code.
Many organizations that create applications also employ testers to verify the quality of the code. Testers use a variety of different methods to put the code through its paces. Ideally, they will detect problems with the code before it goes live. Some of the common methods of testing code include:
Static code analysis examines the code without executing it. Automated tools can analyze code and mark potential defects. Some tools work as the developer creates the code, similar to a spell checker. Other tools can examine the code once it is semi finalized.
Dynamic analysis checks the code as it is running. A common method is to use fuzzing(模糊测试). Fuzzing uses a computer program to send random data to an application. In some cases, the random data can crash the program or create unexpected results, indicating a vulnerability. Problems covered during a dynamic analysis can be fixed before releasing the application.
Stress testing methods attempt to simulate a live environment and determine how effective or efficient an application operates with a load. As an example, a web application is susceptible to a DDoS attack. A stress test can simulate a DDoS attack and determine its impact on the web application.
A sandbox is an isolated area used for testing programs. Application developers can test applications in a sandbox, knowing that any changes they make will not affect anything outside the sandbox. Virtual machines (VMs) are often used for sandboxing.
Testing helps identify and remove bugs. However, it’s also important that the software does what it’s meant to do. Model verification is the process of ensuring that software meets specifications and fulfills its intended purpose.
Software development life cycle (SDLC) models attempt to give structure to software development projects. Two popular models are waterfall and agile.
The waterfall model includes multiple stages going from top to bottom. Each stage feeds the next stage, so when you finish one stage, you move on to the next stage. When following the waterfall model strictly, you don’t get back to a stage after finishing it. There are multiple variations of the waterfall model, but they all use stages. However, the names of these stages vary from model to another. Some typical stages used with the waterfall model include:
The developers work with customer to understand the requirements. The output of this stage is a requirements document, which provides clear guidance on what the application will do.
Developers begin to design the software architecture in this stage. This is similar to creating the blueprints for a building. The design stage doesn’t include any detailed coding but instead focuses on the overall structure of the project.
Developers write the code at this stage based on the requirements and design.
The verification stage ensures the code meets the requirements.
The maintenance stage implements changes and updates as desired.
A challenge with the waterfall model is that it lacks flexibility. It is difficult to revise anything from previous stages. For example, if a customer realizes a change in the requirements is needed, it isn’t possible to implement this change until the maintenance stage. The agile model uses a set of principles stress interaction, creating a working application, collaborating with the customer, and responding to change. Instead of strict phases, the agile model uses iterative cycle. Each cycle creates a working, if not complete, product. Tester verify the product works with the current features and then developers move into the next cycle. The next adds additional features, often adding small, incremental(增加的) changes from the previous cycle.
A key difference of the agile model compared with the waterfall model is that it emphasizes interaction between customers, developers, and testers during each cycle. In contrast, the waterfall model encourages interaction with customers during the requirements stage, but not during the design and implementation stages. The agile model can be very effective if the customer has a clear idea of the requirements. If not, the customer might ask for changes during each cycle, extending the project’s timeline.
DevOPs combines the words development and operations, and it’s an agile-aligned software development methodology. Secure DevOPs is a software development process that includes extensive communication between software developers and operations personnel. It also includes security considerations throughout the project. When applied to a software development project, it can allow developers to push out multiple updates a day in response to changing business needs. Some of the concepts included within a secure DevOPs project are summarized in the following bullets:
Security automation uses automated tests to check code. When modifying code, it’s important to test it and ensure that the code doesn’t introduce software bugs or security flaws. It’s common to include a mirror image of the production environment and run automated tests on each update and ensure it is error free.
Continuous integration refers to the process of merging(合并)code changes into a central repository(仓库). Software is then built and tested from this central repository. The central repository includes a version control system, and the version control system typically supports rolling back code changes when they cause a problem.
Baselining refers to applying changes to the baseline code every day and building the code from these changes. For example, image five developers are working on different elements of the same project. Each of them has modified and verified some code on their computer. At the end of the day, each of these five developers uploads and commits their changes. Someone then builds the code with these changes and then automation techniques check the code. The benefit is that bugs are identified and corrected quicker. In contrast, if all the developers applied their changes once a week, the bugs can multiply and be harder to correct.
Immutable systems cannot be changed. Within the context of secure DevOPs, it’s possible to create and test systems in a controlled environment. Once they are created, they can be deployed a production environment. As an example, it’s possible to create a secure image of a server for a specific purpose. This image can be deployed as an immutable system to ensure it stays secure.
Infrastructure as code refers to managing and provisioning data centers with code that defines virtual machines (VMs). Many VMs are created with scripts. Once the script is created, new VMs can be created just by running the script.
In software development, change management helps ensure that developers do not make unauthorized changes. The change management process allows several people to examine the change to ensure it won’t cause unintended consequences. Also, any change to the application becomes an added responsibility. In addition to preventing unauthorized changes and related problems, a change management process also provides an accounting structure to document changes. Once a change is authorized and implemented, the change is documented in a version control document.
Version control tracks the version of software as it is updated, including who made the update and when. Many advanced software development tools include sophisticated version control systems. Developers check out the code to work on it and check it back into the system when they’re done. The version control system can then document every single change made by the developer. Even better, this version control process typically allows developers to roll back changes to a previous version when necessary.
Provisioning and deprovisioning typically refers to user accounts. In user accounts, provisioning refers to creating an account and giving appropriate privileges to the account, so that a user can use the newly created account to access various resources; while deprovisioning refers to removing access to access to resources and can be as simple as disabling the account. Within the context of secure application development and deployment concepts, these terms apply to an application. Provisioning an application refers to preparing and configuring the application to launch on different devices and to use different application services. Deprovisioning an app refers to removing it from a device.
Web servers most commonly host web sites accessible on the Internet, but they can also serve pages within an internal network. Organizations place web servers within a demilitarized zone (DMZ) to provide a layer protection. These two primary applications used for web server are:
Apache is the most popular web server used on the Internet. It’s free and can run and Unix, Linux and Windows systems.
IIS is a Microsoft web server, and it’s included free with any Windows Server product.
SQL is a Structured Query Language used to communicate with databases. SQL statements read, insert, update and delete data to and from a database. Many web sites use SQL statements to interact with a database providing users with dynamic content. A database is a structured set of data. It typically includes multiple tables, and each table holds multiple columns (attributes) and rows (tuples).
Normalization of a database refers to organizing the tables and columns to reduce redundant data and improve overall database performance. Although there are several normal forms, the first three are the most important.
A database in first normal form (1NF) if it meets the following three criteria:
Second normal form (2NF) only applies to tables that have a composite primary key, where two or more columns make up the full primary key. A database is in 2NF if it meets the following criteria:
Third normal form (3NF) helps eliminate unnecessary redundancies within a database, A database is in 3NF if it meets the following criteria:
In a SQL injection attack, the attacker enters additional data into the web page form to generate different SQL statements. SQL query languages use a semicolon ( ; ) to indicate the end of the SQL line and use two dashes (–) as an ignored comment. With the knowledge, the attacker could enter different information into the web form like:
SELECT * FROM books WHERE author = ‘Darril Gibson’; SELECT * FROM customers; --’
The first query retrieves data from the database as the web site expected. However, the semicolon signals the end of the first and the database will accept another command. The next query reads all the data in the customer table, which can give attacker access to names, credit card data, and more. The last two dashes comment out the second single quote to prevent a SQL error.
If the application doesn’t include error-handling routines, these errors provide details about the type of database the application is using, such as Oracle, Microsoft SQL Server, or MYSQL database. Different databases format SQL statements slightly differently, but once the attacker learns the database brand, it’s a simple matter to format the SQL statements required by the brand. The attacker then follows with SQL statements to access the database and may allow the attacker read, modify, delete, and/or corrupt data.
Many SQL injection attacks use a phrase of or ‘1’= ‘1’
to create a true condition. For example, if an online database allows you to search a customers table looking for a specific record, it might expect you to enter a name. If you entered Homer Simpson, it would create a query like:
SELECT * FROM customers WHERE name = ‘Homer Simpson’;
This query will retrieve a single record for Homer Simpson. However, if the attacker enter name= ‘’ or ‘1’ = ‘1’ –’
instead of Homer Simpson, it will create a query like:
SELECT * FROM customers WHERE name = ‘’ or ‘1’ = ‘1’
The first clause will likely not return any records because the table is unlikely to have any records with the name file empty. However, because number 1 always equals number 1, the WHERE
clause in the second statement always equates to True, so the SELECT
statement retrieves all records from the customers table.
In many cases, a SQL injection attack starts by sending improperly formatted SQL statements to the system to generate errors. Proper error handling prevents the attacker from gaining information from these errors. Instead of showing the errors to the user, many web sites simply present a generic error web page that doesn’t provide any details. Input validation also provides strong protection against SQL injection attacks. Before using the data entered into a web form, the web application verifies that the data is valid. Additionally, database developers often use stored procedures with dynamic web pages. A stored procedure is a group of SQL statements that execute as a whole, similar to a mini program. A parameterized stored procedure accepts data as an input called a parameter. Instead of copying the user’s input directly into SELECT
statement, the input is passed to the stored procedure as a parameter. The stored procedure performs data validation, but it also handles the parameter (the inputted data) differently and prevent a SQL injection attack. Depending on how well the database server is locked down (or not), SQL injection attacks may allow the attacker to access the structure of the database, all the data, and even modify data.
Besides DLL injection and SQL injection, another type of injection attack is command injection attacks. In some cases, attackers can inject operating system commands into an application using web page forms or text boxes. Any web page that accepts input from users is a potential threat. Directory traversal is a specific type of command injection attack that attempts to access a file by including the full directory path or traversing the directory structure. For example, in Unix systems, the passwd file includes user logon information, and it is stored in the /etc
directory with a full directory path of /etc/passwd. Attackers can use commands such as ./../etc/passwd
or /etc/passwd
to read the file. Similarly, they could use a remote directory command such as rm -rf
to delete a directory, including all files and subdirectories. Input validation can prevent these types of attacks.
Cross-site scripting (XSS) is another web application vulnerability that can be prevented with input validation techniques. Attackers embed malicious HTML or JavaScript code into a web site’s code. The code executes when the user visits the site. XSS attacks allow attackers to capture user information such as cookies. The primary protection against XSS attacks is at the web application with sophisticated input validation techniques. Additionally, OWASP strongly recommends the use of a security encoding library. When implemented, an encoding library will sanitize HTML code and prevent XSS attacks. It’s also important to educate users about the dangers of clicking links. Some XSS attacks send emails with malicious links within them. The kind of XSS attack fails if users do not click the link.
Cross-site request forgery (CSRF) is an attack where an attacker tricks a user into performing an action on a web site. The attacker creates a specially crafted HTML links, and the user performs the action without realizing. As an example of how HTML links create action, consider the HTML link http://www.google.com/search?q=Success
. If users click this link, it works just as if the user browsed to Google and enter Success as a search term. The ?q=Success
part of the query causes the action. Many web sites use the same type of HTML queries to perform actions, which could include making purchases, transferring money, verifying password resetting, etc. If a web site supports any action via an HTML link, an attack is possible. Web sites typically won’t allow these actions without users first logging on. However, if users have logged on before, authentication information is stored on their system either in a cookie or in the web browser’s cache. Some web sites automatically use this information to log users on as soon as they visit. In some cases, the CSRF attack allows the attacker to access the user’s password.
Users should be educated on the risks related to links from sources they don’t recognize. Developers need to be aware of CSRF attacks and the different methods used to protect against them. One method is to use dual authentication and force the user to manually enter credentials prior to performing actions. Another method is to expire the cookie after a short period, such as 10 minutes, preventing automatic logon for the user. Many programming languages support CSRF tokens. For example, Python and Django, two popular web development languages, require the use of an CSRF token in any page that includes a form. This token is a large random number generated each time the form is displayed. When a user submits the form, the web page includes the token along with other form data. The web application then verifies that the token in the HTML request is the same as the token included in the web form. The HTML request might be something like:
getcertificatedgetahead.com/edit?action=set&key=email&value=you@home.com&token=1357924
The token is typically much longer. If the website receives a query with an incorrect error, it typically raises a 403 Forbidden error. Attackers cannot guess the token, so they cannot craft malicious link that will work against the site.
Within the context of cybersecurity, there are multiple references available that describe best practices and provide instructions on how to secure systems. Some of these are industry-standard frameworks, while others are platform- or vendor-specific guides. A framework is a structure used to provide a foundation. Cybersecurity frameworks typically use a structure of basic concepts, and they provide guidance to professionals on how to implement security in various systems. Some generic categories of frameworks are:
Regulatory frameworks are based in relevant laws and regulations. As an example, the Health Insurance Portability and Accountability Act (HIPAA) mandates specific protections of all health-related data. The Office of the National Coordinator for Civil Rights (OCR) created the HIPAA Security Risk Assessment (SRA) Tool. This tool provides a framework that organizations can use to help ensure compliance HIPAA.
A non-regulatory framework is not required by any law. Instead, it typically identifies common standards and best practices that organizations can follow. As an example, COBIT (Control Objectives for Information and Related Technologies) is a framework that many organizations use to ensure that business goals and IT security goals are linked together.
Some frameworks are used within a single country (and referred to as national frameworks), while others are used internationally. As an example, NIST created the Cybersecurity Framework, which focuses on cybersecurity activities and risks within the United States. In contrast, the International Organization for Standardization (ISO) and the International Electrotechnical Commission (IEC) create and publish international standards. For example, ISO/IEC 27002 provides a framework for IT security.
Some frameworks only apply to certain industries. As an example, organizations that handles credit cards typically comply with the Payment Card Industry Data Security Standard (PCI DSS). PCI DSS includes 12 requirements and over 200 sub-requirements that organizations follow to protect credit card data.
In addition to frameworks, you can also use various guides to increase security. This includes benchmark(基准) or secure configuration guides, platform- or vendor-specific guides, and general-purpose guides. On the surface, this is quite simple. When configuring Linux systems, use a Linux guide. When configuring a Windows system, use a Windows guide. Additionally, when configuring a system for a specific role. As an example, a web server would need port 80 and 443 for HTTP and HTTPS respectively. However, a database application server would net typically need these ports open, so they should be closed on a database application server. The individual guides for each of the roles provide this information.
复习笔记基于Darril Gibson所著Study Guide整理而成,仅供学习参考