Python Basics for Pentesters

The entirety of this guide was written by Martian Defense, LLC

Introduction to Python

Python is a high-level, interpreted, and general-purpose dynamic programming language. It is often used as a scripting language because of its forgiving syntax and compatibility with a wide variety of different systems and libraries. In this section, we will review the basic building blocks of Python, from understanding the Python interpreter, syntax, to building simple functions, classes, and modules. These fundamentals will serve as your foundation in using Python for penetration testing.

Python is an interpreted language, which means that it is processed at runtime by the interpreter. You can use the Python interpreter in two ways:

  • Interactive mode

  • Script mode

Interactive mode

In interactive mode, you type Python code and the interpreter displays the result:

$ python
>>> print("Hello, Earth!")
Hello, Earth!
>>> 

Script mode

In script mode, you store code in a file and use the interpreter to execute the contents of the file:

$ python hello.py
Hello, Earth!

Syntax

Python was designed to be easy to understand and fun to use. Its syntax is clear and it has a distinct style guideline, PEP 8, which promotes readability.

Variables and Data Types

Python has several fundamental data types:

Numeric Types: int, float, complex

  • Boolean Type: bool

  • Text Type: str

  • Sequence Types: list, tuple, range

  • Mapping Type: dict

  • Set Types: set, frozenset

  • You can assign any data type to any variable:

Nonprimitive Data Types

  • List: Any data that is enclosed within square brackets ( [ ] ) and separated by a comma is considered a list. In Python, the objects in a list are indexed, where the first object in the list starts with index 0, the proceeding object is 1, and so on.

  • Tuple: Any data that is enclosed within parenthesis ( ( ) ) and separated by a comma is considered a tuple. Tuples are immutable, meaning that data is stored in a tuple cannot be modified at run time. The following is a tuple of IP addresses.

  • Dictionary: Any key-value pairs that are enclosed in curly brackets ( { } ) and separated by a comma is a dictionary. The following dictionary has keys that are equal to interface names and values with the desired state of the interface.

  • Set: A collection of unique objects that is enclosed curly brackets ( { } ) and separated by a comma is considered a set.

Primitive data types can be converted to other primitive types using built-in functions, assuming that the converted value is valid.

Some nonprimitive data types can also be converted to other (similar) data types. For example, a list can be converted to a set or a tuple but cannot be converted to a dictionary.

As shown in the output, the devices list contains a duplicate entry of "NEXUS", but it was removed when converted to the set data type. Note that set removes items based on case sensitivity. The previous output shows that 'ASA' and 'asa' are still present in the set because they are different values.

It is possible to convert a list to a tuple, but the variable that holds the converted tuple can no longer be modified. Here you can see an example:

It is not possible to convert one data type to another if the converted value is invalid. For example, an error will be raised if you attempt to convert a list to a dictionary. Here is an example of such an error:

Each data type that was mentioned previously supports different built-in methods and attributes. To list the methods that can be used on a particular data type, create a variable with that type and then issue the dir() built-in method. Here you see an example of methods that can be used on the string data types.

If you want to convert a string to all capital letters, you need to use the upper() method.

To learn how to use each method, you can use the help() function.

Now you have an understanding of the different data types that Python supports. You can expand that knowledge with the following slightly more advanced topic: nested nonprimitive data types.

Nested data structures are only applicable to nonprimitive types. Each nonprimitive type can contain the same or other nonprimitive data types as nested entries.

For example, a nested list can contain a dictionary as a nested item:

To access the value inside the nested list without looping through the list, you first need to identify its position in the list. Because lists are ordered, and the positions, starting from 0, are incremented from left to right by 1, the position for the nested dictionary will be 1.

Now, to refer to the position, you need to put an integer within the square brackets:

However, that will only give you the [{"state": "shutdown"}] item, because it is also a list, and has only one value that can be referenced with its positional number:

At this point, what remains is a dictionary. Now you can print the value of the key by appending the name of the key to the variable that precedes the position in the list:

Now, consider a nested dictionary. As you know, dictionaries are not sorted, so the position of a key cannot be referenced. Instead, it can be referenced directly by specifying its name and enclosing it in square brackets. Here is an example:

You can obtain nested values that are stored under each root key. In this example, start with csr1kv1. To return the value stored in the key, you enclose the key’s name in square brackets and append it to the variable.

The returned value is another dictionary. To return the value of the nested key, you need to add its name after the name of the root key.

Now consider the second root key. As you can see, the value is a list, so you need to act accordingly. You need to obtain the value, pick the position within the list, and then use the key name to return the value.

Understanding how to find the position in nested lists is crucial in day-to-day programming, especially when dealing with API calls.

Control Structures

Control structures determine the flow of your program. They include conditionals (if, elif, else) and loops (for, while).

Functions

A function is a block of code which only runs when it is called. Functions provide better modularity for your application and a high degree of code reusing. You can define functions using the def keyword:

Classes

Python is an object-oriented language and classes provide a means of bundling data and functionality together. Creating a new class creates a new type of object, allowing new instances of that type to be made:

Modules

A module allows you to logically organize your Python code. Grouping related code into a module makes the code easier to understand and use:

You can then import this module in another script:

Network Programming

In penetration testing, network programming is a crucial skill. With Python, you can write scripts that can sniff network packets, perform network scans, and execute other related tasks. In this section, we will review the fundamentals of network programming in Python.

Socket Programming

A socket is one endpoint of a two-way communication link between two programs running on a network. Python provides a robust library, socket, which provides us with socket operations. Here's a basic server-client program example:

Server:

Client:

TCP and UDP Connections

There are two main types of Internet protocol (IP) traffic, and they are TCP (Transmission Control Protocol) and UDP (User Datagram Protocol). In Python, we can set up both types of connections with the socket module.

TCP Connection

TCP is a reliable connection-oriented protocol that guarantees the successful delivery of data. Here's a basic TCP client-server program:

Server:

UDP Connection

UDP is not a connection-oriented protocol. Unlike TCP, it doesn't confirm whether the data reached the receiver or not. Here's a basic UDP client-server program:

Server:

Client:

Network Scanning

Network scanning is a process of identifies active hosts (clients and servers) on a network and their ports. In Python, we can use the socket module to perform this task. For instance, the following script checks for open TCP ports on a target host:

Web Scraping

Web scraping is a technique to extract data from websites. It involves making HTTP requests to the URLs of specific websites and then parsing the response (HTML) to pull out the information you need. Python provides several libraries to simplify web scraping, including requests for making HTTP requests and BeautifulSoup for parsing HTML.

HTTP Requests

The first step in web scraping is to send a HTTP request to the URL of the webpage you want to access. When a browser sends a request to a server, it's basically asking that server to send it a webpage. In Python, the requests library is commonly used for making HTTP requests.

The following example demonstrates how to use requests to make a GET request:

The get() function sends a GET request to the specified URL and returns a Response object. This object contains the server's response to your request. You can get the content of the response with response.text, and the HTTP status code with response.status_code.

HTML Parsing

Once you have accessed the HTML content of the webpage, you can use it to extract the data you need. This is known as parsing. Python has several libraries for parsing HTML, including BeautifulSoup and lxml.

BeautifulSoup is a Python library for parsing HTML and XML documents. It transforms a complex HTML document into a tree of Python objects, such as tags, navigable strings, or comments.

Here is an example of how to use BeautifulSoup to parse HTML content:

In this example, BeautifulSoup(response.text, 'html.parser') creates a BeautifulSoup object and specifies the parser. soup.find('h1') finds the first <h1> tag in the HTML.

Handling Cookies and Sessions

In some cases, you may need to maintain a session between multiple requests to the same website. For example, you might need to log in to a website and then access a specific page that requires authentication. The requests library provides a Session object to handle this.

A Session object allows you to persist certain parameters across requests. It also persists cookies across all requests made from the Session instance.

Here is an example of how to use a Session object to log in to a website and then access a protected page:

Working with Web APIs

Web APIs (Application Programming Interfaces) provide a way for applications to interact with each other. They expose parts of their service over the network, allowing other software to request data or perform actions.

You can interact with a Web API by sending HTTP requests, just like you do when you're scraping a webpage. The only difference is that, instead of getting HTML content in response, you get data in a machine-readable format, like JSON.

Here is an example of how to send a GET request to a Web API and parse the JSON response:

File Handling

In the process of penetration testing, you often need to read from files (like configurations, wordlists, etc.) or write results to files. Python has powerful built-in features for file handling, which include methods for creating, reading, updating, and deleting files.

Opening Files

In Python, you use the built-in open function to open a file. The open function takes two parameters: the name of the file, and the mode for opening the file.

Reading Files

Once you have opened a file for reading, you can use the read, readline, or readlines method to read the file's content.

Note: Don't forget to close the file when you're done with it!!:

Writing Files

To write to a file, you open the file in write ('w') or append ('a') mode, and then use the write method.

Keep in mind that opening a file in write mode will overwrite the existing content of the file. If you want to add to the existing content instead, open the file in append mode.

Working with JSON Files

JSON (JavaScript Object Notation) is a popular data format that is often used for communication between a server and a client, or between different parts of a single application. Python includes the json module which allows you to read and write JSON data.

In the above example, json.dump(data, f) writes JSON data to a file, and json.load(f) reads JSON data from a file.

Error Handling

When working with files, errors can occur for many reasons, such as the file not existing, the user not having enough permissions, etc. It's important to handle these errors in your code to prevent your program from crashing.

Python provides the try/except statement to catch and handle exceptions. Here is an example:

In this example, if opening the file fails because the file does not exist, Python raises a FileNotFoundError exception, which is then caught and handled by the except block.

Cryptography and Hashing

Cryptography is a fundamental part of cybersecurity and is essential for maintaining the confidentiality, integrity, and authenticity of data. It involves encoding, decoding, hashing, and password cracking which are key aspects of penetration testing.

Understanding Hashing

Hashing is a technique used to convert any data into a fixed size of unique data. The result of a hash function is called a hash value or simply, a hash. A good hash function ensures that the change of even a single bit of input will result in a significant change in the output.

Generating Hashes with Python

Python's hashlib module provides a variety of hashing algorithms including md5, sha1, sha256, and more. Here is an example of generating a hash with sha256:

Working with Password Hashes

In many cases, especially during penetration testing, we come across hashed passwords. Python can be used to generate and compare password hashes. The bcrypt library is a powerful, flexible library for hashing passwords. Here is an example:

Cryptography

Cryptography involves encrypting and decrypting data. Encryption transforms data into an unreadable format using an encryption algorithm and an encryption key. Decryption transforms the data back into its original format using the same encryption algorithm and a decryption key.

Symmetric Encryption and Decryption

In symmetric encryption, the same key is used for both encryption and decryption. Python provides several libraries for symmetric encryption, including cryptography. Here's how to use it for AES (Advanced Encryption Standard) encryption and decryption:

Asymmetric Encryption and Decryption

In asymmetric encryption, also known as public key cryptography, two different keys are used for encryption and decryption. The cryptography library also supports asymmetric encryption:

Python Libraries for Penetration Testing

Scapy

Scapy is a powerful Python library for packet manipulation. It allows you to forge or decode packets of a wide number of protocols, send them over the wire, capture them, and match requests and replies.

Here's an example of how to create an ICMP Echo request (a "ping") with Scapy:

Impacket

Impacket is a library for working with network protocols, which is highly effective when it comes to creating packet-level tools or working with network services. Impacket supports protocols like IP, TCP, UDP, ICMP, IGMP, ARP, and protocols used by higher-level services like SMB, MSRPC, and others.

Here's an example of using Impacket to connect to an SMB service:

Requests

Requests is a library for making HTTP requests. It abstracts the complexities of making requests behind a beautiful, simple API, so that you can focus on interacting with services and consuming data in your application.

Here's an example of how to use requests to make a GET request:

BeautifulSoup

BeautifulSoup is a Python library for parsing HTML and XML documents. It's often used for web scraping, which is a method of extracting data from websites.

Here's an example of how to use BeautifulSoup to extract all links from a webpage:

Building a Testing Tool with Python

Basic Network Scanner

By combining the Python concepts and libraries we've discussed so far, you can create your own powerful penetration testing tools. In this section, we will develop a simple yet effective network scanner as an example.

Tool Overview

Our network scanner will perform two main tasks:

  1. Discover all the devices connected to the same network.

  2. Scan the open ports of a given device.

For this, we will mainly use the Scapy library for packet generation and manipulation.

Writing the Network Scanner

Post-Exploitation with Python

Post-exploitation refers to the phase where an attacker (or penetration tester) has already gained access to a system and might need to maintain that access, escalate privileges, gather more information, or cover their tracks. For this section, we will discuss how Python can be used for such tasks.

Maintaining Access

Once a penetration tester gains access to a system, maintaining that access is crucial. Python provides several methods to accomplish this, such as creating backdoors.

A backdoor is a script that allows an attacker to bypass normal authentication methods. Please note that creating or using a backdoor is illegal and unethical without proper authorization. The following example is for educational purposes only:

Privilege Escalation

Privilege escalation involves gaining elevated access to resources that are typically protected from an application or user. There are two types of privilege escalation: horizontal and vertical. Horizontal escalation involves taking over another user's access rights, while vertical escalation involves elevating the privileges of the current user account.

Here is a simple script that checks if the current user has root privileges:

Information Gathering

Python is excellent for gathering more information from a compromised system. For instance, it can be used to list all directories and files, read specific files, fetch system and network information, and much more. Here's a simple script to fetch system information:

Covering Tracks

Covering tracks is an important part of post-exploitation. It involves deleting or altering logs that can indicate a system intrusion.

Here's a simple script that deletes a log file:

Buffer Overflow Vulnerabilities with Python

A buffer overflow happens when a program or process tries to store more data in a buffer (temporary data storage area) than it was intended to hold. Python can be used to create scripts that exploit these vulnerabilities in controlled and ethical hacking scenarios. In this section, we will provide a basic understanding of how to use Python to exploit buffer overflow vulnerabilities.

Understanding Buffer Overflow

Buffers are areas of memory set aside to hold data, often while moving it from one section of a program to another, or between programs. Buffer overflows can often be triggered by malformed inputs; if one assumes all inputs will be smaller than a certain size and the buffer is created to accommodate that, an anomalous transaction that produces more data could cause it to overflow. This can cause the data to leak into other buffers, which can corrupt or overwrite the data they were holding.

Building a Buffer Overflow Exploit

Here's an example of how you might create a Python script to exploit a buffer overflow vulnerability. The script will generate a long string of 'A's and send it to the target process. If the process does not correctly handle input of this length, it may overflow its buffer, causing a crash or other unexpected behavior.

Please note that all scripts are highly simplified. In a real-world situation these would involve more complex techniques and understanding of the target system and application.

Last updated