This is a followup of my previous article How an HTTP/HTTPS Proxy Work where I explained how a basic HTTP proxy works. In this article, I will explain how to implement a simple HTTPS proxy with interception in Python using asyncio and the ssl
module. No third-party libraries are required.
For simplicity, we will not handle any errors or edge cases. This is just a basic example to illustrate the concept.
An echo HTTPS server
First we are going to implement proxy server that will just print the request it receives.
import asyncio
from contextlib import closing
def proxy_handler(reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
async def handle_connection():
with closing(writer):
# Read the initial request line
request_line = await reader.readline()
print(f"Request Line: {request_line.decode().strip()}")
# The headers are terminated by an empty line
headers = await reader.readuntil(b"\r\n\r\n")
print(f"Headers: {headers.decode().strip()}")
asyncio.create_task(handle_connection())
async def main():
server = await asyncio.start_server(proxy_handler, "127.0.0.1", 8888)
async with server:
await server.serve_forever()
asyncio.run(main())
And a simple curl command to test it
curl --proxy http://127.0.0.1:8888 https://example.com
You should see the following output
Request Line: CONNECT example.com:443 HTTP/1.1
Headers: Host: example.com:443
User-Agent: curl/8.12.1
Proxy-Connection: Keep-Alive
And the curl command should return
curl: (56) Proxy CONNECT aborted
We need to handle the CONNECT
method to establish a tunnel between the client and the server.
Parsing the request line
We are going to assume it’s a CONNECT request for simplicity and ignore the headers.
def proxy_handler(reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
async def handle_connection():
with closing(writer):
# Read the initial request line
request_line = await reader.readline()
print(f"Request Line: {request_line.decode().strip()}")
headers = await reader.readuntil(b"\r\n\r\n")
print(f"Headers: {headers.decode().strip()}")
# We are extracting host and port from the request line CONNECT example.com:443 HTTP/1.1
host, port = request_line.decode().split(" ")[1].split(":")
port = int(port)
# Connect to the remote server
await connect_to_remote(reader, writer, host, port)
asyncio.create_task(handle_connection())
Connecting to the remote server
Now we need to implement the connect_to_remote
function.
async def connect_to_remote(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_host: str,
remote_port: int,
):
remote_reader, remote_writer = await asyncio.open_connection(
remote_host, remote_port
)
with closing(remote_writer):
await acknowledge_connection(client_writer)
And we need to implement the acknowledge_connection
function to send a 200 OK response to the client.
async def acknowledge_connection(client_writer: asyncio.StreamWriter):
response = "HTTP/1.1 200 Connection Established\r\n\r\n"
client_writer.write(response.encode())
await client_writer.drain()
Our curl now should return
curl: (35) TLS connect error: error:00000000:lib(0)::reason(0)
Meaning we have established the tunnel but we are not handling the TLS handshake yet.
Forwarding data between client and server
We are now introducing a forward_data
and create_tunnel
functions to forward data between the client and the server.
async def forward_data(src: asyncio.StreamReader, dest: asyncio.StreamWriter):
while True:
data = await src.read(4096)
if not data:
break
dest.write(data)
await dest.drain()
async def create_tunnel(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_reader: asyncio.StreamReader,
remote_writer: asyncio.StreamWriter,
):
await asyncio.gather(
forward_data(client_reader, remote_writer),
forward_data(remote_reader, client_writer),
)
We are now calling create_tunnel
in the connect_to_remote
function.
async def connect_to_remote(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_host: str,
remote_port: int,
):
remote_reader, remote_writer = await asyncio.open_connection(
remote_host, remote_port
)
with closing(remote_writer):
await acknowledge_connection(client_writer)
await create_tunnel(client_reader, client_writer, remote_reader, remote_writer)
Now our curl command should work and return the content of the example.com homepage.
Keep in mind that we are keeping the code minimal and we don’t properly handle errors and cleanup.
Intercepting the SSL/TLS traffic
At this point, we have a working HTTPS proxy that forwards data between the client and the server. However, we are not intercepting the SSL/TLS traffic yet if we try to print the data in the forward_data
function, we will see that it’s encrypted.
Creating a self-signed certificate
To intercept the SSL/TLS traffic, we need to create a self-signed certificate. We can use the openssl
command line tool to generate a self-signed certificate and a private key.
openssl req -x509 -newkey rsa:2048 -keyout key.pem -out cert.pem -days 365 -nodes -subj "/CN=localhost"
This will create a cert.pem
file containing the self-signed certificate and a key.pem
file containing the private key.
Connecting to the remote server with SSL
Now that we no longer forward data directly between the client and the server, we need to establish an SSL connection to the remote server. We can use the ssl
module to create an SSL context and wrap the connection to the remote server.
We update the connect_to_remote
function to use SSL when connecting to the remote server.
async def connect_to_remote(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_host: str,
remote_port: int,
):
context = ssl.create_default_context()
remote_reader, remote_writer = await asyncio.open_connection(
remote_host, remote_port,
ssl=context
)
with closing(remote_writer):
await acknowledge_connection(client_writer)
await create_tunnel(client_reader, client_writer, remote_reader, remote_writer)
Connecting to the client with SSL
Now once we have established the connection to the remote server, we need to establish an SSL connection with the client.
During the proxy connection the beginning of the request is not encrypted, so we can read the request line and headers from the client to know which host the client is trying to connect to.
After that we need to upgrade the connection to SSL and use our self-signed certificate to establish the SSL connection with the client.
Note that this will require Python 3.11 or higher.
We create a new function upgrade_to_tls
to upgrade the connection to SSL.
First we create an SSL context with our self-signed certificate.
Then we use the start_tls
method of the StreamWriter
to upgrade the connection to SSL.
async def upgrade_to_tls(client_writer: asyncio.StreamWriter, remote_host: str):
"""
Upgrade the existing connection from client to use TLS with a self-signed certificate
"""
context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
context.load_cert_chain(certfile="cert.pem", keyfile="key.pem")
await client_writer.start_tls(context, server_hostname=remote_host)
We then call this function in the connect_to_remote
function after reading the request line and headers.
async def connect_to_remote(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_host: str,
remote_port: int,
):
context = ssl.create_default_context()
remote_reader, remote_writer = await asyncio.open_connection(
remote_host, remote_port,
ssl=context
)
with closing(remote_writer):
await acknowledge_connection(client_writer)
# Upgrade to TLS the connection
await upgrade_to_tls(client_writer, remote_host)
await create_tunnel(client_reader, client_writer, remote_reader, remote_writer)
If you try the curl command again, you should see an error in your server:
ssl.SSLError: [SSL: TLSV1_ALERT_UNKNOWN_CA] tlsv1 alert unknown ca (_ssl.c:1029)
And your curl command should return
curl: (60) SSL certificate problem: self-signed certificate
More details here: https://curl.se/docs/sslcerts.html
curl failed to verify the legitimacy of the server and therefore could not
establish a secure connection to it. To learn more about this situation and
how to fix it, please visit the webpage mentioned above.
This is because curl does not trust our self-signed certificate. You can add the --insecure
option to curl to ignore the certificate validation.
curl --insecure --proxy http://127.0.0.1:8888 https://example.com
Seeing the decrypted data
We can now print the data in the forward_data
function and see the decrypted data.
async def forward_data(src: asyncio.StreamReader, dest: asyncio.StreamWriter):
while True:
data = await src.read(4096)
if not data:
break
print(data)
dest.write(data)
await dest.drain()
If you try the curl command again, you should see the decrypted data in the proxy output.
Final code
Here is the final code of the HTTPS proxy with interception.
import asyncio
from contextlib import closing
import ssl
async def forward_data(src: asyncio.StreamReader, dest: asyncio.StreamWriter):
while True:
data = await src.read(4096)
if not data:
break
print(data)
dest.write(data)
await dest.drain()
async def create_tunnel(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_reader: asyncio.StreamReader,
remote_writer: asyncio.StreamWriter,
):
await asyncio.gather(
forward_data(client_reader, remote_writer),
forward_data(remote_reader, client_writer),
)
async def acknowledge_connection(writer: asyncio.StreamWriter):
"""
Once connected we need to tell the client that the connection is established
"""
response = "HTTP/1.1 200 Connection Established\r\n\r\n"
writer.write(response.encode())
await writer.drain()
async def connect_to_remote(
client_reader: asyncio.StreamReader,
client_writer: asyncio.StreamWriter,
remote_host: str,
remote_port: int,
):
context = ssl.create_default_context()
remote_reader, remote_writer = await asyncio.open_connection(
remote_host, remote_port,
ssl=context
)
with closing(remote_writer):
await acknowledge_connection(client_writer)
# Upgrade to TLS the connection
await upgrade_to_tls(client_writer, remote_host)
await create_tunnel(client_reader, client_writer, remote_reader, remote_writer)
async def upgrade_to_tls(client_writer: asyncio.StreamWriter, remote_host: str):
"""
Upgrade the existing connection from client to use TLS with a self-signed certificate
"""
context = ssl.SSLContext(ssl.PROTOCOL_TLS_SERVER)
context.load_cert_chain(certfile="cert.pem", keyfile="key.pem")
await client_writer.start_tls(context, server_hostname=remote_host)
def proxy_handler(reader: asyncio.StreamReader, writer: asyncio.StreamWriter):
async def handle_connection():
with closing(writer):
# Read the initial request line
request_line = await reader.readline()
print(f"Request Line: {request_line.decode().strip()}")
headers = await reader.readuntil(b"\r\n\r\n")
print(f"Headers: {headers.decode().strip()}")
# We are extracting host and port from the request line CONNECT example.com:443 HTTP/1.1
host, port = request_line.decode().split(" ")[1].split(":")
port = int(port)
# Connect to the remote server
await connect_to_remote(reader, writer, host, port)
asyncio.create_task(handle_connection())
async def main():
server = await asyncio.start_server(proxy_handler, "127.0.0.1", 8888)
async with server:
await server.serve_forever()
asyncio.run(main())
Conclusion
In this article, we have seen how to implement a simple HTTPS proxy with interception in Python using asyncio and the ssl module. We have covered how to parse the HTTP CONNECT method, establish a tunnel between the client and the server, and intercept the SSL/TLS traffic using a self-signed certificate.
To productize this code, you would need to handle errors, edge cases, and cleanup properly. You probably also want to trust a CA certificate if you want to intercept traffic from a browser.