Getting started: Writing your own HTTP/1.1 client¶
h11 can be used to implement both HTTP/1.1 clients and servers. To give a flavor for how the API works, we’ll demonstrate a small client.
An HTTP interaction always starts with a client sending a request,
optionally some data (e.g., a POST body); and then the server
responds with a response and optionally some data (e.g. the
requested document). Requests and responses have some data associated
with them: for requests, this is a method (e.g.
GET), a target
/index.html), and a collection of headers
User-agent: demo-clent). For responses, it’s a status code
(e.g. 404 Not Found) and a collection of headers.
Of course, as far as the network is concerned, there’s no such thing as “requests” and “responses” – there’s just bytes being sent from one computer to another. Let’s see what this looks like, by fetching https://httpbin.org/xml:
In : import ssl, socket In : ctx = ssl.create_default_context() In : sock = ctx.wrap_socket(socket.create_connection(("httpbin.org", 443)), ...: server_hostname="httpbin.org") ...: # Send request In : sock.sendall(b"GET /xml HTTP/1.1\r\nhost: httpbin.org\r\n\r\n") # Read response In : response_data = sock.recv(1024) # Let's see what we got! In : print(response_data) b'HTTP/1.1 200 OK\r\nAccess-Control-Allow-Credentials: true\r\nAccess-Control-Allow-Origin: *\r\nContent-Type: application/xml\r\nDate: Sat, 09 Nov 2019 22:09:04 GMT\r\nReferrer-Policy: no-referrer-when-downgrade\r\nServer: nginx\r\nX-Content-Type-Options: nosniff\r\nX-Frame-Options: DENY\r\nX-XSS-Protection: 1; mode=block\r\nContent-Length: 522\r\nConnection: keep-alive\r\n\r\n<?xml version=\'1.0\' encoding=\'us-ascii\'?>\n\n<!-- A SAMPLE set of slides -->\n\n<slideshow \n title="Sample Slide Show"\n date="Date of publication"\n author="Yours Truly"\n >\n\n <!-- TITLE SLIDE -->\n <slide type="all">\n <title>Wake up to WonderWidgets!</title>\n </slide>\n\n <!-- OVERVIEW -->\n <slide type="all">\n <title>Overview</title>\n <item>Why <em>WonderWidgets</em> are great</item>\n <item/>\n <item>Who <em>buys</em> WonderWidgets</item>\n </slide>\n\n</slideshow>'
If you try to reproduce these examples interactively, then you’ll
have the most luck if you paste them in all at once. Remember we’re
talking to a remote server here – if you type them in one at a
time, and you’re too slow, then the server might give up on waiting
for you and close the connection. One way to recognize that this
has happened is if
response_data comes back as an empty string,
or later on when we’re working with h11 this might cause errors
So that’s, uh, very convenient and readable. It’s a little more understandable if we print the bytes as text:
In : print(response_data.decode("ascii")) HTTP/1.1 200 OK Access-Control-Allow-Credentials: true Access-Control-Allow-Origin: * Content-Type: application/xml Date: Sat, 09 Nov 2019 22:09:04 GMT Referrer-Policy: no-referrer-when-downgrade Server: nginx X-Content-Type-Options: nosniff X-Frame-Options: DENY X-XSS-Protection: 1; mode=block Content-Length: 522 Connection: keep-alive <?xml version='1.0' encoding='us-ascii'?> <!-- A SAMPLE set of slides --> <slideshow title="Sample Slide Show" date="Date of publication" author="Yours Truly" > <!-- TITLE SLIDE --> <slide type="all"> <title>Wake up to WonderWidgets!</title> </slide> <!-- OVERVIEW --> <slide type="all"> <title>Overview</title> <item>Why <em>WonderWidgets</em> are great</item> <item/> <item>Who <em>buys</em> WonderWidgets</item> </slide> </slideshow>
Here we can see the status code at the top (200, which is the code for “OK”), followed by the headers, followed by the data (a silly little XML document). But we can already see that working with bytes by hand like this is really cumbersome. What we need to do is to move up to a higher level of abstraction.
This is what h11 does. Instead of talking in bytes, it lets you talk
in high-level HTTP “events”. To see what this means, let’s repeat the
above exercise, but using h11. We start by making a TLS connection
like before, but now we’ll also import
h11, and create a
In : import ssl, socket In : import h11 In : ctx = ssl.create_default_context() In : sock = ctx.wrap_socket(socket.create_connection(("httpbin.org", 443)), ....: server_hostname="httpbin.org") ....: In : conn = h11.Connection(our_role=h11.CLIENT)
Next, to send an event to the server, there are three steps we have to
take. First, we create an object representing the event we want to
send – in this case, a
In : request = h11.Request(method="GET", ....: target="/xml", ....: headers=[("Host", "httpbin.org")]) ....:
Next, we pass this to our connection’s
method, which gives us back the bytes corresponding to this message:
In : bytes_to_send = conn.send(request)
And then we send these bytes across the network:
In : sock.sendall(bytes_to_send)
There’s nothing magical here – these are the same bytes that we sent up above:
In : bytes_to_send Out: b'GET /xml HTTP/1.1\r\nhost: httpbin.org\r\n\r\n'
Why doesn’t h11 go ahead and send the bytes for you? Because it’s designed to be usable no matter what socket API you’re using – doesn’t matter if it’s synchronous like this, asynchronous, callback-based, whatever; if you can read and write bytes from the network, then you can use h11.
In this case, we’re not quite done yet – we have to send another
event to tell the other side that we’re finished, which we do by
In : end_of_message_bytes_to_send = conn.send(h11.EndOfMessage()) In : sock.sendall(end_of_message_bytes_to_send)
Of course, it turns out that in this case, the HTTP/1.1 specification
tells us that any request that doesn’t contain either a
Transfer-Encoding header automatically has a
0 length body, and h11 knows that, and h11 knows that the server knows
that, so it actually encoded the
EndOfMessage event as the
In : end_of_message_bytes_to_send Out: b''
But there are other cases where it might not, depending on what
headers are set, what message is being responded to, the HTTP version
of the remote peer, etc. etc. So for consistency, h11 requires that
you always finish your messages by sending an explicit
EndOfMessage event; then it keeps track of the details of
what that actually means in any given situation, so that you don’t
Finally, we have to read the server’s reply. By now you can probably
guess how this is done, at least in the general outline: we read some
bytes from the network, then we hand them to the connection (using
Connection.receive_data()) and it converts them into events
In : bytes_received = sock.recv(1024) In : conn.receive_data(bytes_received) In : conn.next_event() Out: Response(status_code=200, headers=[(b'access-control-allow-credentials', b'true'), (b'access-control-allow-origin', b'*'), (b'content-type', b'application/xml'), (b'date', b'Sat, 09 Nov 2019 22:09:04 GMT'), (b'referrer-policy', b'no-referrer-when-downgrade'), (b'server', b'nginx'), (b'x-content-type-options', b'nosniff'), (b'x-frame-options', b'DENY'), (b'x-xss-protection', b'1; mode=block'), (b'content-length', b'522'), (b'connection', b'keep-alive')], http_version=b'1.1', reason=b'OK') In : conn.next_event()