Building a request inspector

First pass: duct tape and string
The title and the subtitle of the post on a white background. An orange highlight cuts horizontally across the middle.

I've been quite involved with distributed tracing and have spent a fair amount of time looking at the W3C recommendation for dealing with trace context at work lately. As a result of this, I have found myself wanting to inspect the headers on outbound requests to ensure that the framework and libraries we use handle tracing correctly.

But how do you do that? I couldn't find any services or command line utilities that did this (at least not simply), so I set out to build one myself. It's been a while since I did anything in Rust, and this sounded like a fun, little weekend project. Spoiler alert: It wasn't.

I wanted this to be a short and simple tutorial on how to build such a server in Rust, but things didn't go quite as I planned, and it took much more time and effort than I expected. Rather than hiding it and pretending it never happened, though, I'm going to take it and run with it. I'm sure that if I'm running into these issues, I'm not the only one.

Intended audience

This post is intended for people who have at least some experience with Rust, including familiarity with the type system and the borrow checker (at least as concepts). You should also have some passing knowledge of HTTP requests.

This post is not intended to be a thorough tutorial or a list of best practices, but rather to serve as a way of demonstrate how I work. Yes, the code here works and runs as expected, and clippy doesn't complain, but it's not good.

There are code samples below, and you can also check out the repo on GitLab. However, this should not be considered a final version, and there will likely be further updates to the repo later on. This is also not an in-depth analysis of the code, but a short tour of it. In short: it works, but it's far from perfect. In a lot of ways, writing this code felt a lot like how the Oatmeal describes projects coming together in his fantastic comic 'Erasers are Wonderful': full of twists, turns, and toilet fires, and in the end you have something that's good enough.

The goal

I set out to make a simple web server that:

  • would accept requests at any endpoint
  • would accept requests with any method
  • would respond with a JSON object containing data about the request's:
    • headers
    • method
    • path
    • query string

I also wanted to add the request body (if there was one) to the response, but it wasn't the most important issue. Other additional features, such as reading data from environment variables, command line options, logging, etc., could be added later.

How (or: 'show me the code')

When building a web server in Rust, there's a number of frameworks to choose from. I first went with Actix, but after not reading the docs, ended up working directly with Hyper because it was easier to create a function that would handle any request at any route with any method. Or at least it was covered in the initial tutorial.

In addition to Hyper, I'm also pulling in anyhow and serde-json for dealing with errors and working with JSON.

Below, we'll break the program up into functions and look at them one at a time.

Dependencies and imports

Let's get the boring (but very important) bits out of the way. Here's the dependencies section of the Cargo.toml file, as well as the program imports.

Dependencies:

[dependencies]
hyper = "0.13"
tokio = { version = "0.2", features = ["full"] }
serde_json = "1.0"
anyhow = "1.0"

Imports:

use anyhow::Result;
use hyper::service::{make_service_fn, service_fn};
use hyper::{Body, HeaderMap, Request, Response, Server};
use serde_json::json;
use std::collections::HashMap;
use std::convert::Infallible;

The main function

#[tokio::main]
async fn main() -> Result<(), hyper::error::Error> {
    let make_svc = make_service_fn(|_| async { Ok::<_, Infallible>(service_fn(handle_requests)) });

    let addr = ([127, 0, 0, 1], 8080).into();

    let server = Server::bind(&addr).serve(make_svc);

    println!("Server started. Listening on http://{}", addr);

    server.await
}

There's a few things happening here, but it's rather self-explanatory. We declare a handler and an address for the server; start the server with the aforementioned handler and address, and wait for it to finish (which happens on termination).

The request handler

async fn handle_requests(req: Request<Body>) -> Result<Response<Body>> {
    let response_data = json!({
        "headers": to_string_map(req.headers()),
        "path": req.uri().path(),
        "queryString": req.uri().query(),
        "method": req.method().as_str(),
        "version": format!("{:?}", req.version()),
    }).to_string();

    println!("Received request: {:?}", response_data);

    Ok(Response::builder()
       .header("content-type", "application/json")
       .body(Body::from(response_data))?)
}

This is the meat of the program and really what it's all about: extracting data from the request and returning it to the caller. As this is a very rough proof of concept, I'm mapping the the data into a completely arbitrary JSON structure rather than into a struct.

After mapping, I print the result of the mapping, and return the response with an appropriate content-type.

Serializing the HeaderMap

Serde takes care of serializing most of the data very well, but doesn't like the HeaderMap which contains the request's headers. ~HeaderMap~ is a multimap---a map structure that can associate multiple values with a single key---and as such, doesn't easily serialize to JSON.

To solve this, I decided to turn the HeaderMap into a HashMap<String, String>, simply creating a comma-separated string for headers that have multiple values. Not the most elegant or robust solution, but hey, it works.

Also, because the header_value.to_str function 'yields a &str slice if the HeaderValue only contains visible ASCII chars' (according to the docs), I put ~"Non-ASCII header value"~ to handle cases where it contains non-ASCII characters.. Again: it works.

fn to_string_map(headers: &HeaderMap) -> HashMap<String, String> {
    let mut map = HashMap::new();
    for (header_name, header_value) in headers.iter() {
        let k = header_name.as_str();
        let v = header_value
            .to_str()
            .unwrap_or("Non-ASCII header value")
            .into();

        match map.get_mut(k) {
            None => {
                map.insert(k.into(), v);
            }
            Some(old_val) => *old_val = format!("{}, {}", old_val, v),
        }
    }

    map
}

The unexpected challenges

So what made this so difficult? Why didn't it work out as I expected? Well, here's some of the issues I ran into:

The request body
As briefly mentioned up top, I originally wanted to also include the request body in the response. I spent too much time on trying to make this work before realizing that I should leave it out for now.

This turned out to be difficult because of how I couldn't easily parse the body as a String and include that in the output JSON. But after a bit of thought, I realized that I can't just assume that the body is JSON (or even a string), so it's more work than I expected.

Converting between different types
Related to the issues with the request body is conversion between different data types, and especially between types that are and aren't serializable by Serde. It felt like there was a lot of juggling types around just to please the compiler.
Manually serializing data types
While most of the data types can easily be represented as strings in this case, the header map needed some work. While it wasn't a very difficult exercise, it took more time than expected, especially because there wasn't an obvious way to perform an upsert-like action into a HashMap.
Lack of examples
This could be me or it could be the documentation, but I found it difficult to do what I wanted. I'd expected there to be more information on getting data from a request, but it's quite possible that I just didn't read far enough.
I'm ... /rusty/
It's been a while since I last worked with Rust, and the borrow-checker was stricter than I remember.
Working directly with Hyper?
I don't know whether this was much of an issue or not. It gave me quick and easy access to the endpoint setup I wanted, but it might have introduced other complications. That said, it looks as if Actix simply re-exports a lot of Hyper's data types, so I don't know how much of a difference that would have made.

Wrapping up

Even if things didn't go exactly as planned, it was a fun, and at times very frustrating, little project. Having looked a little bit more at the Actix docs, I have found a few sections that make me think it could be quite suitable after all, so I'll probably rewrite the project some fourteen times in the coming week.

Next time I'll hopefully have something a bit more polished to show off.

Peace.



Thomas Heartman is a developer, writer, speaker, and one of those odd people who enjoy lifting heavy things and putting them back down again. Preferably with others. Doing his best to gain and share as much knowledge as possible.