Using SRV records for service discovery in Python

This post’ll introduce you to what SRV records are, how they can be used for service discovery, and how you can easily use them in Python applications using a library called srv_hijacker.

Terminology

Let’s first briefly go over some terms I’ll be using:

Service Registry

A service registry is a database of all services and their current running instances. Examples of open-source tools that can function as service registries: Zookeeper, Consul.

Service Discovery

For one service to talk to another service, it needs an IP and a port. How does it get this?

In a traditional setup where apps run on physical machines, the IPs are relatively static, so they can be hardcoded in the application.

In modern setups where multiple applications run on a set of machines that keep getting upscaled, downscaled, recycled etc. - it is unlikely that the IP and port of a certain service will remain constant.

This is where service discovery comes in. When a service wants to talk to another service, it must ‘discover’ the service’s IP and port by querying a central location.

DNS records

I’m assuming that you’re familiar with this, so won’t add details here. If you need a refresher, here are some resources:

Service Discovery via DNS

What does service discovery boil down to? Given a service name (well-known), retrieve an IP + port. Doesn’t this sound very similar to what DNS does?

google.com is a service we use everyday - by its well known name. The actual IP might keep on changing, but the DNS server we use acts as a service registry and ensures that we always get IPs that are up-to-date.

The limitation of common DNS records are that they only provide an IP, and not a port. The most commonly used records are A (or AAAA for IPv6) and CNAME, neither of these specify a port, only an IP. The port is always decided by the caller, because we use well-known protocols like HTTP, HTTPS that have default ports (80 and 443 respectively).

Since common DNS records don’t specify a port, their use-case for service discovery is limited.

SRV records

SRV is a relatively new DNS record type that fixes the limitation I mentioned above. Along with an IP, SRV records also specify a port.

SRV records can be used for service discovery, as long as all clients recognize the records and use the specified host and port accordingly. Since SRV is a relatively new record type, not all clients honor these records. Browsers don’t, and most HTTP client libraries (like requests, Python’s most popular HTTP client) don’t either.

srv_hijacker can help here.

SRV hijacker

As mentioned above, requests doesn’t follow SRV records. Even if SRV records were returned by your DNS server, requests wouldn’t use them.

This is where srv_hijacker enters the picture. srv_hijacker monkey-patches urllib3 to ensure that all HTTP requests to certain URLs honor SRV records returned by a DNS server of your choice.

Speaking of monkey patching in Python, allow me to proudly present a picture I found on the interwebz:

monkeypatch

The API for srv_hijacker is very simple:

import srv_hijacker

srv_hijacker.hijack(
    host_regex=r'service.consul$',
    srv_dns_host='127.0.0.1',
    srv_dns_port=8600
)

In english, this reads as:

For any HTTP request to a host that matches host_regex, query a DNS server identified by dns_host and dns_port to fetch an SRV record.

Use this SRV record to set the host and port on the request.

Consul is a common choice for a service registry that serves SRV records.