10 years of infra

Reflecting on 10 years building infra.

Oct 9
8:13 AM

Ten Years of Infrastructure



tl;dr: peak at the insight delivered by the new infra stack behind this website!



   Ten years ago, I was divesting from social media. However, I still wanted a platform to share info about myself. With a marginal understanding of HTML & CSS, I cobbled together the base format of this website that still exists today. The splash page and general site layout have hardly changed after a decade!


Building a website


When I started building my website, I did what most people probably do. I googled how to setup my own website. I decided to go with HostGator because they were the cheapest that met my goals of web & email hosting backed by a VPS (virtual private server) with CLI access. In a few hours, I had a functional website configured.



Oh how simple the internet is when you let someone else handle the specifics.


Then in ~2016, I joined Uber as an infrastructure engineer. I had access to so many linux servers, I had no need to toy around on my small cloud VPS. I was actively encouraged to develop on servers with effectively limitless resources. My website stagnated a bit, and I really didn’t login to my VPS for years. It probably had its distro updated multiple times, unbeknownst to me, as I was still getting my emails and my website was still humming along.

Shrinkflation


This year, I decided I wanted better insight to my web presence - who’s accessing it, from where, etc. I login to my trusty old server and proceed to start installing basic infra products I’ll rely on;

  • jail-shell@hostgator-1234:/home/jailshell#



"… I’m pretty sure I installed zsh. What the hell is jailshell? Sounds bad…"

    jail-shell@hostgator-1234:/home/jailshell# sudo apt-get install nginx

    apt-get install nginx

    E: Could not open lock file /var/lib/dpkg/lock-frontend - open (13: Permission denied)

    E: Unable to acquire the dpkg frontend lock (/var/lib/dpkg/lock-frontend), are you root?

"… yeah... that's bad…"

Okay, this is a blocker… Unbeknownst to me, Hostgator had slowly replaced my VPS with something that still hosted my web & email services, but removed any real server permissions I'd previously had. Hostgator’s customer agreements likely shifted over the years, allowing these restrictions they’d imposed on my instance. I follow up to see how much an “unrestricted” VPS will cost to meet my needs. Hostgator wanted more than $40/month! This entire experience with HostGator left me wanting more. I shopped around and found Hostinger locking in unrestricted VPS leases for 3 years at $4 per month.

Moving to Greener Pastures


I had ~45 days of overlap between my Hostinger contract start and my Hostgator contract expiring. This was ample time to transfer my domain & DNS, configuring minimal web and email hosting to get my email and website functional again. But I didn’t move hosting services to save a couple bucks a month. I want to build.

When I started working for Uber, the first tool I really helped build was a global CDN for autonomous vehicle map-making. It was a glorified web-server, but it saved us tens of millions of dollars and thousands of man-hours. I wanted to replicate a production service I’d help build when I initially moved to Uber. I decided to build my website like a truly scalable CDN with the following requirements:

  • Proxy Web Server - Enables scalability towards website users.
  • Reverse Proxy Web Server - Enables scalability towards website servers..
  • Monitoring Stack - database and user interface providing business insight.
  • Email Server - I don't know - it sends and receives email.
  • Composable - I should be able to spin up a new server instance, pull some data from Github, and be able to serve web traffic.
  • Dynamic - Services &/or the website should automatically reflect changes in the source of truth, hosted in Github.
  • VPN - Administrative functions should primarily be carried out through some mechanism hardened to the open internet.
  • Extensible - we shouldn’t have to refactor any modules to support reasonable future requests.
  • Picking the Ingredients


    Most of these tools I chose because of familiarity. Basically I knew they'd fit my use case without any surprises.

    Proxy Web Server - HAProxy.
    - Routes traffic based on the IP being routed to, such as a public IP, and a VPN IP.
    - Routes traffic based on the URL. For instance the domain monitor.sethsmith.net currently routes differently than sethsmith.net.
    - Configures a VIP and routes traffic behind all servers subscribed to the VIP. I’m not using this functionality currently.
    Reverse Proxy Web Server - Nginx.
    - Caches relevant downstream data. For example, the pictures on this site were probably served from the nginx cache, instead of from the actual filesystem.
    - Emits enriched web traffic logs and metrics ideal for business insight.
    - Routes web requests based on request heuristics. For instance, requests for my resume, a pdf, are routed differently than requests for photos or webpages.
    Monitoring Stack - Elastic.
    - Schemaless datastore, ideal for putting semi-structured logs and other relevant info. Nginx, Haproxy, and other valuable data is stored here for monitoring & alerting. Easy to work with and durable - define log files for Elasticsearch to ingest, and it just works indefinitely.
    - Single Pane of Glass - Kibana frontend natively integrates with Elasticsearch. It removes the burden of developing a frontend to expose aggregated metrics.
    Email Server - Docker Mailserver.
    - An industry standard containerized email server.
    - It was either this or MailCow, which is fun but not as mature in any way.
    Composable - Docker.
    - Natively supports all services in this stack. Most services in this stack are already on boarded to Docker.
    - Enables service mobility - if I need an extra mailserver, I can spin one up trivially with the same configs, on this VPS, or another.

    +

    Dynamic - Github + Cron.
    - All configs are updated and stored in a github private repo.
    - Cron automatically pulls down these files routinely.
    - Any changes in the source of truth causes the service to restart with the updated config.
    Secure Access - Wireguard.
    - Secure VPN allows connection from any predefined endpoint.
    - Allows significant hardening of internet endpoints while maintaining secure privileged access.

    Extensible - Everything is off the shelf except the plumbing and the decorations.

    Putting It All Together



    This gets overly complex unless we separately assess the three use cases delivered by this stack;
    - Web Serving.
    - Mail Serving.
    - Monitoring.

    Web Serving


        Haproxy sits on the internet, binded to the IP address of this website. It routes traffic based on domain to either my website; sethsmith.net, or the monitor, exposed at monitor.sethsmith.net. Monitor traffic is routed through Nginx to Kibana. Website traffic (sethsmith.net) is routed through Nginx where the url is evaluated and the respective files are served by Nginx. If Nginx has the data cached, you’re served the cached data. If not, Nginx fetches the data for you and caches it for the next requester.


    Mail Serving


        A DNS record pointing to mail.sethsmith.net resolves to my public IP. From there, docker-mailserver is listening and processing mail requests.


    Monitoring


        Primarily available through a VPN. Incredibly extensible to many infrastructure control-plane needs relating to monitoring.