Hosting

I am thinking about setting up a hosting platform. I want to host my own things. I want to offer free hosting and other online services to people in my community. I would like to sell certain kinds of hosting in support of my tech business. All of those make me want a reliable hosting platform, but only one of them generates cashflow, so it has to be fairly cheap

Most of the things I imagine hosting would be deployed as containers. I want to control the machine where they run but I am not worried about high isolation or protection against hostile local processes. This is a community resource that will be used cooperatively

I would not like to pay E-Corp to host “compute” in an expensive cloud system that promises to send my kernel memory to the NSA in three availability zones. I also don’t need a whole computer, at least not of the class likely to be installed in places with good network access. Nor do I need a whole storage array, but would like to use part of one. So the plan is still a VM, just not the “compute” kind

I am willing to pay a smaller company for VM hosting with the intent to find a more human scale. Not a giant datacenter destroying a whole county but a few racks in an office building with good telecom access

I am open to other ideas, but medium-sized “bare metal” VMs are my current thinking. Perhaps something like SSD Nodes. For $1080 I can have 3 years of this VM: 12 Intel cores, 96 GB RAM, 2 TB storage, and 48 TB/month outbound with 4ms ping times. It currently costs more than $1080 to get 96 GB of physical RAM so that seems like an okay deal

SSD Nodes is on my list specifically because they support nested emulation. I’ve been using that setup on both local1 and hosted servers for a few years and I like it. It lets me design the whole system including networking and resource isolation into a portable container2. Being able to move to a new underlying host without hassle is important to me for both technical and personal reasons

Here’s the part I haven’t designed yet: Availability

I would like to use two of these VMs3, to provide high(er)-availability than the version where server updates mean downtime

To that extent that it’s plausible it’s my goal to for these two hosts to be hot-hot. They are not going to masquerade behind an common network address but ideally they would both be listed in most DNS records (at least when the are both up) and both able to service most requests

For anything with low-contention disk usage (like web or email) it’s probably sufficient to have a distributed filesystem. Something like GlusterFS4 can sync files transparently and can be mounted directly into containers, which makes it easy to manage disk space. Even things like sqlite should mostly work, so long as they have a locking mechanism. Some things like memcached can cluster on their own. Some things would require actual coordination; PgSQL replication requires setup but is pretty smooth from a client perceptive, even without IP-layer rerouting

I’d take suggestions on how to structure this thing so, and how to build tools and procedures to run it. None of this is a new, unsolved problem that requires unique software, it’s more a question of how to realize this particular project. I’m trying to build something that can work on a community-level budget — and generally that meets my own proclivities5 — so I’m not necessarily looking for a prefab solution

  1. I think it’s handy to have an a flexible abstraction layer for real hardware. I can pick if there’s a VM dedicated to running a piece of hardware via PCI passthrough, or if the hypervisor deals with drivers and enumeration and firmware and encryption and presents a simple virtio device to a dumb guest ↩︎
  2. As opposed to making API calls to someone’s cloud service. I’m not a fan of designs that rely on proprietary for-profit service calls to set basic configuration ↩︎
  3. Ideally on a different hosting platform for resiliency against provider failures (hence the desire for self-contained portability), but at least in a different datacenter ↩︎
  4. Or maybe CephFS? What’s the hotness these days? I have used Gluster and I for a 2-host deployment I like that it doesn’t need a separate metadata server, but development is slow these days. If I was building my own storage array I would do LustreFS backed by ZFS, but that’s not this project ↩︎
  5. For example, I’m very partial to a design that completely describes all the non-user-data portions of the system as human-readable version-managed configuration. Ideally one that can start right from “connect to VNC” to self-deploy on any host ↩︎