Constraining Shapes

I’m thinking of something of a VM map that looks something like this, duplicated at each hosting provider:

zCloud

Overall hypervisor, installed on the “bare metal” I rent; all other VMs are internal to this hypervisor

Does not host any client-facing or VM-facing services. Forwards (or drops) almost all inbound traffic to zFront, except for its own SSH service. Allows most outbound connections

zNet

Provides basic infrastructure to other VMs inside zCloud. Mostly exists to keep services off of zCloud proper, so the hypervisor config and complexity are as low as possible

Does not require any other VMs for operation. May use other VMs for maintenance tasks. Does not connect to zStore

zFront

Inbound connection termination, including TLS, for all client-facing traffic

Sever-side endpoints default to local services but will proxy to the remote peer when local services are down. zFront is also the recommended endpoint for inter-service connections, to take advantage of failover and other routing functions

Does not require any other VMs for operation. May use other VMs for maintenance tasks. Does not connect to zStore

  • Load balancing & rate limiting (haproxy)
  • Certificate management2 (haproxy)
  • Internal availability monitoring & failover

zStore

Distributed storage provider for all user data, replicated to the remote zStore peer

Does not require any other VMs for operation but will interact with its peer. May use other VMs for maintenance tasks. No services exposed outside the local and peer networks

  • Distributed FS (cephfs)
  • Distributed block device (cephrdb)
  • Other ceph-RADOS-backed storage

zPod

Primary container host; backend for most proxied connections

Potentially more than one VM, if splitting makes admin easier, but all sharing the same host-level config. I’m aware there are many orchestration systems available, but that feels like unwelcome abstraction for a system where a single reboot/kernel panic on zCloud would bring down all nodes and networks3 And where I will be deploying containers from people who

Depends on zStore and zFront at runtime. May interact with other VMs for maintenance tasks, or from within containers

One host-level service:

And lots of containers. Some for admin support, like a backup server. Some for my own community projects, like free web and email hosting, or the 4114.us online services. Some for other people’s own containers, to support community projects that need more complex hosting than the simple systems I maintain

And ideally some non-free hosting for business and other entities that can afford it, to generate a little cashflow for this system

All container volumes are backed with zStore mounts. Local data is mostly limited to compose.yaml files; everything else should be either ephemeral (container images) or stored in a ceph volume

zMah

Container host for things that will not use ceph storage. These services are expected to provide their own replication with their remote peer (if needed), and to use raw local storage

Does not require any other local VMs for operation but will interact with its peer. May use other VMs for maintenance tasks, or from within containers

Many of its containers would have no services exposed outside the local and peer networks, but that is not a technical requirement

Runs one host-level service:

And containers like:

Common Services

All VMs will run a few services for VM-local use including:

Storage Encryption

SSH needs a secret host key. VMs connected to zStore will have credentials for the ceph cluster. Maintenance tools like cron are likely to need credentials or other secrets. These and other secrets in the VM’s local filesystem will be protected at rest by zCloud‘s dm-crypt7

For user data (e.g. container volumes) cephfs provides directory-level encryption via the fscrypt interface, allowing containers to enforce their own encryption policies. But I expect the ceph block devices will also be dm-crypt, to provide belt-and-suspenders protection, and to make the config uniform across all storage stripes

  1. It’s useful to have site-local records and caching at each peer ↩︎
  2. Using ACME http challenges for credential-less certificate signing, so that we hold very few secrets in this world-facing VM ↩︎
  3. Or for a system where I will deploying containers that are not well productized, or that are part of someone’s attempt to learn about containerized hosting. I want it to be simple enough for non-professional users (and easy for me to help with) ↩︎
  4. Configured to prefer the zNet DNS service but transparently fallback to public DNS for all services and containers on the VM ↩︎
  5. It feels like you shouldn’t need NTP on nested VMs, but it’s actually a tricky problem to synchronize VM clocks even with kvm-clock tick sources. NTP is a lightweight fix ↩︎
  6. Which holds small local log file but mostly forwards to the zNet logger (when available) ↩︎
  7. zCloud itself must leave the SSH host keys accessible so that an administrator can connect to unlock storage. But it will separately protect all other sensitive data via an encrypted mount point for config, and delay starting any services that require it ↩︎