Constraining Shapes

I’m thinking of something of a VM map that looks something like this, duplicated at each hosting provider:

zCloud

Overall hypervisor, installed on the “bare metal” I rent; all other VMs are internal to this hypervisor

Does not host any client-facing or VM-facing services. Forwards (or drops) almost all inbound traffic to zFront, except for its own SSH service. Allows most outbound connections

zNet

Provides basic infrastructure to other VMs inside zCloud. Mostly exists to keep services off of zCloud proper, so the hypervisor config and complexity are as low as possible

Does not require any other VMs for operation. May use other VMs for maintenance tasks. Does not connect to zStore

zFront

Inbound connection termination, including TLS, for all client-facing traffic

Sever-side endpoints default to local services but will proxy to the remote peer when local services are down. zFront is also the recommended endpoint for inter-service connections, to take advantage of failover and other routing functions

Does not require any other VMs for operation. May use other VMs for maintenance tasks. Does not connect to zStore

  • Load balancing & rate limiting (haproxy)
  • Certificate management2 (haproxy)
  • Internal availability monitoring & failover

zStore

Distributed storage provider for all user data, replicated to the remote zStore peer

Does not require any other VMs for operation but will interact with its peer. May use other VMs for maintenance tasks. No services exposed outside the local and peer networks

  • Distributed FS (cephfs)
  • Distributed block device (cephrdb)
  • Other ceph-RADOS-backed storage

zPod

Primary container host; backend for most proxied connections

Backs all container volumes with zStore mounts. Local data is mostly limited to compose.yaml files; everything else should be either ephemeral (container images) or stored in a ceph volume

Depends on zStore and zFront at runtime

Provides lots of important services via containers (like the primary nginx proxy, a backup server, etc.) but only one host-level service:

zCluster

Container host for things that will not use ceph storage. These services are expected to provide their own replication/clustering (if needed) with their remote peer, and to use raw local storage

Does not require any other local VMs for operation but will interact with its peer. May use other VMs for maintenance tasks. No services exposed outside the local and peer networks

Common Services

All VMs will run a few services for VM-local use including:

Storage Encryption

SSH needs a secret host key. VMs connected to zStore will have credentials for the ceph cluster. Maintenance tools like cron are likely to need credentials or other secrets. These and other sensitive config in the VM’s local filesystem will be protected at rest by zCloud‘s dm-crypt6, applied to the entire VM image

For user data (e.g. container volumes) cephfs provides directory-level encryption, allowing containers to enforce their own encryption policies either at the container mount layer or internally. But I expect ceph itself will sit on dm-crypt block devices, to provide belt-and-suspenders protection

  1. It’s useful to have site-local records and caching at each peer ↩︎
  2. Using ACME http challenges for credential-less certificate signing, so that we hold very few secrets in this world-facing VM ↩︎
  3. Configured to prefer the zNet DNS service but transparently fallback to public DNS for all services and containers on the VM ↩︎
  4. It feels like you shouldn’t need NTP on nested VMs, but it’s actually a tricky problem to synchronize VM clocks even with kvm-clock tick sources. NTP is a lightweight fix ↩︎
  5. Which holds small local log file but mostly forwards to the zNet logger (when available) ↩︎
  6. zCloud itself must leave the base OS and host SSH keys accessible at before user interaction. But it will separately protect all other sensitive data via an encrypted post-unlock mount ↩︎