Deploy AETHER on Google Cloud

Cloud Run · Cloud SQL · Memorystore · GCS · CloudAMQP

1. Prerequisites

Accounts, tooling, and access you need before provisioning anything.

Install the gcloud CLI and enable APIs

Install the gcloud CLI, authenticate, set the active project and region, then enable the APIs AETHER needs.

Authenticate, set project/region, enable APIs
gcloud auth login
gcloud config set project <PROJECT_ID>
gcloud config set run/region africa-south1
gcloud services enable run.googleapis.com sqladmin.googleapis.com \
  redis.googleapis.com artifactregistry.googleapis.com secretmanager.googleapis.com
Note
Commands here are bash (Cloud Shell is bash). On Windows PowerShell translate loops and the line-continuation \ → backtick `. AWS is the run-validated reference path — GCP follows the same AETHER architecture.

Authenticate and pull the AETHER images from GHCR

AETHER images are at ghcr.io/rizaanlakay/afroai. Authenticate Docker with a GitHub PAT (classic) that has read:packages, then pull the four component images. You re-tag and push them to Artifact Registry in Phase 3.

1 — Authenticate to GHCR
echo "<GITHUB_PAT>" | docker login ghcr.io -u <GITHUB_USERNAME> --password-stdin
2 — Pull all AETHER images
TAG=1.0.4
for r in agent-web agent-service agent-api agent-worker; do
  docker pull ghcr.io/rizaanlakay/afroai/$r:$TAG
done
Note
Images are at ghcr.io/rizaanlakay/afroai/{service}:<tag> — use the latest release tag. You never build from source.

Choose your sizing — production vs. low-cost evaluation

Cloud Run scales to zero, so an idle test deployment costs almost nothing — use --min-instances=0 and a Burstable Cloud SQL tier (db-g1-small) and a small Memorystore (1 GB Basic). Production targets ~1,000 concurrent conversations: --min-instances=3 for agent-web/agent-service, a GeneralPurpose Cloud SQL (4+ vCPU), and STANDARD_HA Memorystore. Phase 7 has stop/start scripts.

Warning
Cloud Run is stateless — SignalR requires the Redis backplane (built into AETHER), WebSocket support, --session-affinity, and a long request timeout (--timeout=3600). Image generation holds a request 70–120s, so the long timeout matters.

2. Provision data & infrastructure

Create the managed PostgreSQL, Redis, object storage, message broker, and secrets.

Create Cloud SQL for PostgreSQL with pgvector

Create a PostgreSQL 16 instance. AETHER stores the schema and the RAG embeddings (3072-dim, pgvector) here. pgvector must be enabled via the cloudsql.enable_pgvector flag at creation time. The admin user is the DB admin — the app's web-user login is created in Phase 5.

Create the instance (db-g1-small for test, db-custom-4-16384 for production)
gcloud sql instances create afroai-pg \
  --database-version=POSTGRES_16 --tier=db-g1-small --region=africa-south1 \
  --database-flags=cloudsql.enable_pgvector=on
gcloud sql users set-password afroai_admin --instance=afroai-pg --password='<STRONG_PASSWORD>'
gcloud sql databases create AfroAI --instance=afroai-pg
Danger
Without cloudsql.enable_pgvector=on the initializer fails on CREATE EXTENSION vector. Set it at creation (changing it later requires a restart). Admin username: letters/digits/underscore only.

Create Memorystore for Redis

Redis is the distributed cache (afroai: prefix) and the SignalR backplane (afroai-signalr: prefix). Connect via the instance's private IP.

Create the Redis instance (BASIC 1GB for test, STANDARD_HA for production)
gcloud redis instances create afroai-redis --size=1 --region=africa-south1 --tier=BASIC
Note
Connection string: <private-ip>:6379,abortConnect=false (Memorystore Basic has TLS off by default). Cloud Run needs a Serverless VPC connector to reach the private IP. Memorystore cannot be stopped — delete it to zero its cost when idle (Phase 7).

Create a GCS bucket + HMAC key (S3 interop)

GCS exposes an S3-compatible XML API, so AETHER's S3/MinIO SDK works against it with HMAC credentials — no MinIO container needed. AETHER uses a single bucket (Minio:BucketName); create it (globally-unique name) and an HMAC key for a service account that can read/write it.

Create the bucket and an HMAC key
gcloud storage buckets create gs://afroai-artifacts --location=africa-south1
gcloud storage hmac create <SERVICE_ACCOUNT_EMAIL>   # prints the access key + secret
Note
Set Minio__Endpoint=storage.googleapis.com, Minio__Secure=true, Minio__BucketName=afroai-artifacts, and the HMAC access/secret as Minio__AccessKey/Minio__SecretKey. If you hit a signing error, set Minio__Region to the bucket location (africa-south1).
Warning
Like S3, GCS bucket names are global — if afroai-artifacts is taken, pick a unique name and use it for Minio__BucketName too. The service account behind the HMAC key needs object read/write on that bucket (e.g. roles/storage.objectAdmin).

Provision RabbitMQ (CloudAMQP recommended)

GCP has no managed RabbitMQ. Use CloudAMQP (free Little Lemur plan) or run RabbitMQ on GKE/Compute Engine. Supply the amqps://<user>:<pass>@<host>/<vhost> URL to ConnectionStrings__queue.

CloudAMQP (free, recommended)
1. Sign up at cloudamqp.com, create a 'Little Lemur (Free)' instance
2. Copy its AMQP URL (amqps://user:pass@host/vhost)
3. Store it as the afroai-queue secret in the next step
Warning
Do not substitute Cloud Pub/Sub. AETHER's worker uses the MassTransit RabbitMQ transport; switching would require rebuilding the binaries. AETHER configures the bus with cfg.Host(new Uri(url)), so the CloudAMQP URL with a /vhost works as-is.

Store secrets in Secret Manager

AETHER reads these sensitive values; store each in Secret Manager and reference it via --set-secrets in Phase 4. The env-var column shows the .NET config key (:__).

SecretEnv var (Cloud Run)What it is
afroai-db-connectionConnectionStrings__DefaultConnectionPostgres conn string incl. the web-user password
afroai-queueConnectionStrings__queueCloudAMQP amqps:// URL
afroai-openai-keyKernelMemory__AI__OpenAI__ApiKeyYour OpenAI API key
afroai-orchestrator-keyServices__OrchestratorApiKeyInternal API key — you generate it
afroai-mcp-keyMcp__CredentialEncryptionKeyAES-256 key, base64 of exactly 32 bytes
afroai-minio-accessMinio__AccessKeyGCS HMAC access key
afroai-minio-secretMinio__SecretKeyGCS HMAC secret

Set KernelMemory__Services__Postgres__ConnectionString to the same value as afroai-db-connection.

Generate + store the two internal keys (32-byte base64)
gcloud secrets create afroai-orchestrator-key --data-file=<(openssl rand -base64 32)
gcloud secrets create afroai-mcp-key          --data-file=<(openssl rand -base64 32)
# then the rest:
printf '%s' 'sk-...' | gcloud secrets create afroai-openai-key --data-file=-
printf '%s' 'amqps://user:pass@host/vhost' | gcloud secrets create afroai-queue --data-file=-
Danger
Services:OrchestratorApiKey and Mcp:CredentialEncryptionKey must be byte-for-byte identical across agent-web, agent-service, and agent-api. The MCP key must decode to exactly 32 bytes.
Note
Endpoints, model names, region, and bucket name are not secrets — pass them as --set-env-vars in Phase 4.

3. Push images to Artifact Registry

Mirror the AETHER images into your private Artifact Registry repository.

Create an Artifact Registry repository

Create one Docker-format repository in africa-south1.

Create the repository
gcloud artifacts repositories create afroai --repository-format=docker --location=africa-south1

Re-tag the GHCR images and push

Authenticate Docker to Artifact Registry, then re-tag each image pulled in Phase 1 and push.

Configure Docker and push
TAG=1.0.4
gcloud auth configure-docker africa-south1-docker.pkg.dev
PROJECT=$(gcloud config get-value project)
for r in agent-web agent-service agent-api agent-worker; do
  docker tag  ghcr.io/rizaanlakay/afroai/$r:$TAG \
              africa-south1-docker.pkg.dev/$PROJECT/afroai/$r:$TAG
  docker push africa-south1-docker.pkg.dev/$PROJECT/afroai/$r:$TAG
done

4. Deploy the services

Run the AETHER components on Cloud Run.

Deploy each component to Cloud Run

Deploy one service per component. agent-web (and agent-api if you expose the public REST API) are public; agent-service and agent-worker use --ingress=internal. Cloud Run provides HTTPS automatically. Connect Cloud SQL via --add-cloudsql-instances and Memorystore via a Serverless VPC connector (--vpc-connector).

agent-web (public, long timeout, sticky)
PROJECT=$(gcloud config get-value project)
gcloud run deploy afroai-agent-web \
  --image=africa-south1-docker.pkg.dev/$PROJECT/afroai/agent-web:1.0.4 \
  --region=africa-south1 --allow-unauthenticated \
  --min-instances=0 --max-instances=20 --concurrency=100 \
  --timeout=3600 --session-affinity \
  --vpc-connector=afroai-vpc --add-cloudsql-instances=$PROJECT:africa-south1:afroai-pg
agent-service + agent-worker (internal)
PROJECT=$(gcloud config get-value project)
for r in agent-service agent-worker; do
  gcloud run deploy afroai-$r \
    --image=africa-south1-docker.pkg.dev/$PROJECT/afroai/$r:1.0.4 \
    --region=africa-south1 --ingress=internal --no-allow-unauthenticated \
    --min-instances=0 --max-instances=20 --timeout=3600 \
    --vpc-connector=afroai-vpc --add-cloudsql-instances=$PROJECT:africa-south1:afroai-pg
done
Danger
If you add a Cloud Run HTTP health-check probe, point it at / — not /health (AETHER only maps /health in Development). The default TCP startup probe is fine.

Wire configuration and secrets

Pass plain config via --set-env-vars and secrets via --set-secrets. Services__AgentService is the internal agent-service *.run.app URL.

agent-web env + secrets
PROJECT=$(gcloud config get-value project)
gcloud run services update afroai-agent-web --region=africa-south1 \
  --set-env-vars="Redis__ConnectionString=<MEMORYSTORE_IP>:6379,abortConnect=false,Minio__Endpoint=storage.googleapis.com,Minio__Secure=true,Minio__BucketName=afroai-artifacts,Minio__Region=africa-south1,Services__AgentService=https://afroai-agent-service-<hash>.a.run.app,KernelMemory__AI__OpenAI__TextModel=gpt-4o-mini,KernelMemory__AI__OpenAI__EmbeddingModel=text-embedding-3-large" \
  --set-secrets="ConnectionStrings__DefaultConnection=afroai-db-connection:latest,KernelMemory__Services__Postgres__ConnectionString=afroai-db-connection:latest,ConnectionStrings__queue=afroai-queue:latest,KernelMemory__AI__OpenAI__ApiKey=afroai-openai-key:latest,Services__OrchestratorApiKey=afroai-orchestrator-key:latest,Mcp__CredentialEncryptionKey=afroai-mcp-key:latest,Minio__AccessKey=afroai-minio-access:latest,Minio__SecretKey=afroai-minio-secret:latest"
Warning
Per-component env: agent-service = db + queue + openai + orchestrator-key + mcp-key (no Redis/Minio). agent-api = db + orchestrator-key + Services__AgentService. agent-worker = db + queue + openai + minio, and sets DOTNET_ENVIRONMENT=Production (Worker host, not ASP.NET). Its image (built from Dockerfile.worker) bundles Python for the code/image sandbox.
Note
The db-connection secret uses the Cloud SQL unix socket: Host=/cloudsql/<PROJECT>:africa-south1:afroai-pg;Database=AfroAI;Username=web-user;Password=<pwd>;SslMode=Disable.

5. Initialize the AfroAI database

Create the schema, seed reference data, and the app login role.

Apply the schema + seed data

Run the Cloud SQL Auth Proxy locally, then apply the bundled schema and seed (they ship with the installer and reflect a known-good database — the EF migration chain has drifted from the model, so the schema dump is the reliable path).

Start the proxy, apply schema + seed (download both from /Database)
PROJECT=$(gcloud config get-value project)
cloud-sql-proxy $PROJECT:africa-south1:afroai-pg &
export PGPASSWORD='<AFROAI_ADMIN_PASSWORD>'
psql "host=127.0.0.1 port=5432 dbname=AfroAI user=afroai_admin" -v ON_ERROR_STOP=1 -f afroai-schema.sql
psql "host=127.0.0.1 port=5432 dbname=AfroAI user=afroai_admin" -v ON_ERROR_STOP=1 -f afroai-seed.sql
Create the app login role, then GRANT (run as afroai_admin on AfroAI)
CREATE ROLE "web-user" LOGIN PASSWORD '<WEB_USER_PWD>';
GRANT USAGE, CREATE ON SCHEMA public TO "web-user";
GRANT ALL ON ALL TABLES IN SCHEMA public TO "web-user";
GRANT ALL ON ALL SEQUENCES IN SCHEMA public TO "web-user";
ALTER DEFAULT PRIVILEGES FOR ROLE afroai_admin IN SCHEMA public GRANT ALL ON TABLES TO "web-user";
ALTER DEFAULT PRIVILEGES FOR ROLE afroai_admin IN SCHEMA public GRANT ALL ON SEQUENCES TO "web-user";
Danger
The app connects as web-user, but tables are owned by afroai_admin — without these GRANTs every page fails with "permission denied for table …". The seed includes the agent-creation reference data and an initial admin user — change that user's password before going live.

6. Scale & harden

Autoscaling, connection pooling, shared keys, backups, and observability.

Tune Cloud Run autoscaling

Set min/max instances and concurrency per service. Before running agent-web at min-instances > 1, configure shared Data Protection keys (persist to Redis) — Cloud Run instances are stateless and do not share keys, so cookies/antiforgery break otherwise.

Production autoscale for agent-web / agent-service
gcloud run services update afroai-agent-web --region=africa-south1 --min-instances=3 --max-instances=20 --concurrency=100
gcloud run services update afroai-agent-service --region=africa-south1 --min-instances=3 --max-instances=20 --concurrency=80
Warning
Cloud Run instances do not share ASP.NET Core Data Protection keys. Persist the key ring to Redis before scaling agent-web past one warm instance, or logins/antiforgery fail intermittently.

Connection pooling + backups

Many Cloud Run instances multiply EF Core connections. Raise max_connections and/or front Cloud SQL with PgBouncer. Enable automated backups + PITR.

Raise max_connections (keep the pgvector flag!) and enable PITR
gcloud sql instances patch afroai-pg --database-flags=cloudsql.enable_pgvector=on,max_connections=500
gcloud sql instances patch afroai-pg --backup-start-time=02:00 --retained-backups-count=14 --enable-point-in-time-recovery
Tip
Always re-include cloudsql.enable_pgvector=on when patching flags — --database-flags replaces the whole set, so omitting it would disable pgvector. The KernelMemory RAG tables (km- prefix) grow with ingested knowledge.

Observability

AETHER emits OpenTelemetry (via Aspire ServiceDefaults). Forward it to Cloud Trace / Cloud Monitoring via the OTLP exporter or the OpenTelemetry Collector.

7. Operations & cost control

Watch the logs, and stop/start the stack so you are not billed while idle.

Monitor logs and service status

Read each service's logs and check its status. Cloud Run logs also stream to Cloud Logging.

Tail logs for one service (repeat per component)
gcloud beta run services logs tail afroai-agent-worker --region=africa-south1
Find errors across all services
gcloud logging read 'resource.type=cloud_run_revision AND severity>=ERROR' --limit=50 --freshness=15m
Service status
for s in agent-web agent-service agent-api agent-worker; do
  gcloud run services describe afroai-$s --region=africa-south1 --format='value(status.url, status.conditions[0].status)'
done
Tip
A resource that 404s only in production is almost always a Linux case-sensitivity issue (the container filesystem is case-sensitive; Windows dev is not). The error page hides the real exception — Cloud Logging has the stack trace.

Shut everything down (stop idle billing)

Cloud Run already scales to zero, but pin every service to --min-instances=0 so no warm instance bills, then stop Cloud SQL.

Min-instances 0 on all services + stop Cloud SQL
for s in agent-web agent-service agent-api agent-worker; do
  gcloud run services update afroai-$s --region=africa-south1 --min-instances=0
done
gcloud sql instances patch afroai-pg --activation-policy=NEVER
Warning
Memorystore cannot be stopped — delete it to zero its cost (and recreate on startup). With --min-instances=0, Cloud Run only bills while actually serving requests.

Start everything back up

Start Cloud SQL first, then restore each service's min-instances (or just leave them at 0 — Cloud Run starts on the first request).

Start Cloud SQL, restore min-instances
gcloud sql instances patch afroai-pg --activation-policy=ALWAYS
for s in agent-service agent-web agent-worker agent-api; do
  gcloud run services update afroai-$s --region=africa-south1 --min-instances=1
done
Note
If you deleted Memorystore on shutdown, recreate it (Phase 2) and refresh the Redis env before serving traffic. Cloud SQL takes a minute or two to come back to RUNNABLE.