Skip to main content

Image repositories

ECR

New accounts use a single shared account for container images in ECR (636728427214). Repositories are typically named <service>/<app> (for example active10/frontend).

Notes: - Image signing metadata is created and stored alongside images by the application pipeline, but ECS does not currently validate signatures. - Images are scanned using both Synk (as part of the application pipeline) and AWS Inspector (when pushed to ECR).

Cross-account access is handled via ECR repository policies:

  • Workload accountsother than dev: read-only pull
  • Dev workload account: write/push

Terraform state

State currently lives per-account: Terraform creates an S3 bucket for the backend. Optionally, a DynamoDB table is also created but this is no longer necessary with newer versions of Terraform.

One potential improvement would be to centralise backend bucket creation via StackSets/CloudFormation.

Control Tower logging

CloudTrail and Config are enabled through Control Tower, so logs should already land in the centralised log archive account.

Secrets

Most secrets are created as placeholders in Terraform, with values set manually afterwards. This gives us some infrastructure visibility, but it means our infrastructure is not fully reproducible.

Tagging

We use Terraform default_tags so resources get consistent tags automatically. In addition to standard UKHSA tags, we add tags pointing back to the relevant Terraform code repo to make it easier to find from where and why something was deployed.

KMS

Default AWS-managed keys are used for most services. This keeps costs and operational overhead low, but it may not be sufficient for workloads with tighter compliance requirements.

Alerting and monitoring

There’s a standard set of alarms that are deployed to services. These alarms are defined in the devops-terraform-standard-alarms repo. The deployed alarms use a Lambda which is added automatically to each account via a StackSet. This Lambda forwards alarms to a Microsoft Teams channel. Refer to the devops-phe-alarms-lambdas repo for more detailed information on how this works.

ECS standards

For ECS services, start with the smallest CPU/memory configuration that will run, then resize based on real traffic and telemetry. This keeps costs low and avoids over-provisioning.

Logging and log retention

Logging is inconsistent across services. Most apps output default framework logs; structured logging is rare which makes CloudWatch harder to use.