GitHub Runner S3 Cache: Turbocharge Your Self-Hosted Workflows on AWS

Published on May 20, 2025 | Written by Andreas

Caching is a great way to speed up your GitHub Actions workflows. However, with self-hosted runners on AWS, accessing the default cache provided by GitHub can be slow and expensive due to limited network bandwidth and traffic costs. That’s why we have implemented a cache layer for GitHub Actions on top of Amazon S3. Available for HyperEnv 2.14.0 and later.

@cache speeds up workflows by caching dependencies

The @cache action allows caching dependencies to minimize the time spent downloading dependencies. The following code snippet shows how to cache a file named demo.txt. At the beginning of the job cache-demo, the runner will try downloading the cache file with the key demo. In case the cache key demo does not yet exist, the runner will upload the file demo.txt under the cache key demo when finishing the job.

name: Demo
on: push
jobs:
  cache-demo:
    runs-on: ubuntu-latest
    steps:
    - uses: actions/checkout@v4
    - name: Cache demo.txt
      id: cache-demo
      uses: actions/cache@v4
      with:
        path: demo.txt
        key: demo

More importantly, actions to prepare environments like setup-java or setup-node rely on caching to speed up dependency downloads.

The problem with GitHub’s default cache for self-hosted runners on AWS

By default, GitHub uses Azure Blob Storage under the hood to persist cached files, which makes sense as the GitHub-hosted runners are running on Azure as well. However, when using self-hosted runners on AWS, uploading and downloading cache files to Azure Blob Storage comes with increased latency, limited bandwidth, and additional AWS costs for outgoing traffic. That’s why we developed a cache layer for GitHub Actions that stores cache files on S3 for HyperEnv.

GitHub Actions Cache on S3 by HyperEnv

The following diagram illustrates the architecture of the HyperEnv cache layer for GitHub Actions.

EC2 instances run HyperEnv and self-hosted runners.
S3 bucket to store cache files.
Lambda function to validate JWT signed by GitHub and to assume IAM role for restricted access to S3 bucket.

An S3 lifecycle policy ensures that cached files and incomplete partial uploads are getting deleted after a configurable number of days.

GitHub caching comes with restrictions on which jobs can access which cache files.¹ To ensure that only authorized jobs can access cache files, the Lambda function validates the JWT signed by GitHub and assumes an IAM role for restricted access to the S3 bucket. For example, a job running for a branch can only access cache files uploaded by jobs for the same or the default branch.

Of course, HyperEnv also supports advanced cache key matching by defining so called restore-keys.²

The HyperEnv cache architecture

Additional benefits when using S3 as GitHub Actions cache

Besides the benefits of improved speed and reduced network traffic costs, using S3 as the caching layer for GitHub Actions comes with more benefits: unlimited storage and extended retention.

The GitHub cache provides a maximum of 10 GB of storage. Also, GitHub deletes cached files after 7 days of inactivity. With our S3-based caching solution, you can store as much data as you want and keep it for as long as you need it (configurable retention period of 1 to 365 days).

Summary

Self-hosted GitHub runners on AWS benefit from using Amazon S3 as the storage layer for caching. We are pleased to announce S3 caching for HyperEnv with our latest version 2.14.0 which will improve the speed of your workflows and provide unlimited storage for @cache.