Debug
If things go wrong, here is how to debug a stuck or failed job.
First, go to GitHub and open the job that got stuck or failed.
From the URL of the job, you need to extract the run ID and job ID.
Here is an example with run ID 16275548818
and the job ID is 45953603345
.
https://github.com/example/repo/actions/runs/16275548818/job/45953603345
Second, check the runner orchestrator.
- Open the AWS Management Console and head to the Step Functions service.
- Look for a state machine named RunnerOrchestrator-* and open its details.
- Search through the list of executions: <JOB_ID>-<RUN_ID> and open the execution.
- In the outputs of the LaunchSpot or LaunchOnDemand state, you will find the ID of the launched EC2 instance. Note down the instance ID.
- In case both the LaunchSpot and LaunchOnDemand states failed, use the link “Log group” to jump into the log groups of the Lambda function and search for error messages.
Third, go to the CloudWatch service in the AWS Management Console and jump to Logs Insights.
- Select the log group named hyperenv-RunnerLogGroup-* and a suitable timespan.
- Use the following query to search the logs. Make sure to replace <INSTANCE_ID> with the instance ID from the state machine execution.
fields @timestamp, @message, @logStream, @log
| sort @timestamp desc
| filter @logStream like '<INSTANCE_ID>'
- Check the logs for errors yourself or export them and send them to us.