close
close

GitHub Availability Report: September 2024

In September, we experienced three incidents that resulted in reduced performance of GitHub services.

September 16 21:11 UTC (lasts 57 minutes)

On September 16, 2024, between 21:11 UTC and 22:08 UTC, the GitHub Actions and GitHub Pages services were demoted. Customers deploying Pages from a source branch experienced delayed executions. We determined that the root cause is a misconfiguration in the service that manages runner connections, which led to CPU throttling and performance degradation in that service. Action tasks experienced average delays of 23 minutes, with some tasks experiencing delays as long as 45 minutes. Over the course of the incident, 17% of journeys were delayed by more than five minutes. At peak times, as many as 80% of journeys experienced delays of more than five minutes.

We mitigated the incident by redirecting runner connections away from the misconfigured nodes, starting at 21:16 UTC. In addition to addressing the configuration issue we discovered as a result, we have improved our overall monitoring to reduce the risk of a similar recurrence and reduce our time for automated detection and mitigation of these types of issues in the future.

September 24 08:20 UTC (lasts 44 minutes)

On September 24, 2024 from 08:20 UTC to 09:04 UTC, the GitHub Codespaces service experienced an interruption in network connectivity, leading to an error rate of approximately 25% for the outage period. We traced the cause to an interruption in network connectivity caused by exhaustion of the Source Network Address Translation (SNAT) port after a deployment, which caused individual code spaces to lose their connection to the service. To mitigate the impact, we increased the port allocations to provide enough buffer for more outbound connections shortly after deployment. We will scale up our outbound connectivity in the near future and add improved network capacity monitoring to prevent future degradation.

September 30 10:43 UTC (lasts 43 minutes)

On September 30, 2024 from 10:43 UTC to 11:26 UTC, GitHub Codespaces customers in the Central India region were unable to create new codespaces. There was no impact on resumes and there was no impact on customers in other regions. We traced the cause to storage capacity limitations in the region and resolved it by temporarily redirecting the creation requests to other regions. We then added additional storage capacity to the region and traffic was returned. We also identified a bug that caused some of the available capacity to go unused, artificially limiting capacity and prematurely halting creations in the region. We have since also fixed this bug so that available capacity scales as expected according to our capacity planning projections.


Follow our status page for real-time updates on status changes and post-incident summaries. To learn more about what we’re working on, check out the GitHub Engineering Blog.

Written by

Jakub Oleksy