1
0
Fork 0
mirror of https://github.com/VSadov/Satori.git synced 2025-06-09 09:34:49 +09:00
Satori/docs/infra/buildtriage.md
Youssef Victor 5d5c3e7a58
Enable markdownlint rule (MD009) (#40887)
* Create markdownlint.yml

* Create markdownlint-problem-matcher.json

* Create .markdownlint.json

* Update .markdownlint.json

* fix violations

* fixes

* Remove "push" section

As advised by @viktorhofer so it's quite clear it only runs in CI.

Co-authored-by: Dan Moseley <danmose@microsoft.com>
2021-02-08 10:43:40 -08:00

4.1 KiB

Build Triage Rotation

The responsibility of this role is triaging our rolling / official builds, filing issues to track broken tests, submitting changes to dotnet/runtime to work around issues (disabling a test, undoing a PR that broke the build).

In some cases this will require working with core-eng team when the issues are in Helix / Azure / Arcade. This person will also attend CI Council with the infra manager to provide updates on our reliability status.

This directly impacts developer productivity and the need to promptly fix such breaks can span across time zones. Hence it will be the collective responsibility of the Scout pool to investigate such breaks.

This role will work on a rotation basis. There are six people in the role and each rotation will last for a calendar month.

Prerequisites

Please make sure that you are part of the following groups before you start the rotation:

Unfortunately, the teams channel's members need to be listed individually. Ping @ViktorHofer if you need access.

Tracking Build Failures

All the CI failures can be tracked through the CI Council dashboards i.e. Public, Internal. We have different dashboards for public (Rolling & PR Builds) and internal builds.

In addition to the dashboards, official build failure notifications are sent to the internal runtime infrastructure email alias.

For each of these mail notifications, a matching issue should exist (either in the dotnet/runtime repository or in dotnet/core-eng or dotnet/arcade). The person triaging build failures should reply to the email with a link to the issue to let everyone know it is triaged. This guarantees that we are following-up on infrastructure issues immediately. If a build failure's cause isn't trivial to identify, consider looping in dnceng.

Tests are not run during the internal builds. Publishing and signing steps are run only during internal builds. Rolling builds run tests for the full matrix.

For new issues, try to provide a runfo search which will make it easy to isolate repeated instances of that failure.

Contact @chcosta if you are having any trouble accessing the dashboards. Contact @Chrisboh if you don't have the calendar invite for the CI Council meeting. Contact @jaredpar if you are having any trouble with runfo, site or utility.

Ongoing Issues

All the issues causing the builds to fail should be marked with blocking-clean-ci label. Any issues causing build breaks in the official build should be marked with blocking-clean-official. It helps in tracking issues effectively.

The main meta-bug linking to currently tracked issues is here.

Some helpful resources

Build Rotation for upcoming months

Month Alias
September 2020 @directhex
October 2020 @jkoritzinsky
November 2020 @aik-jahoda
December 2020 @akoeplinger
January 2021 @hoyosjs
February 2021 @anipik
March 2021 @directhex