Testing at Microsoft

The other day I got into a discussion with a colleague (also ex-Microsoft) about software testing (or QA, if you like) at Microsoft. Here is a short description of my experiences. Microsoft is a big place, and I left a couple of years ago, so your experiences may differ from mine.

Bugs are tracked in a bug tracking system similar to JIRA. Key fields (mandatory for every bug / feature request / whatever) are:

  • Priority
  • Severity
  • Repro Steps
  • Product Version (e.g. Office 10, Office 11, Office 12)
  • Milestone (e.g. M1, M2, M3, Beta, RTM)
  • Resolution (Fixed, Not Repro, Won’t Fix, By Design, …)
  • Status (Open, Resolved, Closed)

Typical workflow:

  • Tester identifies a problem.
  • Tester creates a bug filling out information and assigns to the right developer (or lead, see Triage below)
  • If applicable, the bug is triaged and assigned to the correct developer if necessary.
  • Developer either fixes the problem or identifies the reason why it should not be fixed, and assigns back to tester as Resolved.
  • Tester tests the fix according to their team’s policy.
  • If the fix is acceptable, it is Closed. Otherwise it is reactivated (Status = Open) and assigned back to the developer.
  • If after closing the bug comes back due to a code regression, the bug is reopened and assigned back to the developer. This happens even if the regression is long after the original fix.

Many teams conduct a “waterfall” style process whereby the development for a product release is divided in advance into milestones, and feature development assigned to each milestone. Typically team leadership (program management aka PM, development, QA) agree on release criteria. These are the conditions under which the product will be released. Sample categories:

  • Certain key experiences work as defined.
  • Automated testing for certain key experiences.
  • Number of bugs according to a certain query (for example, no P1-P3 bugs) is zero.
  • Code coverage is at least X%.
  • The top K bugs from the previous release have been fixed.
  • Performance and scalability for key experiences meet certain requirements. (e.g. no page takes longer than K seconds to load.)
  • Security review completed.
  • Globalization review completed
  • Etc.

A release manager (who could be PM/Dev/QA) is often responsible for tracking this checklist as development proceeds. It is often common for there to be milestone exit criteria. This is essentially a less restrictive version of the release criteria. The purpose of milestone exit criteria in the waterfall methodology is to make sure that the product is in a close-to-shippable state even during development.

Based on the release or milestone exit criteria, it is possible to identify a bug query that tracks all issues that as per agreement must be fixed prior to the end of the milestone or release. This query must return zero records before moving to the next milestone or release, otherwise the release is delayed. As the end of the milestone approaches, the team begins to monitor this query closely. The purpose of triage meetings is to review incoming bugs to determine whether they meet the bar for the release. Triage meetings always include a PM, dev, and QA representative. Actions taken in triage may include:

  • Resolving bugs as Won’t Fix or By Design based on review,
  • Raising or lowering the priority of the bug,
  • Assigning bugs to the correct developer,
  • Changing the Product Version or Milestone for the bug, thereby moving the bug out of the current release,
  • Reassigning to the tester for additional information,
  • Assigning to the developer with instructions to investigate the cost and risk of a fix.

Near the end of a release, triage may also review proposed code changes before accepting into the release branch, summoning the developer / tester involved to explain. If triage is not in effect, then QA just assigns the bug to the developer responsible for the area. The release manager periodically sends out a glidepath for the current milestone or release. This shows the current progress towards the zero bug goal, and may identify specific bugs that are the goal for the next period. If insufficient progress towards the goal is being made, corrective action is taken, including:

  • Cutting features
  • Reducing scope
  • Throwing developers in “bug jail”. Any developer with more than X bugs with Pri/Sev of Y are forbidden from working on new features until their bug count is reduced

As developers complete their feature work for a release, they take bugs from other developers to help with the glidepath.

The team identifies a date prior to the release called ZBB (Zero Bug Bounce). This is the day when the bug query is supposed to come up empty. Typically the day after ZBB, new bugs come in, or certain resolved bugs are reactivated. These bugs are tracked in real time and no other changes except changes related to those bugs are accepted. Then final release testing proceeds until the code is released (after signoff from PM, Dev, and Test). In the old days this was called “RTM” – Release to Manufacturing.

Author: natebrix

Follow me on twitter at @natebrix.

2 thoughts on “Testing at Microsoft”

  1. Hi Nathan,

    Your post is interesting. While I was working basic IT in college my users came across a lot of bug to Office products on Mac OS X. As a computer engineering major I knew to a certain extent products and their bugs won’t all be solved in time for release so there is a certain point of functionality that is deemed acceptable for release.

    In your experience where do you think the hold up on bug fixes are? Is it because developers don’t deem it high enough priority or is it because the designers don’t have time for it?

    Just wondering because I’m applying for SQA or software engineering jobs right now.

    1. Part of the “hold up” in fixing bugs is a balancing act of new features against fixing a shipped bug.

      Fixing a shipped bug is considered in light of its severity (crasher? data loss? workaround? merely annoying?) and impact (everyone? some people? very few people?).

      A shipped bug has shipped, so it is no longer considered a bug proper. Now its a feature. (Not a joke.) Which means to change (fix) the feature is a new product request. Those product requests get prioritized against other, taking into account difficulty and business value.

      Often, new features win out as having more business value than correcting a defect (misfeature). Which is why a known bug can linger on release-after-release-after-release. Even worse is when some known bug becomes relied upon by some part of the user community, then fixing the (mis)feature becomes a balancing act.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s