I’ve recently been making use of Microsoft’s “GitHub Copilot coding agent” on GitHub. This is a new feature that allows you to assign issues in a repository directly to Copilot. Copilot then works on your behalf to generate pull requests and code changes independently. I’m on a project right now that has access to this feature and a lot of what we’ve been doing is experimenting to figure out what works. This article aims to provide you with my takeaways from some of our experimentation.

Setup Files

It’s important to make use of some of the files Microsoft expects you to provide, namely, copilot-setup-steps.yml and copilot-instructions.md . These names must match exactly because they are special file names that Microsoft is looking for in its process.

copilot-setup-steps.yml is where developers can provide all of the instructions needed for Copilot to set up your development environment in a container. It has access to Github Actions as well as custom setup instructions. Here is an example that should be added in .github/workflows .

name: "Copilot Setup Steps"

# Documentation:
# https://docs.github.com/en/enterprise-cloud@latest/copilot/customizing-copilot/customizing-the-development-environment-for-copilot-coding-agent#preinstalling-tools-or-dependencies-in-copilots-environment
on: workflow_dispatch

jobs:
  # The job MUST be called `copilot-setup-steps` or it will not be picked up by Copilot.
  copilot-setup-steps:
    runs-on: ubuntu-latest

    # Set the permissions to the lowest permissions possible needed for your steps.
    # Copilot will be given its own token for its operations.
    permissions:
      # If you want to clone the repository as part of your setup steps, for example to install dependencies, you'll need the `contents: read` permission. If you don't clone the repository in your setup steps, Copilot will do this for you automatically after the steps complete.
      contents: read

    # You can define any steps you want, and they will run before the agent starts.
    # If you do not check out your code, Copilot will do this for you.
    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up Node.js
        uses: actions/setup-node@v4
        with:
          node-version-file: '.tool-versions'

      - name: Install dependencies
        run: npm install

      - name: Start server
        run: |
          npm start &

This code allows Copilot to do just what it says. Checkout code, setup node, install dependencies, and start the development server.
The second file is copilot-instructions.md, which allows you to give custom instructions to Copilot for it to keep in context when it addresses the content of your issue. Take a moment to parse it. Think about other instructions you might want included in a file like this, such as “follow Typescript best practices”, and “run npm test” before committing, etc.

# Copilot Contribution Guidelines for Word Add-In

This repository contains the codebase for an MS Word Add-In. It is primarily built using TypeScript and React, with integration into the Microsoft Office ecosystem. Please follow these guidelines when contributing:

## Code Standards

### Required Before Each Commit

-   Run `npm run lint-diff:fix` to ensure proper code formatting and adherence to linting rules only for changed files.
-   Run `npm test` to verify all unit tests pass before committing changes.

## Repository Structure

-   `add-in/`: root directory
    -   `src/`: Core source code for the add-in, including components, schemas, and utilities.
        -   `taskpane/`: Contains the main UI components and logic for the Word add-in.
            -   `__tests__/`: Unit tests for components and actions.
        -   `schemas/`: Zod schemas for validating data structures.
        -   `utils/`: Helper functions and utilities and tests.
    -   `assets/`: Static assets such as icons and images.
-   `docs/`: Documentation for deployment, setup, and usage.
-   `.github/`: GitHub-specific configuration files, including workflows and contribution guidelines.

## Key Guidelines

1. Follow TypeScript best practices and idiomatic patterns.
2. Maintain existing code structure and organization.
3. Use React hooks and functional components where appropriate.
4. Write unit tests for new functionality using Jest and React Testing Library.
5. Document public APIs and complex logic. Suggest changes to the `../docs/` folder when appropriate.
6. Ensure compatibility across Mac, Windows, and Online versions of Microsoft Word.

## Additional Notes

-   Use Fluent UI v9 components for consistent styling and functionality.
-   Refer to README.md for setup instructions and debugging tips.
-   For deployment, follow the steps outlined in the docs/deployment.md

Once these instructions are included in the .github folder of the project, Copilot now has a context document it can use every time it does any work autonomously. The next question that needs to be answered is, what is suitable work for Copilot to do?

Copilot’s Job

Part of my team’s normal sprint process is determining whether a ticket is a good candidate to be worked on by Copilot and so we’ve added a label to the project; “Good for Copilot”. During a poker session or refinement, after we’ve determined complexity and figured out any questions that were raised, we then decide whether or not to apply the “Good for Copilot” label. I tend to think of Copilot, and all AI tools, as Junior Developers.

Years ago, when I was first starting out in development, I asked the front-end lead how he determined whether or not an issue would be good for me to work on, and he told me that he based it off of how many places in the app a given issue is expected to touch. He’d give me issues he expected to be contained within one or a few files, but the widespread code changes, I had to build up to. So that’s the metric I started with for Copilot. How many code changes am I expecting an issue to require, and I reevaluate after each ticket that Copilot finishes. Sometimes though, I like to let Copilot have a crack at tickets, just to see what it will do. It might get me, as they say, 80% of the way there, by which I mean, the ticket is mostly done, it just needs somebody with a human touch.

In order for Copilot to get an issue 80% of the way there, the issue needs to be defined to a level commensurate with how broad, complicated, or wide reaching the finished product is meant to be. For simple issues, the Acceptance Criteria might be, “Change error text color to #FF0000”. Copilot is good at things like finding where error text is, finding the associated style, and updating the color property of that style to be the supplied value, in this case the hex value for pure red. As the issue gets more complex, edge cases should be handled much more explicitly in the issue description. You can save yourself a lot of time by providing implementation suggestions in the ticket before assigning the ticket to Copilot. Don’t be afraid to tell Copilot specific modules to look at, cases you want coverage for, or any of the other little details you might give a coworker along their implementation path. Copilot does not ask questions when there is ambiguity. It just makes changes, and counts on someone to double check its work. This raises another important part of working with Copilot; the human developer looking over its shoulder.

The You in the Machine

Copilot does work on behalf of a person, which means that a person needs to take responsibility for issues that arise as a result. Copilot is just as capable of doing wacky stuff as any other AI tool. The human involved in this workflow has several checkboxes they should tick in advance of merging a PR generated by Copilot. First and foremost, the human assignee should pull the code down and manually QA the code changes. The human assignee should review the code and any suggestions for code changes should be added to the PR as a comment. Copilot will start work as soon as it gets feedback, so it’s better to batch comments in a review rather than to submit single comments.

The easiest way for Copilot to make code changes is for the human assignee to be as explicit and direct as possible in their feedback. Telling Copilot to put code on a specific line, or delete a particular test, is a much more straightforward change than saying something like, “This reads funky” or something like that. You may feel that if you’re going to tell Copilot to make specific changes on specific lines, then why not just do it yourself? Good question.

Be the Pilot you want to see in the World

Don’t be afraid to make changes to a Copilot PR yourself. It can be faster than trying to go back and forth with comments, and unlike when you work with a human developer who develops skills and retains memories of best practices, Copilot doesn’t benefit from you asking it to make the change (and it may cost you money). Copilot can make the same mistake on every PR and you can point it out every time, and it will not matter. Especially because a lot of the time, the reason the human assignee knows what line to change, is because they did some extensive debugging to figure it out. It may be slower to point out small changes like this to Copilot.

Here’s an example of what I think is a good use case for Copilot:

@copilot we have this comment clarifying the expected behavior

"Q: Do we want to enforce a particular format for the input?
A:inputs that should be recognized are 1 2 3 4 5 6 7 8 9 10 11 12 30 : am pm

Q: Do we want to allow inputs not in the dropdown (5:05 AM)?
A: no, only allow hour or half hour inputs
Q: How much should the user have to type before a match is found?
A: typing any of the numbers above should find an input"

I'd like to try replacing the underlying timepicker being used in the CustomTimepicker with a dropdown, which I think will provide the functionality we want.

I told Copilot to make this change because I thought it would solve the problem, and because I thought it would be faster and easier for Copilot to do it, rather than for me to do it. The approach didn’t work, but that wasn’t Copilot’s fault. This leads me to the use case for Copilot which is when you have a concrete idea of what changes you want to make, or how you want to tackle a problem, or in other words, uncomplicated, time consuming, work, but you want to spend your time doing something else.

Opening the Black Box

Sometimes having Copilot do anything complicated can lead to lengthy back and forth. Github helpfully exposes the session logs anytime Copilot is dispatched to perform work, and it might be worth looking at the sessions it generates to see how it is approaching the problem. This can give you insight into how you can more effectively communicate with Copilot. It could be that Copilot is completely misunderstanding the issue because of some word choices.

Bye Bye Co-Birdie

Once the back and forth is done, and the human assignee has QA’d the code, reviewed it themselves, and is willing to stand by the changes, it’s time to move the PR out of WIP/draft status, and to assign other human reviewers. I find it much harder to review code I’ve gone back and forth with Copilot on than to review code someone else wrote. It’s important to get a second pair of literal human eyeballs looking at the code changes Copilot cooked up.

When it is time to assign other human reviewers to a PR that Copilot generated, I suggest unassigning Copilot from the PR. I’ve had cases where Copilot made changes to the PR based on reviewer feedback, which was not the expectation. The expectation is that human reviewers provide PR review, and the human assigned to the issue evaluates what code changes need to be incorporated, and which can be resolved or require more discussion.

At the point where other human reviewers become involved, this becomes a lot more like standard code review. Approve, merge, deploy, QA and whatever else may be in a team’s workflow. The main difference is that you weren’t pairing with a person. It doesn’t care if you tell it, “Nice Job!” or something like that. However, I think we have an opportunity to practice a culture we want to foster by encouraging our teammates to treat, not just talk to, Copilot like a person. It can build your communication muscles by getting in those extra reps, and leaves a trail of what you thought and how you felt Copilot did or said.

TLDR: Add Setup Steps and instructions. Be as direct about what you want as possible. Be diligent in reviewing. Get feedback from your team about the work you delegated. Be kind and supportive in communication. I’m phrasing this intentionally to point out that these are the exact tips I would suggest for working with any new developer on your team, especially a junior. Prepare things for Copilot, and you’ll be prepared for anyone.

ai coding automated reviews Copilot

GitHub Copilot: 5 Great Tips for Better Outcomes