Autonomous Visual Studio Code Desktop Automation using Computer Use Agent & PyAutoGUI

srikantan

Microsoft

Jul 20, 2025

This article outlines an approach to autonomous code generation using the Computer Use Agent (CUA) model and PyAutoGUI. It automates VS Code desktop interactions with GitHub Copilot in Agent mode.

Project Overview

The system replicates a developer's workflow by autonomously launching VS Code, configuring the environment, and generating code via GitHub Copilot Agent mode. This automation is ideal for scenarios where:

Codebase is not hosted on GitHub – GitHub Agents on github.com are not applicable.
GitHub Codespaces cannot access the code – VS Code in the browser is ruled out.
Desktop automation is required – Must work with VS Code desktop client on a local computer.

Key Innovation: CUA Model + PyAutoGUI Integration

The project demonstrates the synergy between CUA model and PyAutoGUI:

PyAutoGUI: Executes desktop actions (launching apps, typing, clicking).
CUA Model: Analyzes screenshots and determines next actions.
Deterministic Outcomes: Enables autonomous detection and correction at each step.

Complete Autonomous Workflow

Phase 1: Environment Setup

Launch VS Code desktop application.
Navigate to the project workspace directory.
Maximize VS Code window.
Open integrated PowerShell terminal.

Phase 2: Intelligent Package Management

Read requirements.txt.
Execute pip install -r requirements.txt.
Monitor progress:
- Capture screenshots.
- CUA analyzes terminal output.
- Detects "in_progress" vs "complete" states.
- PyAutoGUI waits for CUA confirmation.

Phase 3: GitHub Copilot Integration

Open GitHub Copilot chat panel (Ctrl+Shift+I).
Switch to Agent mode.
Submit developer prompt.
Monitor status:
- Capture screenshots every 3 seconds.
- CUA checks "Keep" button status.
- Accept generated code when ready.

CUA Model Decision Points

Package Installation Monitoring

{
  "installation_status": "complete" | "in_progress"
}

In Progress: Detects progress bars, downloads.
Complete: Identifies empty prompt ready for input.

Code Generation Monitoring

{
  "button": "enabled" | "disabled"
}

Disabled: Code generation ongoing.
Enabled: Ready for acceptance

Automation Flow in Action

The following are the sequence of steps that are automated in the flow

Launch VS Code → Open project workspace.
Open terminal → Execute pip install.
CUA monitors → Detects completion.
Open Copilot → Activate Agent mode.
Submit prompt → Begin code generation.
Monitor "Keep" button → Detect enabled state.
Accept code → Automation complete.

Real-World Applications

Development Scenarios

Clone repositories.
Create branches.
Generate code.
Run tests and validations.

Regression Testing Use Cases

Routine code generation.
Test case creation.
Documentation updates.
Refactoring tasks.

Note: Requires a dedicated computer to run VS Code desktop. Use responsibly with safeguards and monitoring.

Refer to the GitHub Repo here. Refer to the video of this sample in action below:

Limitations:
With pyautogui, the App on which the automation needs to be done has to be on the foreground. It cannot choose a particular app from amongst other apps on the Desktop to monitor the visual state. There are other options to consider for that, which I will cover in another article later

Updated Jul 20, 2025

Version 1.0

Microsoft

Joined June 13, 2018

View Profile

Azure AI Foundry Blog

Follow this blog board to get notified when there's new activity