Contents
Recently I have been thinking more and more about AI and I wondered if it was possible to use AI to automate boring administrative tasks in vSphere.
First I though to use some Generative AI to provide instruction on how to perform the task, but I had no way to actually get something to perform them for me.
Then today I discovered than you can use a tool called browser-use, combined with a model to perform tasks within a browser.
They way this works is you setup the browser-use system with your model of choice and API key. Then you simply provide a prompt such as “login to this URL with user X and password Y” It then launches a browser session and scans the page then injects your form details.
This is all basic stuff though, so I tried something more complex. I used used these tools to login to the vSphere client and interact with my virtual machines but I think if you get the prompts right, you could even get vMotion to work and far more complex things including advanced configuration.
Before I run through how to set this up, take a look at the quick demo which logs into the vSphere client, deals with an SSL error, navigated around a bunch of errors in the client and then powers on a VM. The prompt for this is in the instructions below.
Demo
Setting up the environment
Install Python
- Download and Install Python: Download Python | Python.org
Install Browser Use Web UI
- Download or clone from: GitHub – browser-use/web-ui: Run AI Agent in your browser
- Extract and change into the folder directory
- Follow the instructions for Option 1: Local Installation
- If you’re using PowerShell, when instructed to run:
source .venv/bin/activate
you will need to actually run:.venv/bin/activate
Launch
- From the same directory you have been working in, run:
python webui.py --ip 127.0.0.1 --port 7788
- Open a web browser to:
http://127.0.0.1:7788
Configure Browser Use Web UI
- Open the LLM Configuration and select your model. I am using the DeepSeek Reasoner Model.
- Locate your model’s API keys
- This will vary depending on the model you are using.
- For DeepSeek, Navigate to DeepSeek and then API platform.
- Create a new account
- Top up up the account with $2 to get started
- Next go to API keys and generate a new API key
- Add your API key to Web UI (Under LLM Configuration)
- Enter the Base URL for the model. For DeepSeek, this is:
https://api.deepseek.com/v1
- Under Agent Settings, it is recommended to disable “Use Vision” due to compatibility issues with DeepSeek
Performing vSphere Tasks
Now onto the good bit!
- Go to Run Agent and enter your prompt.
- Be sure to provide the URL, user and password plus what you want it to do.
Task Description
1. Open the vSphere client
2. Wait for the licensing banner to load
3. Close the licensing error banner with the x on the top right of the browser
4. Power on the VM using the power on icon
Additional Information
The vSphere client URL is: https://vcf-m01-vc01.lab.local/ui/
The vSphere client username is: administrator@vsphere.local
The vSphere client password is: Password12345!
The VM name is: vcf-m01-nsx01a
What’s interesting here is in the console window you can see what the model is trying to do in plain English, then it tries that.
I also though it was interesting how it tried a few ways to bypass the SSL error until it finally found a good way to do it. Also sometimes it does the wrong thing (opens the wrong menu for example) but it will then close that menu and try again, a little like how a human would make a mistake and correct it.
Share your prompts!
If you manage to get something interesting working such as live vMotion, advanced configurations or even a host upgrade using this method, please let me know.
I’d be happy to put together a library of prompts and credit you for your work.
A note on using LLMs
I highly recommend doing your own research before using DeepSeek’s LLMs, especially for work purposes outside of China.
I would also avoid this use of LLMs to orchestrate tasks in your production environment due to the obvious risks associated with it.