Creating a Custom Environment#
Creating a custom MiniWoB++ environment is as simple as creating a new task HTML page, and then specifying the URL to the HTML file when registering the environment.
The following tutorial illustrates how to create a custom environment with the standard observation space and action space. For more advanced needs (customizing the spaces, creating a package, etc.), see this Gymnasium tutorial.
Creating the Task HTML Page#
To properly interface with the MiniWoB++ library, the HTML file must include
core.css
and
core.js
.
These files are also available from the Python package (under html/core/
).
Here is an example HTML file (custom.html
):
<!DOCTYPE html>
<html>
<head>
<title>Custom Task</title>
<!-- The following CSS and JS are required. -->
<link rel="stylesheet" type="text/css" href="core.css">
<script src="core.js"></script>
<script>
// genProblem() will be called at the start of each episode.
var genProblem = function() {
// core.randi(a, b) (from core.js) samples an integer between a and b (inclusive).
let targetAmount = core.randi(2, 5), clickedAmount = 0;
// Set the task utterance. This will become the "utterance" in the observation.
document.getElementById("query").textContent =
`Click Hello ${targetAmount} times, then click Submit.`;
// Create the task UI.
let hello = document.createElement("button");
let submit = document.createElement("button");
let amount = document.createElement("span");
document.getElementById("area").replaceChildren(hello, submit, amount);
hello.textContent = "Hello";
hello.addEventListener("click", () => {
clickedAmount++;
amount.textContent = clickedAmount;
});
submit.textContent = "Submit";
submit.addEventListener("click", () => {
if (clickedAmount === targetAmount) {
// core.endEpisode(reward, timeProportional) ends the episode and sets
// the reward (which can be read by Selenium). If timeProportional is
// true, then the reward is scaled by the amount of time left.
core.endEpisode(1.0, true);
} else {
core.endEpisode(-1.0, false);
}
});
}
window.onload = function() {
// core.startEpisode() prepares the web page for an episode by showing the "START"
// screen (it does not actually start the episode until "START" is clicked).
// core.startEpisode() will also be automatically called at the end of each episode.
core.startEpisode();
}
</script>
</head>
<body>
<!-- The following 3 divs are required. -->
<div id="wrap">
<div id="query"></div>
<div id="area"></div>
</div>
</body>
</html>
Registering the Environment#
The generic environment class MiniWoBEnvironment
from miniwob.environment
can be used as the entry point for the environment.
The following keyword arguments should be specified:
subdomain
: This should match the HTML filename.base_url
: This should point to where the HTML is; the URL{base_url}/{subdomain}.html
should point to the task page.field_extractor
: This should be a function that takes the utterance string (that appears in the yellow box) and returns a list of fields as key-value tuples. One option is to not produce any field by specifyinglambda x: []
.
Here is example code (custom_registry.py
):
import pathlib
from gymnasium.envs.registration import register
from miniwob.fields import create_regex_field_extractor
register(
id='miniwob/custom-v0',
entry_point='miniwob.environment:MiniWoBEnvironment',
kwargs={
'subdomain': 'custom',
# Assuming that this file is in the same directory as `custom.html`:
'base_url': 'file://{}/'.format(pathlib.Path(__file__).parent),
# This helper method will produce a function that takes a string and returns
# [('amount', ___)], where ___ is from the capturing group in the regex.
'field_extractor': create_regex_field_extractor(
r'Click Hello (\d+) times, then click Submit\.', ['amount'],
)
},
)
Using the Environment#
The following code is adapted from the example in the Basic Usage page:
import time
import gymnasium
from miniwob.action import ActionTypes
from miniwob.fields import field_lookup
# Import `custom_registry.py` above to register the task.
import custom_registry
gymnasium.register_envs(custom_registry)
# Create an environment.
env = gymnasium.make('miniwob/custom-v0', render_mode='human')
# Wrap the code in try-finally to ensure proper cleanup.
try:
# Start a new episode.
observation, info = env.reset()
# Find the relevant HTML elements.
for hello_button in observation["dom_elements"]:
if hello_button["text"] == "Hello":
break
for submit_button in observation["dom_elements"]:
if submit_button["text"] == "Submit":
break
amount = int(field_lookup(observation['fields'], 'amount'))
for _ in range(amount):
# Click Hello.
action = env.unwrapped.create_action(ActionTypes.CLICK_ELEMENT, ref=hello_button["ref"])
observation, reward, terminated, truncated, info = env.step(action)
time.sleep(0.5)
# Click Submit.
action = env.unwrapped.create_action(ActionTypes.CLICK_ELEMENT, ref=submit_button["ref"])
observation, reward, terminated, truncated, info = env.step(action)
# Check if the action was correct.
assert reward >= 0
assert terminated is True
time.sleep(0.5)
finally:
env.close()