Automation API
The automation object is available in the browser.webfuseSession namespace in every content script of an extension. If you want to call it from a background script or a popup, you need to send a message to the content script and then call the method of the automation API.
Targeting
Section titled “Targeting”By default, automation actions work on the element below the virtual mouse pointer. A target can optionally be specified in different ways: Either via CSS selector (selector: string), or absolute point coordinates ([x: number, y: number]).
type Target = Element | string | [number, number]; // Element reference, CSS selector or point coordinatemouse_move()
Section titled “mouse_move()”Move the virtual mouse pointer.
browser.webfuseSession.automation.mouse_move( target: Target, persistent: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Mouse pointer target.
persistent
- Whether to keep the mouse pointer on screen (hides after some time by default).
Returns
Section titled “Returns”A promise that resolves once the mouse was moved.
scroll()
Section titled “scroll()”Scrolls the deepest scrollable element under the target by the given amount in the given direction.
browser.webfuseSession.automation.scroll( target: Target, direction: 'vertical' | 'horizontal', amount: number): Promise<void>Parameters
Section titled “Parameters”target
- Scroll(able) target.
direction
- The direction to scroll.
amount
- The amount of pixels to scroll.
Returns
Section titled “Returns”A promise that resolves once scroll ended.
Example
Section titled “Example”await browser.webfuseSession.automation.scroll(100, 'down', '#scrollable');left_click()
Section titled “left_click()”Perform a left (primary) mouse button click.
browser.webfuseSession.automation.left_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
Example
Section titled “Example”await browser.webfuseSession.automation.left_click([100, 250]);middle_click()
Section titled “middle_click()”Perform a middle (wheel) mouse button click.
browser.webfuseSession.automation.middle_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
right_click()
Section titled “right_click()”Perform a right (secondary) mouse button click.
browser.webfuseSession.automation.right_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
type()
Section titled “type()”browser.webfuseSession.automation.type( target: Target, text: string, moveMouse: boolean = false, overwrite: boolean = false, timePerChar: number = 100): Promise<void>Type text to an element. Typing is natural, i.e. as if a human presses a sequence of keys.
Parameters
Section titled “Parameters”target
- Typing target.
text
- Text to type.
[moveMouse]
- Whether to optionally imply a virtual mouse pointer movement into the center of the target.
[overwrite]
- Whether to overwrite the current value of the target element.
[timePerChar]
- Average time to type a character in milliseconds.
Returns
Section titled “Returns”A promise that resolves once text was typed.
key_press()
Section titled “key_press()”browser.webfuseSession.automation.key_press( target: Target, key: "a" | "b" | ... | "Y" | "Z", options?: { altKey?: boolean; ctrlKey?: boolean; metaKey?: boolean; shiftKey?: boolean; }): Promise<void>Press a key on an element.
Parameters
Section titled “Parameters”target
- Key press target.
key
- Key to press.
[options]
- Booleans to hold down a secondary during the press:
alt,ctrl,meta, orshift.
Returns
Section titled “Returns”A promise that resolves once key was pressed.
setSelection()
Section titled “setSelection()”browser.webfuseSession.automation.setSelection(target: Target, text: string, occurrence: number = 0): Promise<void>Parameters
Section titled “Parameters”target
- Text content selection target.
text
- Text to select (empty text also removes any existing selection).
[occurrence]
- The index of occurrence if text repeats in the target (e.g.,
1for the second occurrence).
Returns
Section titled “Returns”A promise that resolves once the selection was applied.
getSelection()
Section titled “getSelection()”browser.webfuseSession.automation.getSelection(): Promise<string>Returns
Section titled “Returns”A promise that resolves with the currently selected text (or empty string if nothing is selected).
wait()
Section titled “wait()”browser.webfuseSession.automation.wait(ms: number): Promise<void>Parameters
Section titled “Parameters”ms
- The amount of milliseconds to wait.
Returns
Section titled “Returns”A promise that resolves once the given time passed.
take_dom_snapshot()
Section titled “take_dom_snapshot()”browser.webfuseSession.automation.take_dom_snapshot(options?: { rootSelector?: string; crossframe?: boolean; crossshadow?: boolean; revealMaskedElements?: boolean; modifier?: "downsample" | { name: string; params?: unknown[]; };}): Promise<void>Serialize the DOM for various processing purposes, such as for LLM input.
Parameters
Section titled “Parameters”[options]
-
DOM snapshot options:
-
[rootSelector]Selector of the element to designate as the snapshot root (documentElementby default). -
[crossframe]Webfuse Exclusive Whether to include iframe subtrees (falseby default). -
[crossshadow]Webfuse Exclusive Whether to include shadow DOM subtrees (trueby default). -
[revealMaskedElements]Whether to include masked elements (falseby default). -
[modifier]Snpashot modifier (Identity by default).nameModifier name.[params]Modifier parameter record.
Returns
Section titled “Returns”A promise that resolves to the snapshot.
Modifiers
Section titled “Modifiers”accessibility-tree
Section titled “accessibility-tree”Translate the DOM to an accessibility tree representation.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'accessibility-tree' } })Example
Section titled “Example”<form role="form" aria-describedby="recipe-hint" aria-labelledby="recipe-form-title"> <div role="group" aria-labelledby="checkbox-group"> <h3 id="checkbox-group">Recipe Preferences</h3> <label for="notifications" aria-describedby="notifications-description"> <input type="checkbox" id="notifications" name="notifications" aria-label="Enable recipe update notifications"> Receive recipe updates </label> <p id="notifications-description">I would like to receive updates.</p> </div> <button type="button" onclick="PASTA.showRecipes()" aria-controls="recipe-results" aria-label="Show pasta recipes for selected type" role="button" aria-live="assertive"> Show Recipes </button></form>{ "role": "RootWebArea", "source": "html", "children": [ { "name": "Recipe Preferences", "properties": { "level": 3 }, "role": "heading", "source": "#checkbox-group" }, { "children": [ { "name": "Enable recipe update notifications", "properties": { "aria-label": "Enable recipe update notifications" }, "role": "checkbox", "source": "#notifications", "states": { "checked": false } } ], "properties": { "aria-describedby": "notifications-description" }, "role": "generic", "source": "html > body > section > form > div > label", "description": "I would like to receive updates." } ]}downsampled recommended
Section titled “downsampled ”Reduce the overall DOM below about 32K (2^15) estimated LLM input tokens.
The resulting DOM can be considered a low resolution variant, which retains the majority of inherent UI features.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ rootSelector: '#app', modifier: 'downsample', })D2Snap
Section titled “D2Snap”Applies the D2Snap algorithm to the DOM. This will reduce its size, while retaining a majority of UI features. The algorithm was developed in order to mitigate the prevalent DOM token size disadvantage.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'D2Snap', params: { hierarchyRatio: 0.4, textRatio: 0.6, attributeRatio: 0.8, // or // k: 0.4, l: 0.6, m: 0.8 options: { assignUniqueIDs: false, keepUnknownElements: false, skipMarkdownTranslation: false, } } } })[modifier.params.options]
[assignUniqueIDs]Whether to add a unique data attributedata-uidto every element in the DOM in order to allow identification of equivalent elements across the original and the downsampled DOM. For example,<button class="btn btn-primary" data-uid="27">Click here!</button>.[keepUnknownElements]Whether to keep unknown (custom) elements in the downsampled DOM.[skipMarkdownTranslation]Whether to skip content HTML to Markdown translation.
Example
Section titled “Example”<section class="container" tabindex="3" required="true" type="example"> <div class="mx-auto" data-topic="products" required="false"> <h1>Our Pizza</h1> <div> <div class="shadow-lg"> <h2>Margherita</h2> <p> A simple classic: mozzarela, tomatoes and basil. An everyday choice! </p> <button type="button">Add</button> </div> <div class="shadow-lg"> <h2>Capricciosa</h2> <p> A rich taste: mozzarella, ham, mushrooms, artichokes, and olives. A true favourite! </p> <button type="button">Add</button> </div> </div> </div></section><!-- k = .4, l = .6, m = .8 --><section> # Our Pizza <div> ## Margherita A simple classic: <button>Add</button> ## Capricciosa A rich taste: <button>Add</button> </div></section>AdaptiveD2Snap
Section titled “AdaptiveD2Snap”Applies the AdaptiveD2Snap algorithm to the DOM. This is an adaptive version of the D2Snap algorithm that does not require explicit parameters.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'AdaptiveD2Snap', params: { maxTokens: 32768, maxIterations: 3, options: { assignUniqueIDs: false, skipMarkdownTranslation: false, } } } })take_gui_snapshot()
Section titled “take_gui_snapshot()”Serialize the GUI for various processing purposes, such as for LLM input.
Serialized GUI corresponds to a screenshot. Hence, this is an alias of webfuseSession.takeScreenshot().
browser.webfuseSession.automation.take_gui_snapshot(): Promise<ImageBitmap>