Automation API
The automation object is available in the browser.webfuseSession namespace in every content script of an extension. If you want to call it from a background script or a popup, you need to send a message to the content script and then call the method of the automation API.
Targeting
Section titled “Targeting”By default, automation actions work on the element below the virtual mouse pointer. A target can optionally be specified in different ways: Either via CSS selector (selector: string), or absolute point coordinates ([x: number, y: number]).
type Target = Element | string | [number, number]; // Element reference, CSS selector or point coordinatemouse_move()
Section titled “mouse_move()”Move the virtual mouse pointer.
browser.webfuseSession.automation.mouse_move( target: Target, persistent: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Mouse pointer target.
persistent
- Whether to keep the mouse pointer on screen (hides after some time by default).
Returns
Section titled “Returns”A promise that resolves once the mouse was moved.
scroll()
Section titled “scroll()”Scrolls the deepest scrollable element under the target by the given amount in the given direction.
browser.webfuseSession.automation.scroll( target: Target, direction: 'vertical' | 'horizontal', amount: number): Promise<void>Parameters
Section titled “Parameters”target
- Scroll(able) target.
direction
- The direction to scroll.
amount
- The amount of pixels to scroll.
Returns
Section titled “Returns”A promise that resolves once scroll ended.
Example
Section titled “Example”await browser.webfuseSession.automation.scroll(100, 'down', '#scrollable');left_click()
Section titled “left_click()”Perform a left (primary) mouse button click.
browser.webfuseSession.automation.left_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
Example
Section titled “Example”await browser.webfuseSession.automation.left_click([100, 250]);middle_click()
Section titled “middle_click()”Perform a middle (wheel) mouse button click.
browser.webfuseSession.automation.middle_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
right_click()
Section titled “right_click()”Perform a right (secondary) mouse button click.
browser.webfuseSession.automation.right_click( target: Target, moveMouse: boolean = false): Promise<void>Parameters
Section titled “Parameters”target
- Click target.
[moveMouse]
- Whether to optionally move the virtual mouse pointer to the target center before the click.
Returns
Section titled “Returns”A promise that resolves once click was performed.
type()
Section titled “type()”browser.webfuseSession.automation.type( target: Target, text: string, moveMouse: boolean = false, overwrite: boolean = false, timePerChar: number = 100): Promise<void>Type text to an element. Typing is natural, i.e. as if a human presses a sequence of keys.
Parameters
Section titled “Parameters”target
- Typing target.
text
- Text to type.
[moveMouse]
- Whether to optionally imply a virtual mouse pointer movement into the center of the target.
[overwrite]
- Whether to overwrite the current value of the target element.
[timePerChar]
- Average time to type a character in milliseconds.
Returns
Section titled “Returns”A promise that resolves once text was typed.
key_press()
Section titled “key_press()”browser.webfuseSession.automation.key_press( target: Target, key: "a" | "b" | ... | "Y" | "Z", options?: { altKey?: boolean; ctrlKey?: boolean; metaKey?: boolean; shiftKey?: boolean; }): Promise<void>Press a key on an element.
Parameters
Section titled “Parameters”target
- Key press target.
key
- Key to press.
[options]
- Booleans to hold down a secondary during the press:
alt,ctrl,meta, orshift.
Returns
Section titled “Returns”A promise that resolves once key was pressed.
wait()
Section titled “wait()”browser.webfuseSession.automation.wait(ms: number):Promise<void>Parameters
Section titled “Parameters”ms
- The amount of milliseconds to wait.
Returns
Section titled “Returns”A promise that resolves once the given time passed.
take_dom_snapshot()
Section titled “take_dom_snapshot()”browser.webfuseSession.automation.take_dom_snapshot(options?: { rootSelector?: string; crossframe?: boolean; revealMaskedElements?: boolean; modifier?: "downsample" | { name: string; params?: unknown[]; };}): Promise<void>Serialize the DOM for various processing purposes, such as for LLM input.
Parameters
Section titled “Parameters”[options]
-
DOM snapshot options:
-
[rootSelector]Selector of the element to designate as the snapshot root (documentElementby default). -
[crossframe]Whether to take include iframe subtrees (falseby default). -
[revealMaskedElements]Whether to include masked elements (falseby default). -
[modifier]Snpashot modifier (Identity by default).nameModifier name.[params]Modifier parameter record.
Returns
Section titled “Returns”A promise that resolves to the snapshot.
Modifiers
Section titled “Modifiers”downsample
Section titled “downsample”Reduce the overall DOM below about 32K (2^15) estimated LLM input tokens.
The resulting DOM can be considered a low resolution variant, which retains the majority of inherent UI features.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ rootSelector: '#app', modifier: 'downsample', })D2Snap
Section titled “D2Snap”Applies the D2Snap algorithm to the DOM. This will reduce its size, while retaining a majority of UI features. The algorithm was developed in order to mitigate the prevalent DOM token size disadvantage.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'D2Snap', params: { hierarchyRatio: 0.4, textRatio: 0.6, attributeRatio: 0.8, // or // k: 0.4, l: 0.6, m: 0.8 options: { assignUniqueIDs: false, keepUnknownElements: false, skipMarkdownTranslation: false, } } } })[modifier.params.options]
[assignUniqueIDs]Whether to add a unique data attributedata-uidto every element in the DOM in order to allow identification of equivalent elements across the original and the downsampled DOM. For example,<button class="btn btn-primary" data-uid="27">Click here!</button>.[keepUnknownElements]Whether to keep unknown (custom) elements in the downsampled DOM.[skipMarkdownTranslation]Whether to skip content HTML to Markdown translation. :::
AdaptiveD2Snap
Section titled “AdaptiveD2Snap”Applies the AdaptiveD2Snap algorithm to the DOM. This is an adaptive version of the D2Snap algorithm that does not require explicit parameters.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'AdaptiveD2Snap', params: { maxTokens: 32768, maxIterations: 3, options: { assignUniqueIDs: false, skipMarkdownTranslation: false, } } } })take_gui_snapshot()
Section titled “take_gui_snapshot()”Serialize the GUI for various processing purposes, such as for LLM input.
browser.webfuseSession.automation.take_gui_snapshot(): Promise<ImageBitmap>