Automation API
The automation
object is available in the browser.webfuseSession
namespace in every content script of an extension. If you want to call it from a background script or a popup, you need to send a message to the content script and then call the method of the automation API.
Targeting
Section titled “Targeting”By default, automation actions work on the element below the virtual mouse pointer. A target can optionally be specified in different ways: Either via CSS selector (selector: string
), or absolute point coordinates ([x: number, y: number]
).
type Target = string | [number, number]; // CSS selector or point coordinate
type Target = string | [number, number];
mouse_move()
Section titled “mouse_move()”Move the virtual mouse pointer. The pointer can be used as an implicit target specifier for subsequent actions.
browser.webfuseSession.automation.mouse_move(target: Target): Promise<void>
Parameters
Section titled “Parameters”target
- Mouse pointer target.
Returns
Section titled “Returns”A promise that resolves once the mouse was moved.
scroll()
Section titled “scroll()”Scrolls the deepest scrollable element under the current position of the virtual mouse by the given amount in the given direction. If a selector is provided, it will scroll the element that matches the selector.
browser.webfuseSession.automation.scroll( direction: 'vertical' | 'horizontal', amount: number, target?: Target): Promise<void>
Parameters
Section titled “Parameters”direction
- The direction to scroll.
amount
- The amount of pixels to scroll.
[target]
- Scroll(able) target.
Returns
Section titled “Returns”A promise that resolves once scroll ended.
Example
Section titled “Example”await browser.webfuseSession.automation.scroll(100, 'down', '#scrollable');
left_click()
Section titled “left_click()”Perform a left (primary) mouse button click.
browser.webfuseSession.automation.left_click(target?: Target): Promise<void>
Parameters
Section titled “Parameters”[target]
- Click target.
Returns
Section titled “Returns”A promise that resolves once click was performed.
Example
Section titled “Example”await browser.webfuseSession.automation.left_click([100, 250]);
middle_click()
Section titled “middle_click()”Perform a middle (wheel) mouse button click.
browser.webfuseSession.automation.middle_click(target?: Target): Promise<void>
Parameters
Section titled “Parameters”[target]
- Click target.
Returns
Section titled “Returns”A promise that resolves once click was performed.
right_click()
Section titled “right_click()”Perform a right (secondary) mouse button click.
browser.webfuseSession.automation.right_click(target?: Target): Promise<void>
Parameters
Section titled “Parameters”[target]
- Click target.
Returns
Section titled “Returns”A promise that resolves once click was performed.
type()
Section titled “type()”browser.webfuseSession.automation.type(text: string, target>: Target): Promise<void>
Type text to an element. Typing is natural, i.e. as if a human presses a sequence of keys.
Parameters
Section titled “Parameters”text
- Text to type.
[target]
- Typing target.
Returns
Section titled “Returns”A promise that resolves once text was typed.
key_press()
Section titled “key_press()”browser.webfuseSession.automation.key_press( key: "a" | "b" | ... | "Y" | "Z", options?: { altKey?: boolean; ctrlKey?: boolean; metaKey?: boolean; shiftKey?: boolean; } target?: Target): Promise<void>
Press a key on an element.
Parameters
Section titled “Parameters”key
- Key to press.
[options]
- Booleans to hold down a secondary during the press:
alt
,ctrl
,meta
, orshift
.
[target]
- Key press target.
Returns
Section titled “Returns”A promise that resolves once key was pressed.
wait()
Section titled “wait()”browser.webfuseSession.automation.wait(ms: number):Promise<void>
Parameters
Section titled “Parameters”ms
- The amount of milliseconds to wait.
Returns
Section titled “Returns”A promise that resolves once the given time passed.
take_dom_snapshot()
Section titled “take_dom_snapshot()”browser.webfuseSession.automation.take_dom_snapshot(options?: { rootSelector?: string; crossframe?: boolean; revealMaskedElements?: boolean; modifier?: "downsample" | { name: string; params?: unknown[]; };}): Promise<void>
Serialize the DOM for various processing purposes, such as for LLM input.
Parameters
Section titled “Parameters”[options]
-
DOM snapshot options:
-
[rootSelector]
Selector of the element to designate as the snapshot root (documentElement
by default). -
[crossframe]
Whether to take include iframe subtrees (false
by default). -
[revealMaskedElements]
Whether to include masked elements (false
by default). -
[modifier]
Snpashot modifier (Identity by default).name
Modifier name.[params]
Modifier parameter record.
Returns
Section titled “Returns”A promise that resolves to the snapshot.
Modifiers
Section titled “Modifiers”downsample
Section titled “downsample”Reduce the overall DOM below 2^13
estimated LLM input tokens.
The resulting DOM can be considered a low resolution variant, which retains the majority of inherent UI features.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ rootSelector: '#app', modifier: 'downsample', })
D2Snap
Section titled “D2Snap”Applies the D2Snap algorithm to the DOM. This will reduce its size, while retaining a majority of UI features. The algorithm was developed in order to mitigate the prevalent DOM token size disadvantage.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'D2Snap', params: { hierarchyRatio: 0.4, textRatio: 0.6, attributeRatio: 0.8 // or // k: 0.4, l: 0.6, m: 0.8 } } })
AdaptiveD2Snap
Section titled “AdaptiveD2Snap”Applies the AdaptiveD2Snap algorithm to the DOM. This is an adaptive version of the D2Snap algorithm that does not require explicit parameters.
const domSnapshot = await browser.webfuseSession .automation .take_dom_snapshot({ modifier: { name: 'AdaptiveD2Snap', params: { maxTokens: 32768, maxIterations: 3 } } })
take_gui_snapshot()
Section titled “take_gui_snapshot()”Serialize the GUI for various processing purposes, such as for LLM input.
browser.webfuseSession.automation.take_gui_snapshot(): Promise<ImageBitmap>