Skip to content

Snapshots

Snapshot is a metaphor for the serialization of a web page’s current state. A snapshot can be utilised as LLM context. Given a snapshot, an LLM could hence analyze the respective web page, answer questions about it, and even suggest helpful interactions. Web page snapshots represent a

A GUI snapshot corresponds to a screenshot. It captures the web page visually, just as humans primariliy perceive it. Research has supported that GUI snapshots alone are not a powerful means of LLM input with regard to fueling web agents. However, Webfuse’s Automation AI provides a method to take a GUI snapshot:

const screenshot = await browser.webfuseSession.automation.take_gui_snapshot();

A DOM snapshot captures a web page on code level. It serializes it as HTML, which leverages great code interpretation abilities of LLMs. Webfuse provides some cutting edge DOM snapshot capabilities that boost LLM-based web agent success. In its simples form, a DOM snapshot can be taken as follows:

const html = await browser.webfuseSession.automation.take_dom_snapshot()