Skip to main content

ZenRows scraping browser: How to build an AI Browser agent

A hands-on guide to using ZenRows Scraping Browser for AI-driven web navigation, extraction, proxy automation, and concurrent remote browsing with Puppeteer.

In this guide, we’ll go over the basics of using ZenRows Scraping Browser. As we go, we’ll build a minimal AI agent that can navigate the web using a remote browser in real-time. Familiarity with JavaScript here is helpful but not required.

By the time you’re finished, you’ll be able to do the following.

  • Create a JavaScript project
  • Connect to a remote browser using the ZenRows browser SDK
  • Use basic browser tools with Puppeteer
  • Connect an AI agent to your remote browser instance

What is ZenRows Scraping Browser?

ZenRows Scraping Browser is a remote browser API. It allows users to open a headless browser instance using Playwright or Puppeteer. Their browser comes with automated proxy management, CAPTCHA avoidance and concurrent browsing to make scaling easy.

Let’s look at these features in detail.

  • Remote browsing: Rather than running a browser on a local machine, users and AI agents can run browsers within the cloud for stable and reliable connections. If power goes out at your office, your web data infrastructure keeps running.
  • Proxy management: Scraping Browser comes with built-in proxy integration. Proxy connections are selected and rotated by ZenRows automatically. This helps provide stable access.
  • CAPTCHA avoidance: Scraping Browser does not solve CAPTCHAs by default, it avoids them. This can lead to some difficulties. However, they do offer a third party CAPTCHA solving integration.
  • Concurrency: This is perhaps the biggest one. Even on their lowest tier plan, Developer, users can run up to 20 browsers concurrently. At their highest tier, Business 500, Scraping Browser supports up to 150 browsers running concurrently.

When a team needs browser automation, the ZenRows Scraping Browser provides them with automated remote browsing — built on top of strong proxy network infrastructure.

Getting started

To get started, you’ll need a ZenRows account and an OpenAI account. Also, make sure you have NodeJS installed — we need it to build JavaScript projects. Once you’ve got your credentials and NodeJS, we can create the basic setup for our project.

Once you’ve got a ZenRows account, navigate to their Scraping Browser page to retrieve your API key. They do allow you to copy the websocket URL if you prefer.

You’ll also need an OpenAI developer account. You can create an account at their API page. Once you’ve got an account, you can go to their API keys page to manage your credentials.

OpenAI API keys page
OpenAI API keys page

Create a new project folder.

mkdir zenrows-demo

Navigate to the new folder.

cd zenrows-demo

Initialize the project using NPM. --y is optional. Here, we’re just using it to skip the interactive setup.

npm init --y

Add our required dependencies. We’re using OpenAI, Puppeteer and the ZenRows SDK.

npm install openai puppeteer-core @zenrows/browser-sdk

Building a remote browser agent

Now, everything’s ready for us to start building. In these next few sections, we’ll go through the code required to build our remote browser agent. We’ll start with a function for running the browser. Then, we’ll create some tools. When we’re done with this, we’ll write a basic runtime for our AI agent.

Running the browser

ScrapingBrowser lets us create a new instance of Scraping Browser. We connect using Puppeteer and a websocket connection. Websockets allow client side software with a redundant connection to the server.

We open a new page in the browser and return the page. If Scraping Browser runs into an error, we return the error message.

//function to use Scraping Browser
async function run(fn) {
  const ws = new ScrapingBrowser({ apiKey: ZENROWS_KEY }).getConnectURL();

  const browser = await puppeteer.connect({
    browserWSEndpoint: ws,
    defaultViewport: null,
    ignoreHTTPSErrors: true
  });

  const page = await browser.newPage();
  let out;

  try {
    out = await fn(page);
  } catch (err) {
    out = `ERROR: ${err.message}`;
  }

  await browser.close();
  return out;
}

Creating tools

This is where the AI agent magic comes from. tools is a JSON object. We have tools called goto, extract and click. Each tool is a field within the tools object. Each tool comes with a description so our AI agent can understand what the tool does.

//browser tools for the agent to call
const tools = {
  goto: {
    description: "Navigate to a URL and return the page HTML.",
    parameters: {
      type: "object",
      properties: { url: { type: "string" } },
      required: ["url"]
    },
    func: async ({ url }) =>
      run(async (p) => {
        await p.goto(url, { waitUntil: "networkidle0" });
        return await p.content();
      })
  },

  extract: {
    description: "Extract visible text from a CSS selector.",
    parameters: {
      type: "object",
      properties: { selector: { type: "string" } },
      required: ["selector"]
    },
    func: async ({ selector }) =>
      run(async (p) => {
        try {
          await p.waitForSelector(selector, { timeout: 3000 });
          return await p.$eval(selector, (el) => el.textContent.trim());
        } catch {
          return `ERROR: selector '${selector}' not found.`;
        }
      })
  },

  click: {
    description: "Click a selector and return the updated HTML.",
    parameters: {
      type: "object",
      properties: { selector: { type: "string" } },
      required: ["selector"]
    },
    func: async ({ selector }) =>
      run(async (p) => {
        try {
          await p.waitForSelector(selector, { timeout: 3000 });
          await p.click(selector);
          await p.waitForTimeout(1000);
          return await p.content();
        } catch {
          return `ERROR: cannot click selector '${selector}'.`;
        }
      })
  }
};

The runtime

The code block below holds our actual runtime. We give the agent a system message so it knows it’s a web scraping agent. The next message is our user prompt — Look for the latest news on Google News and give me a summary. We use a while loop to hold for the agent to run continually while browsing. We use a simple if statement to glue our agent together with the tools. Finally, when the task is complete, we log the AI agent’s output to the console.

//this is our actual runtime
(async () => {
  let messages = [
    { role: "system", content: "You are a web scraping agent with full browser control." },
    { role: "user", content: "Look for the latest news on Google News and give me a summary." }
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: "gpt-5-mini",
      messages,
      tools: Object.entries(tools).map(([name, def]) => ({
        type: "function",
        function: {
          name,
          description: def.description,
          parameters: def.parameters
        }
      })),
      tool_choice: "auto"
    });

    const msg = response.choices[0].message;

    //glue code for the agent to access tools
    if (msg.tool_calls) {
      const call = msg.tool_calls[0];
      const tool = tools[call.function.name];
      const args = JSON.parse(call.function.arguments);

      const result = await tool.func(args);

      messages.push(msg);
      messages.push({
        role: "tool",
        tool_call_id: call.id,
        content: result
      });

      continue;
    }

    console.log("\n=== Agent Output ===\n");
    console.log(msg.content);
    break;
  }
})();

Full agent code

Our full code is laid out below for you to copy and paste. Remember to substitute the API keys with your own. When the agent’s finished, it prints your results. In this case, our AI agent is prompted to look at Google News. Feel free to adjust the prompt to experiment with other sites as well.

const OpenAI = require("openai");
const puppeteer = require("puppeteer-core");
const { ScrapingBrowser } = require("@zenrows/browser-sdk");

//api keys
const OPENAI_KEY = "your-openai-api-key";
const ZENROWS_KEY = "your-zenrows-api-key";

const client = new OpenAI({ apiKey: OPENAI_KEY });

//function to use Scraping Browser
async function run(fn) {
  const ws = new ScrapingBrowser({ apiKey: ZENROWS_KEY }).getConnectURL();

  const browser = await puppeteer.connect({
    browserWSEndpoint: ws,
    defaultViewport: null,
    ignoreHTTPSErrors: true
  });

  const page = await browser.newPage();
  let out;

  try {
    out = await fn(page);
  } catch (err) {
    out = `ERROR: ${err.message}`;
  }

  await browser.close();
  return out;
}

//browser tools for the agent to call
const tools = {
  goto: {
    description: "Navigate to a URL and return the page HTML.",
    parameters: {
      type: "object",
      properties: { url: { type: "string" } },
      required: ["url"]
    },
    func: async ({ url }) =>
      run(async (p) => {
        await p.goto(url, { waitUntil: "networkidle0" });
        return await p.content();
      })
  },

  extract: {
    description: "Extract visible text from a CSS selector.",
    parameters: {
      type: "object",
      properties: { selector: { type: "string" } },
      required: ["selector"]
    },
    func: async ({ selector }) =>
      run(async (p) => {
        try {
          await p.waitForSelector(selector, { timeout: 3000 });
          return await p.$eval(selector, (el) => el.textContent.trim());
        } catch {
          return `ERROR: selector '${selector}' not found.`;
        }
      })
  },

  click: {
    description: "Click a selector and return the updated HTML.",
    parameters: {
      type: "object",
      properties: { selector: { type: "string" } },
      required: ["selector"]
    },
    func: async ({ selector }) =>
      run(async (p) => {
        try {
          await p.waitForSelector(selector, { timeout: 3000 });
          await p.click(selector);
          await p.waitForTimeout(1000);
          return await p.content();
        } catch {
          return `ERROR: cannot click selector '${selector}'.`;
        }
      })
  }
};

//this is our actual runtime
(async () => {
  let messages = [
    { role: "system", content: "You are a web scraping agent with full browser control." },
    { role: "user", content: "Look for the latest news on Google News and give me a summary." }
  ];

  while (true) {
    const response = await client.chat.completions.create({
      model: "gpt-5-mini",
      messages,
      tools: Object.entries(tools).map(([name, def]) => ({
        type: "function",
        function: {
          name,
          description: def.description,
          parameters: def.parameters
        }
      })),
      tool_choice: "auto"
    });

    const msg = response.choices[0].message;

    //glue code for the agent to access tools
    if (msg.tool_calls) {
      const call = msg.tool_calls[0];
      const tool = tools[call.function.name];
      const args = JSON.parse(call.function.arguments);

      const result = await tool.func(args);

      messages.push(msg);
      messages.push({
        role: "tool",
        tool_call_id: call.id,
        content: result
      });

      continue;
    }

    console.log("\n=== Agent Output ===\n");
    console.log(msg.content);
    break;
  }
})();

Agent output

Below is our agent’s output. As you can see, it read through Google News and summarized content. It also cited specific sources that mention each story alongside with dates that the stories were published.

Google News search output
Google News search output

Dealing with CAPTCHAs

Scraping Browser does not come with CAPTCHA solving. Some teams may run into issues with this. If you’re experiencing CAPTCHAs frequently, they do offer a third-party integration using 2captcha.

Go to the integrations page and add you CAPTCHA solver. When you’re finished, click the “Save” button.

CAPTCHA solver integration
CAPTCHA solver integration

Conclusion

The ZenRows Scraping Browser gives teams an automated browser they can run using cloud infrastructure. Tools can easily be passed into the AI agent for usage and writing them is pretty straightforward. This browser does run into issues when dealing with CAPTCHAs but their third party integrations make it possible for developers to overcome this barrier.

Building AI agents is simple once the agent is connected to the tools. From there, programming and debugging can be handled by adjusting the agent prompts.