Help Needed with Automating Pagination in Custom Node

Hello Gumloop Community,

I’m working on an automation flow to scrape doctor profiles from a paginated listing page on Gumloop, but I’m running into some issues with pagination in my custom “Page Navigator” node .

:small_blue_diamond: My Setup:

• I have a working automation flow that:

  1. Extracts all profile links from a listing page.

  2. Scrapes each profile individually and extracts details (name, specialty, address, phone, email).

  3. Saves the extracted data into Google Sheets.

• The problem is the listing page URL does not change when navigating pages (AJAX-based pagination).

• I added a “Page Navigator” custom node to click the “Next Page” button and navigate through all pages.

:mag: Issues I’m Facing:

:one: Pagination is not working correctly.

• My Page Navigator executes, but it doesn’t move beyond Page 1 or keeps extracting the same page.

:two: Gumloop expects a list but receives a single URL input.

• The input node provides a single listing page URL, but the Page Navigator expects a list.

:three: Potential bot detection issues (reCAPTCHA appearing randomly).

• Sometimes, instead of loading new profiles, the page returns a Google reCAPTCHA in the extracted content.

:rocket: What I’ve Tried So Far:

:white_check_mark: Converting the input URL into a list using a List Operations node.

:white_check_mark: Modifying the Page Navigator script to:

• Click “Next Page” only if visible.

• Wait for new content to load before proceeding.

• Stop pagination when no more pages exist.

:white_check_mark: Checking the output logs to verify if new data loads after clicking “Next”.

:white_check_mark: Trying to detect AJAX-loaded content before moving to the next step.

:hammer_and_wrench: What I Need Help With

:one: How can I ensure that my Page Navigator correctly moves through pages and loads new results?

:two: What’s the best way to detect if a new page has loaded (instead of extracting the same data repeatedly)?

:three: How can I prevent reCAPTCHA from blocking my automation?

:four: Should I handle AJAX-based pagination differently in Gumloop?

Any insights, suggestions, or best practices would be greatly appreciated! :rocket: Thanks in advance for your help.

Link to my workbook: https://www.gumloop.com/pipeline?workbook_id=daLprHVz33GRxZjzM67HJR&run_id=9GLm4gh5eTR7NY4AYNJghJ

This will do the job (I think)

1 Like

The website i am trying to extract data from through pagination automation does not change URLs on new pages (it uses AJAX for pagination) unfortunately. Hence the issue. :sob:

Hey @Manishdwivedi - This is going to be tricky. The custom node in your flow labelled “Page Navigator” - what is that doing exactly?

If automating this end to end is not imperative you can try the Gumloop Chrome Extension:
Doc: https://docs.gumloop.com/nodes/browser_extension/browser_extension_input
Tutorial: https://www.loom.com/share/6b343be195ba4a55a66ce26894b303f9

The extension takes the screen you’re on as input and performs the set action on it (eg. scraping or screenshot), hence you can navigate through the pages and run the flow via the extension on each page.

This topic was automatically closed 4 days after the last reply. New replies are no longer allowed.