Issue with Web Scraping Product Hunt Visit Button URLs

I’m trying to build a flow that:

  1. Scrapes Product Hunt posts
  2. Gets both the Product Hunt URL and the actual website URL for each product
  3. Writes them to Google Sheets

The challenge is getting the actual website URL. The Visit button has this HTML structure:

<a href="https://teamble.com/?ref=producthunt#" target="_blank" data-test="visit-website-button" data-sentry-component="Button">Visit website</a>

I’ve tried:

  1. AI Web Browsing Agent:

    • Successfully clicks the Visit button
    • But can’t capture the URL from the new tab
  2. Web Agent Scraper (in batch mode):

    • Using data-test=“visit-website-button”
    • Using data-sentry-component=“Button”
    • Getting same URL (teamble.com) for all entries

PS: Also tried different settings with the web agent scraper to no avail.

The main issue seems to be capturing the URL that opens in the new tab after clicking the Visit button with the AI Web Browsing Agent (it’s so close I can feel it). Any suggestions on how to handle this?

Run link: https://www.gumloop.com/pipeline?run_id=fz8oPmsdo4TMRAd9YFQCcB&workbook_id=b1uPrHiGX4v87AimKxScRJ

https://www.gumloop.com/pipeline?run_id=Cqd52fxc4hWPNEMpfMwSjp&workbook_id=b1uPrHiGX4v87AimKxScRJ

Workbook link: https://www.gumloop.com/pipeline?workbook_id=b1uPrHiGX4v87AimKxScRJ&run_id=fz8oPmsdo4TMRAd9YFQCcB

Hey @Valentin_Gallone! If you’re reporting an issue with a flow or an error in a run, please include the run link and make sure it’s shareable so we can take a look.

  1. Find your run link on the history page. Format: https://www.gumloop.com/pipeline?run_id={your_run_id}&workbook_id={workbook_id}

  2. Make it shareable by clicking “Share” → ‘Anyone with the link can view’ in the top-left corner of the flow screen.
    GIF guide

  3. Provide details about the issue—more context helps us troubleshoot faster.

You can find your run history here: https://www.gumloop.com/history

Hey @Valentin_Gallone - You’re super close. You can use the Web Agent Scraper node with Scrape Source action and pass that onto an Extract Data node to find the URL.

The AI Web Browsing Agent node is highly experimental and consumes a lot of credits so this would be a much more efficient option for you.

Eg run link:

  1. https://www.gumloop.com/pipeline?workbook_id=2p2KaoN6AXjNhFFQHYdzJr&run_id=3XShQ3vcrxzkCdwZMKv36a

  2. https://www.gumloop.com/pipeline?workbook_id=2p2KaoN6AXjNhFFQHYdzJr&run_id=3BXqF3tLPb3TskhX9N66Hj

Let me know if this works for you.

1 Like

Thank you so much you da man!

1 Like

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.