I’m trying to build a flow that:
- Scrapes Product Hunt posts
- Gets both the Product Hunt URL and the actual website URL for each product
- Writes them to Google Sheets
The challenge is getting the actual website URL. The Visit button has this HTML structure:
<a href="https://teamble.com/?ref=producthunt#" target="_blank" data-test="visit-website-button" data-sentry-component="Button">Visit website</a>
I’ve tried:
-
AI Web Browsing Agent:
- Successfully clicks the Visit button
- But can’t capture the URL from the new tab
-
Web Agent Scraper (in batch mode):
- Using data-test=“visit-website-button”
- Using data-sentry-component=“Button”
- Getting same URL (teamble.com) for all entries
PS: Also tried different settings with the web agent scraper to no avail.
The main issue seems to be capturing the URL that opens in the new tab after clicking the Visit button with the AI Web Browsing Agent (it’s so close I can feel it). Any suggestions on how to handle this?
Run link: https://www.gumloop.com/pipeline?run_id=fz8oPmsdo4TMRAd9YFQCcB&workbook_id=b1uPrHiGX4v87AimKxScRJ
https://www.gumloop.com/pipeline?run_id=Cqd52fxc4hWPNEMpfMwSjp&workbook_id=b1uPrHiGX4v87AimKxScRJ
Workbook link: https://www.gumloop.com/pipeline?workbook_id=b1uPrHiGX4v87AimKxScRJ&run_id=fz8oPmsdo4TMRAd9YFQCcB