Product Hunt yearly LeaderBoard to Google Sheet

learngumloop · February 16, 2025, 2:46am

I’m trying to scrape the Product Hunt yearly leaderboard and save the data to Google Sheets. 2023 | Product Hunt

My current flow uses Browser Extension Input but keeps timing out. Here’s my flow link: https://www.gumloop.com/pipeline?run_id=2oCFiGm5Wgxt6N3YBNp46B&workbook_id=tizj2qn9JMdg5k3jM4usks

I need help to build a flow:

either with the Browser Extension approach or with Website Scraper
Ensuring proper connection to Extract Data node
Maintaining the current Google Sheets output format

Would appreciate any guidance on getting this flow working; or if you have a flow that is already working, please link below.

Arslan · February 16, 2025, 3:46am

Hey, have you tried using the “Website Scraper” node with “advanced” mode toggled? Should be more reliable than the Browser Extension approach (for publicly available websites) to get the content in the first place. You can then connect this output to the “Extract Data node”

learngumloop · February 16, 2025, 5:13am

Hi @Arslan - thanks for taking the time to respond.
Yes. tried; didnt work for some reason; Scaraper → Data Extract note → spreadsheet; but output was blank.

Arslan · February 16, 2025, 5:15am

Hey @learngumloop, no worries, will try to look into this now. I’m new to the flow building as well, so might take some time, but Gumloop is a powerful tool so shouldn’t be too hard. Will update as soon as I have something working

Arslan · February 16, 2025, 5:49am

@learngumloop this is working on my end: https://www.gumloop.com/pipeline?workbook_id=3P7BHKkbj3YFJ3SNx6Bbie

learngumloop · February 16, 2025, 5:56am

@Arslan - Thank you. Its working, however, it seems its limiting to 18-20 rows max; what can we do to simulate the functionality of Page Down or Show More and extract the full list ? ( not just the top 10 or 20 ) ?

Arslan · February 16, 2025, 6:00am

That’s strange @learngumloop, I was able to extract 68 entries with this flow, might be due to the model not being deterministic? How many do you need?

learngumloop · February 16, 2025, 6:07am

@Arslan - I am just learning GumLoop tool; so want to simulate the entire page load of ProductHunt till the end of the list; as PH is a good example directory listing. ( I am aiming to learn the extraction methods here - from completeness perspective - so the job finishes with full intended dataset )

hope that explains my learning objective!

Arslan · February 16, 2025, 6:11am

gotcha @learngumloop I’m trying to scroll down manually but it seems almost like an infinite (or very very large) load, so not sure if you’d be able to get all the listings here just due to the sheer size of the page and scrolling required. Even if you were to use actions, it would probably time out due to the time it takes to scroll until “the end”

Arslan · February 16, 2025, 6:16am

i believe it’s all the launches for that year on that page if you keep scrolling (40,000+ launches). That would probably time out any agent/web scraper

learngumloop · February 16, 2025, 6:26am

@Arslan - I was trying to compare this feature with other tools like RoboMotion or RTILA - which have a NextPage or LoadMore ButtonClick function, along with a wait clock; however those tools are quite hard to work with so was just hoping gumloop would of some help here.

That said, I consider the original help request to be Solved for sure. Thanks for the collaboration.

system · February 16, 2025, 7:27am

This topic was automatically closed 60 minutes after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Seeking Help: Automating Company List Extraction to Google Sheets Get Help Website-Scraper	3	25	March 4, 2025
Amazon Content Review Get Help Website-Scraper , Google-Sheets-Reader	5	32	May 20, 2025
Scraping Trustpilot reviews Get Help Website-Scraper	6	56	April 14, 2025
Extract to table is failing for google sheets Get Help Google-Sheets-Writer	3	51	March 11, 2025
Impossible to extract data in google sheet Get Help Extract-Data	3	68	April 5, 2025

Product Hunt yearly LeaderBoard to Google Sheet

Related topics