Website Scrapper with Screenshot? + questions

Hello guys,

I am on my first day trying Gumloop and I like it a lot, but I have some questions and maybe some experts can help :slight_smile:

I see that when scrapping a website, sometimes it translate the content to english (from spanish), and that makes the analysis less robust for my case.

  • Is it possible to avoid this translation and force the original language for the scrapped content?
  • Is there a node or a way to: 1. Access the website / 2. Remove Cookies or pop ups. / 3. Make a screenshot. / 4. Send that screenshot to AI to analyze content. / 5. Extract data ??
    Maybe can be done with custom node?

Thanks in advance and have a nice day!
Javi

Hey @Javi_U! If you’re reporting an issue with a flow or an error in a run, please include the run link and make sure it’s shareable so we can take a look.

  1. Find your run link on the history page. Format: https://www.gumloop.com/pipeline?run_id={{your_run_id}}&workbook_id={{workbook_id}}

  2. Make it shareable by clicking “Share” → ‘Anyone with the link can view’ in the top-left corner of the flow screen.
    GIF guide

  3. Provide details about the issue—more context helps us troubleshoot faster.

You can find your run history here: https://www.gumloop.com/history

Hey @Javi_U –

  1. Can you share an example of this? You can find the run link on the history page or through the Previous Runs tab on the canvas.

Just a heads-up — the scraper sees what you’d see in incognito mode on the site. So if the default view is in English, it’ll scrape the English content. Maybe double-check if there’s a language filter enabled somewhere on the site for you?

  1. You can try using the Web Agent Scraper node for that — it has a Screenshot Page action built in. A custom node could also work well here.

https://www.gumloop.com/pipeline?run_id=fVKevRPqx4z9b86jmHTA9C&workbook_id=3VPQLqBYjXt5KuWWg4STEf

The website is https://www.iberdrola.es/luz/plan-online-tres-periodos (default in spanish), and a big part of the scrapped web content is in english, making some extracted data wrong.

Thank you!

The actual error for the sheet writing step can be fixed by adding Google Drive credentials here: Gumloop | Settings

On the workflow I noticed that the final step in the Web Agent Scraper node is Screenshot Full Page, which outputs the image link. To accurately read the image, you’ll have to use an Analyze Image node, enable Use Link and pass the link the link in the input field.

Let me know if that works for you. If not, please share the run link after doing these edits and please share access by adding my email (wasay@gumloop.com) or setting the workbook to “anyone with the link can view” from the share button on the run link.

Ok, I will try that.

Did you see why when using the Web Scrapper (no image mode), is giving some part of the web content in english?
https://www.gumloop.com/pipeline?run_id=fVKevRPqx4z9b86jmHTA9C&workbook_id=3VPQLqBYjXt5KuWWg4STEf (this link and not the one I sent through Help Ticket, which was with the Web Agent Scrapper)

Thanks for your help!

Hey! This would be a limitation of the node right now, it uses a proxy which might be causing this.

This topic was automatically closed 3 days after the last reply. New replies are no longer allowed.