Hi,
I am trying to make an automation that would scrape Trustpilot reviews for a brand that I insert into an Interface. I would insert https://www.trustpilot.com/review/brand.com.
The automation should check all the pages with the reviews (pagination). The output should go into a google sheet with predefined columns (Number of stars, Content of the review…)
There are several issues that I have come across - I have made a separate subflow just to produce URL’s with ?page=x I believe this works, although I can not figure out how I would be able to make the automation work so that I wouldn’t have to insert the URL into the Combine page Node but would just insert brand’s URL on the Trustpilot site (The inserted URL in the interface should be enough - https://www.trustpilot.com/review/brand.com)
The issue with the main flow that feeds from the output of the subflow (?page=x URL’s) is that it provides the same results in google sheet for all different URL’s (?page=x)-
Basically it feeds the final google sheet with the same content x-times.
Hey @Sas! If you’re reporting an issue with a flow or an error in a run, please include the run link and make sure it’s shareable so we can take a look.
Find your run link on the history page. Format: https://www.gumloop.com/pipeline?run_id={{your_run_id}}&workbook_id={{workbook_id}}
Make it shareable by clicking “Share” → ‘Anyone with the link can view’ in the top-left corner of the flow screen.
Provide details about the issue—more context helps us troubleshoot faster.
Hey @Sas – I believe you’re super close and have set the flow properly, the main issue that you’re facing is this correct:
The issue with the main flow that feeds from the output of the subflow (?page=x URL’s) is that it provides the same results in google sheet for all different URL’s (?page=x)-
Basically it feeds the final google sheet with the same content x-times.
Can you share the run link for this so I can view the inputs/outputs please? You can find the run link on the https://www.gumloop.com/history page or through the Previous Runs tab on the canvas.
Thanks! If you delete the Extract Data node and add it back, that should resolve the issue. The key is to disable the Extract List option to avoid getting a List of List output.
If you do want to extract a list from the Extract Data node, the best approach is to create a subflow with the Website Scraper, Extract Data, and Google Sheet nodes. Then, wrap that subflow in an error shield—this will make the flow more robust overall.