Not extracting the actual social media link that is on the webpage

In some cases it is adding up social media links that do not actually exist on the webpage. For some reason it is not extracting the actual social media link that is on the webpage.

Here is one of URLs I am testing the Gumloop with.
mrkcleaning.com

https://www.gumloop.com/pipeline?workbook_id=urNSsusSa212BS5zwy2E1f

Hey @ksandersbbb - Could you please share the run link from the https://www.gumloop.com/history page so I can take a look at the run with that URL?

https://www.gumloop.com/pipeline?run_id=UyKBogyxHeoxB7PAcFTDMy&workbook_id=urNSsusSa212BS5zwy2E1f

Thank you! In this case the scraped content did not have the links to the social media handles, you can verify that by viewing the output of the scraped content:

This means the AI likely made-up the answer here. Try enabling ‘advanced scraping’ in the website scraper node and run the flow again, let me know if that works for you.

I made that change, and checked the box for advanced scraping. Now it is not finding a social media links.

What do you think I should do?

Can you share the latest run link please?

I tried several different AIs.

https://www.gumloop.com/pipeline?run_id=LDAgfVWzbKhuJfkLSWrgfZ&workbook_id=urNSsusSa212BS5zwy2E1f

https://www.gumloop.com/pipeline?run_id=KbAMPKFqLSBNtJmdud6kdx&workbook_id=urNSsusSa212BS5zwy2E1f

https://www.gumloop.com/pipeline?run_id=ZCuBCoUEyefNvxG4ZnUfFf&workbook_id=urNSsusSa212BS5zwy2E1f

https://www.gumloop.com/pipeline?run_id=8ufgzfrBWvfeP3nkaHBxxu&workbook_id=urNSsusSa212BS5zwy2E1f

https://www.gumloop.com/pipeline?run_id=Jcp3mCrLeN74jZ3VGMdoy9&workbook_id=urNSsusSa212BS5zwy2E1f

I see. For some reason the scraped content is not picking up the social media URLs however if you switch the website scraper node with the Web Agent Scraper node and set the action to ‘Get All URLs’ the flow should work as expected.

Here’s an example: https://www.gumloop.com/pipeline?workbook_id=who7ogLiwfsjvgiFWtWbM3&run_id=PTc7ptZYy5ScmoZgVqsY8F

1 Like

This topic was solved and automatically closed 4 days after the last reply. New replies are no longer allowed.