CrawlWeb - Crawl only certain parts of a website

ojuarezojuarez MemberPosts:5Contributor II
edited June 2019 inHelp
Hello.

I am crawling a website with "CrawlWeb" operator and it seems to be doing ok. The problem is when I want to crawl only one area of the site:

I am using the following values.

url:http://www.autopredios.com
crawling rules: .+vans.+


I only want to crawl:http://www.autopredios.com/vehicles


Do I have to do it by changin the URL or by changing the crawling rules. I tried changing the URL and doesn't seems to exclude other directories.


Tagged:

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, MemberPosts:1,869Unicorn
    You definitely have to change to URL. If it goes to other parts of the website, you have to tune your crawling rules.

    Best regards,
    Marius
Sign InorRegisterto comment.