"[Solved]Crawling rules"
I'm trying to crawl a bookingsite for hotels. I want to crawl the reviews. For example the url:http://www.tripadvisor.nl/Hotel_Review-g188590-d2333086-Reviews-EasyHotel_Amsterdam-Amsterdam_North_Holland_Province.html#REVIEWS
I use Crawl web as a operater but I don't get output.
http://www.tripadvisor.nl/Hotel_Review-g188590-d2333086-Reviews-EasyHotel_Amsterdam-Amsterdam_North_Holland_Province.html#REVIEWS"/>
Can anybody tell me what I,m doing wrong?
Thanxs, Arno
I use Crawl web as a operater but I don't get output.
Can anybody tell me what I,m doing wrong?
Thanxs, Arno
Tagged:
0
Answers
your rule for storing misses the -or10. Use copy paste next time
Arno
You helped me a great deal with crawling this url:http://www.tripadvisor.nl/Hotel_Review-g188590-d2333086-Reviews-EasyHotel_Amsterdam-Amsterdam_North_Holland_Province.html#REVIEWS
Now I created a xpath to retrieve the reviews. The xpath works in Google dobs but not in Rapidminor. The reason is that I have to crawl following url:
http://www.tripadvisor.nl/ShowUserReviews-g188590-d2333086-r155685828-EasyHotel_Amsterdam-Amsterdam_North_Holland_Province.html#
They lead to the same reviews. I like to use Rapidminer to follow the pages. The only thing that changes going to a next page is for example -r155685828. The URL of the next page is the same , exept the r#. This hans changed in r162587896.
My proces is:
Can you ones help me again?
Thanxs, Arno