Comparing texts.. is Cross distance - the right approach?
Hello community,
we have a project in which we want to compare learning contents of our university course (script) with different Udemy courses.
we have a project in which we want to compare learning contents of our university course (script) with different Udemy courses.
Reference set: We have read in the script of our professor and a book of the lecture as a .PDF document and generated a list of words with the Text Processing Extension, which are our reference.
We now want to compare this reference list with the results from our Udemy content to check which course has the least distance to our reference list based on the course description. So we want to rank the list of courses.
We have already built 3 processes including data, which I have attached to this post. Unfortunately, we don't know whether the cross-distance approach is the right one at the moment?
Many greetings
Best Answer
Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635
This seems similar to the earlier question. I am not sure there is a single right answer here. I think cross-distances is one suitable approach. You could also look at clustering or LDA topic extraction to supplement.5