"Rocchio Algorithm"

dalidali MemberPosts:6Contributor II
edited May 2019 inHelp
Hi,

is there an implementation of the rocchio algorithm in RapidMiner? Or how could I change the k-Nearest-Neighbor to a Rocchio by calculating the average word vector for each class and use only these for classification.

THX in advance.
Tagged:

Answers

  • dalidali MemberPosts:6Contributor II
    Hello again,

    it's pretty sad, that there is no Rocchio in RapidMiner. Now I'm trying to set up my own but already having problems while trying to get the mean of all word vectors of a class.

    Is there a function that averages all given word vectors so I get one centroid vector? I can't find it.

    Thanks for any help.
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    this is possible if you somehow missuse K-Medoids. See the following process for details:










    <参数键=“repository_entry”值= " / /样品s/data/Iris"/>




















    Unfortunately this won't work in the current version because of a bug in the nominal Distance measure using the numerical attributes, too. This is resolved with the coming update at end of next week.

    Greetings,
    Sebastian
  • dalidali MemberPosts:6Contributor II
    Thanx for the reply. I'm really looking forward to try it by the end of the week. I'll tell, if it worked.
  • dalidali MemberPosts:6Contributor II
    well, it looked like a good idea to "misuse K-Medoids" but it's taking hours to calculate - I stopped it after half an hour. I think the problem is, that RM is trying to find my classes, but using the given classes might help speeding up the whole process.

    isn't there another operator to just calculate the mean of some wordvectors? there must be anything like averaging all given vectors and getting the mean vector?! just can't find it.

    thanks a lot for any advice.
  • IngoRMIngoRM Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, Community Manager, RMResearcher, Member, University ProfessorPosts:1,751RM Founder
    Hi,

    I have just uploaded a process which calculates the average values for all attributes grouped by the class and uses the resulting prototypes as input for the k-NN learner. It might be that you need a recent RapidMiner version since this process makes use of a relatively new feature of the operator "Aggregate", namely to directly aggregate a set of attributes with the same default function. Otherwise you will have to define all aggegations for all attributes manually which is of course not really possible for word vectors...

    The description of the process on myExperiment can be found at

    http://www.myexperiment.org/workflows/1917.html

    你可以直接从myExper下载过程iment within RapidMiner (which I strongly recommend) by using the Community Extension of RapidMiner. Just install the extension and activate the "MyExperiment Browser" view. Then you can easily search for processes and download them. The process is called "Rocchio".

    Cheers,
    Ingo
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    let me mention that this is possible only with the 5.1.002+ version released a week before.

    Some problems become outdated really fast...

    Greetings,
    Sebastian
Sign InorRegisterto comment.