How to delete rows based a list of values

RitikaRitika MemberPosts:11Newbie
Hi! I have two datasets where the first one is a large set with a list of names and info associated with the names and the second is a smaller set containing only names. I want to delete the rows in the first set which have names not included in the second dataset. I know this is possible with the "filter examples" operator, but I do not want to manually input the filters (there are more than 100). Is there an operator that could read a file and delete the rows accordingly in another file?

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Hi@Ritika,

    You can find in attached file an example of process which performs your task using theSet Minusoperator.
    You can adapt it to your use case.

    希望这有助于

    Regards,

    Lionel


  • RitikaRitika MemberPosts:11Newbie

    Hello Lionel,

    我得到同样的发作formed error. Sorry about this. Could you send the code?

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    @Ritika,

    Yes, sure :

    < ?xml version = " 1.0 " encoding = " utf - 8 " ?> <过程版本sion="9.9.002">                                                                                
    Regards,

    Lionel

  • RitikaRitika MemberPosts:11Newbie
    Hi Lionel,

    Sorry for the late response, but yes, this worked! Is there also a way to remove instances if the tablecontainsthose values? I believe this process works for only times when the table contains thoseexact values. In other words, say I wanted to keep the name Mike and there are instances of Mike Anderson and Mike Brown; I would want to keep both of them regardless of the last name -- I'm just looking for values that contain Mike.
Sign InorRegisterto comment.