Dictionary Approach: Avoid multiple count of words

FeliceFelice MemberPosts:3Newbie
edited November 2019 inHelp
Hi, I have a problem with the Dictionary Approach in Text-Mining. The dictionary contains the words digit and digital acceleration. My process counts the ouccurance of digital acceleration double, so once as digit and once as digit acceleration.
Can you recommend an operator which enables that only the occurence of digital acceleration is counted. So that in the end I have only one ouccurance.

为帮助坦克!

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,314RM Data Scientist
    Hi,
    what operator did you use to do it? Can you maybe post the XML?

    Best,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • FeliceFelice MemberPosts:3Newbie
    Hi Martin, thanks for your reply! Attached you can find the xml.
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,314RM Data Scientist
    you can just switch to binary occurances? Then it is only counting if, not how often a word occurs.

    Cheers,
    Martin
    - Head of Data Science Services at RapidMiner -
    Dortmund, Germany
  • FeliceFelice MemberPosts:3Newbie
    Hi Martin,
    thanks for your reply. But I need the quantity of occurences, not only if a word occurs. I just want to avoid that longer word combination like digit accel are counted also as digit.

    Thanks for helping!
Sign InorRegisterto comment.