"[SOLVED] Empty Word List"

beedaanbeedaan MemberPosts:4Contributor I
edited June 2019 inHelp
Hi All,

I am counting the occurrences of words in a txt document. The text document has abstracts of other documents, as well as the document title. The general format of the file is such:




...

This continues for roughly 36,00 documents. The total size of the document is 46MB. I am expecting to get a word list of word occurrences as a result. What I actually get is an empty word list. Here is my attached process:





<宏/ >






<运营商激活= " true " class = "文本:process_documents" compatibility="5.2.004" expanded="true" height="94" name="Process Documents" width="90" x="447" y="75">


























I used this youtube video as a guide:https://www.youtube.com/watch?feature=endscreen&;NR=1&v=EjD2M4r4mBM

Please let me know what I am doing wrong. Thanks.

Answers

  • MariusHelfMariusHelf RapidMiner Certified Expert, MemberPosts:1,869Unicorn
    Heya,

    it might be helpful if you check the option "create word vector" in the Process Documents operator:)
    Additionally, you are reading only one document, but your pruning settings are configured to ignore words which appear in less than two documents. So for testing I suggest to disable pruning.

    Happy mining,
    Marius
  • beedaanbeedaan MemberPosts:4Contributor I
    Thanks for the help. This worked for me. I have a question though, I got it to work first by creating a word vector. I got it to work again my not creating a word vector. In my results, I still had a word list. What does "create word vector" actually do?
  • MariusHelfMariusHelf RapidMiner Certified Expert, MemberPosts:1,869Unicorn
    It should prevent the creation of the word vector if disabled. However, I did not ever disable the option, because I see no reason why I would not create a wordlist.

    After changing options, it is generally a good idea to hit "enter" or click somewhere on the process pane to make sure that the changes are actually submitted. Maybe the options were not applied when you hit the run button (yes, this needs improvement :-\ )

    Best, Marius
  • beedaanbeedaan MemberPosts:4Contributor I
    Thanks for the response. I'm tinkering around with some of the text association features. I am having issues with the program crashing. I can tell you what I am doing to get these crashes if you are interested.
  • MariusHelfMariusHelf RapidMiner Certified Expert, MemberPosts:1,869Unicorn
    Of course we are interested in that, but please open a new thread for it. If you get a dialog with "Submit Bug" you can also just click that button and describe everything in the dialog which will popup. That way the bug is submitted directly into our bug tracking system and won't get lost in the depths of the forum. Additionally, the bug report will contain some valuable information about the program state at the moment of the crash, which will greatly help us to fix it.
  • beedaanbeedaan MemberPosts:4Contributor I
    太棒了!谢谢你的回复
Sign InorRegisterto comment.