Process Documents multiple times to get TF-IDF and TO in one output file

websiteguywebsiteguy MemberPosts:24Maven
edited November 2018 inHelp
Hi , this is my first post, so hello all.

Ok sorted that using multiply but need the term frequency but not total occurances but by document.
So if the word cheap appears in both documents I need to get the amount of occurances in document A and the amount of occurances in document B and NOT the combined total off occurances across both documents.

anyone healp me out with this?? cheers,





Best Answer

  • websiteguywebsiteguy MemberPosts:24Maven
    Solution Accepted
    thanks thats a good idea i had not thought of that.

Answers

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn
    How is your process setup? If you use the option Term Occurances it will tell you how many times a word appears in each document.

    (This can be handy in large datasets that are often growing and you have limited memory because you can then batch create TF-IDF by storing the term occurances for each document and calculating the TF-IDF as needed).
Sign InorRegisterto comment.