RFM - nth selection process to create a test sample in Rapid Miner . Can someone assist

cwoocwoo MemberPosts:10Contributor II
edited November 2018 inHelp

Given a scored RFM master file , i would like to extract a nth selection test sample . Eg. if the nth slection is 10 then the sample will consist of every 10th record and should create a statistically similar test sample .

400,000 fille will result in a test file 40,00 examples.

Colin

Tagged:

Best Answers

  • earmijoearmijo MemberPosts:270Unicorn
    Solution Accepted

    I don't claim efficiency or beauty but the code below ought to work.






    <宏/ >


























  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    Solution Accepted

    You are probably aware of this, but there is also a "sample" operator--it doesn't take exactly every nth record, but it does have parameters for taking either an absolute number of records or a percentage randomly, and if you set the random seed then the results will be reproducible. For most purposes, typically a random sample is sufficient (and may even be preferable) compared to a sample based on a heuristic such as "every nth record."

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    bhupendra_patil

Answers

  • cwoocwoo MemberPosts:10Contributor II

    thank you very much .

    Quite simple using the generate ID and then generating sample using the modulus function then filter all with mod 0 .

    Excellent

    Colin

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn

    Hi,

    你可以让它与Filt更有效er Example's option to use an expression right away. With that you can save the overhead of Generate Attribute and adding a new column. You simply enter there an expression that evaluates to true or false, where you can use the mod function on the id as in the example above.

    Greetings,

    Sebastian

  • bhupendra_patilbhupendra_patil Administrator, Employee, MemberPosts:168RM Data Scientist
  • cwoocwoo MemberPosts:10Contributor II

    thanks for refining it

Sign InorRegisterto comment.