RFM - nth selection process to create a test sample in Rapid Miner . Can someone assist
Given a scored RFM master file , i would like to extract a nth selection test sample . Eg. if the nth slection is 10 then the sample will consist of every 10th record and should create a statistically similar test sample .
400,000 fille will result in a test file 40,00 examples.
Colin
Best Answers
-
earmijo MemberPosts:270Unicorn
I don't claim efficiency or beauty but the code below ought to work.
<宏/ >0 -
Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
You are probably aware of this, but there is also a "sample" operator--it doesn't take exactly every nth record, but it does have parameters for taking either an absolute number of records or a percentage randomly, and if you set the random seed then the results will be reproducible. For most purposes, typically a random sample is sufficient (and may even be preferable) compared to a sample based on a heuristic such as "every nth record."
1
Answers
thank you very much .
Quite simple using the generate ID and then generating sample using the modulus function then filter all with mod 0 .
Excellent
Colin
Hi,
你可以让它与Filt更有效er Example's option to use an expression right away. With that you can save the overhead of Generate Attribute and adding a new column. You simply enter there an expression that evaluates to true or false, where you can use the mod function on the id as in the example above.
Greetings,
Sebastian
优秀的建议@land
here is an implementation of that suggestion
https://github.com/patilbhupendra/Sample_RapidMiner_Processes/blob/master/get%20every%2010th%20Record.rmp
thanks for refining it