"to ask about data sampling"

m_r_nourm_r_nour MemberPosts:35Guru
edited May 2019 inHelp
Hi all

I have an unbalanced dataset . No of data in a class is 500 time more than No. of a data in other groups.

and I want to re sample such that the number of sample in all group is same.
How can I do that?
I tried to use sampling techniques but all of them just re sample and save ratio of number of sample in groups

Thank you for your consideration and time in advance

问候
REZA
Tagged:

Answers

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    which RapidMiner version do you use?

    Greetings,
    Sebastian
  • m_r_nourm_r_nour MemberPosts:35Guru

    ver 4.6


    to clarification, I want to do this balanced sampling several times and make an average of them performance result to know overall performance in this method


    thanks

    问候
    REZA
  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    I think there are several possibilities you could use:
    If you are going to use a learner supporting example weights, you could use the EqualLabelWeighting. This will not sample the number of attributes, but equalizes the total weight assigned to each label. That might be even better, because no examples will be lost at all.
    Another possibility would be to split the example set several times depending on the label and sample each subset to the same size. After this, all subsets would have to be merged and viola: You have a balanced example set.
    If this becomes unhandy, because you have to many label values, you might use the ValueIterator and an IOStorer and IORetriever...
    Ok, seems to be rather complex. Here's how it would work:

    <操作符名称= " ExampleSetGenerator" class="ExampleSetGenerator" breakpoints="after">
















    <操作符名称= " ExampleSetMerge" class="ExampleSetMerge">















    Hope this will help you, understand what I'm suggesting.

    Greetings,
    Sebastian
  • m_r_nourm_r_nour MemberPosts:35Guru
    thanks
Sign InorRegisterto comment.