count function for nominal values

lghansselghansse MemberPosts:18Contributor II
edited October 2019 inHelp

Hi,

I have a list of 25nominalattributes, which I would like to aggregate to 1 attribute that counts the 25 said attributes if they have a valid value (being: not missing), but I'm at loss at how to do it in an easy way. I've looked at the aggregate, generate aggregate and generate attributes functions, the aggregate-functions seem only useful for integers and the generate attributes does not have a count-function (at least, not that I've found). I've included an example below for clarity.

att1 att2 att3 att4 att5

valuex valuey missing valuez missing

missing valuex missing valuey missing

-> So the new attribute should have value 3 for example 1, and 2 for example 2.

Anyone has experience with this?

Best Answer

  • Edin_KlapicEdin_Klapic Moderator, Employee, RMResearcher, MemberPosts:299RM Data Scientist
    Solution Accepted

    Hi@lghansse,

    The Operator Generate Aggregation has a aggregation functioncount.

    It works as expected - I get the value 3 for example 1, and 2 for example 2.

    See screenshot below for details.

    Best regards,

    Edin

    image.png

    David_A sgenzer

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn

    Hi Lise,

    If you have installed Python on your computer, you can use the "Execute Python" operator (to download and install via marketplace)

    to perform this task, there is only one line of code.

    Here you can find the process, with your fictive example set.

    The calculated "count_valid_values" attribute is in the last column.

    Here the process :







    <运营商激活= " true " class = "过程”兼容ibility="8.0.001" expanded="true" name="Process">


















    Your fictive example set is in attached file.

    I hope this will be helpful

    Regards,

    Lionel

    sgenzer
  • lghansselghansse MemberPosts:18Contributor II

    Thank you, I tried it before but made the mistake of ticking off the checkbox "ignore missings" because I assumed it would would not count if an attribute had missing values (which would defeat my purpose).

    Thanks for the help!

    Lise

    sgenzer
Sign InorRegisterto comment.