Deployment of models with dummy encoded attributes

sectynsectyn MemberPosts:25Maven
I have a categorical attribute for which I am doing dummy encoding and this generates a few extra columns. Now supposing my score data does not have some of those columns, during scoring of the data, it mentions that the columns which are in the train data but not the score data will be augmented by using mean or mode values. Where can I specify that it should be mean or mode? Or if I do not want either the mean or mode, but just 0, how can I do that?

Best Answer

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,421RM Data Scientist
    Solution Accepted
    Hi,
    what you want to use is the preprocessing model of Nominal to Numerical. If you apply this, then you get exactly the same columns as in training.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

Sign InorRegisterto comment.