How to divide ZIP Code into a cluster analysis?

a_trunka_trunk MemberPosts:4Contributor I
edited December 2018 inHelp

Hello,

sorry for my simple question, but i work not so long with rapidminer and i need it for education. I have a simple case but i do not right solve the problem: I have a dataset of 100.000 Zip Code and Customers numbers and want to analyse the best selling areas in my country. So i decided to use the cluster analyse. The ZIP Code in Germany is from 00001 to 99999 and i want to build clusters for example 00001 to 00500 and for example 70000 to 75000.

My question: How can i tell rapidminer how they build the cluster by this range?

Many many thanks for help.

Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn

    Hi@a_trunk

    You can try to use theSplit Dataoperator to create some partitions of your data, like in this process :





    <宏/ >




    <参数键= value =“generator_type comma_separated_text"/>

















    <连接from_op="Create ExampleSet" from_port="output" to_op="Split Data" to_port="example set"/>
    <连接from_op="Split Data" from_port="partition 1" to_port="result 1"/>
    <连接from_op="Split Data" from_port="partition 2" to_port="result 2"/>
    <连接from_op="Split Data" from_port="partition 3" to_port="result 3"/>








    I hope it helps,

    Regards,

    Lionel

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn

    You might also want to create a new attribute (using Generate Attributes) that corresponds to some higher level groupings of postal codes. Using the prefix function, you can create aggregated groups at the 1 digit level, the 2 digit level, etc. These can then be made available to the clustering algorithm rather than the raw zip code. The problem with the raw zip code is that RapidMiner has no idea it is a hierarchical relationship---it just interprets it as a set of distinct nominal values.

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
Sign InorRegisterto comment.