compare csv files by ID column

giorogogiorogo MemberPosts:13Contributor I
edited November 2018 inHelp

你好,我是新这个优秀的程序;我需要他lp to perform the following task: I have two csv files two common columns (ID and emotion) I would like to create a task to compare these two files and get as a result two documents in which are shown in one all the ids with the same emotion and in another the ids with different emotions; for example id 001 file A felicity emotion, B emotion sadness file will be placed in the file with different emotions. Could you tell me step by step how should I do? Thank you

Tagged:

Best Answer

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Solution Accepted

    Hi again@giorogo,

    You can find here the complete process of what you want to do, based on the@mschmitz' s idea :


























































    <连接from_op = "Read CSV" from_port="output" to_op="Set Role" to_port="example set input"/>
    <连接from_op = "Set Role" from_port="example set output" to_op="Join" to_port="left"/>
    <连接from_op = "Read CSV (2)" from_port="output" to_op="Set Role (2)" to_port="example set input"/>
    <连接from_op = "Set Role (2)" from_port="example set output" to_op="Join" to_port="right"/>
    <连接from_op = "Join" from_port="join" to_op="Filter Examples" to_port="example set input"/>
    <连接from_op = "Filter Examples" from_port="example set output" to_port="result 1"/>
    <连接from_op = "Filter Examples" from_port="unmatched example set" to_port="result 2"/>







    I hope it helps,

    Regards,

    Lionel

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist

    Hi,

    I think what you want to do is a join operator first where the key attribute is "id" in both sides.

    The result is a table like this:

    id annotation emotion [.... other attributes]

    Afterwards, you use a Filter Example operator to split the table into to parts. the ones were annotation=emotion and the other.

    Best,

    Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    sgenzer
  • giorogogiorogo MemberPosts:13Contributor I

    First of all thanks for the reply; unfortunately now the problem is in Filter example; is the configuration in the images correct?

    1.png 14.2K
    2.png 33.2K
    uno.png 56.9K
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn

    Hi@giorogo,

    You have to put a twoSet Roleoperators after your 2Read CSVoperators

    ans set youridattribute asidin the parameter panel.

    Here a screenshot of the process :

    Compare_csv_files.pngttgt

    Regards,

    Lionel

    sgenzer
  • giorogogiorogo MemberPosts:13Contributor I

    I've done but I have this error (see images).

    1.png 55K
  • giorogogiorogo MemberPosts:13Contributor I

    Thank you very much for your help !!! Problem solved! You are very kind

    sgenzer
Sign InorRegisterto comment.