create FP-Growth graph

TobiasNehrigTobiasNehrig MemberPosts:41Guru
edited December 2018 inHelp

Hi Experts,

I've a questions about creating a graph form the results of the FP-Growth operator without using the Create Association Rules operator. Is there a way to visualize the FP-Growth results in a graph?









<运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Crawler" width="90" x="45" y="34">

<运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Crawler Spon" width="90" x="45" y="34">


http://www.spiegel.de"/>














<运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (8)" width="90" x="246" y="34"/>





<运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (7)" width="90" x="648" y="34"/>

















<运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (9)" width="90" x="380" y="34"/>



















<运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="124" name="Prepare Data" width="90" x="246" y="34">








































<运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="fp Growth" width="90" x="514" y="34">






















<运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (4)" width="90" x="849" y="34"/>













<运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Co-occurrence" width="90" x="514" y="34">




















<连接from_port = " 1 " to_op = "过程文档from Data" to_port="example set"/>

















best regards

Tobias

Tagged:

Best Answer

  • TobiasNehrigTobiasNehrig MemberPosts:41Guru
    Solution Accepted

    Hi,

    i've found a solution to create a co-occurrence graph based the approach of@bhupendra_patil. After writing the FP-Growth result in a XML-File, I had to read the XML-File two times and create a new ExampleSet.

    Bildschirmfoto vom 2018-08-08 16-56-11.png

















    <参数键= value =“use_default_namespace歧视e"/>
















    <参数键= value =“use_default_namespace歧视e"/>









































    <连接from_op = "重命名Word1”from_port = "的例子set output" to_op="Join (2)" to_port="right"/>







    Tobais

    sgenzer

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn

    Not that I know of, but I would be interested if any other community members know a way to do this!

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
  • TobiasNehrigTobiasNehrig MemberPosts:41Guru

    Hi@Telcontar120,

    I found this post "Writing Association Rules to Exampleset or file" from@bhupendra_patiland I've tried to implement this in my process. But writing the FP-Growth result in a XML file blows nearly my RAM (32 GB) and creates a 8GB file. The mentioned Read XML Operation blows finally my RAM and the Process terminates.









    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Crawler" width="90" x="45" y="34">

    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Crawler Spon" width="90" x="45" y="34">


    http://www.spiegel.de"/>














    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (8)" width="90" x="246" y="34"/>





    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (7)" width="90" x="648" y="34"/>

















    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (9)" width="90" x="380" y="34"/>















    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory" width="90" x="179" y="34"/>
    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Prepare Data" width="90" x="313" y="34">









































    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (2)" width="90" x="45" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (5)" width="90" x="179" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (11)" width="90" x="313" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (12)" width="90" x="447" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (13)" width="90" x="581" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (14)" width="90" x="715" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (15)" width="90" x="849" y="34">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (16)" width="90" x="45" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (9)" width="90" x="179" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (17)" width="90" x="313" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (18)" width="90" x="447" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (19)" width="90" x="581" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (3)" width="90" x="715" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (21)" width="90" x="849" y="136">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (20)" width="90" x="45" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (4)" width="90" x="179" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (8)" width="90" x="313" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (10)" width="90" x="447" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (by Content)" width="90" x="581" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (6)" width="90" x="715" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (7)" width="90" x="849" y="238">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (22)" width="90" x="45" y="340">



    <运营商激活= " true " class = "文本:filter_tokens_by_content" compatibility="8.1.000" expanded="true" height="68" name="Filter Tokens (23)" width="90" x="849" y="340">





































    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (4)" width="90" x="581" y="34"/>























    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="124" name="fp Growth" width="90" x="447" y="34">


























































    <连接from_op = "读取XML“from_port =来说,“输出”port="result 4"/>









  • JeffChowaniecJeffChowaniec Employee, MemberPosts:13RM Data Scientist

    I'm curious as to which version of RM Studio you are using. 8.1 and below has the old versions of FP growth and frequent item sets. You might have to update to 8.2 to get a performance bump.

    sgenzer
  • TobiasNehrigTobiasNehrig MemberPosts:41Guru

    Hi@JeffChowaniec,

    I'm using RapidMiner 8.2.001

  • JeffChowaniecJeffChowaniec Employee, MemberPosts:13RM Data Scientist

    I tried running your process and I found that the web crawl runs for 25+ mins and I wasn't able to finish the process because I need my machine for some other tasks. I have a 32gb machine and I could see it getting taxed pretty hard at some points. Have you tried it with a data set that is a fraction of what you are trying to query? The idea is we want to make sure that even a small data set in this case will run and not take up the available memory before we dedicate a 1 hr+ run time to this.

    sgenzer
  • TobiasNehrigTobiasNehrig MemberPosts:41Guru
    Hi@JeffChowaniec
    I haven't tried to crawl less pages because once I crawled stored it in the repository. This file is to huge to upload it here. Instead I here is a repository file after the Numerical to Binominal Operator as input data for FP-growth.
  • TobiasNehrigTobiasNehrig MemberPosts:41Guru

    Hi,

    I think I've found my problem with the memory. I had to cut the FP-Growth parameter max items per itemset from 0 to 2. Now I struggle with filling the ExampleSet from the XML file "Writing Association Rules to Exampleset or file". In this example Data import wizard fullfills in Step 4 automaticly the column current value. That doesn't happens in my approach and I don't know why.












    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="124" name="FP-Growth Sub" width="90" x="514" y="34">




    <运营商激活= " true " class = " free_memory“compatibility="8.2.001" expanded="true" height="82" name="Free Memory (3)" width="90" x="45" y="34"/>




























    <运营商激活= " true "类=“子流程”薪酬atibility="8.2.001" expanded="true" height="82" name="Create Graph" width="90" x="782" y="187">




















    <参数键= value =“use_default_namespace歧视e"/>



































    <连接from_op = "读取XML“from_port =来说,“输出”op="Rename" to_port="example set input"/>







    <操作符= " true " class = " create_associati激活on_rules" compatibility="8.2.001" expanded="true" height="82" name="Create Association Rules" width="90" x="782" y="34">
















  • TobiasNehrigTobiasNehrig MemberPosts:41Guru

    Hi,

    it's me again.

    I'm trying to sort out how it might be possible add the Item names in@bhupendra_patilapproachWriting-Association-Rules-to-Exampleset-or-file. The approach FP-Growth runs and I see all Columns more or less filed but if I'm using instead the new FP-Growth the Item names are not shown. Has anyone an idea how this is is possible?


























    <运营商激活= " true " class = " fp_growth”那么tibility="8.2.001" expanded="true" height="82" name="FPGrowth" width="90" x="581" y="136">





    <操作符= " true " class = " create_associati激活on_rules" compatibility="8.2.001" expanded="true" height="82" name="Create Association Rules" width="90" x="782" y="238"/>





















    <参数键= value =“use_default_namespace歧视e"/>
















































    <参数键= value =“use_default_namespace歧视e"/>































    <操作符= " true " class = " create_associati激活on_rules" compatibility="8.2.001" expanded="true" height="82" name="Create Association Rules (2)" width="90" x="782" y="442"/>












    <连接from_op = "读取XML“from_port =来说,“输出”op="Rename" to_port="example set input"/>











    If I'm using this approach on my process, than I'll see all the numerical values but no item names.

    best regards

    Tobias

Sign InorRegisterto comment.