"How to create an association matrix instead of the rules?"

eldenosoeldenoso MemberPosts:65Contributor I
edited June 2019 inHelp

Hello altogether,

the example set which I have contains the transition of customers between different hotels for four years. I already did the basket analysis but am not satisfied with the result due to it's lack of visualization.

What I want to achieve is kind of an association matrix for example:

Product A was bought again 80 times.
Product B was bought again 100 times.
20 Customers who bought Product B also bought Product A.

The Matrix (in percentage) would look like this:
A B
A 1 0,2

B 0,25 1

So an unsymmetrical matrix is created, which then could be visualised by xy-scatter with different circle sizes.
The problem is I don't know how to get to this matrix. My starting point would be to pivot and aggregate the data so that I get to the matrix-format.

Thank you:)

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    Did you use the Association Rules to Exampleset operator?

    I was just faced with creating a similar type of matrix this morning but I haven't solved it yet.

  • eldenosoeldenoso MemberPosts:65Contributor I

    Thank you Thomas,

    yes I already used it, but is the association rule really suitable for that? Isn't it just a thing of aggregating or counting?

    Best

    Philipp

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn

    Off the top of my head this morning I don't know how the matrix would look for more products than your example.

    在你的例子中,有一个操作员统计istics extension that does exactly this, so you can loop it to produce one for each product.

    See below rather rushed example







    <操作符法ivated="true" class="process" compatibility="7.4.000" expanded="true" name="Process">

    <操作符法ivated="true" class="retrieve" compatibility="7.4.000" expanded="true" height="68" name="Load Transactions" width="90" x="112" y="187">


    <操作符法ivated="true" class="aggregate" compatibility="6.0.006" expanded="true" height="82" name="Aggregate" width="90" x="112" y="336">





    <操作符法ivated="true" class="pivot" compatibility="7.4.000" expanded="true" height="82" name="Pivot" width="90" x="246" y="336">



    <操作符法ivated="true" class="rename_by_replacing" compatibility="7.4.000" expanded="true" height="82" name="Rename by Replacing" width="90" x="380" y="336">



    <操作符法ivated="true" class="replace_missing_values" compatibility="7.4.000" expanded="true" height="103" name="Replace Missing Values" width="90" x="112" y="442">



    <操作符法ivated="true" class="numerical_to_binominal" compatibility="6.0.003" expanded="true" height="82" name="Numerical to Binominal" width="90" x="246" y="442"/>
    <操作符法ivated="true" class="set_role" compatibility="7.4.000" expanded="true" height="82" name="Set Role" width="90" x="380" y="442">




    <操作符法ivated="true" class="rmx_stat:cross_table" compatibility="1.3.004" expanded="true" height="82" name="Extract Cross Table" width="90" x="514" y="340">


















    MARKET BASKET ANALYSIS<br>Model associations between products by determining sets of items frequently purchased together and building association rules to derive recommendations.
    Step 1:<br/>Load transaction data containing a transaction id, a product id and a quantifier. The data denotes how many times a certain product has been purchased as part of a transactions.
    <br> <br> <br> <br> <br> <br> <br> <br> <br> <br> <br> <br> <br> Step 2:<br>Edit, transform &amp; load (ETL) - Aggregate transaction data to account for multiple occurrences of the same product in a transaction. Pivot the data so that each transaction is represented by a row. Transform purchase amounts to binary &quot;product purchased yes/no &quot; indicators.<br>
    Step 3:<br/>Using FP-Growth, determine frequent item sets. A frequent item sets denotes that the items (products) in the set have been purchased together frequently, i.e. in a certain ratio of transactions. This ratio is given by the support of the item set.
    <br> <br> <br> <br> <br> <br> Step 4:<br/>Create association rules which can be used for product recommendations depending on the confidences of the rules.<br>
    Outputs: association rules, frequent item set<br>



    Thomas_Ott yyhuang
  • eldenosoeldenoso MemberPosts:65Contributor I

    Thank you Edward,

    but as you said, the operator just works for two attributes, thus two products.

    I found a website, which explains exactly what I want to achieve, in Excel. Would this also be possible in RapidMiner somehow?

    https://help.xlstat.com/customer/en/portal/articles/2062425-how-can-associations-rules-help-for-market-basket-analysis?b_id=9283

    The photo shows this "influence matrix".

    Thank you:)

    assoc4.jpg 248.8K
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist

    Sure,

    its just a pivot and a Replace Missing Values operator.

    ~Martin

    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    yyhuang
  • eldenosoeldenoso MemberPosts:65Contributor I

    不是那么容易,虽然?

    My example set contains 4 attributes (different years) and the the examples, which resemble the different product which was bought each year by each customer. What I have done now (since time doesn't play a role in the association matrix) is that I removed dublicates within each row (customer). So if a customer has bought product A, B, C, A within four years, it is now reduced to A, B, C since the only information which is important is, that these products were bought together.

    Now something like counting every combination has to happen. But I'm stuck with this problem now. Because the normal association rule to example set doesn't allow me to convert it to a matrix.

    Thank you:)

  • JEdwardJEdward RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:578Unicorn

    Are you meaning something like this then?







    <操作符法ivated="true" class="process" compatibility="6.0.002" expanded="true" name="Process">

    <操作符法ivated="true" class="retrieve" compatibility="7.4.000" expanded="true" height="68" name="Iris" width="90" x="45" y="120">


    <操作符法ivated="true" class="discretize_by_frequency" compatibility="7.1.001" expanded="true" height="103" name="Discretize by Frequency" width="90" x="179" y="120">



    <操作符法ivated="true" class="nominal_to_binominal" compatibility="7.1.001" expanded="true" height="103" name="Nominal to Binominal" width="90" x="313" y="120">

    <参数键= value =“use_underscore_in_name true"/>

    <操作符法ivated="true" class="fp_growth" compatibility="7.4.000" expanded="true" height="82" name="FPGrowth" width="90" x="447" y="120">




    <操作符法ivated="true" class="create_association_rules" compatibility="7.4.000" expanded="true" height="82" name="Create Association Rules" width="90" x="581" y="136">


    <操作符法ivated="true" class="converters:rules_2_example_set" compatibility="0.2.000" expanded="true" height="82" name="Association Rules to ExampleSet" width="90" x="715" y="136"/>
    <操作符法ivated="true" class="select_attributes" compatibility="7.4.000" expanded="true" height="82" name="Select Attributes" width="90" x="849" y="34">



    <操作符法ivated="true" class="pivot" compatibility="7.4.000" expanded="true" height="82" name="Pivot" width="90" x="782" y="238">





    <操作符法ivated="true" class="rename_by_replacing" compatibility="7.4.000" expanded="true" height="82" name="Rename by Replacing" width="90" x="916" y="391">


    <操作符法ivated="true" class="replace_missing_values" compatibility="7.4.000" expanded="true" height="103" name="Replace Missing Values" width="90" x="950" y="289">



















  • eldenosoeldenoso MemberPosts:65Contributor I

    Thank you Edward that the kind of influence matrix I was searching for, but is it also possible that we have just single attributes in columnes and examples?

    Best
    Philipp

    Thomas_Ott
Sign InorRegisterto comment.