"Generate Products" operator should generate only unique combinations of the attributes
yzan
MemberPosts:66Unicorn
Currently, when we pass a set {"att1", "att2", "att3"} as "attribute1" and {"att1", "att2", "att3"} as "attribute2" to "Generate Products" operator, we get a product between "att1" and "att2" twice. Once as "(att1) * (att2)" and once as "(att2) * (att1)". Since multiplication (for all numerical data types that RapidMiner supports) is commutative, it creates an annecessary redundancy.
箴posal: Generate only unique combinations of the attributes.
Workaround: Use "Remove Correlated Attributes" after "Generate Products".
Tagged:
0
Comments
Wow...been using RapidMiner Studio for six years and never saw this operator before.当然乐意开放投票。
Hello@yzan,
Might I suggest a slight change? What you wrote here makes total sense, and I believe it should be the default: it shouldn't be expected to generate a cross join.
However, I found myself in more cases where I don't have to remove correlated attributes than the other way round, hence I think the way the product generation is executed should be configurable. (Besides, it reduces the impact in current models).