"RM 4.3 Feature Generation problem"
我已经安装了新的RM 4.3 EE版本,但我一个m having problems with the Feature Generation not recognizing generated features when used in later steps.
This example adds one to the first attribute (successfully) to generate a new "plusone" attribute. Then, it tries to use "plusone" in a subsequent step, but RM returns an error that "plusone" doesn't exist.
I then tried splitting the computation up across two FeatureGeneration nodes, but it yields the same error:
<参数键= value =“attributes_upper_bound 1.0"/>
This appears to be a regression from 4.2, as I had a working process in RM 4.2 that now fails. Also note that I had to use the prefix notation to get the computation to work, not the infix notation that is supposed to be in RM 4.3. Either my installation is messed up (perhaps from an incomplete uninstall?), or there's a bug in RM 4.3.
<参数键= value =“attributes_upper_bound 1.0"/>
Thanks,
Keith
Tagged:
0
Answers
you are right: we had to remove the feature that freshly created attributes can be directly re-used from the FeatureGeneration operator since it unfortunately caused bugs in other settings. Since it was quite unpredictable (even for us who always work on predictions)in which cases everything works and in which cases not, we decided to remove this feature.
This was also supported by the decision to create a new operator "AttributeConstruction" which should now be preferred for the construction of new attributes instead of the FeatureGeneration operator. This new operator "AttributeConstruction" also is the one which supports infix formulas and nicer constants (just "1" instead of "const[1]()") as well as many new functions - including a very nice if-function, e.g.
if (attribute1 > 5, sin(attribute2), cos(attribute3))
which will create a new attribute with the value sin(attribute2) if the value of attribute1 is larger than 5 and cos(attribute3) otherwise. Even nominal values are supported in this if-statement, e.g.
if (attribute1 == "dog", attribute2 + attribute3, 42)
So why did we keep the old FeatureGeneration operator at all (and it is not even marked as deprecated)? The reason is simple: it is faster on real large datasets and so we decided to keep it but we unfortunately had to remove the re-use-just-created-attributes functionality.
Hope that clarifies things about feature construction a bit.
Cheers,
Ingo
yes, that's a PITA, sorry about that. If you are not yet have rewritten all of your processes, we could also try to include the old functionality of the old operator FeatureGeneration under a new name, e.g. "FeatureGenerationDeprecated" and deliver this with the next EE update. Then it is simply a matter of replacing all "FeatureGeneration" with "FeatureGenerationDeprecated" which might be easier. Of course, the deprecated operator will be removed for some version in the future but it would give you some more time to update your processes. Please let me know if this would be useful, then I would ask one of our developers to add this "new" (old) operator.
Cheers,
Ingo
Keith
Cheers,
Ingo