"split operator - export data not complete for further use (operators)"
Hello,
the split operator gives me only the first three columns for further use even if the operator created more. That means that in the result view I see all split columns (more than thee) but I cannot choose them in another operator (only the first three are visible).
Here is a simple table one can try it:
bla split
asdf 2345x2134
dsaf 2345x2345x345x456x356x3546
sadf 2435x2345
the split operator gives me only the first three columns for further use even if the operator created more. That means that in the result view I see all split columns (more than thee) but I cannot choose them in another operator (only the first three are visible).
Here is a simple table one can try it:
bla split
asdf 2345x2134
dsaf 2345x2345x345x456x356x3546
sadf 2435x2345
Tagged:
0
Answers
my quick test process worked fine, I could select up to "split_6" attribute in further operators: Can you provide your process XML which does not work?
Regards,
Marco
1. RapidMiner 5.3 is old. Like really old. We cannot provide help for that anymore here. Please consider using RapidMiner Studio 7.0. instead.
2. You are using the split operator after "Read Excel". The problem is that the output of Read Excel depends on actually reading the excel file at runtime. So until then, we don't know what the result will be. Therefore the split operator creates a dummy output to show an example of how it could look like.
To use actual data, load it into the repository first, then access it with a "Retrieve" operator. That way, you have full metadata available and the split operator preview will be correct.
Regards,
Marco
But I still cant see the split columns higher than 3 in the operators select attributes, rename, remove duplicates (subset).
yes, that is expected due to the "can't know beforehand" problem. You can still manually change those parameters if you know you will end up with 6 splits for example.
But the easiest solution is to read the data into your repository, then only use the data from the repository in your process. That way you have the actual information available during construction time.
Regards,
Marco
your local repository sits on your file system - data cannot be to big for that
Manually depends on the parameter. For example for "Remove Duplicates", you can select 'subset', then add the name like "split_6" to the upper right textfield and press +
Regards,
Marco