Skip to main content

Join (Deprecated)

Synopsis

This Operator joins two ExampleSets using one or more Attributes of the input ExampleSets as

key attributes.

Description

This Operator joins two ExampleSets using one or more Attributes of the input ExampleSets askey attributes.

Identical values of thekey attributesindicate matching Examples. An Attribute with id role is selected as key by default but an arbitrary set of one or more Attributes can be chosen as key. Four types of joins are possible:inner,left,rightandouterjoin. All these types of joins are explained in the parameters section.

Differentiation

Append

The Append Operator merges the Examples of the input ExampleSets into the resulting ExampleSet. Therefore all input ExampleSet need to have the same structure (number of Attributes, Attribute names and value types).

Generate Products

The Cartesian Product Operator builds a cartesian product of the input ExampleSets, i.e. every Example from the left ExampleSet is joined with each Example of the right ExampleSet.

Union

The Union Operator combines both input ExampleSets in such a way that all Attributes and Examples are part of the resulting union ExampleSet.

Superset

The Superset Operator expects two ExampleSets as input and adds the Attributes of the first ExampleSet to the second ExampleSet and vice versa. Both resulting ExampleSets are delivered as output of the Superset Operator.

Input

left

The left input port expects an ExampleSet. This ExampleSet will be used as the left ExampleSet for the join.

The right input port expects an ExampleSet. This ExampleSet will be used as the right ExampleSet for the join.

Output

join

输出端口提供联合ExampleSet.

Parameters

Remove double attributes

This parameter indicates if double Attributes should be removed or renamed. Double Attributes are those Attributes that are present in both ExampleSets. If this parameter is checked, from Attributes which are present in both ExampleSets only the one from the left ExampleSet will be taken and the one from the right ExampleSet will be discarded. If this parameter is unchecked, the Attributes from the right ExampleSet are renamed. Thekey attributeswill always be taken from the left ExampleSet. Please note that this check for double Attributes will only be applied for regular Attributes. Special Attributes of the right ExampleSet which do not exist in the left ExampleSet will simply be added. If they already exist they are simply skipped.

Join type

This parameter specifies which join should be performed. You can easily understand these joins by studying the tutorial Process. Four types of joins are supported:

  • inner: The resulting ExampleSet will contain only those Examples where thekey attributesof both input ExampleSets match, i.e. have the same value.
  • left: This is also called left outer join. The resulting ExampleSet will contain all Examples from the left ExampleSet. If no matching Examples were found in the right ExampleSet, then its Attributes will consist of missing values. Missing values or null values are shown as '?' in RapidMiner. The left join will always contain the results of the inner join; however it can contain some Examples that have no matching Examples in the right ExampleSet.
  • right: This is also called right outer join. The resulting ExampleSet will contain all records from the right ExampleSet. If no matching Examples were found in the left ExampleSet, then its Attributes will consist of missing values. Missing values or null values are shown as '?' in RapidMiner. The right join will always contain the results of the inner join; however it can contain some Examples that have no matching Examples in the left ExampleSet.
  • outer: This is also called full outer join. This type of join combines the results of the left and the right join. All Examples from both ExampleSets will be part of the resulting ExampleSet, whether the matchingkey attributevalue exists in the other ExampleSet or not. If no matchingkey attributevalue was found the corresponding resulting Attributes will consist of missing values. Missing values or null values are shown as '?' in RapidMiner. The outer join will always contain the results of the inner join; however it can contain some Examples that have no matching Examples in the other ExampleSet.

Use id attribute as key

This parameter indicates if the Attribute with the id role should be used as thekey attribute. This option is checked by default. If unchecked, then you have to specify thekey attributes左和右穰mpleSets. Identical values of thekey attributesindicate matching Examples.

Key attributes

This parameter is available when when the parameteruse id attribute as keyis unchecked. This parameter specifies Attribute(s) which are used as thekey attributes. Identical values of thekey attributesindicate matching Examples. For eachkey attributefrom the left ExampleSet a correspondingkey attributefrom the right ExampleSet has to be chosen. Choosing appropriatekey attributesis critical for obtaining the desired results.

Keep both join attributes

If checked, both Attributes of a join pair will be kept. Usually this is unneccessary since both Attributes are identical. It may be useful to keep such a column if there are missing values on one side.