You are viewing the RapidMiner Developers documentation for version 9.3 -Check here for latest version
API changes in RapidMiner 7.3
RapidMiner 7.3 brings two changes that affect the development of extensions. First, a central API for the creation of data sets (ExampleSet
instances) was introduced. Second, theExampleSet
接口扩展了一个方法来允许freeing unused data.
These changes only affect you if your extension includes operators that generate new data sets or defines its ownExampleSet
s (e.g., custom views).
Generating data sets
RapidMiner 7.3 adds theExampleSets
class that provides a set of static methods to build new data sets. Those methods replace direct instantiations ofExampleTable
implementations such as theMemoryExampleTable
. In particular, all public constructors for theMemoryExampleTable
class have been deprecated.
The new API provides methods to create data sets from both columnar and row-oriented data:
import com.rapidminer.example.Attribute; import com.rapidminer.example.Attributes; import com.rapidminer.example.ExampleSet; import com.rapidminer.example.table.AttributeFactory; import com.rapidminer.example.table.BinominalMapping; import com.rapidminer.example.table.DataRow; import com.rapidminer.example.table.DataRowFactory; import com.rapidminer.example.table.NominalMapping; import com.rapidminer.example.utils.ExampleSetBuilder; import com.rapidminer.example.utils.ExampleSets; import com.rapidminer.operator.Operator; import com.rapidminer.operator.OperatorDescription; import com.rapidminer.operator.OperatorException; import com.rapidminer.tools.Ontology; import java.time.Duration; import java.time.Instant; import java.util.ArrayList; import java.util.List; /** Create example set using column fillers */ Attribute topTen = AttributeFactory.createAttribute("Top Ten Numbers", Ontology.INTEGER); Attribute coinFlip = AttributeFactory.createAttribute("Coin Flip", Ontology.BINOMINAL); NominalMapping coin = new BinominalMapping(); int heads = coin.mapString("Heads"); int tails = coin.mapString("Tails"); coinFlip.setMapping(coin); ExampleSet numbers = ExampleSets.from(topTen, coinFlip) .withRole(topTen, Attributes.ID_NAME) .withBlankSize(10) .withColumnFiller(topTen, i -> i + 1) .withColumnFiller(coinFlip, i -> Math.random() < 0.5 ? heads : tails) .build(); /** Create example set from double matrix */ ExampleSetBuilder builder = ExampleSets.from(AttributeFactory.createAttribute(Ontology.REAL), AttributeFactory.createAttribute(Ontology.REAL), AttributeFactory.createAttribute(Ontology.REAL)); builder.withExpectedSize(10); double rawData[][] = new double[10][3]; for (double[] row : rawData) { builder.addRow(row); } ExampleSet matrix = builder.build(); /** Create example set from custom DataRows */ Attribute nominalAttribute = AttributeFactory.createAttribute("Nominal", Ontology.NOMINAL); Attribute numericalAttribute = AttributeFactory.createAttribute("Numerical", Ontology.REAL); Attribute dateTimeAttribute = AttributeFactory.createAttribute("DateTime", Ontology.DATE_TIME); List attributes = new ArrayList<>(); attributes.add(nominalAttribute); attributes.add(numericalAttribute); attributes.add(dateTimeAttribute); ExampleSetBuilder builder = ExampleSets.from(attributes).withExpectedSize(2); DataRowFactory dataRowFactory = new DataRowFactory(DataRowFactory.TYPE_DOUBLE_ARRAY, '.'); DataRow dataRow = dataRowFactory.create(attributes.size()); // this is important, for nominal attributes the value to set in the data row is the index of the mapped string! dataRow.set(nominalAttribute, nominalAttribute.getMapping().mapString("Hello")); dataRow.set(numericalAttribute, 1.0); dataRow.set(dateTimeAttribute, Instant.now().toEpochMilli()); builder.addDataRow(dataRow); dataRow = dataRowFactory.create(attributes.size()); // see comment above, index of the mapped string! dataRow.set(nominalAttribute, nominalAttribute.getMapping().mapString("World")); dataRow.set(numericalAttribute, 42.0); dataRow.set(dateTimeAttribute, Instant.now().plus(Duration.ofDays(1)).toEpochMilli()); builder.addDataRow(dataRow); ExampleSet exampleSet = builder.build();
Freeing unused resources
TheExampleSet
interface has been extended by thecleanup()
method. RapidMiner will invoke this method at certain points of the process execution, e.g., in between operators. Please note, that the default implementation does nothing.
/** * Frees unused resources, if supported by the implementation. Does nothing by default. * * Should only be used on freshly {@link #clone}ed {@link ExampleSet}s to ensure that the * cleaned up resources are not requested afterwards. * * @since 7.3 */ public default void cleanup() { // does nothing by default }
When implementing custom example sets that manage their own resources, please use this method to free unused data such as temporary attributes.
If you do not manage your own resources, but implement a customExampleSet
that acts as view on top another data set, please delegate the call accordingly.
For instance, most of RapidMiner's view implementations reference a single家长
example set. Thus, their implementation ofcleanup()
boils down to:
@Override public void cleanup() { parent.cleanup(); }