Categories

Versions

You are viewing the RapidMiner Developers documentation for version 9.7 -Check here for latest version

API changes in RapidMiner 7.3

RapidMiner 7.3 brings two changes that affect the development of extensions. First, a central API for the creation of data sets (ExampleSetinstances) was introduced. Second, theExampleSet接口扩展了a method to allow for freeing unused data.

These changes only affect you if your extension includes operators that generate new data sets or defines its ownExampleSets (e.g., custom views).

Generating data sets

RapidMiner 7.3 adds theExampleSetsclass that provides a set of static methods to build new data sets. Those methods replace direct instantiations ofExampleTableimplementations such as theMemoryExampleTable. In particular, all public constructors for theMemoryExampleTableclass have been deprecated.

The new API provides methods to create data sets from both columnar and row-oriented data:

import com.rapidminer.example.Attribute; import com.rapidminer.example.Attributes; import com.rapidminer.example.ExampleSet; import com.rapidminer.example.table.AttributeFactory; import com.rapidminer.example.table.BinominalMapping; import com.rapidminer.example.table.DataRow; import com.rapidminer.example.table.DataRowFactory; import com.rapidminer.example.table.NominalMapping; import com.rapidminer.example.utils.ExampleSetBuilder; import com.rapidminer.example.utils.ExampleSets; import com.rapidminer.operator.Operator; import com.rapidminer.operator.OperatorDescription; import com.rapidminer.operator.OperatorException; import com.rapidminer.tools.Ontology; import java.time.Duration; import java.time.Instant; import java.util.ArrayList; import java.util.List; /** Create example set using column fillers */ Attribute topTen = AttributeFactory.createAttribute("Top Ten Numbers", Ontology.INTEGER); Attribute coinFlip = AttributeFactory.createAttribute("Coin Flip", Ontology.BINOMINAL); NominalMapping coin = new BinominalMapping(); int heads = coin.mapString("Heads"); int tails = coin.mapString("Tails"); coinFlip.setMapping(coin); ExampleSet numbers = ExampleSets.from(topTen, coinFlip) .withRole(topTen, Attributes.ID_NAME) .withBlankSize(10) .withColumnFiller(topTen, i -> i + 1) .withColumnFiller(coinFlip, i -> Math.random() < 0.5 ? heads : tails) .build(); /** Create example set from double matrix */ ExampleSetBuilder builder = ExampleSets.from(AttributeFactory.createAttribute(Ontology.REAL), AttributeFactory.createAttribute(Ontology.REAL), AttributeFactory.createAttribute(Ontology.REAL)); builder.withExpectedSize(10); double rawData[][] = new double[10][3]; for (double[] row : rawData) { builder.addRow(row); } ExampleSet matrix = builder.build(); /** Create example set from custom DataRows */ Attribute nominalAttribute = AttributeFactory.createAttribute("Nominal", Ontology.NOMINAL); Attribute numericalAttribute = AttributeFactory.createAttribute("Numerical", Ontology.REAL); Attribute dateTimeAttribute = AttributeFactory.createAttribute("DateTime", Ontology.DATE_TIME); List attributes = new ArrayList<>(); attributes.add(nominalAttribute); attributes.add(numericalAttribute); attributes.add(dateTimeAttribute); ExampleSetBuilder builder = ExampleSets.from(attributes).withExpectedSize(2); DataRowFactory dataRowFactory = new DataRowFactory(DataRowFactory.TYPE_DOUBLE_ARRAY, '.'); DataRow dataRow = dataRowFactory.create(attributes.size()); // this is important, for nominal attributes the value to set in the data row is the index of the mapped string! dataRow.set(nominalAttribute, nominalAttribute.getMapping().mapString("Hello")); dataRow.set(numericalAttribute, 1.0); dataRow.set(dateTimeAttribute, Instant.now().toEpochMilli()); builder.addDataRow(dataRow); dataRow = dataRowFactory.create(attributes.size()); // see comment above, index of the mapped string! dataRow.set(nominalAttribute, nominalAttribute.getMapping().mapString("World")); dataRow.set(numericalAttribute, 42.0); dataRow.set(dateTimeAttribute, Instant.now().plus(Duration.ofDays(1)).toEpochMilli()); builder.addDataRow(dataRow); ExampleSet exampleSet = builder.build();

Freeing unused resources

TheExampleSetinterface has been extended by thecleanup()method. RapidMiner will invoke this method at certain points of the process execution, e.g., in between operators. Please note, that the default implementation does nothing.

/** * Frees unused resources, if supported by the implementation. Does nothing by default. * * Should only be used on freshly {@link #clone}ed {@link ExampleSet}s to ensure that the * cleaned up resources are not requested afterwards. * * @since 7.3 */ public default void cleanup() { // does nothing by default }

When implementing custom example sets that manage their own resources, please use this method to free unused data such as temporary attributes.

If you do not manage your own resources, but implement a customExampleSetthat acts as view on top another data set, please delegate the call accordingly.

For instance, most of RapidMiner's view implementations reference a single家长example set. Thus, their implementation ofcleanup()boils down to:

@Override public void cleanup() { parent.cleanup(); }