Categories

Versions

You are viewing the RapidMiner Studio documentation for version 9.7 -Check here for latest version

Using the Sensor Link extension

TheSensor Linkextension connects RapidMiner to theOSIsoft PI System, allowing easy extraction of operational data and updating data points using RapidMiner processes.

Sensor Link utilizes thePI Web APIand is compatible with API versions 2017 R2 and newer.

Please note that OSIsoft requires a PI System Access (PSA) license for using programmatic APIs such as the PI Web API. In particular, the availability of the API, e.g., as part of a PI Vision installation, does not guarantee a PSA license is available.

Install the Sensor Link extension

To install the extension, go to theExtensionsmenu, open theRapidMinerMarketplace (Updates and Extensions), and search forSensor Link. For more detail, seeAdding extensions.

Connect to the PI Web API

Sensor Link makes use of RapidMiner’s connection framework. This allows managing connections centrally and to reuse connections between operators. The extension supports bothHTTP Basic Authentication(username and password) andWindows Authenticationusing the current Windows user (Kerberos, NTLM). You can create a new connection from theConnectionsmenu:

For any connection you will need to specify the PI web APIendpointand thedefault root pathto use. The root path should be the name of a data server. It can be overridden in the operator parameters if required.

HTTP Basic Authentication

The following connection is an example usingHTTP Basic Authentication. It connects RapidMiner toOSIsoft’s public test system:

Windows authentication

To connect to an instance that requires Windows Authentication (Kerberos), selectWindows SSO (Kerberos/NTLM)as authentication method:

When using this authentication method, Sensor Link will use the current Windows user for authentication.

SSL settings and troubleshooting

By default, Sensor Link only trusts secured connections if RapidMiner recognizes the certificate used by the endpoint, and if the certificate was issued for the hostname of the endpoint.

If a connection to an internal system fails with an SSL error, this is most likely due to one of these two requirements not being met. In this case, we strongly recommend adding the certificate to RapidMiner Studio first. For more details, seeTrust a self-signed SSL certificate.

Alternatively, you can configure the connection to trust any self-signed certificate (less secure). Furthermore, the settingVerify hostcan be deactivated to trust certificates with mismatching host names.

Examples

You can find the operatorsCompressed Data,Current Values,Sample Data,Calculate Data, andPublish Databy searching forPIin the Operators panel:

For all operators you can specify the connection either by connecting the input port or by selecting the connection in the corresponding parameter (only visible if no input is connected). For most operators the only other mandatory parameter is the first data item or an expression (performance equation). By default, the PI Web API will answer with data for the last 24 hours.

Calculate Data

The following example queries5 minute averagesfor the data pointsBA:CONC.1andBA:TEMP.1(specified underadditional data items). The start and end time parameters use the relative expressionsYandTforyesterdayandtodayrespectively. For an overview of supported time strings for start and end times, please refer to theWeb API documentation.

This query results in a data table similar to this:

Please note that Calculate Data only provides limited support for non-numeric data points. Only the calculation methodscountandpercent goodcan be used in combination with such points.

Sample Data

This example samples the data pointsBA:CONC.1,BA:TEMP.1, andBA:ACTIVE.1(specified underadditional data items) at 10 minute intervals. Samples are taken by interpolating between the nearest two recorded values. This time, we use absolute start and end times:

Sample Data supports both numeric and non-numeric data points. The operator will look up the correct type automatically and map it to the corresponding RapidMiner attribute types (real, integer, and polynominal).

This query results in the following data table:

Compressed Data

The Compressed Data operator can be used to retrieve raw recorded data. The PI Data Archive might compress the recorded data over time, e.g., it removes data points that have little importance for interpolation. Thus, the name of the operator. Its interface is similar to the other operators:

However, its output differs in that the timestamps are not necessarily equidistant. Furthermore, when retrieving data for multiple data points it is not guaranteed that the same timestamps are returned for the different points. The operator handles this by performing an outer join on the timestamps and leaving missing cells empty (displayed as‘?’):

Current Values

当前值符类似于广告样稿ressed Data operator in that it returns raw recorded data. But as its name suggest, it only returns the most recent value for each data point:

However, when querying multiple data points at the same time, we might still end up with multiple rows due to mismatching timestamps:

Time filtering with performance equations

The connector does not implement theTime Filteredfunction known from PI Data Link. However, it is possible to calculate the amount of time a performance equation evaluates totrueusing the equivalent expressions within the performance equation itself.

The example above evaluates the expressionTimeLt('BA:TEMP.1', '*-1h', '*', 10)once every full hour. The expression itself computes the amount of time the recorded temperatureBA:TEMP.1was below 10 degrees in the last hour (in seconds):

Filter expressions

所有数据检索操作符支持过滤表达sions. Let’s revisit theexample of the Sample Dataoperator. If we only want to retrieve data for rows whereBA:ACTIVE.1indicates an active state, we can do so by using the filter'BA:ACTIVE.1' = "Active":

Please note that after filtering the returned data is no longer guaranteed to be equidistant. For example, there is a 20min gap between rows 2 and 3 because we dropped an inactive row in between (see the original table in theSample Data example):