Skip to main content

Search Solr (Data)

Synopsis

This operator searches for Solr entries and generates an example set.

Description

To connect to a Solr server, you have to specify a Solr connection. This comprises the URL of a Solr server and an optional user/password combination for authentication. Typically, the Solr server URL ends with the string '/solr'.

The next step is to select a collection on the server. A collection can be imagined as a table. It is composed of several columns, which are called Solr fields. A Solr field has a type (e.g. number) and a key (the name of the column). Each entry in Solr can be imagined as a row and contains values for the respective fields.

A RapidMiner example set has a very similar structure. It also can be imagined as a table. Therefore every row of Solr is added as row in RapidMiner. The Solr collection fields are used as RapidMiner attributes.

To search Solr, you have to specify a query string. You can add filters to refine your query. E.g., if you want to receive no items with an attribute key "popularity" and the value "6", use "!popularity:6". The range of the entries to receive can be set by the attributes offset and rows. You can specify, which field is used to sort the received entries. It is also possible to enable faceting. Faceted search breaks up search results into multiple categories. Use "facet fields" and "date facets" to specify Solr fields for faceting.

If a Solr field supports multiple elements, the related values are provided as a JSON array.

Input

connection

This input port expects a Connection object if any. See the parameterconnection entryfor more information.

Output

output

This port provides the main search result. It consists of an example set.

facets

This port is used to provide results of the faceted search. An example set is provided and contains the field name, the value which was found, and the number of occurrences.

connection

This output port delivers the Connection object from the input port. If the input port is not connected the port delivers nothing.

Parameters

Connection source

This parameter indicates how the connection should be specified. It gives you two options, predefined and repository. The parameter is not visible if theconnectioninput port is connected.

Connection entry

This parameter is only available when theconnection sourceparameter is set torepository。This parameter is used to specify a repository location that represents a connection entry. The connection can also be provided using theconnectioninput port.

Connection

This parameter is only available when theconnection sourceparameter is set topredefined。Solr连接的连接细节have to be specified. If you have already configured a Solr connection, you can select it from the drop-down list. If you have not configured a Solr connection yet, select the icon to the right of the drop-down list. Create a new Solr connection in the Manage connections dialog. The Solr server URL is required. Additionally, you can specify a username/password combination for authentication.

Collection

Provide the name of the Solr collection, which has to be used to access data.

Query

The term to search for.

Filter query

A filter, which does not influence the relevancy score, which is the default sort order. With this field, you can refine your query. E.g. if the field name has to contain John, but must not contain Doe, you can use 'name:John -name:Doe'.

Offset

The first document index to fetch.

Limit

The maximum number of results.

Sort

Specifies, if search results are sorted.

Sort field

The Solr field which is used for sorting.

Sort order

The sorting order of results.

Specifies, if faceted searching is used.

Categorical facets

The facets to use for faceted search.

Date facets

The date facets to use for faceted search. A single date facet consists of the field name, a start date, an end date, and a gap.

Include generated fields

Specifies, if automatically generated fields are included into search results. These fields can consist of SolrCloud fields or can be based on dynamic Solr fields.