Search Solr (Data)
Synopsis
This operator searches for Solr entries and generates an example set.
Description
To connect to a Solr server, you have to specify a Solr connection. This comprises the URL of a Solr server and an optional user/password combination for authentication. Typically, the Solr server URL ends with the string '/solr'.
The next step is to select a collection on the server. A collection can be imagined as a table. It is composed of several columns, which are called Solr fields. A Solr field has a type (e.g. number) and a key (the name of the column). Each entry in Solr can be imagined as a row and contains values for the respective fields.
A RapidMiner example set has a very similar structure. It also can be imagined as a table. Therefore every row of Solr is added as row in RapidMiner. The Solr collection fields are used as RapidMiner attributes.
To search Solr, you have to specify a query string. You can add filters to refine your query. E.g., if you want to receive no items with an attribute key "popularity" and the value "6", use "!popularity:6". The range of the entries to receive can be set by the attributes offset and rows. You can specify, which field is used to sort the received entries. It is also possible to enable faceting. Faceted search breaks up search results into multiple categories. Use "facet fields" and "date facets" to specify Solr fields for faceting.
If a Solr field supports multiple elements, the related values are provided as a JSON array.
Input
connection
This input port expects a Connection object if any. See the parameterconnection entryfor more information.
Output
output
This port provides the main search result. It consists of an example set.
facets
This port is used to provide results of the faceted search. An example set is provided and contains the field name, the value which was found, and the number of occurrences.
connection
This output port delivers the Connection object from the input port. If the input port is not connected the port delivers nothing.
Parameters
Connection source
This parameter indicates how the connection should be specified. It gives you two options, predefined and repository. The parameter is not visible if theconnectioninput port is connected.
Connection entry
This parameter is only available when theconnection sourceparameter is set torepository。This parameter is used to specify a repository location that represents a connection entry. The connection can also be provided using theconnectioninput port.
Connection
This parameter is only available when theconnection sourceparameter is set topredefined。Solr连接的连接细节have to be specified. If you have already configured a Solr connection, you can select it from the drop-down list. If you have not configured a Solr connection yet, select the icon to the right of the drop-down list. Create a new Solr connection in the Manage connections dialog. The Solr server URL is required. Additionally, you can specify a username/password combination for authentication.
Collection
Provide the name of the Solr collection, which has to be used to access data.
Query
The term to search for.
Filter query
A filter, which does not influence the relevancy score, which is the default sort order. With this field, you can refine your query. E.g. if the field name has to contain John, but must not contain Doe, you can use 'name:John -name:Doe'.
Offset
The first document index to fetch.
Limit
The maximum number of results.
Sort
Specifies, if search results are sorted.
Sort field
The Solr field which is used for sorting.
Sort order
The sorting order of results.
Faceted search
Specifies, if faceted searching is used.
Categorical facets
The facets to use for faceted search.
Date facets
The date facets to use for faceted search. A single date facet consists of the field name, a start date, an end date, and a gap.
Include generated fields
Specifies, if automatically generated fields are included into search results. These fields can consist of SolrCloud fields or can be based on dynamic Solr fields.