Execute Python
Synopsis
Executes a Python script.
Description
Before using this operator you may need to specify the path to your Python installation under Settings -> Preferences menu (on Mac OS choose RapidMiner Studio -> Preferences). In the appearing settings panel select the Python Scripting tab. Your Python installation must include the pandas module since example sets get converted to pandas.DataFrames. By unchecking theuse default pythoncheckbox you can configure an individual Python binary for this operator instead of using the global settings.
This operator executes either the script provided through thescript fileport or parameter or the script specified in thescriptparameter. The arguments of the script correspond to the input ports, where example sets are converted to Pandas DataFrames. Analogously, the values returned by the script are delivered at the output ports of the operator, where DataFrames are converted to example sets.
The operator supports conda (anaconda) virtual environments, virtualenvwrapper virtual environments and you can select a Python binary, by specifying the full file system path to it as well. For more information on how to select the required Python, see the Parameters section of this help page.Note, that you may need to configure the extension. For this go to Settings -> Preferences menu (on Mac OS choose RapidMiner Studio -> Preferences). In the appearing settings panel select the Python Scripting tab. Edit the settings here, if required.
Using conda:if you installed the conda Python distribution to a non default location, you may need to add the installation directory and some subdirectories in the global settings of the Python Scripting Extension. For this go to Settings -> Preferences menu (on Mac OS choose RapidMiner Studio -> Preferences). In the appearing settings panel select the Python Scripting tab. Add the installation directory of your conda installation to the list ofsearch paths. On Windows you need to add theconda_install_dir\Scripts
subdirectory and on Linux and Mac OS theconda_install_dir/bin
subdirectory as well.
Accessing macros:you can access and modify the macros defined in RapidMiner from the Python code. You can call a macro by enclosing the name of the macro inside the%{}
marks. Before interpreting the Python code, these values will be substituted with actual macro values. For a more fine grained control over macros, set theuse macrosparameter. For more information see the parameter description below.
The console output of Python is shown in the Log View (View -> Show View -> Log).
Input
script file
A file containing a python script to be executed. The file has to comply with thescriptparameter rules. This port is optional, a file can also be provided through thescript fileparameter.
input
The Script operator can have multiple inputs. An input must be either an example set, a file object, a connection object, or a Python object which was generated by an 'Execute Python' operator.
Output
output
The Script operator can have multiple outputs. An output can be either an example set, a file object or a Python object generated by this operator.
Parameters
Script
The Python script to execute. Define a method with name 'rm_main' with as many arguments as connected input ports or alternatively a*args
argument to use a dynamic number of attributes. The return values of the method 'rm_main' are delivered to the connected output ports. If the method returns a tuple then the single entries of the tuple are delivered to the output ports. Entries from the data type 'pandas.DataFrames' are converted to example sets; files are converted to File Objects, other Python objects are serialized and can be used by other 'Execute Python' operators or stored in the repository. Serialized Python objects have to be smaller than 2 GB.
如果你通过脚本将通过一个例子an input port, the meta data of the example set (types and roles) is available in the script. You can access it by reading the attributerm_metadata
of the associated pandas.DataFrame, in our exampledata
.data.rm_metadata
is a dictionary from attribute names to a tuple of attribute type and attribute role.
You can influence the meta data of an example set that you return as a pandas.DataFrame by setting the attributerm_metadata
. If you don't specify attribute types in this dictionary, they will be determined using the data types in Python. You can specify your own roles or use the standard roles of RapidMiner like 'label'.
For more information about the meta data handling in a Python operator check the tutorial process 'Meta data handling' below.
If a script file is provided either through thescript fileport or parameter (port takes precedence), that script will be used instead of the value of this parameter.
Script file
A file containing a python script to be executed. The file has to comply with thescriptparameter rules. This parameter is optional.
Use default python
使用Python二进制或环境中定义the RapidMiner Studio global settings. The global settings can be accessed from the Settings -> Preferences menu (on Mac OS choose RapidMiner Studio -> Preferences). In the appearing settings panel select the Python Scripting tab. Here you can define the defaults.
Package manager
This parameter only available ifuse default pythonis set to false. This parameter specifies the package manager used by the operator. Currently Conda/Anaconda/Miniconda and Virtualenvwrapper is supported, or you can define the full path to your preferred python binary as well.
Conda environment
This parameter only available ifuse default pythonis set to false andpackage manageris set toconda (anaconda). This parameter specifies the conda virtual environment used by this operator.
Venvw environment
This parameter only available ifuse default pythonis set to false andpackage manageris set tovirtualenvwrapper. This parameter specifies the virtualenvwrapper virtual environment used by this operator.
Python binary
This parameter only available ifuse default pythonis set to false andpackage manageris set tospecific python binaries.This parameter specifies the path to the python binary, used by this operator.
Use macros
Use an additional named parametermacrosfor therm_main
method (NOTE, that you will need to modify the script and add the parameter manually). This way all the macro values will be passed as an additional parameter of therm_main
method and you can access the macro values via themacros
dictionary. Each dictionary value will be a Python string. You can also modify values of the dictionary or add new elements. The changes will be reflected in RapidMiner after the execution of the operator.