You are viewing the RapidMiner Python documentation for version 9.9 -Check here for latest version
Execute Python code
Here are the basic features of the extension. Make sure to explore the tutorial processes provided with theExecute Pythonoperator as well. Other operators (Python LearnerandPython Transformer) will be presented on the自定义操作符page.
How things work
It's important to understand how data is transferred between RapidMiner operators and the operators provided by the Python Scripting extension. In other words, what happens when you connect the port of any RapidMiner operator to any of the Python operators (Execute Python,Python Learner, andPython Transformer).
When passing data to one of the Python operators, RapidMiner ExampleSets are transformed automatically to Pandas DataFrames. The Pandas DataFrames returned by yourrm_main
function (see the next chapter onhow to structure your code) are converted back automatically to RapidMiner ExampleSets by the Python Scripting extension. Metadata propagation and automatic data type conversion is also in place in both directions.
How to structure your code
To successfully execute your Python code inside RapidMiner, you need to structure your code in a way that you declare anrm_main
function as your main entry point. The number and order of input parameters and returned values of yourrm_main
function will correspond to the input and output ports of theExecute Pythonoperator.
You have to follow this convention regardless if you are using our inline editor or just embedding a Python script or notebook file.
Running scripts
You can execute your Python code either by editing it in-line with our basic script editor (it provides basic syntax highlighting but lacks all the powerful features of a Python IDE), or by specifying a script file in theExecute Pythonoperator'sscript fileparameter. If your script is stored in a location accessible via internet (such as GitHub), you can also read your script file directly from there with the help of theOpen Fileoperator.
You can also store your script file in your RapidMiner project or repository.
As a convenience feature, if you drag and drop a .py or .ipynb file from your project or repository to the canvas, the correct operators will be automatically created for you.
Running notebooks
You can also executeipynb
notebooks with the help ofExecute Python. In this case, use thescript fileparameter of the operator to locate your notebook. The same consideration on how to structure code applies for notebooks as for Python scripts.
If you tagged your notebook cells, we offer a selective tag based execution. One way to do this is to click theShow Preview...button on the Execute Python operator (once you added your notebook into thescript fileparameter) and pick which cells to exclude from the execution. Alternatively, you can specify which cells to execute by providing a regular expression in thenotebook cell tag filterparameter.
Using RapidMiner macros
Macros added into the Python code inline with the%{myMacro}
syntax will be parsed before the script execution, both in case of an inline script and one provided by script file. But, to no surprise, this piece of code then will only run inside RapidMiner, and will otherwise produce a syntax error.
Another, more pythonic way to tackle this is to check theenable macrosparameter on yourExecute Pythonoperator. Next, you need to add an extra parameter to yourrm_main
function, where macros will be accessible during your execution. This will allow you not only to read macro values, but also to define new ones, or overwrite the value of existing macros.