[New Extension] Projects
MartinLiebig
Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,368RM Data Scientist
大家好!
I've just released a new extension called "Projects". This extension adds two new entries to your repository actions:
This allows you to create a standard setup for a project. It automatically creates a folder structure, add standard processes with defined documentation and so on. The structure for the Project looks like this:
Some of the processes already have a implementation. The 04-Learning one looks like this:
I hope that these templates can make your life easier andmake yourself more efficient. I appriciate any feedback on the templates and of course on further enhancements!
Best,
Martin
- Head of Data Science Services at RapidMiner -
Dortmund, Germany
Dortmund, Germany
Tagged:
16
Comments
Dortmund, Germany
Dortmund, Germany
Dortmund, Germany
That's an interesting extension and I was trying to adapt our current project to use it, but found that it's not really obvious how to use it as intended. Is there any documentation explaining the typical workflow with this extension?
For example:
- How to change the version number? (I tried setting a "version" macro with Set Macro in the "! Main" process, but results are still written to version 1)
- How to store data from one step to reuse it in the next step (If I try to run steps individually, I see they are expecting data from the previous step to be there in the repository, but I could not find in the default generated processes anything that stores outputs to the repository)
- How to specify the learners to use when training: (I found the answer: Edit the "Used Learners" repository entry but it would be useful to have it documented)
- How to compare the different trained model to decide which one to use (is there a single view showing the performance of all models, or should we individually open each "%{version}/learner/performance" repository entry to review it?
- Once we have decided which model should be used in production, how do we "promote" the model to become the production model? (For example, is there a predefined variable/macro to reference the selected model in the repository?)
An example of a full solution might also be helpful, but eventually we would need documentation so that it can be used by non-expert users (for example to allow other users to maintain/re-train/improve a model after we have completed initial development). If we end using the extension we could also contribute some documentation, but if you have some answers to the questions above, or existing documentation/examples to get started, that would be helpful.
Thanks
thank you for your detailed feedback. I would build it in, but this extension will be a bit outdated with the new features we show during wisdom next week. That will make all of our life way easier.
Best,
Martin
Dortmund, Germany