RapidMiner Board - Chart Server
I am sure we are all involved in optimising RapdMiner models, saving partial results and models on disk, and using some tools to preview and chart the results, while the model development is ticking away in the background. Basically, we need to have live charts and the ability to send data and performance measures to these charts. We could use RapidMiner Server apps to achieve some of this functionality, but it can be done much simpler and cheaper, and if needed interactively as well.
I suggest to create a RapidMiner Board, similar in its functionality to Tensorflow's Tensorboard. The board would run as a web server (either launched locally, similarly what is already been done with H2O for example, or to be linked to an existing web server via URL). The RapidMiner would also need an extension to connect the RM instance to the board, and to offer "callback" operators which would send data and performance measurements to the board in runtime, as well as "charting" operators that would request updates of live plotting of the accumulated data. This would allow watching the performance of models developed iteratively in loops, in cross-validation, and optimisation grids, which normally takes too long to execute and there is virtually no feedback of what is going on. The charts could be interactive, so that you could select different visualisation of data, inspect data points, etc.
The charts can be as sophisticated, dynamic and beautiful as D3.js, but there are many other HTML5 options there as well.
Optionally, if we were to replicate the full Tensorboard functionality, it may be possible to store the data in a folder accessible by the server, so that you can plot the saved data after the run finished, or compare the charts from one run against the charts of another run. It may even be possible to save the models checkpoints for later selection and loading (e.g. to select the best), and if so, similarly to Tensorboard, we could offer visualisation of the checkpointed models.
Comments
Thanks,@jacobcybulski.你”re over my head here but I'm passing this along to product mgmt who will certainly understand what you're getting at.
Scott
@jacobcybulski
If I understand correctly it might be possible to "mock-up" in a fashion how this could work by using the Reporting Extension after set to HTML output.
With a cleverly written HTML template (with a bunch of JS) you can output out fixed report elements to the HTML and have some clever javascript read a log file output as it updates to disk. So you would need the template to trigger regular page refreshes for updated HTML content and also to trigger regular refreshes for the log file(s).
@JEdward, yes Reporting Extension has some elements there, in particular its ability to add to the report in a loop. The main difference of course is that what should be saved at the server side is the data / performance measures rather than images or text forming the report. In this way, the charts can be created live (e.g. from the logs) using the facilities available in JavaScript, e.g. d3.js or plotly.js, or something quite different such as using live geo-located visualisations using leaflet.js, or very futuristic and dynamic, game-like 3D visualisations, e.g. using three.js or babylon.js.
所以在某种意义上,这个新功能可能不是ly replicate the current behaviour of Tensorboard for RapidMiner (visualisation of models and performance indicators), but be a general-purpose interactive data visualisation server (interactive as all of the above JavaScript libraries offer interactivity).