[HowTo] Create Box Plots to Check Regressions

MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,381RM Data Scientist
Hey guys!

This is not a question, but rather a how to. I frequently use Box plots to asses the quality of regression problems.
What I do is, that I discretize the prediction, and look at box plot to compare it to the real value. This looks like this:

Here we see a lot. Most importantly that this model is flat in the beginning and the end, and there is a big of a correlation in the center. I prefer these plots over normal scatter plots of True-vs-Predicted, because you may get disctracted by some outliers if you do this.

Attached is the example process how to generate such a plot. It needs a bit of preprocessing.


























<参数键= value =“block_type attribute_block"/>























Filter on date<br/>





























































<连接from_op = from_“窗口”port="windowed example set" to_op="Filter Examples" to_port="example set input"/>










This gets the binning into the right order
Use Boxplot.<br><br>Volume Column: gas price / euro (times 1000) + 1 (horizon)<br>Group By Column: prediction(gas price / euro (times 1000) + 1 (horizon))<br>







- Head of Data Science Services at RapidMiner -
Dortmund, Germany
Tagged:
MPB_ hbajpai Jasmine_ MarcoBarradas yyhuang sgenzer David_A LeMarc
    Sign InorRegisterto comment.