"Logistic Regression predicts too high probabilities"

confuzioconfuzio MemberPosts:4Contributor I
edited May 2019 inHelp
目前我作表语用ting creditor-defaults, using (besides others) Kernel Logistic Regression. Here's my problem: KLR predicts probabilities ("confidences") which are quite a bit higher than those given by a quite accurate Generalized Additive Model; leading to a much worse average performance (% deviance explained).

Comparing the Logits of the predicted probabilities it could be that there is just a constant missing, but I'm not sure if that is where the problem comes from.

Below my simple (not) working example. I really appreciate your help!






<宏/ >

< =“tru运营商激活e" class="process" compatibility="5.1.001" expanded="true" name="Process">

< =“tru运营商激活e" class="read_csv" compatibility="5.1.001" expanded="true" height="60" name="Read CSV" width="90" x="45" y="120">
















< =“tru运营商激活e" class="logistic_regression" compatibility="5.1.001" expanded="true" height="94" name="Logistic Regression" width="90" x="246" y="120">


< =“tru运营商激活e" class="read_csv" compatibility="5.1.001" expanded="true" height="60" name="Read CSV (2)" width="90" x="246" y="255">
















< =“tru运营商激活e" class="apply_model" compatibility="5.1.001" expanded="true" height="76" name="Apply Model" width="90" x="380" y="165">



< =“tru运营商激活e" class="performance_classification" compatibility="5.1.001" expanded="true" height="76" name="Performance" width="90" x="514" y="165">




< =“tru运营商激活e" class="write_csv" compatibility="5.1.001" expanded="true" height="60" name="Write CSV" width="90" x="648" y="210">


< =“tru运营商激活e" class="log" compatibility="5.1.001" expanded="true" height="76" name="Log" width="90" x="648" y="75">



















Tagged:

Answers

  • 土地土地 RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    first of all: If you want that we can draw conclusions from your process you will have to generate a process that does not depend on .csv files lying on your local hard disk.

    If a Kernel Logistic Model performs worse than another modeling technique: This is not necessary a bug. Might be it just doesn't work on your data?
    If you have reasons to believe otherwise, please explain them more detailed. You can be asured that we will listen carefully.

    Greetings,
    Sebastian
登录orRegisterto comment.