Why does log-Transformation give 3% better accuracy with LibSVM?

Fred12Fred12 MemberPosts:344Unicorn
edited December 2019 inHelp

hi,
I tried a log10-Transformation no my right-skewed dataset, and trained / tested it again with a LibSVM. The results were staggering me, as it is a quite difficult dataset. But the results were 2.5 -3 % better than with not transformed dataset (from 84-85 to 87.6 % better performance..). I also standardized my datasets prior...

how can that be? I mean SVM does not make any distribution assumptions like a GLM or does it?

it would just correspond to a different Kernel function right? I used RBF-kernel, then it would be a RBF-Kernel with ||log(x)-log(x*)|| in the numerator of the rbf kernel function, right?

Tagged:

Best Answer

  • marcin_blachnikmarcin_blachnik MemberPosts:61Guru
    Solution Accepted

    Well

    The SVM don't make any assumptions but I guess you use RBF kernel. The RBF kernel has fixed Gamma. When the attributes are squed the Gamma has different influence on low values of the data than on hi values, when you log it you make the gamma to have equal influence in the entire space.
    Good preprocessing is very important for SVM model.

    Best

    Maricn

    Thomas_Ott Fred12

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    Great question! I don't know the answer but I will do an incantation for@IngoMierswaand see if he can answer that!

  • Fred12Fred12 MemberPosts:344Unicorn

    hi,

    did you get any answers yet?

  • Pekka_JounelaPekka_Jounela Member, University ProfessorPosts:4University Professor

    Hi, I guess thats because taking a log10 transformation reduces variance.

    Bests

    Pekka

  • Fred12Fred12 MemberPosts:344Unicorn

    thanks, that sounds like a nice and logic explanation:)

    Thomas_Ott
Sign InorRegisterto comment.