Out of memory while running SVM

AjAj MemberPosts:23Maven
edited June 2019 inHelp
Hello,

I am running SVM on data that has more than 28,000 plus rows and 300 or so attributes. I have 1 GB RAM and around 4 GB swap memory on my Fedora 17 Linux box. I also tried the solution on another Fedora system that has 1 GB RAM and 20 GB swap memory, with the same result.

I am getting the following error related to out of memory.

"This process would need more than the maximum amount of available memory. You can either leave the process as it is and use a computer with more memory, reduce the amount of data by one of the sampling operators, optimize the process by using other learning or preprocessing schemes, or directly work on database systems, eg. by using the cached database example source operators."

I am thinking of using Radoop and I have written to them, requesting help. But in the meanwhile, is there any way that I can solve the problem using the current hardware and also by not reducing data, as data is highly irregular, so k-NN models are playing an important role in out of sample prediction.

I am particularly interested in the last phrase in the above message related to cached database example source operators. I have no idea of what that means. Could anyone please point out to me what element is it in Rapidminer and any example on how to use it.

Before running SVM, I had to run Random Forest on the data, but encountered the same out of memory issue.

Any suggestions, hints, lessons learned, tutorials, and documentation of how to solve this problem and dealing with lot of data would be highly helpful as I have worked with relatively small amount of data, using Rapidminer on my PC.

Thanks,
Ajay
Tagged:

Answers

  • zwitter689zwitter689 MemberPosts:5Contributor II
    I am having the same problem running FP-Growth - any help is appreciated
  • mahtab3000mahtab3000 MemberPosts:4Contributor I

    Hi,

    I have the same problem :

    "This process would need more than the maximum amount of available memory. You can either leave the process as it is and use a computer with more memory, reduce the amount of data by one of the sampling operators, optimize the process by using other learning or preprocessing schemes."

    Is there any idea?!!

    I really appreciate any help you can provide.

Sign InorRegisterto comment.