prediction modeling for text analysis

lambamanika07lambamanika07 MemberPosts:24Maven
edited December 2018 inHelp

I am trying to perform a prediction modeling of text resources. I chose 272 training resource and 116 as test ones. But only 190 from the training ones and 80 from the test ones got modeled and results about their accuracy, precision and recalls values were shown. But I want to get those results for all the data. Please help.

Best Answers

  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Solution Accepted

    Hi@lambamanika07

    I don't understand exactly what you want to do and what you performed exactly.

    Your training and test dataset are both labeled ?

    But given the information given, I suggest you to perform across validationwith your 272 training ressources to build a model ==>

    you will have the performance (accuracy, recall, precision) of your model based on your 272 training ressources.

    and then to apply this model to your 116 (labeled ?) test ressources with aperformanceoperator. =>So you can measure the performance

    of your builded model on "unseen" data. THe process looks like this :

    text_training_test_data.png

    or

    you can perform across validationwith your 388 ressources (272 training + 116 test) to build a better model ==>

    you will have the performance (accuracy, recall, precision) of your model based on your 388 "training" ressources.

    and you can apply this model to future unseen data. The process looks like that :

    text_training_test_data_2.png

    For a better response, can you share your process and your data source, please ?

    Regards,

    Lionel

    sgenzer
  • lionelderkrikorlionelderkrikor Moderator, RapidMiner Certified Analyst, MemberPosts:1,195Unicorn
    Solution Accepted

    Hi@lambamanika07again,

    to complete my response, the sub-processcross validationlooks like that :

    text_training_test_data_3.png

    Regards,

    Lionel

    sgenzer

Answers

  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    Are you using Cross Validation? Post your proess using the < / > option.

  • lambamanika07lambamanika07 MemberPosts:24Maven
  • Thomas_OttThomas_Ott RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,761Unicorn

    @lambamanika07i would not build a text classification model as you've shown. I would do it like@lionelderkrikorshows. Also, if the LinearSVM doesn't show good results, I would try a Naive Bayes and/or Deep Learning. You could even use a Stacking or Voting operator.

    sgenzer
Sign InorRegisterto comment.