Features effecting Bottom Line (Revenue)

msacs09msacs09 MemberPosts:55Contributor II
edited December 2018 inHelp
Experts,

Can you please help me on how to perform a feature weights/contributing factors that effecting the revenue. We would like understand why are some instances of revenue low and some high, what is the differentiator. Please see the sample data. I wanted to see what features are affecting a revenue percentages.

Can you please help me how to approach this. I do have lot of nominal attributes, should i convert everything to numerical etc., can you point me to a sample process please.

As Always thanks you for your valuable advice and time

Answers

  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    If you are understanding the univariate relationships between Revenue and other attributes one at a time, you should look at the Weighting operators. Weight by Correlation is good for numerical attributes and Weight by Information Gain or Weight by Chi Square is good for nominal variables.

    These will only show you individual relationships. Your question may actually be about what combinations of factors are most associated with Revenue. If that is the case and you are interested in exploring multivariate relationships, then that is basically a supervised machine learning problem. In that case, you probably want to build a simple predictive model to start, using a highly interpretable algorithm. I suggest a simple Decision Tree model so you can get a sense of what combinations of factors are associated with different levels of Revenue.

    In both cases, looking at the tutorial processes contained in RapidMiner will be useful for understanding the basic setup and use in RapidMiner.

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
  • msacs09msacs09 MemberPosts:55Contributor II
    @Telcontar120Thank you sir. Your understanding is exactly right. I need to "explore multivariate relationships effecting Revenue" . Can I kindly request a sample/similar process that I can infer please??
  • msacs09msacs09 MemberPosts:55Contributor II
    Telcontar120谢谢先生。有山姆ple process around exploring multivariate relationships please?
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    edited December 2018
    Here's a simple cross-validation with a DT for a numerical dataset. You'll need to substitute your own dataset of course and make sure Revenue is set as the label.
    < ?xml version = " 1.0 " encoding = " utf - 8 " ?> <过程版本sion="9.1.000-BETA2">                                          Builds a model on the current training data set (90 % of the data by default, 10 times).<br><br>Make sure that you only put numerical attributes into a linear regression!                     Applies the model built from the training data set on the current test set (10 % by default).<br/>The Performance operator calculates performance indicators and sends them to the operator result.  A cross validation including a linear regression.             

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
  • msacs09msacs09 MemberPosts:55Contributor II
    edited November 2018
    @Telcontar120非常感谢先生。你能建议最好的way to represent this via chart. What charts in Rapidminer would help us to interpret the below for the Business folks. Does the below sample says that Med has highest margin, since the count is 10?? Basically i want to extract the decision tree model and present in a meaningful way

    RegressionTree

    segment = global: 0.018 {count=4} segment = local | Sector = AD: 0.016 {count=3} | Sector = ES: 0.011 {count=2} segment = med: 0.020 {count=10}
  • msacs09msacs09 MemberPosts:55Contributor II
    Telcontar120谢谢先生。有山姆ple process around exploring multivariate relationships please?
  • Telcontar120Telcontar120 Moderator, RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:1,635Unicorn
    The sample process I provided earlier in this thread is suitable for exploring and showing multivariate relationships via a decision tree. You could also swap the learner and do something similar with a linear regression or GLM.

    Brian T.
    Lindon Ventures
    Data Science Consulting from Certified RapidMiner Experts
    sgenzer
Sign InorRegisterto comment.