Generating Synthetic Data or Simulated Data

omar_a_karimomar_a_karim MemberPosts:2Contributor I
edited December 2018 inHelp

I am new to RapidMiner but not new to data science. Synthetic data has its uses in developing data science solutions. I am looking for the best RapidMiner approach to simulate booking events, such as airline bookings. As an example consider a single flight, each day a certain number of passengers book or cancel for this flight. If the flight leaves say 3/1/2019, the bookings could start coming in about 60 days prior, say 1/1/2019 and continue booking through the days leading up to the flight. So I have 60 booking days and one flight. In principle this is easy to simulate, even in Excel.

Imagine now that I have a hundred flights and a 60 day booking window. With a page of Python/Pandas I can quickly create this synthetic data, with different booking characteristics for each of my flights depending on flight date, origin and destination, among other factors.

How should I conceptually get started with this in RapidMiner Studio? I can assure you I have rummaged through the nodes named "Generate" but I did not see an obvious and simple way to go about this. I am sure I must have missed something. This is where RapidMiner experts like you, dear Reader, can be very helpful. I am looking for some guidance, not a full solution. Many thanks.

Tagged:

Best Answer

  • SGolbertSGolbert RapidMiner Certified Analyst, MemberPosts:344Unicorn
    Solution Accepted

    Hi Omar,

    there is a very simple way that's right on your hands: you can use the same Python scripts that you have been using. You just need to install the Python scripting extension.

    I avoid repeating myself or reinventing the wheel as much as I can, so I think that in your case it is also the "expert" solution.

    Regards,

    Sebastian

Answers

  • omar_a_karimomar_a_karim MemberPosts:2Contributor I

    Thanks Sebastian. Of course that makes a lot of sense - using the encapsulated Python. I am able to do this, yes. I will explore the capabilities of the scripting extensions some more as well.

Sign InorRegisterto comment.