Split Dcoument by Content

andkandk 米emberPosts:21米aven
edited November 2019 inHelp
Hi,

As I am a Newbie do RM I have a question regarding the "Split Dcoument by Content" operator. I have to supply an input and an output folder, so I don't really get why the software asks me to connect the ports.? In general is there a description about the port shortcuts, this means thr, op, doc etc.? Am I right when I suppose that the purchase of this "How to extend RM 5.0"? I mean the name implies that it is mainly directed to developers but as i assume that it is a further development of the former "rm tutorial...." and so it also contains detailed description of the different operators and processes.

感谢帮助!最好的问候,

André
Tagged:

Answers

  • haddockhaddock 米emberPosts:849米aven
    Bonjour André,

    No need to buy the Extenders manual, but it is probably a good idea to check out page 33 of the normal manual ( available herehttp://rapid-i.com/content/view/26/84/) for info about what the graphics mean. On whether youhaveto connect your operators, it depends on the context; don't be alarmed bywarningmessages, if in doubt start the process to find out! RM is pretty forgiving.

    Good weekend!

  • andkandk 米emberPosts:21米aven
    thank you, ok i really should have read the manual with more caution!;)

    best regards,

    andré
  • andkandk 米emberPosts:21米aven
    hi,

    i am sorry but i couldn't find any solution in the manual regarding my problem. i am trying to split a collection of xml documents by a xpath query with the "split file by content" operator. for now this is everything i want to do. i supply the input folder (on my harddisk) at the properties "texts", and the output folder in "output" and of course i define the xpath at which each document in the collection should be split as well. nevertheless the operator still asks me for a connection of ports. i can't understand why is this case as all this process should do is to split my files and store it at the defined location. in the description of the operator it says: input: through 1 output: through 1 .... what does that mean? what is this thr port for? these are infos i couldn't find in the manual. it would be very nice if someone could help me with this.

    best regards,

    andré

  • landland RapidMiner Certified Analyst, RapidMiner Certified Expert, MemberPosts:2,531Unicorn
    Hi,
    you don't need to connect the ports at all, they won't change the data anyway. It is just a help to define the order of execution. Without any connection, there's no order defined, so it might happen that although you want to process the files that result from the split you will split them after trying to process them.
    There's a button to show the actual order if you don't want to use the through ports.


    Greetings,
    Sebastian
Sign InorRegisterto comment.