"A question about process documents from files"

platanas20platanas20 MemberPosts:22Maven
edited May 2019 inHelp
Hello to everyone,

I'm new with rapidminer and I created a project where I save txt files which are articles from a site (i use the operator "Crawl Web")
After that i use the operator "process documents from files" to read the files.
Inside the operator i use the operator "extract information" (x-path).
I get the comments successfully and i want to ask if it is possible to write only the comments in a document (for example .txt) ?
I'm sorry for my English !!!


MY CODE:












http://nba.sport24.gr/category/nba_news/?locale=el_gr"/>






<参数键= value =“max_depth1"/>














@class='body']/h:p/text()"/>















<连接from_op = "过程文件从文件“from_port="example set" to_op="Write as Text" to_port="input 1"/>







???

Answers

  • Miguel_B_scherMiguel_B_scher MemberPosts:9Contributor II
    Hello platanas.
    Since I dont have your files:
    Your problem is that you just want to save the extracted comments named "com" and you dont know how to do that right?
    You just can use the"Select Attributes"operator to get your comments, that you extracted with your Xpath path command. Just add the attribute name where your texts are saved. In your project you named it: "com".
    With this operator the result will just be the extracted comments that you can easily save in a database or text file using the write operator.

    Hope this helps.

    Greetings
    Miguel

  • platanas20platanas20 MemberPosts:22Maven

    Hello Miguel,

    Thank you very much.This is exactly what i want to do. I use the "Select Attribute" operator and i choose the attribute 'com'. After i use the operator "Write excel" but the results in the excel file are all the attributes.

    Greetings
    platanas
  • Marco_BoeckMarco_Boeck Administrator, Moderator, Employee, Member, University ProfessorPosts:1,984RM Engineering
    Hi,

    please post your process xml so we can see what's wrong. Select Attribute followed by Write Excel works fine for me here;)

    Regards,
    Marco
  • platanas20platanas20 MemberPosts:22Maven
    Here is my XML code:












    http://nba.sport24.gr/category/nba_news/?locale=el_gr"/>






    <参数键= value =“max_depth1"/>














    @class='body']/h:p/text()"/>



















    <连接from_op = "过程文件从文件“from_port="example set" to_op="Select Attributes (2)" to_port="example set input"/>











    image
  • colocolo MemberPosts:236Maven
    Hi platanas,

    try checking the "include special attributes" option for the "Select Attributes" operator. Since you let some meta data be appended by "Process Documents from Data", there exist special attributes, which are not filtered out by default.

    Best regards
    Matthias

    P.S. Please consider using the CODE-Tags when posting longer parts of code to improve readability and keep postings shorter.
  • platanas20platanas20 MemberPosts:22Maven

    Yes now it works fine!!!
    Thank you very much
Sign InorRegisterto comment.