提取sentiment operator works with french words?

EL75EL75 MemberPosts:43Contributor II
Hi,
Does someone could tell me if VADER or Wordnet are dealing with french when you select one of them in the "Extract sentiment" operator ?
- The wordnet exist for french (Wolf):http://pauillac.inria.fr/~sagot/index.html#wolf
- VADER also has been transposed:https://github.com/thomas7lieues/vader_FR

But what about the legacy operator of rapid miner? I've seen no way to parameter the operator, neither in the help window...
In case the standard rapid miner operator doesn't woks for french, is there a way to connect rapidminer to the french projects mentioned above?
thanks.

Best Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted
    Hi,

    there is something odd with escaping of / and so on, please try this process and adapt the path of read csv in a way that it points to the downloaded version of:https://raw.githubusercontent.com/thomas7lieues/vader_FR/master/vaderSentiment_fr/fr_lexicon.txt
    Best,
    Martin












    <参数键=“编码”值= "系统" / >





















    <参数键=“编码”值= "系统" / >









































































    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
    EL75
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Solution Accepted
    Hi@El75,
    i will connect with you via email.
    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany

Answers

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi,

    this operator is actually just wrapping models created with dictionary based sentiment operator. You can easily use the dict based sentiment operator to do this.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • EL75EL75 MemberPosts:43Contributor II
    hello mschmitz,
    thanks for your answer. how can I manage the "dictionary based sentiment operator" in order to access to french versions mentioned of vader or wordnet?
    best regards
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi@EL75,
    did you check the tutorial process?

    BR,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • EL75EL75 MemberPosts:43Contributor II
    if you mean this one, yes. Tell me if I'm wrong.
    In case not, how this process allow me to access one of those ressources?
    The wordnet exist for french (Wolf):http://pauillac.inria.fr/~sagot/index.html#wolf
    - VADER also has been transposed:https://github.com/thomas7lieues/vader_FR
    best regards

  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
  • EL75EL75 MemberPosts:43Contributor II
    Thanks for your answer !
    WOLF project is the french translation of wordnet, probably a good idea to add it too.
    rapidminer popularity will increase within the french community:)

    - The wordnet exist for french (Wolf):http://pauillac.inria.fr/~sagot/index.html#wolf
    - VADER also has been transposed:https://github.com/thomas7lieues/vader_FR
  • EL75EL75 MemberPosts:43Contributor II
    Martin,
    trying to copy/paste the xml code ("a full training process looks like this") in rapid miner.. but nothing happens.
    could you help ?













    <参数键=“编码”值= "系统" / >



    https://raw.githubusercontent.com/thomas7lieues/vader_FR/master/vaderSentiment_fr/fr_lexicon.txt"/>
    https://github.com/cjhutto/vaderSentiment;




















    <参数键=“编码”值= "系统" / >






































































  • EL75EL75 MemberPosts:43Contributor II
    edited December 2020
    Thanks a lot! works fine.
    可以给我一个sk you few additional questions, in oder to fine tune the process?

    1- working with example set
    As I have an example set containing reviews, I've added a "data to document" operator before the "loop collection" operator (I havent't seen an operator like "Apply Model (Documents)" dedicated to example sets). then I've put in the "loop" all my text processing operators, and it looks fine. Is it the right way?

    2- using emojis
    I've seen in the vader repository that there are two others files that could be helpful (I've lot of emoticons in my reviews):
    is there a way to integrate them in this process ?


    3- understanding the columns in the dictionary

    - att1 is the word of de dictionary
    - att2 seems to be the value of the polarity
    - att3: is it the weight?
    ——att4:这些值是如何使用的?

    4- using polarity_scores_max
    https://github.com/thomas7lieues/vader_FR
    on this web page it is indicated that we can usepolarity_scores_max: how is it possible?
    # Note : You can use polarity_scores_max instead of polarity_scores. polarity_scores_max uses fuzzywuzzy to get the most similar words with your inputs. For example "connar" won't be detected with polarity_scores but with polarity_scores_max

    5- Build my own dictionary
    If I want to add sentiment words and weights related to the specific domain I'm working on, what would be the best process?
    just adding new lines in the dictionary file?

    I really enjoy using this dictionary on my data set:)
    all the best,
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi@EL75,
    yes, you can just append the dictionaries and create one large one to do this.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • EL75EL75 MemberPosts:43Contributor II
    Hi Martin,
    something strange: the process works fine, alone. But when the same one is added to a bigger one (copy/paste) with other operators (I've done this to compare results) => I get an error message saying (prb of tokenization) although the subprocess "loop collection" contains tokenization process". I'm 100% sure that all connections are good. I have even try something aberant but that seems to reveals a bug: in the processus that works fine, I've imported other operators (that generate the default), then move them to the trash (so that I come back to the process that worked fine) and then the process crash...
    below: the process containing at the bottom the "Vader FR" (deactivated)


    the "vader fr" process (works fine alone):


    thanks for your help
    best
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi@EL75,
    I would love to help, but I am very busy and this is somewhat complex. I cannot deep dive into it.

    Is this something commercial or is this an academic project? If this is a commercial request we may move this over and we can assign resources on it. Otherwise maybe@lionelderkrikoror so can help?

    Cheers,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • EL75EL75 MemberPosts:43Contributor II
    edited December 2020
    Hi Martin,
    Of course not, this is not commercial but a research purpose:)=> (working on health aspects and impacts of digital practices => I'm working on parents and children reviews coming from app stores, twitter, blogs etc)
    But as I'm working on a french dataset that would be very useful.

    可以给我一个sk you also :
    1 - WORD2VEC
    - I've read your article "wordSynonym Detection with Word2Vec" => I've tried to implement the process but I've obtained strange results : do this operator works with every language (e.g french of course)?

    2- TOPICS EXTRACTION
    As I'm trying to extract topics from the data set, I've read and adapted your excellent article dealing with amazon reviews, thinking that this process could fit part of my needs. It is really inspiring! I wonder if there's any other possibilities to visualize results, such as dendrogram, etc?

    Best,
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi@EL75,
    maybe you want to explain a bit more what you try to accomplish from a "Business" perspective so we can map this to a DS method?

    ~Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • MartinLiebigMartinLiebig Administrator, Moderator, Employee, RapidMiner Certified Analyst, RapidMiner Certified Expert, University ProfessorPosts:3,404RM Data Scientist
    Hi@EL75,
    一分钟前我添加了法国的运营商。它will not be publicly available for a bit (since we usually wait a bit to have more new things). Please let me know if you need a preview build.

    Best,
    Martin
    - Sr. Director Data Solutions, Altair RapidMiner -
    Dortmund, Germany
  • EL75EL75 MemberPosts:43Contributor II
    Hi Martin,
    thanks for having done it. I'd appreciate receiving a preview build, indeed.
    I wish you a happy new year!
    Best,
Sign InorRegisterto comment.