Split text tokens where words have concatenated
I have text tokens like
stylesexploration
expressionresearch
technologypractice
curriculaimprovisationsurvey
where the punctuation and or spaces are missing in the original text
Besides using a list of replace "expressionresearch" with 2 tokens "expression" & "research" is there a smarter way to handle the situation
stylesexploration
expressionresearch
technologypractice
curriculaimprovisationsurvey
where the punctuation and or spaces are missing in the original text
Besides using a list of replace "expressionresearch" with 2 tokens "expression" & "research" is there a smarter way to handle the situation
Tagged:
0
Answers
then you might do things using some Generate Attributes functions like contains or find or so..
~Martin
Dortmund, Germany
https://github.com/fxsjy/jieba