“PCA运营商塔基•ng to much time"

waqaskhan343 · May 2018

Hello, I am performing sentiment analysis on text data in which I examine 1700 tweets. after performing all preprocessing of data I want to visualize it using PCA to check the relationship between the different classes. After generating TF-IDF I am using pca operator with componant=2 and fixed number variance but it taking much much time approx 2 to 3 hour. Even I put a normalize operator before PCA but it doesn't work for me

Telcontar120 · May 2018

Did you apply any pruning when you generated your word vector? If not, then you probably have thousands of attributes, many of which have extremely low values, and that is why PCA is taking so long! You should definitely prune your wordlist first, since tokens that have only a handful of occurrences are not going to be meaningful, but they are causing a lot of computational effort on the part of the PCA operator.

Thomas_Ott · May 2018

What@Telcontar120said. Work on your wordlist first before you put it into PCA. Even just 50 attributes could chew up runtime if you don't have a large memory computer.

Howdy, Stranger!

Quick Links

Categories

Altair RapidMiner Community

GET HELP. LEARN BEST PRACTICES. NETWORK WITH YOUR PEERS.

“PCA运营商塔基•ng to much time"

Answers