Google Search To Tag Cloud Viewer With Stemming
Posted by admin on May 15 2012 04:54:53pm
This flow is similar to the "Tag Cloud Viewer" flow.

This one presents a webui for a search query that is sent to Google and retrieves specified number of documents from the results for further processing.

This flow also adds the components necessary to perform stemming. This flow will display a tag cloud for a url that points to a text document. For pdf, the text is extracted from the pdf file and a tag cloud is created for words. For html, the html tags are removed and then a tag cloud is created for the words.

This flow applies stemming to the words to create a dictionary of terms used, then reverse stems to the shortest word with the same mapping for the tag cloud visualization. So "run", "runs", "running" would map to "run" and the word used in the tag cloud visualization would be "run". If the word "run" did not exist, then the shortest word that exists in the dictionary created would be used, in this case "runs".
Flow URI:
Location URI: