To search, Click below search items.


All Published Papers Search Service


Ontology Based Document Data Analysis


Ambreen Zafar, Muhammad Awais and Muhammad Ahmad Aftab


Vol. 18  No. 11  pp. 42-48


A vast amount of data is generating at a rapid pace over the internet by means of blogs, online forums and emails etc. The huge volume and complex semantics of unstructured data initiates the need of effective management for efficient retrieval. It is intricate for users to find right keywords for search to retrieve relevant search results. There also exist polysemous words in the vocabulary of every natural language i.e. words contributing different meaning according to the context. Additional relations among words such as super-subordinate relation (hypernym/hyponym) and part-whole relation (meronym/holonym) can also be incorporated to capture the semantics of user’s query. The concept of document clustering along with ontology provides users with the opportunity to overcome difficulties associated with traditional keyword based search. It intends to reduce search time and enhance the retrieval of relevant documents. This research proposes a semantic-based document clustering technique by applying K-means clustering algorithm over concept weight matrix, computed using modified TF-IDF approach. The weights are calculated specifically for the features and their relations extracted from WordNet ontology. Silhouette coefficient is used as a measure of cluster purity.


Clustering, Ontology, WordNet, Concept Weight, TF-IDF