Theses and Dissertations at Montana State University (MSU)

Permanent URI for this collectionhttps://scholarworks.montana.edu/handle/1/733

Browse

Search Results

Now showing 1 - 3 of 3
  • Thumbnail Image
    Item
    MAXPLANAR : a graphical software package for testing maximal planar subgraph algorithms
    (Montana State University - Bozeman, College of Engineering, 1996) Zhao, Kedan
  • Thumbnail Image
    Item
    Visualization of groundwater pollution using computer graphics
    (Montana State University - Bozeman, College of Engineering, 1990) Palakovich, James S.
  • Thumbnail Image
    Item
    Apriori approach to graph-based clustering of text documents
    (Montana State University - Bozeman, College of Engineering, 2008) Hossain, Mahmud Shahriar; Chairperson, Graduate Committee: Rafal A. Angryk
    This thesis report introduces a new technique of document clustering based on frequent senses. The developed system, named GDClust (Graph-Based Document Clustering) [1], works with frequent senses rather than dealing with frequent keywords used in traditional text mining techniques. GDClust presents text documents as hierarchical document-graphs and uses an Apriori paradigm to find the frequent subgraphs, which reflect frequent senses. Discovered frequent subgraphs are then utilized to generate accurate sense-based document clusters. We propose a novel multilevel Gaussian minimum support strategy for candidate subgraph generation. Additionally, we introduce another novel mechanism called Subgraph-Extension mining that reduces the number of candidates and overhead imposed by the traditional Apriori-based candidate generation mechanism. GDClust utilizes an English language thesaurus (WordNet [2]) to construct document-graphs and exploits graph-based data mining techniques for sense discovery and clustering. It is an automated system and requires minimal human interaction for the clustering purpose.
Copyright (c) 2002-2022, LYRASIS. All rights reserved.