Knowledge Discovery Using Neural Networks

Date

2003-12

Authors

Doddameti, Sandesh

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

A vital type of knowledge that can be acquired from vast amounts of data generated m today's world are the hidden trends These hidden trends highlight the generality that exist m the data and can be expressed as rules or correlations These trends, which are specific to the application, represent a type of knowledge discovery The acquired knowledge is extremely helpful m understanding the domain, which the data describes. In this thesis, a process for discovering trends in datasets using neural networks is presented. The process consists of five phases - Data preparation, Training, Pruning, and re-training, Clustering, and Extraction. In phase one, the data is encoded into binary vectors in the data preparation phase. In the training phase, a supervised learning method is used to train the neural network. The network learns the correlation that exist in the dataset. During training, inconsistent patterns are removed via a filtering process. In the pruning and re-training phase, the unnecessary connections and neurons are pruned and the network is re-trained. The clustering phase superimposes a layer of adaptive clustering neural network on the hidden layer of the network. The purpose of the superimposed layer is to create generalized regions of activation for hidden layer neuron activation values and to identify a representative value for each region. The extraction phase of uses the trained network with the superimposed layer, to discover the trends in the dataset. The process provides several control parameters such as frequency, radius, and activation level to achieve flexibility and stringency for the extracted trends. Predicted trends are discovered during this phase using all combinations of the input patterns. Finally, the applicability and robustness of the process is demonstrated by applying the process to real world datasets, demographic and crime, dietary factors and Plasma Retinol and Beta-Carotene concentrations, system measurements and CPU usage, body measurements and body fat percentage, pollution and mortality. The process was used to predict trends from acquired knowledge in the demographics-crime dataset.

Description

Keywords

neural networks, data mining, supervised learning, machine learning, computer science

Citation

Doddameti, S. (2003). Knowledge discovery using neural networks (Unpublished thesis). Texas State University-San Marcos, San Marcos, Texas.

Rights

Rights Holder

Rights License

Rights URI