dc.contributor.advisor | Ngu, Anne H.H. | |
dc.contributor.author | Phillips, Clark Raymond ( ) | |
dc.date.accessioned | 2015-06-26T18:02:51Z | |
dc.date.available | 2015-06-26T18:02:51Z | |
dc.date.issued | 2015-05 | |
dc.identifier.citation | Phillips, C. R. (2015). Employing an efficient and scalable implementation of the Cost Sensitive Alternating Decision Tree algorithm to efficiently link person records (Unpublished thesis). Texas State University, San Marcos, Texas. | |
dc.identifier.uri | https://digital.library.txstate.edu/handle/10877/5576 | |
dc.description.abstract | When collecting person records for census, identifying individuals accurately is paramount. Over time, people change their phone numbers, their addresses, even their names. Without a universal identifier such as a social security number or a finger-print, it is difficult to know whether two distinct person records represent the same individual. The Cost Sensitive Alternating Decision Tree (CSADT) algorithm (a supervised learning algorithm) is employed as a Record Linkage solution to the problem of resolving whether two person records are the same individual. A person record consists of several attributes such as a name, a phone number, an address, etc. The number of person-record-pairs grows exponentially as the number of records increase. In order to accommodate this exponential growth, a scalable implementation of the CSADT algorithm was employed. A thorough investigation and evaluation are presented demonstrating the effectiveness of this implementation of the CSADT algorithm on linking person records. | |
dc.format | Text | |
dc.format.extent | 99 pages | |
dc.format.medium | 1 file (.pdf) | |
dc.language.iso | en | |
dc.subject | Decision trees | |
dc.subject | Machine learning | |
dc.subject | Alternating decision tree | |
dc.subject.lcsh | Computer science--Mathematics | en_US |
dc.subject.lcsh | Combinatorial analysis | en_US |
dc.title | Employing an Efficient and Scalable Implementation of the Cost Sensitive Alternating Decision Tree algorithm to Efficiently Link Person Records | |
txstate.documenttype | Thesis | |
dc.contributor.committeeMember | Gao, Byron J. | |
dc.contributor.committeeMember | Lu, Yijuan | |
thesis.degree.department | Computer Science | en_US |
thesis.degree.discipline | Computer Science | en_US |
thesis.degree.grantor | Texas State University | en_US |
thesis.degree.level | Masters | en_US |
thesis.degree.name | Master of Science | en_US |
dc.description.department | Computer Science | |