Discovering Similarity Across Heterogeneous Features: A Case Study of Clinico-Genomic Analysis

Vandana P. Janeja; Josephine M. Namayanja; Yelena Yesha; Anuja Kench; Vasundhara Misal

doi:10.4018/IJDWM.2020100104

Back

Discovering Similarity Across Heterogeneous Features: A Case Study of Clinico-Genomic Analysis

Journal article

Peer reviewed

Discovering Similarity Across Heterogeneous Features: A Case Study of Clinico-Genomic Analysis

Vandana P. Janeja, Josephine M. Namayanja, Yelena Yesha, Anuja Kench and Vasundhara Misal

International journal of data warehousing and mining, Vol.16(4), pp.63-83

2020-10-01

DOI: https://doi.org/10.4018/IJDWM.2020100104

Abstract

Computer Science

Computer Science, Software Engineering

Science & Technology

Technology

The analysis of both continuous and categorical attributes generating a heterogeneous mix of attributes poses challenges in data clustering. Traditional clustering techniques like k-means clustering work well when applied to small homogeneous datasets. However, as the data size becomes large, it becomes increasingly difficult to find meaningful and well-formed clusters. In this paper, the authors propose an approach that utilizes a combined similarity function, which looks at similarity across numeric and categorical features and employs this function in a clustering algorithm to identify similarity between data objects. The findings indicate that the proposed approach handles heterogeneous data better by forming well-separated clusters.

Metrics

29 Record Views

1 Times Cited - Web of Science

InCites Highlights

These are selected metrics from InCites Benchmarking & Analytics tool, related to this output

Collaboration types: Domestic collaboration
Citation topics: 4 Electrical Engineering, Electronics & Computer Science; 4.61 Artificial Intelligence & Machine Learning; 4.61.869 Clustering
Web Of Science research areas: Computer Science, Software Engineering
ESI research areas: Computer Science

UN Sustainable Development Goals (SDGs)

This output has contributed to the advancement of the following goals:

Source: InCites

Details

Title: Discovering Similarity Across Heterogeneous Features: A Case Study of Clinico-Genomic Analysis
Creators: Vandana P. Janeja - University of Maryland, Baltimore County
Josephine M. Namayanja - University of Massachusetts Boston
Yelena Yesha - University of Maryland, Baltimore County
Anuja Kench - University of Maryland, Baltimore County
Vasundhara Misal - University of Maryland, Baltimore County
Publication Details: International journal of data warehousing and mining, Vol.16(4), pp.63-83
Publisher: Igi Global
Number of pages: 21
Academic Unit: College of A&S; A&S - Computer Science
Language: English
Resource Type: Journal article
Record Identifier: 991031757332802976