Abstract
In classification problems that involve multiple sources, data distributions may vary. Therefore, knowledge of dimensions that differ in the source and target data is important to reduce the distance between domains, allowing accurate transfer knowledge. Here, we present a novel method to identify (in)variant genes between source and target datasets and integrate such results to simultaneously reduce the variance between two distributions while optimizing the size and classification error of the selected subset. In particular, we use an evolutionary computation particle swarm optimization algorithm to implement such a bilevel multi-objective programming approach, allowing us to solve a gene subset selection problem.