During Summer 2007, SEASR team member–and Automated Learning Group (NCSA) and Illinois Genetic Algorithms Lab Research Scientist–Xavier Llorà received two Bronze Humies at GECCO 2007 (the Genetic and Evolutionary Computation Conference). Dr. Llorà also received two Best Paper awards at international conferences within a month of one another–a signal accomplishment.
First, the Humies: Dr. Llorà and NCSA faculty fellow Rohit Bhargava (Bioengineering and Beckman, UIUC), with the support of students Rohith Reddy (Bioengineering and Beckman, UIUC) and Brian Matesic (Bioengineering, UIUC), received a Bronze Humie for “Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infrared Spectroscopic Imaging.” The team used a novel genetics-based machine learning technique (NAX) to diagnose prostate cancer. Their innovative data handling and analysis strategies demonstrate fast learning and accurate classification that scales well with parallelization. For the first time, an automated discovery method has performed as accurately in predicting prostate cancer as human experts.
Along with co-authors Jaume Bacardit (ASAP research group, School of Computer Science and IT, U. Nottingham), Michael Stout (ASAP research group, School of Computer Science and IT, U. Nottingham), Jonathan D. Hirst (School of Chemistry, U. Nottingham), Kumara Sastry (Illinois Genetic Algorithms Laboratory, UIUC), and Natalio Krasnogor (ASAP research group, School of Computer Science and IT, U. Nottingham)—Dr. Llorà was also awarded a Bronze “Humie” for “Automated Alphabet Reduction Method with Evolutionary Algorithms for Protein Structure Prediction.” The paper demonstrates that certain automated procedures can be used to reduce the size of the amino acid alphabet used for protein structure prediction from twenty to just three letters with no significant loss of accuracy. This discovery has the potential for enabling a faster and easier learning process, as well as for generating more compact and human-readable classifiers.
The Humies are awarded annually to recognize human-competitive results produced by genetic and evolutionary computation. Each Bronze award carries a $1000 prize.
Next, Best Papers: Dr. Llorà’s award-winning papers were “Toward Billion Bit Optimization via Parallel Estimation of Distribution Algorithm” at GECCO 2007 (the Genetic and Evolutionary Computation Conference) for the Estimation of Distribution Algorithms track in London, England this July—which Dr. Llorà co-authored with NCSA Research Scientist David E. Goldberg (who also serves as Jerry S. Dobrovolny Distinguished Professor in Entrepreneurial Engineering and Director of Illinois Genetic Algorithms Laboratory, UIUC) and Kumara Sastry (Illinois Genetic Algorithms Laboratory, UIUC)—and “Delineating Topic and Discussant Transitions in Online Collaborative Environments” for the overall conference at ICEIS 2007 (International Conference on Enterprise Information Systems) in Funchal, Madeira, Portugal this June. Noriko Imafuji Yasu, a postdoctoral fellow at Illinois Genetic Algorithms Laboratory; NCSA Research Scientist and Professor David E. Goldberg; and marketing researchers Yuichi Washida and Hiroshi Tamura co-authored the paper.
“Toward Billion Bit Optimization via Parallel Estimation of Distribution Algorithm” takes on a major, open problem in the field of genetic algorithms, which is devoted to search procedures based on the mechanics of natural selection and genetics that, since the mid-1980s, have been used increasingly to find answers to important scientific problems. Until now, genetic algorithms have been criticized as being slow, suitable for optimizing problems with only a few variables. Experts have believed genetic algorithms could not scale to help solve larger, more complex problems. However, Dr. Llorà and his co-authors show that genetic algorithms—by utilizing a number of memory and computational efficiencies— can be scaled to present principled solutions to solve boundedly difficult, large scale problems with millions to billions of binary variables. Moreover, they showed that their fully parallelized, highly-efficient compact genetic algorithm was able to do so against a class of additively separable problems even with additive noise, when local search methods failed to do so in the presence of just a modest amount.
“Delineating Topic and Discussant Transitions in Online Collaborative Environments” details a new algorithmic method for analyzing discussion dynamics and social networking in online collaborative environments (in this case, focus group discussions for product conceptualization), a relatively new and important domain of social and consumer communications research. The team developed an algorithm named KEE (Key Elements Extraction), which applies the HITS (Hyperlink-Induced Topic Search) algorithm (Kleinberg, 1999) in an unintended way: using the HITS algorithm for textmining rather than dividing web pages into hubs and authorities. The KEE algorithm assumes a mutually reinforcing relationship between participants and terms, defining significant participants as those who use many significant terms and significant terms as those used by many significant participants.
Employing real discussion data, the team determined that the KEE algorithm provided a better understanding and depiction of participants’ ideas than the traditional TF-IDF (term frequency-inverse document frequency) method. Moreover, since key terms were associated with key persons in the discussion, the terms themselves already conveyed the potential knowledge sought. These results, in which discussion dynamics analysis and social network analysis produced significant knowledge essential to decision support, demonstrate the KEE algorithm’s effectiveness for network- and text-based communication analysis. The KEE algorithm research is associated with the Illinois Genetic Algorithms Laboratory’s DISCUS project, which targets innovation support through network-based communication, using two other chance discovery methods, KeyGraph (Ohsawa, Benson, & Yachida, 1998) and influence diffusion models (IDM) (Matsumura, Ohsawa, & Ishizuka, 2002). Next, the team plans to use the KEE algorithm for knowledge discovery in web-logs or web forums.
SEASR is indeed fortunate to have such a gifted researcher and collaborator on our team!
Bacardit, Jaume, Michael Stout, Jonathan D. Hirst, Kumara Sastry, Xavier Llorà, and Natalio Krasnogor, “Automated Alphabet Reduction Method with Evolutionary Algorithms for Protein Structured Prediction,” in Proceedings of the 2007 GECCO Conference Companion on Genetic and Evolutionary Computation, London, England, United Kingdom, ACM, 2007.
Goldberg, David E., Kumara Sastry, and Xavier Llorà, “Towards Billion Bit Optimization via Efficient Genetic Algorithms,” in Proceedings of the 2007 GECCO Conference Companion on Genetic and Evolutionary Computation, London, England, United Kingdom, ACM, 2007. Also published in Complexity, 12 (3), 27-29.
Kleinberg, Jon M.(1999), “Hubs, Authorities, and Communities,” ACM Computing Surveys, 31 (4), No. 5.
Llorà, Xavier, Rohith Reddy, Brian Matesic, and Rohit Bhargava, “Towards Better than Human Capability in Diagnosing Prostate Cancer Using Infrared Spectroscopic Imaging,” in Proceedings of the 2007 GECCO Conference Companion on Genetic and Evolutionary Computation, London, England, United Kingdom, ACM, 2007.
Matsumura, Naohiro, Yukio Ohsawa, and Mitsuru Ishizuka (2002), “Automatic Indexing for Extracting Asserted Keywords from a Document,” New Generation Computing, 21(1), 37-48.
Ohsawa, Yukio, Nels E. Benson, and Masahiko Yachida (1998), “KeyGraph: Automatic Indexing by Co-Occurrence Graph Based on Building Construction Metaphor,” ADL, 12-18.
Yasui, Noriko Imafuji, Xavier Llorà, David E. Goldberg, Yuichi Washida, and Hiroshi Tamura, “Delineating Topic and Discussant Transitions in Online Collaborative Environments,” in Proceedings of the Tenth International Conference on Enterprise Information Systems, Funchal, Madeira, Portugal, ICEIS Press, 2007