From: NCSA Access Summer 2007 Vol. 20 No. 2 (By Trish Barker) Released: 05/31/07URBANA, IL — The Andrew W. Mellon Foundation has awarded $1.2 million to the National Center for Supercomputing Applications (NCSA) and the Graduate School of Library and Information Science (GSLIS) at the University of Illinois at Urbana-Champaign. The grant will support the development of an environment for drawing knowledge from humanities data. The project will address what principal investigator Michael Welge, leader of NCSA’s Data Intensive Technologies and Applications Division, calls the “80 percent problem”: 80 percent of the information needed for business and research is unstructured, meaning it’s not in easily searchable databases (think of email, text documents, and even images, audio, and video); 80 percent of the required information is “open source,” meaning it’s not proprietary or top secret; and people are spending 80 percent of their time hunting for the information they need and just 20 percent actually using it.” There are trillions and trillions of bytes of data available, but the collections are dispersed and finding the relevant material is time consuming,” Welge says. “Someone who wants to research 19th century novels or the work of Cervantes has a wealth of information available to them, but without tools to help them they’ll spend a long time searching that haystack for their particular needle.” The NCSA/GSLIS team will build on NCSA’s successful D2K software — which helps draw insight from structured data in a variety of research and business domains — and IBM’s Unstructured Information Management Architecture to develop a Software Environment for the Advancement of Scholarly Research (SEASR). SEASR (pronounced “Caesar”) will provide the needed bridges from unstructured data, to structured data, to knowledge. The software will help scholars find the data they need, extract the most relevant information, and analyze what is found to generate fresh insights. “Leveraging the power of information technology for these processes will advance humanities research by increasing the quantity of evidence that researchers can explore and the variety of questions they are able to ask,” says John Unsworth, GSLIS dean and a co-principal investigator for the SEASR project. “This project will have a broad impact on both the humanities and the social sciences because of the staggering growth in the amount of information that exists in a digital format,” said Vernon Burton, director of the Illinois Center for Computing in Humanities, Arts, and Social Science. “It is of utmost importance to have automated tools for extracting useful knowledge from vast multi-modal datasets.” SEASR’s developers plan to make the software easy to use and modular, so that components created to address particular questions can be re-used by other researchers. “The SEASR team will accelerate the development of tools and algorithms for supporting humanities computing, allowing humanities scholars to be able to focus on their research,” says co-principal investigator Loretta Auvil, NCSA. While the SEASR team will initially focus on the humanities, other disciplines in the sciences, engineering, and even national defense have similar needs to manage, analyze, and extract meaning from unstructured and structured data and future efforts could extend SEASR to serve other communities. About NCSA™ (National Center for Supercomputing Applications) is a unique state-federal partnership to develop and deploy national-scale cyberinfrastructure that advances science and engineering. Located at the University of Illinois at Urbana-Champaign, NCSA is one of the leading National Science Foundation-supported supercomputing centers. Additional support comes from the state of Illinois, the University of Illinois, private sector partners, and other federal agencies. For more information, see http://www.ncsa.uiuc.edu/. About GSLIS, Consistently ranked as one of the top three library and information science programs in the United States, the Graduate School of Library and Information Science, founded in 1893 at the Armour Institute in Chicago, maintains a reputation of excellence and quality.
Archive for May, 2007
From: The University of Illinois at Urbana Champaign News Bureau (By Andrea Lynn, Humanities Editor)Released: 5/9/07CHAMPAIGN, Ill. — The University of Illinois, home to one of the world’s biggest libraries, the nation’s top-ranked library and information school, a nascent Center for Computing in Humanities, Arts, and Social Sciences, a supercomputing center and key scholars, is poised to become a leader in the effort to “digitize the humanities.”The effort involves designing and constructing research environments in which humanities scholars can use high-performance computing tools in shared digital networks to conduct research across broad swaths of literature.In the last year, John Unsworth, the dean of Illinois’ Graduate School of Library and Information Science, has secured two major technology grants from the Mellon Foundation to lead multi-institutional projects in the digital humanities.He also chaired the national commission that produced the recently released report, “Cyberinfrastructure for the Humanities and Social Sciences,” on behalf of the American Council of Learned Societies. In mid-April, Unsworth presented highlights of the report at a meeting of national digital centers and their sponsors in Washington, D.C.Since becoming dean four years ago, Unsworth also has published two books on digital humanities, taught courses on humanities computing, and won the 2005 Richard W. Lyman Award from the National Humanities Center.Why do scholars in the humanities need new digital technologies?“Coordinating and optimizing the symbiosis between the computer’s mania for detail and the human’s sense of the gestalt becomes more important every day, as more and more of the cultural record becomes digital, and yet our instruments for exploring that digital cultural record remain the blunt instruments of searching and browsing,” Unsworth said.In January, to that end, the Mellon Foundation announced that Illinois would receive a two-year $1 million grant for a text-mining collaboration called “Metadata Offer New Knowledge” (MONK).Unsworth serves as the Illinois lead for MONK’s international and multi-institutional research team that includes participants from five other universities and the National Center for Supercomputing Applications, based at Illinois.MONK brings together and extends two previous research projects: the Nora Project, a multi-institutional Mellon-funded endeavor for which Unsworth served as project director, and WordHoard, directed by Martin Mueller at Northwestern.Nora and WordHoard applied similar techniques to analyze and explore digital humanities collections – 18th- and 19th-century British and American literature in Nora, and earlier texts, including Shakespeare, Chaucer and early Greek epic literature, in WordHoard. Merging Nora and WordHoard in MONK will create “an inclusive and comprehensive text-mining and text-analysis tool-kit of software for scholars in the humanities,” Unsworth said.MONK is “an unusually large collaboration for humanities computing that brings together some of the best and the brightest in the field across North America.”In March, Michael Welge, of NCSA, won a $1.2 million grant from the Mellon Foundation, for an infrastructure project, with Unsworth serving as one of the co-principal investigators. SEASR, or Software Environment for the Advancement of Scholarly Research, begins in June.According to the project’s online report, SEASR seeks to deliver “a means of addressing the challenges of transforming information into knowledge by constructing the software bridges that are required to move from the unstructured and semi-structured data world to the structured data world.”The aim is to make content collections more useful by integrating two research and development frameworks – NCSA’s Data-to-Knowledge (D2K) and IBM’s Unstructured Information Management Architecture – into an easily useable analytical platform that researchers in any discipline, but particularly the humanities, can easily learn and adapt for their own scholarly research.Other key people in SEASR are Loretta Auvil, NCSA and U. of I., co-principal investigator; Duane Searsmith, U. of I., technical lead; Tara Bazler, Indiana University, usability evaluator; and Tim Cole, U. of I., community adviser.According to Unsworth, SEASR links with the MONK project and “has the potential to bring MONK to bear on existing, real-world digital library collections.”Unsworth also is co-principal investigator, with the U. of I. Library’s Beth Sandore, of a $2.6 million project, the ECHO DEPository, a digital preservation research and development project at Illinois in partnership with the Online Computer Library Center and funded by the Library of Congress.Project partners include NCSA and WILL-AM-FM in Urbana, Ill., two other universities and state libraries in five states.Unsworth’s interest in digital humanities preceded his move to Illinois. From 1993 to 2003 he served as the first director of the Institute for Advanced Technology in the Humanities and as a professor in the English department at the University of Virginia. Prior to that, he taught at North Carolina State University.What first got a scholar of contemporary American fiction so interested in the uses of computers in the humanities?“My interest in computers is directly traceable to procrastination,” Unsworth said.“Specifically, while word-processing my dissertation in the late 1980s, I discovered that writing macros, or simple programs, to sort and format my bibliography, or reprogramming the splash screen in Wordstar, was a great way to avoid writing chapters, or worse, to avoid revising them.”More seriously, he said his involvement with computing became a “sustained,” rather than a “fugitive” engagement, when it “met up with my interest in publishing and scholarly communication in 1990.”At North Carolina State, Unsworth and some junior faculty colleagues wanted to start a journal on postmodernism, but the school couldn’t cover the printing and mailing costs, “so the director of the library suggested that we visit the people in campus computing and explore a new software package called ‘Listserv,’ which is how we ended up publishing the first peer-reviewed electronic journal in the humanities, by e-mail, three years before the advent of the Web.”Unsworth said that while there is a great deal of academic activity in advancing digital humanities development, the movement is in its infancy and barriers exist.Funding is one problem, he said, since large-scale projects can be costly. “But that problem is, happily, being mitigated,” Unsworth said, “as private and government foundations are beginning to coordinate their grant-making, partly in response to the ACLS Cyberinfrastructure report.”Another problem is the academic reward system.“Although the field of digital humanities is respectable with deans, provosts and funding agencies, it is often still regarded with suspicion at the department level as somehow less than scholarly.” That conclusion is supported by “The Book as the Gold Standard for Tenure and Promotion in the Humanistic Disciplines,” a study put together by Leigh Estabrook, a former dean of the library school. Funding for the study was providing by the Mellon Foundation.Unsworth said that even at Illinois, one of the most wired and digitally active campuses in the world, “junior level faculty in the humanities who have interesting ideas and good skills for mounting digital humanities projects hold off until they are tenured.”“That’s too bad – and it should underline the need for department heads and senior faculty members to make digital humanities safe for junior faculty.”