Posted by David Tcheng on Jun 25 2008 04:46:31pm
The SEASR and NEMA (Networked Environment for Music Analysis) teams have transformed a dynamic music classification explorer developed by IMIRSEL (The International Music Information Retrieval Systems Evaluation Laboratory) into a SEASR application that can be reused in whole or part by music researchers everywhere.
Innovations in digital technologies have changed the ways we create, access, analyze, share, and consume information. But to realize their full potential, we need to re-evaluate digital information technologies to consider whether their methods are hold-outs from the age of print and, if so, what improved means we can devise. IMIRSEL’s Son of Blinkie (SoB) [1, 2], a dynamic classification explorer for musical digital library users and researchers, offers such an advance to the way in which we access and analyze music.
In the print collections and their digital descendents, information is retrieved through metadata, or descriptive labels, imposed upon it by librarians, editors, and domain experts. This metadata is used to generate tables of contents, subject indexes, and other searchable formats. Once determined, such labels and their associated epistemologies tend to become fixed and accepted as fact; they present a closed system of established knowledge rather than provide a virtual landscape that encourages exploration and enables discovery.In developing Son of Blinkie—affectionately named after the earlier, simpler “Blinkie Thing” —the researchers at IMIRSEL have sought to bring leading machine learning methods to bear on the problem of how to make better use of the now digital nature of music collections. They have developed a means for searching music automatically, using its features of composition rather than imposed metadata as a guide. Not only does this automated method improve the speed and accuracy of information retrieval, but it promises to enrich our understanding of music and its classification.
Faced with a collection of music, we often accept that the labels imposed by past listeners are accurate and/or informative. But listeners may hold conflicting opinions about a piece, and the piece itself may defy reductive labeling. Through analyzing a piece using its own compositional features, machine learning can help us to understand whether a given piece is representative of a genre or mood as a whole or to certain compositional tendencies within it, tendencies that may change over time, by performer, or even by performance. What’s more, Son of Blinkie (SoB) advances earlier attempts to automate digital music collection retrieval and analysis.
Consider the traditional train-test approach to building, evaluating, and using machine-generated audio-based classifications (e.g., genre, mood, artist, etc.) for Music Digital Libraries (MDL). It’s useful in some contexts, but has two serious shortcomings. First, the classifications are monic (i.e., only one class label per piece). This monicity ignores the fact that most music comprises a mix of moods and/or genres, etc. Second, the classifications are static (i.e., one class label per song) even though pieces evolve through several moods and/or genre mixes over their play time. The SoB system offers a new and superior method of digital music exploration, engineered to overcome train-test shortcomings and better capture the dynamic nature of music. SoB provides users with the capacity for highly configurable real-time classification, visualization, and audition.
Another important advancement made with SoB is that the application operates within SEASR’s service-oriented architecture, taking the form of a series of reusable, open-source components managed by and executed as a shareable workflow from SEASR’s community hub. Not only can users run SoB against their own data sets– with SEASR’s assistance in accepting different input formats stored on different platforms–but they can also reuse and revise components and workflows to build their own music research applications.
SoB works by extracting a stream of features from audio tracks and applying a set of pre-trained classification models to short windows (10 sec.) of these features to generate posterior probability distributions in real-time. The display of the classification probabilities is synchronized with the audio playback, empowering users to dynamically explore the effects and interactions of an infinite number of parameters involved in automatic music classification. SoB permits users to select an arbitrary number of classification models from the system’s ever-growing model library. Currently SoB’s model library comprises two classification “task” collections: mood and genre classifiers.
Check out Son of Blinkie and load a song.
We show a user simultaneously exploring the different real-time behaviors of mood classification models and genre classification models. Each model is making different predictions on this particular 5-second slice of the incoming, never-heard-before, song. The user can visualize the models’ prediction probability distributions, which can help the user better appreciate the potential “mixture” of moods present. The user can also listen to the synchronized audio to better understand the strengths/weaknesses of each model.
The flow shows how data moves through the Son of Blinkie system, as it operates within SEASR (specifically, the semantic, web-driven dataflow execution environment portion of SEASR, which we have named Meandre). Each component represents one step in processing the data. The components run (and so process data) in the order established by the flow: from receiving the song filename and model filenames from the web application, to loading the audio and model data into memory, to extracting a variety of features from the song, to applying the model to the extracted features, to returning the predicted results to the SEASR community hub (a web application) for visualization. Every time a different song is selected, the web application executes this same flow.
==Data Type Restrictions==
# Funded by The Andrew W. Mellon Foundation and the National Science Foundation (Grant No. NSF IIS-0327371). Thanks to M. C. Jones and the SEASR team for their technical assistance.
# IMIRSEL is directed by Dr. J. Stephen Downie, Graduate School of Library and Information Science (GSLIS), UIUC (firstname.lastname@example.org). His Co-PIs on the Son of Blinkie system are Kris West, School of Computing Science, University of East Anglia and Xiao Hu, GSLIS, UIUC.
# Downie, J.S., Ehmann, A.F., and Tcheng, D. 2005: Real-time genre classification for music digital libraries. JCDL’05, 337.
# NEMA Website: http://nema.lis.uiuc.edu.