You are here


You know, when people get together and talk about stuff.

Library Text Mining

Rob Sanderson

Using the TeraGrid1 and the SRB DataGrid2, we have sufficient
computational and storage facilities to run normally prohibitively
expensive processing tasks. By integrating text and data mining
tools3[4] within the Cheshire35 information architecture, we can
parse the natural language present in 20 million MARC records (the
University of California’s MELVYL collection) and extract information to
provide to search/retrieve applications. In this talk, we’ll discuss
the results of applying new techniques to ‘old’ data.

Standards, Reusability, and the Mating Habits of Learning Content

Robby Robson

Digital libraries are supposed to foster reuse of digital content but it is hard to combine content from different sources. We are building prototype software that (1) converts different types of courseware to an XML interchange format based on OpenDocument and other specs/standards (2) enables the content to be disaggregated, recombined, re-styled and endowed with SCORM reporting behaviors and (3) realizes instructional design through the use of the SCORM (or IMS) Simple Sequencing. Will demo, discuss and am happy to talk about the bigger picture of reusability in educational digital libraries and standards if given a longer slot.

Anatomy of aDORe

Ryan Chute

The aDORe Archive is a write-once/read-many storage approach for Digital Objects and their constituent datastreams. First, XML-based representations of multiple Digital Objects are concatenated into a single, valid XML file named an XMLtape. Second, ARC files, as introduced by the Internet Archive, are used to contain the constituent datastreams of the Digital Objects. The software was developed by the LANL Digital Library Research & Prototyping Team and is available under GNU LGPL license.

Ryan Chute
Los Alamos National Laboratory, R

Quality Metrics

Aaron Krowne

This talk will discuss the core development activities of the “Quality
Metrics” project at Emory’s Woodruff Library. This project is being
conducted under an IMLS grant to research requirements for and build
a working prototype digital library search system.

What this project is doing that is new is truly generalizing and
integrating explicit and latent quality indicators which allow
users to ascertain the fitness of digital library resources. Most
search engine components have only one indicator: content-query

Connecting Everything with unAPI and OPA

Dan Chudnov

unAPI is a simple-to-use, simple-to-implement API for web sites that allows rich object access and can be easily layered over existing services like Atom, OpenSearch, OAI-PMH, or SRU. OPA is a general-purpose identifier resolver that wraps API calls to heavily-used but incompatible web services like those from Amazon, Flickr, and Pubmed.

Together they will do the same thing we do every code4libcon – try to take over the world!

[Update 2006-02-28: Slides are he

What Blog Applications Can Teach Us About Library Software Architecture

Casey Bisson

The number of programmers in the library world is growing and our individual efforts have shown great promise, but they exist largely as a spectacle that few libraries can enjoy. We need better means to aggregate our efforts and share solutions that can be employed by libraries without programming staff.

Looking outside libraries, we see some interesting examples in the blog world. The blog world is growing with new bloggers every day, but the most interesting aspect is how many people with limited technical skills are using (maintaining and configuring) blog applications like WordPress or Moveable Type, and how quickly the contributions of the many plugin and theme developers are implemented on those blogs. What lessons can we learn from this and how might a library application built from those lessons work? Are some software architectures better at leveraging the network effects of the growing number of developers in our community than others?

Voting on Code4Lib 2006 Presentation Proposals

Vote for the Code4lib 2006 presentations!

Please log in to participate in voting!

You may choose up to 11 proposals.

Voting closes at January 9th 11PM EST.

The 11 proposals with the most votes win. In case of a tie, we will have a "run off" election tomorrow, January 10th at 5PM - 11PM EST.

I will be deleting all votes cast before 5PM EST unless you specifically tell me that you have to vote early. So, be sure to tell me. Seriously. Send an email to

Happy voting!

code4lib card

If you have a blog please consider adding a playing card to your site:


You can use the following HTML:

<a href="">
<img src="" 

Feel free to use the images directly from if you don't want to go through the effort of grabbing them. This will allow us to secretly discover your site by trawling our logs.


Registration is Open

Registration for Code4lib 2006 is OPEN Register early for a discount. Don't hesitate, or wait, or be late...register today!

Code4lib 2006 Call For Proposals

Call for proposals - Code4lib 2006

We are now accepting proposals for prepared talks for Code4lib 2006. Code4lib 2006 is a loosely structured conference for library technologists to commune, gather/create/share ideas and software, be inspired, and forge collaborations. It is also an outgrowth of the Access HackFest, wrapped into a conference-ish format. It is *the* event for technologists building digital libraries and digital information systems, tools, and software.

At least six time slots will be available for prepared talks. We will choose from among the proposals based on diversity of topics, usefulness, wow factor, and potential impact.


Subscribe to RSS - conferences