Code4Lib 2010 Schedule
The schedule for the 2010 Code4Lib Conference in Asheville, NC.
Monday, February 22 -- Pre-Conferences
Pre-Conference day overview:
- 08:00-09:00 - Registration / coffee
- 09:00-12:00 - Morning Sessions
- 12:00-13:30 - Lunch (on your own)
- 13:30-16:30 - Afternoon Sessions
Full Day Pre-Conferences:
- OCLC Web Services and Lightning Talk Demos - Roy Tennant, Karen Coombs, and Alice Sneary
Full day thorough coverage of a suite of APIs and the essentials about the underlying technologies (e.g., SRU, CQL, Atom, OpenSearch, etc.) to get you going right away. Handouts on each service covered will be distributed that outline the essential information about each service. There will also be time for you to show off what you've done to mashup library data in the past (not limited only to OCLC services), in a 5-10 minute presentation to help inspire ideas.
- Koha - Brendan Gallagher and Ian Walls
Working on Koha bugs and enhancements, discussing best practices to solve common workflow and technical issues, developing helper scripts for data migration, connection to external systems, etc.
- Serials Solutions - Andrew Nagy and Harry Kaplanian
Connecting Serials Solutions Fed Search, Link resolver or Summon to almost anything (such as OCLC, ILS systems, Drupal, whatever). Additionally, participants will have full access to the Summon API - providing search access to over 500 million academic documents. Serials Solutions staff will be on hand to answer questions and demonstrate various technologies.
Morning Pre-Conferences (09:00 - 12:00):
- Solr White Belt - Bess Sadler
The journey of solr mastery begins with installation. We will then proceed to data types, indexing, querying, and inner harmony. You will leave this session with enough information to start running a solr service with your own data.
- Solr Black Belt - Erik Hatcher and Naomi Dushay
Amaze your friends with your ability to combine boolean and weighted searching. Confound your enemies with your mastery of the secrets of dismax. Leave slow queries in the dust as you performance tune solr within an inch of its life.
- Hacker 101/102 - Dan Chudnov
Are you an accidental hacker? Recent MLS grad who got handed all the web scripting duties because "those young people know all that stuff now"? Slid over from reference, tech services, or archives into "doing the tech stuff" because nobody else would? Humanities major who found a good job in the library but never wrote a lick of code before? Sole hacker in a sea of non-technical folks who never has anyone to look to for mentoring? Learned a lot on your own, but you still sometimes "just get stuck" at the same places in your code? If these descriptions sound like you, and you're coming to code4lib to build a better foundation of programming know-how for yourself, or you've already got some basics down and need to fill in some big gaps, this is the preconference for you. We'll cover some basic tenets of how code works and work through some fun, informative examples together. We'll also leave plenty of time for questions and will gear the session to what the people who attend want to learn. Bring a laptop - everyone will write some new code in this session.
- Web Services & Widgets - Godmar Back and Annette Bailey
Afternoon Pre-Conferences (13:30 - 16:30):
- Blacklight - Naomi Dushay, Jessie Keck, and Bess Sadler
Apply your solr skills to running Blacklight as a front end for your library catalog, institutional repository, or anything you can index into solr. We'll cover installation, source control with git, local modifications, test driving development, and writing object-specific behaviors. You'll leave this workshop ready to revolutionize discovery at your library. Solr white belts or black belts are welcome.
- Evergreen - Bill Erickson
Update on development, plans for future development, clearing paths for community involvement, documentation project, what's up with acquisitions, general Q & A, and more. As code4lib heads South again, Evergreen developers, users, and advocates will be there to meet it. The discussion will be a mix of general and technical.
- Hacker 201/202 - Dan Chudnov
Some talk descriptions were abbreviated for length. Complete descriptions of these talks can be found on the talk submissions page.
Tuesday, February 23
- 08:00-09:00 - Registration / Breakfast
- 09:00-09:15 - Welcome / Orientation /Housekeeping
- 09:15-10:00 - Keynote #1: Cathy Marshall [Video] [Page]
- 10:00-10:20 - Cloud4Lib - Jeremy Frumkin and Terry Reese [Video] [Page]
Major library vendors are creating proprietary platforms for libraries. We will propose that the code4lib community pursue the cloud4lib, a open digital library platform based on open source software and open services. This platform would provide common service layers for libraries, not only via code, but also allow libraries to easily utilize tools and systems through cloud services. Instead of a variety of competing cloud services and proprietary platforms, cloud4lib will attempt to be a unifying force that will allow libraries to be consumer of the services built on top of it as well as allow developers / researchers / code4lib'ers to hack, extend, and enhance the platform as it matures.
- 10:20-10:40 - Break
- 10:40-11:00 - The Linked Library Data Cloud: Stop talking and start doing - Ross Singer [Video] [Page]
A year later and how far has Linked Library Data come? With the emergence of large, centralized sources (id.loc.gov/authorities/, viaf.org, among others) entry to the Linked Data cloud might be easier than you think. This presentation will describe various projects that are out in the wild that can bridge the gap between our legacy data and the semantic web, incremental steps we can take modeling our data, why linked data matters and a demonstration of how a small template changes can contribute to the Linked Data cloud.
- 11:00-11:20 - Do It Yourself Cloud Computing with Apache and R - Harrison Dekker [Video] [Page]
R is a popular, powerful, and extensible open source statistical analysis application. Rapache, software developed at Vanderbilt University, allows web developers to leverage the data analysis and visualization capabilities of R in real-time through simple Apache server requests. This presentation will provide an overview of both R and rapache and will explore how these tools might be used to develop applications for the library community.
- 11:20-11:40 - Public Datasets in the Cloud - Rosalyn Metz and Michael B. Klein [Video] [Page]
When most people think about cloud computing (if they think about it at all), it usually takes one of two forms: Infrastructure Services, such as Amazon EC2 and GoGrid, which provide raw, elastic computing capacity in the form of virtual servers, and Platform Services, such as Google App Engine and Heroku, which provide preconfigured application stacks and specialized deployment tools. Several providers, however, offer access to large public datasets that would be impractical for most organizations to download and work with locally. From a 67-gigabyte dump of DBpedia's structured information store to the 180-gigabyte snapshot of astronomical data from the Sloan Digital Sky Survey, chemistry and biology to economic and geographic data, these datasets are available instantly and backed by enough pay-as-you-go server capacity to make good use of them. We will present an overview of currently-available datasets, what it takes to create and use snapshots of the data, and explore how the library community might push some of its own large stores of data and metadata into the cloud.
- 11:40-12:00 - 7 Ways to Enhance Library Interfaces with OCLC Web Services - Karen A. Coombs [Video] [Page]
- 12:00-13:00 - Lunch (provided)
- 13:00-13:20 - Taking Control of Library Metadata and Websites Using the eXtensible Catalog - Jennifer Bowen [Video] [Page]
The eXtensible Catalog Project has developed four open-source software toolkits that enable libraries to build and share their own web- and metadata-focused applications on top of a service-oriented architecture that incorporates Solr in Drupal, a robust metadata management platform, and OAI-PMH and NCIP-compatible tools that interact with legacy library systems in real-time. This presentation will showcase XC's metadata processing services, the metadata "navigator" and the Drupal user interface platform. The presentation will also describe how libraries and their developers can get started using and contributing to the XC code.
- 13:20-13:40 - Matching Dirty Data – Yet Another Wheel - Anjanette Young and Jeff Sherwood [Video] [Page]
This talk demonstrates one method of matching sets of MARC records that lack common unique identifiers and might contain slight differences in the matching fields. It will cover basic usage of several python tools. No large stack traces, just the comfort of pure python and basic computational algorithms in a step-by-step presentation on dealing with an old library task: matching dirty data. While much literature exists on matching/merging duplicate bibliographic records, most of this literature does not specify how to accomplish the task, just reports on the efficiency of the tools used to accomplish the task, often within a larger system such as an ILS.
- 13:40-14:00 - HIVE: A New Tool for Working With Vocabularies - Ryan Scherle and Jose Aguera [Video] [Page]
HIVE is a toolkit that assists users in selecting vocabulary and ontology terms to annotate digital content. HIVE combines the ease of folksonomies with the rigor of traditional vocabularies. By combining semantic web standards with text mining techniques, HIVE will improve the effectiveness of subject metadata generation, allowing users to search and browse terms from a variety of vocabularies and ontologies. Documents can be submitted to HIVE to automatically generate suggested vocabulary terms. Your system can interact with common vocabularies such as LCSH and MESH via the central HIVE server, or you can install a local copy of HIVE with your own custom set of vocabularies. This talk will give an overview of the current features of HIVE and describe how to build tools that use the HIVE services.
- 14:00-14:20 - Metadata Editing – A Truly Extensible Solution - David Kennedy and David Chandek-Stark [Video] [Page]
We set out in the Trident project to create a metadata tool that scales. In doing so we have conceived of the metadata application profile, a profile which provides instructions for software on how to edit metadata. We have built a set of web services and some web-based tools for editing metadata. The metadata application profile allows these tools to extend across different metadata schemes, and allows for different rules to be established for editing items of different collections. Some features of the tools include integration with authority lists, auto-complete fields, validation and clean integration of batch editing with Excel. I know, I know, Excel, but in the right hands, this is a powerful tool for cleanup and batch editing. In this talk, we want to introduce the concepts of the metadata application profile, and gather feedback on its merits, as well as demonstrate some of the tools we have developed and how they work together to manage the metadata in our Fedora repository.
- 14:20-14:40 - Break
- 14:40-15:50 - Lightning Talks 1
- 15:50-17:00 - Breakout Sessions 1
- 17:00-17:15 - Daily Wrap Up (include breakout reports?)
Wednesday, February 24
- 08:00-09:00 - Breakfast
- 09:00-09:15 - Housekeeping, Intros
- 09:15-09:35 - Iterative Development Done Simply - Emily Lynema [Video] [Page]
With a small IT unit and a wide array of projects to support, requests for development from business stakeholders in the library can quickly spiral out of control. To help make sense of the chaos, increase the transparency of the IT "black box," and shorten time lag between requirements definition and functional releases, we have implemented a modified Agile/SCRUM methodology within the development group in the IT department at NCSU Libraries. This presentation will provide a brief overview of the Agile methodology as an introduction to our simplified approach to iteratively handling multiple projects across a small team. This iterative approach allows us to regularly re-evaluate requested enhancements against institutional priorities and more accurately estimate timelines for specific units of functionality. The presentation will highlight how we approach each development cycle (from planning to estimating to re-aligning) as well as some of the actual tools and techniques we use to manage work (like JIRA and Greenhopper). It will identify some challenges faced in applying an established development methodology to a small team of multi-tasking developers, the outcomes we've seen, and the areas we'd like to continue improving. These types of iterative planning/development techniques could be adapted by even a single developer to help manage a chaotic workplace.
- 09:35-09:55 - Vampires vs. Werewolves: Ending the War Between Developers and Sysadmins with Puppet - Bess Sadler [Video] [Page]
Developers need to be able to write software and deploy it, and often require cutting edge software tools and system libraries. Sysadmins are charged with maintaining stability in the production environment, and so are often resistant to rapid upgrade cycles. This has traditionally pitted us against each other, but it doesn't have to be that way. Using tools like puppet for maintaining and testing server configuration, nagios for monitoring, and hudson for continuous code integration, UVA has brokered a peace that has given us the ability to maintain stable production environment with a rapid upgrade cycle. I'll discuss both the individual tools, our server configuration, and the social engineering that got us here.
- 09:55-10:15 - I Am Not Your Mother: Write Your Test Code - Naomi Dushay, Willy Mene, and Jessie Keck [Video] [Page]
How is it worth it to slow down your code development to write tests? Won't it take you a long time to learn how to write tests? Won't it take longer if you have to write tests AND develop new features, fix bugs? Isn't it hard to write test code? To maintain test code? We will address these questions as we talk about how test code is crucial for our software. By way of illustration, we will show how it has played a vital role in making Blacklight a true community collaboration, as well as how it has positively impacted coding projects in the Stanford Libraries.
- 10:15-10:35 - Break
- 10:35-10:55 - Media, Blacklight, and Viewers Like You (pdf, 2.61MB) - Chris Beer [Video] [Page]
There are many shared problems (and solutions) for libraries and archives in the interest of helping the user. There are also many "new" developments in the archives world that the library communities have been working on for ages, including item-level cataloging, metadata standards, and asset management. Even with these similarities, media archives have additional issues that are less relevant to libraries: the choice of video players, large file sizes, proprietary file formats, challenges of time-based media, etc. In developing a web presence, many archives, including the WGBH Media Library and Archives, have created custom digital library applications to expose material online. In 2008, we began a prototyping phase for developing scholarly interfaces by creating a custom-written PHP front-end to our Fedora repository. In late 2009, we finally saw the (black)light, and after some initial experimentation, decided to build a new, public website to support our IMLS-funded /Vietnam: A Television History/ archive (as well as existing legacy content). In this session, we will share our experience of and challenges with customizing Blacklight as an archival interface, including work in rights management, how we integrated existing Ruby on Rails user-generated content plugins, and the development of media components to support a rich user experience.
- 10:55-11:15 - Becoming Truly Innovative: Migrating from Millennium to Koha - Ian Walls [Video] [Page]
On Sept. 1st, 2009, the NYU Health Sciences Libraries made the unprecedented move from their Millennium ILS to Koha. The migration was done over the course of 3 months, without assistance from either Innovative Interfaces, Inc. or any Koha vendor. The in-house script, written in Perl and XSLT, can be used with any Millennium installation, regardless of which modules have been purchased, and can be adapted to work for migration to systems other than Koha. Helper scripts were also developed to capture the current circulation state (checkouts, holds and fines), and do minor data cleanup. This presentation will cover the planning and scheduling of the migration, as well as an overview of the code that was written for it. Opportunities for systems integration and development made newly available by having an open source platform are also discussed.
- 11:15-12:00 - Ask Anything! – Facilitated by Dan Chudnov [Video] [Page]
a.k.a. "Human Search Engine". A chance for you to ask a roomful of code4libbers anything that's on your mind: questions seeking answers (short or long), requests for things (hardware, software, skills, or help), or offers of things. We'll keep the pace fast, and the answers faster. Come with questions and line up at the start of the session and we'll go through as many as we can; sometimes we'll stop at finding the right person or people to answer a query and it'll be up to you to find each other after the session. First time at code4libcon! (Thanks to Ka-Ping Yee for the inspiration/explanation, reused here in part.)
- 12:00-13:00 - Lunch (provided)
- 13:00-13:20 - A Better Advanced Search - Naomi Dushay and Jessie Keck [Video] [Page]
Even though we'd love to get basic searches working so well that advanced search wouldn't be necessary, there will always be a small set of users that want it, and there will always be some library searching needs that basic searching can't serve. Our user interface designer was dissatisfied with many aspects of advanced search as currently available in most library discovery software; the form she designed was excellent but challenging to implement. See http://searchworks.stanford.edu/advanced We'll share details of how we implemented Advanced Search in Blacklight.
- 13:20-13:40 - Drupal 7: A more powerful platform for building library applications - Cary Gordon, The Cherry Hill Company [Video] [Page]
The release of Drupal 7 brings with it a big increase in utility for this already very useful and well-accepted content management framework. Specifically, the addition of fields in core, the inclusion of RDFa, the use of the PHP_db abstraction layer, and the promotion of files to first class objects facilitate the development of richer applications directly in Drupal without the need to integrate external products.
Kill the Search Button - Michael Poltorak Nielsen and Jørn Thøgersen We demo three concepts that eliminate the search button: 1. Instant search. Why wait for tiresome page reloads when searching? Instant search updates the search result on every key-press. We will show how we integrated this feature into our own library search system with minimal changes to the existing setup. 2. Index lookup. Ever dreamed of your own inline instant index lookup? We demo an instant index lookup feature that requires no search button and no page refreshes - and without ever leaving the search field.
3. Slide your data. Sliders are an alternative way to fit search results to the user's search context. Examples are sliders that move search results priorities between title and subject and between books by an author and books about the author.
- 13:40-14:00 - Enhancing Discoverability With Virtual Shelf Browse (3.65 MB ppt) - Andreas Orphanides, Cory Lown, and Emily Lynema [Video] [Page]
With collections turning digital, and libraries transforming into collaborative spaces, the physical shelf is disappearing. NCSU Libraries has implemented a virtual shelf browse tool, re-creating the benefits of physical browsing in an online environment and enabling users to explore digital and physical materials side by side. We hope that this is a first step towards enabling patrons familiar with Amazon and Netflix recommendations to "find more" in the library. We will provide an overview of the architecture of the front-end application, which uses Syndetics cover images to provide a "cover flow" view and allows the entire "shelf" to be browsed dynamically. We will describe what we learned while wrangling multiple jQuery plugins, manipulating an ever-growing (and ever-slower) DOM, and dealing with unpredictable response times of third-party services. The front-end application is supported by a web service that provides access to a shelf-ordered index of our catalog. We will discuss our strategy for extracting data from the catalog, processing it, and storing it to create a queryable shelf order index.
- 14:00-14:20 - How to Implement A Virtual Bookshelf With Solr - Naomi Dushay and Jessie Keck [Video] [Page]
Browsing bookshelves has long been a useful research technique as well as an activity many users enjoy. As larger and larger portions of our physical library materials migrate to offsite storage, having a browse-able virtual shelf organized by call number is a much-desired feature. I will talk about how we implemented nearby-on-shelf in Blacklight at Stanford, using Solr and SolrMarc.
- 14:20-14:40 - Break
- 14:40-15:50 - Lightning Talks 2
- 15:50-17:00 - Breakout Sessions 2 - Sign up on the wiki
- 17:00-17:15 - Daily Wrap Up (include breakout reports?)
Thursday, February 25
- 08:00-09:00 - Breakfast
- 09:00-09:15 - Housekeeping
- 09:15-10:00 - Keynote #2: catfish, cthulhu, code, clouds and Levenshtein distance - Paul Jones [Video] [Page]
- 10:00-10:15 - Break
- 10:15-11:00 - Lightning Talks 3
- 11:00-11:20 - You Either Surf or You Fight: Integrating Library Services With Google Wave - Sean Hannan [Page]
So Google Wave is a new shiny web toy, but did you know that it's also a great platform for collaboration and research? (I bet you did.) ...And what platform for collaboration and research would not be complete without some library tools to aid and abet that process? I will talk about how to take your library web services and integrate them with Google Wave to create bots that users can interact with to get at your resources as part of their social and collaborative work.
- 11:20-11:40 - library/mobile: Developing a Mobile Catalog - Kim Griggs [Video] [Page]
The increased use of mobile devices provides an untapped resource for delivering library resources to patrons. The mobile catalog is the next step for libraries in providing universal access to resources and information. This talk will share Oregon State University (OSU) Libraries' experience creating a custom mobile catalog. The discussion will first make the case for mobile catalogs, discuss the context of mobile search, and give an overview of vendor and custom mobile catalogs. The second half of the talk will look under the hood of OSU Libraries' custom mobile catalog to provide implementation strategies and discuss tools, techniques, requirements, and guidelines for creating an optimal mobile catalog experience that offers services that support time critical and location sensitive activities.
- 11:40-12:00 - Mobile Web App Design: Getting Started (8.5 MB ppt) - Michael Doran [Video [Page]
Creating or adapting library web applications for mobile devices such as the iPhone, Android, and Palm Pre is not hard, but it does require learning some new tools, new techniques, and new approaches. From the Tao of mobile web app design to using mobile device SDKs for their emulators, this presentation will give you a jump-start on mobile cross-platform design, development, and testing. And all illustrated with a real-world mobile library web application.
- 12:00-12:15 - Wrap-Up