Erik Hatcher, LucidWorks, erik.hatcher AT lucidworks.com
Solr is continually improving. Solr 4 was recently released, bringing dramatic changes in the underlying Lucene library and Solr-level features. It's tough for us all to keep up with the various versions and capabilities.
This talk will blaze through the highlights of new features and improvements in Solr 4 (and up). Topics will include: SolrCloud, direct spell checking, surround query parser, and many other features. We will focus on the features library coders really need to know about.
Linked Open Communism: Better discovery through data dis- and re- aggregation
Corey A Harper, New York University, corey dot harper at nyu dot edu
Current library search interfaces focus on books, journals and articles but offer little access to related entities, such as people, places, and events. These entities are generally only represented as attributes of other metadata records. Linked data can power interfaces that surface these entities as first-class resources, integrating them into results alongside library materials.
This presentation will describe research into such an interface for exploring a particular subject area: the history of the Communist Party & labor movements in the US. A triple store was seeded by 1,600 EAD records from NYU's Tamiment Library and Wagner Labor Archives. Based on access points in the finding aids, the store was further populated with data from various sources, including MARC, id.loc, VIAF, and dbpedia. Identifiers are being assigned for a wide array of typed entities, and triples can then be re-assembled into new entity "records". These new records will be loaded into a discovery interface that will allow typical keyword searching across all contained entities, show links between entities, and include faceting on entity types.
It is hoped that this prototype will be a model for a new kind of interface to library, archive & museum metadata targeted to particular subject domains, and could inform the development of a similar dis- and re- aggregation approach for entire library collections.
Jessie Keck, Stanford University, jkeck at stanford dot edu
This is where Watir (pronounced water) comes in. Watir can be used with popular ruby testing frameworks like RSpec and Capybara. This talk will show how to use the combination of these tools to write RSpec tests using Watir to spin up an application in a variety of browsers, navigate the application, and make assertions about the page using Capybara.
Tests using Watir are written in ruby but they don't necessarily need to test ruby application. You can test any application that you can point a browser at, so there are a wide variety of potential uses for tests written with Watir.
Last year Declan Fleming presented ALL TEH METADATAS and reviewed our UC San Diego Library Digital Asset Management system and RDF data model. You may be shocked to hear that all that metadata wasn't quite enough to handle increasingly complex digital library and research data in an elegant way. Our ad-hoc, 8-year-old data model has also been added to in inconsistent ways and our librarians and developers have not always been perfectly in sync in understanding how the data model has evolved over time.
In this presentation we'll review our process of locking a team of librarians and developers in a room to figure out a new data model, from domain definition through building and testing an OWL ontology. We¹ll also cover the challenges we ran into, including the review of existing controlled vocabularies and ontologies, or lack thereof, and the decisions made to cover the gaps. Finally, we'll discuss how we engaged the digital library community for feedback and what we have to do next. We all know that Things Fall Apart, this is our attempt at Doing Better This Time.
Richard Wolf, University of Illinois at Chicago, firstname.lastname@example.org
Mobile is the new hotness ... and you can't be one of the cool kids unless you've got your own mobile app ... but the road to mobility is daunting. I'll argue that it's actually easier than it seems ... and that the simplest way to mobility is to bring your data to the party, create a REST API around the data, tell developers about your API, and then let the magic happen. To make my argument concrete, I'll show (lord help me!) how to go from an interesting REST API to a fun iOS tool for librarians and the general public in twenty minutes.
Bess Sadler, Stanford University Library, email@example.com
The difference between an open source software project that gets new adopters and new contributing community members (which is to say, a project that goes on existing for any length of time) and a project that doesn't, often isn't a question of superior design or technology. It's more often a question of whether the advocates for the project can convince institutional leaders AND front line developers that a project is stable and trustworthy. What are successful strategies for attracting development partners? I'll try to answer that and talk about what we could do as a community to make collaboration easier.
Shawn Averkamp, University of Iowa, shawn-averkamp at uiowa.edu
Matthew Butler, University of Iowa, matthew-butler at uiowa.edu
After a low-tech experiment in crowdsourced transcription grew into a surprisingly successful library initiative and demanded new commitments to user engagement, we found ourselves looking for a more efficient and user-friendly solution. We customized CHNM’s Scripto community transcription tool and various other Omeka plugins to develop a new site: DIYHistory.
We often receive questions about the technical side of both platforms, usually (to our dismay) from libraries who already assume they don't have the IT resources to pursue their own crowdsourcing initiatives. But we found that the software makes up only half of the recipe for success. Do you have compelling content? A long-term commitment to engaging with your users? Are you ready to promote your project far and wide? If so, then deploying a crowdsourcing initiative may be easier than you think.
Our very small development team, which consisted of a healthy mix of technologists and other stakeholders, worked closely and collaboratively on all aspects of the site. We’ll talk about customizing open-source software--how we scaled up functionality and scaled back design to improve user experience and production-level workflows--and how that process served to gently introduce collaborative software practices, such as using Git for version control, into a small, but agile, organization ready to grow. Finally, we'll share our transcription starter kit of forked Scipto and Omeka code and associated documentation for those interested in doing it themselves.
Hands off! Best Practices and Top Ten Lists for Code Handoffs
Naomi Dushay, Stanford University Library, firstname.lastname@example.org
Transition points in who is the primary developer on an actively developing code base can be a source of frustration for everyone involved. We've tried to minimize that pain point as much as possible through the use of agile methods like test driven development, continuous integration, and modular design. Has optimizing for developer happiness brought us happiness? What's worked, what hasn't, and what's worth adopting? How do you keep your project in a state where you can easily hand it off?
Adam Wead, Rock and Roll Hall of Fame and Museum, email@example.com
At the Library and Archives of the Rock and Roll Hall of Fame, we use available tools such as Archivists' Toolkit to create EAD finding aids of our collections. However, managing digital content created from these materials and the born-digital content that is also part of these collections represents a significant challenge. In my presentation, I will discuss how we solve the problem of our hybrid collections by using Hydra as a digital asset manager and Blacklight as a unified presentation and discovery interface for all our materials.
Our strategy centers around indexing ead xml into Solr as multiple documents: one for each collection, and one for every series, sub-series and item contained within a collection. For discovery, we use this strategy to leverage item-level searching of archival collections alongside our traditional library content. For digital collections, we use this same technique to represent a finding aid in Hydra as a set of linked objects using RDF. New digital items are then linked to these parent objects at the collection and series level. Once this is done, the items can be exported back out to the Blacklight solr index and the digital content appears along with the rest of the items in the collection.
Citation search in SOLR and second-order operators
Roman Chyla, Astrophysics Data System, roman.chyla AT (cfa.harvad.edu|gmail.com)
Citation search is basically about connections (Is the paper read by a friend of mine more important than others? Get me a paper read by somebody who cites many papers/is cited by many papers?), but the implementation of the citation search is surprisingly useful in many other areas.
I will show 'guts' of the new citation search for astrophysics, it is generic and can be applied recursively to any Lucene query. Some people would call it a second-order operation because it works with the results of the previous (search) function. The talk will see technical details of the special query class, its collectors, how to add a new search operator and how to influence relevance scores. Then you can type with me: friends_of(friends_of(cited_for(keyword:"black holes") AND keyword:"red dwarf"))