Letting In the Light: Using Solr as an External Search Component

  • Jay Luker, IT Specialist, ADS, jluker@cfa.harvard.edu
  • Benoit Thiell, software developer, ADS, bthiell@cfa.harvard.edu

Code4Lib 2011, Tuesday 8 February, 14:30 - 14:50

It’s well-established that Solr provides an excellent foundation for building a faceted search engine. But what if your application’s foundation has already been constructed? How do you add Solr as a federated, fulltext search component to an existing system that already provides a full set of well-crafted scoring and ranking mechanisms?

This talk will describe a work-in-progress project at the Smithsonian/NASA Astrophysics Data System to migrate its aging search platform to Invenio, an open-source institutional repository and digital library system originally developed at CERN, while at the same time incorporating Solr as an external component for both faceting and fulltext search.

In this presentation we’ll start with a short introduction of Invenio and then move on to the good stuff: an in-depth exploration of our use of Solr. We’ll explain the challenges that we faced, what we learned about some particular Solr internals, interesting paths we chose not to follow, and the solutions we finally developed, including the creation of custom Solr request handlers and query parser classes.

This presentation will be quite technical and will show a measure of horrible Java code. Benoit will probably run away during that part.