You are here

Feed aggregator

DPLA: Board Governance Committee Executive Session: November 12, 2014

planet code4lib - Sat, 2014-11-08 18:46

The DPLA Board of Directors’ Governance Committee will hold an executive session conference call on Wednesday, November 12, 2014. The purpose of the call is to discuss a slate of Board nominees and to come up with next steps.

Pursuant to DPLA’s open meetings guidelines, this call will be closed to the public, as the Board Governance subcommittee will deliberate and consider matters of personnel. The agenda is available below.

For more information about DPLA’s open calls, including a schedule of upcoming calls, click here.

  • Discussion of nominees
  • Discussion of timeline and next steps

All written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

Dan Scott: How discovery layers have closed off access to library resources, and other tales of from LITA Forum 2014

planet code4lib - Sat, 2014-11-08 16:41

At the LITA Forum yesterday, I accused (presentation) most discovery layers of not solving the discoverability problems of libraries, but instead exacerbating them by launching us headlong to a closed, unlinkable world. Coincidentally, Lorcan Dempsey's opening keynote contained a subtle criticism of discovery layers. I wasn't that subtle.

Here's why I believe commercial discovery layers are not "of the web": check out their robots.txt files. If you're not familiar with robots.txt files, these are what search engines and other well-behaved automated crawlers of web resources use to determine whether they are allowed to visit and index the content of pages on a site. Here's what the robots.txt files look like for a few of the best-known discovery layers:

User-Agent: * Disallow /

That effectively says "Go away, machines; your kind isn't wanted in these parts." And that, in turn, closes off access to your libraries resources to search engines and other aggregators of content, and is completely counter to the overarching desire to evolve to a linked open data world.

During the question period, Marshall Breeding challenged my assertion as being unfair to what are meant to be merely indexes of library content. I responded that most libraries have replaced their catalogues with discovery layers, closing off open access to what have traditionally been their core resources, and he rather quickly acquiesced that that was indeed a problem.

(By the way, a possible solution might be to simply offer two different URL patterns, something like /library/* for library-owned resources to which access should be granted, and /licensed/* for resources to which open access to the metadata is problematic due to licensing issues, and which robots can therefore be restricted from accessing.)

Compared to commercial discovery layers on my very handwavy usability vs. discoverability plot, general search engines rank pretty high on both axes; they're the ready-at-hand tool in browser address bars. And they grok, so if we can improve our discoverability by publishing data, maybe we get a discoverability win for our users.

But even if we don't (SEO is a black art at best, and maybe the general search engines won't find the right mix of signals that makes them decide to boost the relevancy of our resources for specific users in specific locations at specific times) we get access to that structured data across systems in an extremely reusable way. With sitemaps, we can build our own specialized search engines (Solr or ElasticSearch or Google Custom Search Engine or whatever) that represent specific use cases. Our more sophisticated users can piece together data to, for example, build dynamic lists of collections, using a common, well-documented vocabulary and tools rather than having to dip into the arcane world of library standards (Z39.50 and MARC21).

So why not iterate our way towards the linked open data future by building on what we already have now? As Karen Coyle wrote in a much more elegant fashion, the transition looks roughly like:

  • Stored data -> transform/template -> human readable HTML page
  • Stored data -> transform/template (tweaked) -> machine & human readable HTML page

That is, by simply tweaking the same mechanism you already use to generate a human readable HTML page from the data you have stored in a database or flat files or what have you, you can embed machine readable structured data as well.

That is, in fact, exactly the approach I took with Evergreen, VuFind, and Koha. And they now expose structured data and generate sitemaps out of the box using the same old MARC21 data. Evergreen even exposes information about libraries (locations, contact information, hours of operation) so that you can connect its holdings to specific locations.

And what about all of our resources outside of the catalogue? Research guides, fonds descriptions, institutional repositories, publications... I've been lucky enough to be working with Camilla McKay and Karen Coyle on applying the same process to the Bryn Mawr Classical Review. At this stage, we're exposing basic entities (Reviews and People) largely as literals, but we're laying the groundwork for future iterations where we link them up to external entities. And all of this is built on a Tcl + SGML infrastructure.

So why It has the advantage of being a de-facto generalized vocabulary that can be understood and parsed across many different domains, from car dealerships to streaming audio services to libraries, and it can be relatively simply embedded into existing HTML as long as you can modify the templating layer of your system.

And offers much more than just static structured data; Actions are surfacing in applications like Gmail as a way of providing directly actionable links--and there's no reason we shouldn't embrace that approach to expose "SearchAction", "ReadAction", "WatchAction", "ListenAction", "ViewAction"--and "OrderAction" (Request), "BorrowAction" (Borrow or Renew), "Place on Reserve", and other common actions as a standardized API that exists well beyond libraries (see Hydra for a developing approach to this problem).

I want to thank Richard Wallis for inviting me to co-present with him; it was a great experience, and I really enjoy meeting and sharing with others who are putting linked data theory into practice.

District Dispatch: Network Neutrality update

planet code4lib - Sat, 2014-11-08 00:48

FCC headquarters in Washington, D.C.

About eight months and 4 million comments after the Federal Communications Commission (FCC) launched the Notice of Proposed Rulemaking on preserving and protecting the Open Internet, a decision could come as soon as the December 11th FCC open meeting.

The American Library Association (ALA) and a host of higher education and library colleagues have been engaged in this conversation throughout, and yesterday added more detail to what we propose as an “internet-reasonable” standard for network neutrality protections.

We developed the “internet reasonable” approach to provide much stronger protections to preserve the openness of the Internet than the FCC’s original proposal (“commercially reasonable”). This proposal obviously isn’t a good fit for non-commercial entities like libraries and other educational institutions. Most public interest groups shared our deep concern that this approach would fail to preserve the open platform that has defined the internet from its inception. (And, to its credit, it sounds like the FCC is seriously considering alternative approaches.)

So, we asked: What would preserve an internet originally built to serve research and learning and now a vital engine of innovation, education, engagement and empowerment? How could the FCC best support our principles of network neutrality, including prohibiting blocking content and paid prioritization, protecting against unreasonable discrimination and providing transparency? Fundamentally, once an internet subscriber pays for service, the subscriber should be able to access the content of their choosing and communicate without interference from the internet service provider (ISP).

We envision an “internet reasonable” standard that would establish a mix of bright line rules, rebuttable presumptions (where the burden is on the internet service provider to demonstrate how any action in opposition to the presumption would be in the public interest) and some areas of discretion for the FCC to consider as the market changes.

Adopting an “internet reasonable” standard is a strong, enforceable policy approach to protecting the openness of the internet, regardless of whether the FCC adopts a legal approach under Title II reclassification, under Section 706, or some combination thereof. As ALA and others stated in our original comments in this proceeding, Title II and Section 706 could each provide a viable legal path for protecting the openness of the internet.

The filing also touches on recent news that the FCC is strongly considering a “hybrid” approach to its new rules. The Verizon v. FCC decision in January 2014 opened the door to this train of thought by suggesting that the service provided by ISPs to edge providers could be regulated differently than the service provided by ISPs to consumers. Mozilla and Tim Wu/Tejas Narechania (of Columbia Law School) have proposed that the service provided to edge providers should be regulated under Title II, allowing the FCC to regulate ISPs’ relationship with consumers under a different regulatory regime.

We’re not quite sure how this hybrid would work—particularly for organizations like libraries that are both consumers and creators. The prospect of applying two different legal regimes over different components of internet access seems confusing and impractical, at the least. We need more details and plan to meet with FCC staff and commissioners to better understand how this might work in practice.

But I’m willing to hear them out. In a world where the “art of the possible” seems harder and harder to achieve, I don’t want to miss a path forward that might achieve our goal of developing strong, legally enforceable rules that keep the internet open and free for creation, free speech and inquiry, research and learning for all.

Unfortunately, with so many perspectives and so much at stake, it’s likely that whatever path the FCC takes, a legal challenge will follow (as has been true twice before). We may reach a stopping point before the end of the year, but not an end to this complex and critically important issue. As always, stay tuned.

The post Network Neutrality update appeared first on District Dispatch.

CrossRef: CrossRef Workshops and Annual Meeting Live Stream Details

planet code4lib - Fri, 2014-11-07 20:59

We will be recording and streaming most of the session of next week's 2014 CrossRef Workshops and Annual Meeting.

Links to the live stream have been posted below and on our annual meeting information page.

2014 CrossRef Workshops - Agenda

LIVE STREAM - will start on Tuesday, Nov 11 at 10:00 GMT (London, UK)

Check the World Clock for correct local time.

The CrossRef Workshops will cover technical, operational and workflow issues will be held on Tuesday, 11 November, from 8:30 AM (registration) to 4:15 PM.

2014 CrossRef Annual Meeting - Agenda

LIVE STREAM - Wednesday, Nov 12 at 10:00 GMT (London, UK)

Check the World Clock for correct local time.

The CrossRef Annual Meeting will be held on Wednesday, November 12, from 8:30 AM to 6:15 PM and will include CrossRef updates, compelling industry speakers and will conclude with a cocktail reception.

This year's program will include:

Data That's Fast, Data That Lasts

Laurie Goodman, Editor-in-Chief of GigaScience, will discuss the importance of citing data and making it persistently available. She has stories too, showing the risks and rewards of rapid data sharing--even before peer review and publication. Her talk will be entitled Ways and Needs to Promote Rapid Data Sharing.

Peer Review: New Methods and Models
Recently, a raft of innovative peer review models have been floated and are charting new waters. Our speakers will each address a different system--different from the traditional peer review of the past, and different from each other.

The roster includes:

Our closing keynote speaker will be Richard Jefferson, from Cambia. His talk will focus on impact and innovation and will be entitled Innovation Cartography: Creating impact from scholarly research requires maps not metrics.

OCLC Dev Network: ETag Change in WMS Acquisitions API

planet code4lib - Fri, 2014-11-07 19:30

We wanted to provide you with some more information about the ETag change in the WMS Acquisitions API we posted about for the November 9th release.


District Dispatch: Thinking About Rural

planet code4lib - Fri, 2014-11-07 17:43

White House Rural Council Convening With NTCA                             (photo by NTCA)

Rural has been on my mind of late. In part because of having traveled recently to a conference in the Midwest and looking out the plane window over the patchwork fields and thinking about how remote some of the farms are and wondering whether the families have a fiber connection, or dial-up or satellite internet—or none at all and then wondering how far it was to the town I could see in the distance and then wondering if the town had a library and what the connection speed was like at the library. Then I wondered what kinds of services the library would be providing and whether it had robust Wi-Fi for the kids who come in after school. Then I wondered how much the library was paying for the connection. And, of course I wondered if the library had access to a fiber connection or whether it too was limited in the speeds it could receive.

Airplane musings aside, I really have rural on my mind because of the ongoing efforts at the Federal Communications Commission (FCC) to get those libraries and those families and their communities connected to the kind of speeds I (in theory, anyway) have access to back on the ground in D.C. The importance of what we’re trying to accomplish through the E-rate proceeding was made ever more clear to me last week. I was fortunate enough to be invited to attend an event hosted by the White House Rural Council for members of NTCA—The Rural Broadband Association. The event focused on the association’s Smart Rural Community initiative and specifically on the 2014 award winners in that initiative. While I am still learning the details of everything NTCA members do for their communities, what I have gained thus far is a further appreciation for the difference strong, committed, and collaborative leadership can make in building a successful community broadband solution. The White House event provided a forum for awardees to highlight the impact their smart rural community has on the opportunities and quality of life for the residents of those communities.

For example, in the presentation by Nancy J. White, chief executive officer of North Central Telephone Cooperative (Lafayette, Tenn.), we heard about the work her company has undertaken to improve access to state-of-the-art healthcare for her rural community. Keith Gabbard, general manager of Peoples Rural Telephone Cooperative (McKee, Ky.) described a program to provide virtual classes to students during extreme winter weather when schools close and also a partnership with the public library to provide digital literacy training which is especially important as this community has the highest unemployment rate in the state. We also heard from Brian Thomason, general manager and chief executive officer of Blue Valley Tele-Communications, Inc. (Home, Kan.) who spoke eloquently about the role of high-capacity broadband in spurring economic development and allowing rural America to flourish.

Libraries support smart rural communities

Anecdotes from libraries in rural America echo the experiences of the NTCA members I spoke with at the event. A library in Mississippi that helped a family with a special needs child connect to classes that allowed him graduate from school; libraries in Georgia that helped over the road truckers complete an online certification course so they can maintain their license and their livelihood; a library in Maine where a self-employed video editor uploads work for his clients across the country because his home connection is too slow; a library in Alaska that connected a parent to a medical specialist so she could complete six weeks of training to take care of her child with diabetes; a library that provides Wi-Fi for a mother to Skype regularly with her son stationed in Afghanistan; or a library that streamed a granddaughter’s graduation in Germany for her grandmother. These examples should be commonplace and could be if there were more communities where the lack of access to affordable high-capacity broadband was not an issue.

The well-connected library is the quintessential community access point for completing Education, jumpstarting Employment and Entrepreneurship, fostering individual Empowerment, and encouraging community Engagement. High-capacity broadband is the crucial foundation on which The E’s of Libraries™ are built. The NTCA members that were recognized for their role in developing smart rural communities  provide important opportunities for libraries (and other anchor institutions), but it is the difference of opportunity for the residents which we all work to ensure that was highlighted that day.

As Shirley Bloomfield, CEO of NTCA, said in her remarks at the event, it is the storytelling that should be celebrated. I would add, especially in D.C. where the policy making is so often divorced from the potential impact it could have if done right. Right when it comes to broadband and rural libraries means having options for affordable high-capacity broadband so that more libraries can be part of the stories I heard from the Smart Rural Community award winners.

The post Thinking About Rural appeared first on District Dispatch.

M. Ryan Hess: New Thoughts on Digital Publishing Services

planet code4lib - Fri, 2014-11-07 16:32

Back in early 2011, I gave an overview of the library as a disruptive publishing platform. Three years is a long time in “disruptive agent” years. So where do we stand today?

First of all, the publishing industry has not fallen yet…but the great disruption goes on.

A friend of mine was recently describing his rodent control neighbor, a charmingly opaque Eastern European gentleman whose central point about controlling rats can be summed up in a single pronouncement: “Fighting rats is F@#%ing 24×7 War!”

I’m seeing value in this statement for the effort to liberate information. As I’m learning in my contact with faculty and other librarians, the rat warrens run deep into our institutions. So invasive are their labyrinths that they threaten the very financial underpinnings of our information services.

Luckily, we are not passive observers in this state of affairs. We are active participants in creating something new. We have tools at our disposal to fill in the rat holes with a digital foundation that will ensure a long, fruitful future of open access publishing that will empower our users in ways traditional publishing could never do.

New Openings

I’m seeing a number of openings libraries are beginning to exploit that build on the “library as publishing platform” model I wrote about earlier. Namely, librarians are often becoming central hubs for a variety of digital services that include:

  • digital humanities and academic computing support
  • digital project consultant services for everything from how to migrate online content to advice on metadata to search engine optimization (SEO) and usability
  • helping faculty navigate scholarly communications issues from copyright to developing readership and recognition
  • and, of course, providing the place on campus for online publishing

Taken together, all of these emerging services suggest a fairly promising future for librarians interested in transforming the profession into something more in line with current and future trajectories for information.

Ready to enlist as a disruptive agent yet?

Over the next few posts, I’ll explore each of the above and how my library is building new services or augmenting older services to meet these emerging digital publishing needs.

First up, that thing that goes by the very vague and unhelpful term of digital humanities…

Ground Zero for Digital Humanities

At my Library, we have not rolled out a formal digital humanities support program…yet.

Nonetheless, we receive regular, unsolicited inquiries about platforms like Omeka and Digital Commons from faculty interested in creating exhibits and online course projects. To meet the demand so far, we’ve rolled out services, but what people really want is full-blown Omeka with plugins like Neatline and others the hosted version does not support.

Clearly, this organic demand suggests a far more robust DH service is required. As I write, we’ve deployed a faculty survey based loosely on one created by Rose Fortier’s work at Marquette University. With this, we hope to not only build awareness of our digital collections and services (spoiler: early results have 60% of faculty being unaware of our institutional repository, for example…24×7 war indeed!), but also we want to learn what services, like digital humanities support, would interest faculty.

Based on our experience, my guess is that digital humanities support services will generate healthy interest. If this is the case, then we will probably role out self-hosted Omeka plus Neatline and GeoServer, along with trainings and baseline technical support, sometime in 2015. The one hitch that will need to be overcome, will be multi-site capability, which will enable us to install Omeka once and then launch as many separate sites as are required with a single click of a button. That particular feature does not exist yet outside, but according to, the forthcoming Omeka 3/Omeka-S will provide this, greatly enhancing the practicality of launching an Omeka service for any library.

Meanwhile, as I recently presented at the 2014 Digital Commons Great Lakes User Group, we are also continuing to provide a measure of digital humanities support on our Digital Commons institutional repository. While not as sexy as Neatline, we are posting student-generated Map of the Month from the Geography Department, for example, in PDF format.

The recent enhanced, zoomable image viewer available in Digital Commons may also help in this regard.

We’ve also seen a few faculty interested in using Digital Commons for student projects, particularly around courses focused on digital publishing issues.

But, of course, as non-librarian content creators enter the collection-building business, they come ill-prepared for overcoming the kinds of problems library professionals excel at solving. And so, this is where I’d like to turn to next: the library as a digital project consultant service.

Library of Congress: The Signal: The Value of the NDSR: Residents and Mentors Weigh In

planet code4lib - Fri, 2014-11-07 14:49

The following is a guest post by Vicky Steeves, National Digital Stewardship Resident at the American Museum of Natural History in New York City. This is the first in a series of posts by the residents from NDSR class of 2014-2015.

I wanted to take this opportunity, as the first 2014-2015 resident to post on The Signal, to discuss how valuable the National Digital Stewardship Residency program is. Among many things, it has given me the opportunity to work at the American Museum of Natural History in New York, surveying scientific research data and recommending preservation strategies. Nowhere else could I have gotten this opportunity. In this post I will look at the value of NDSR, showing that the NDSR is an innovative and important program for furthering the field of library and information science.

Current 2014-2015 NDSR-NY Cohort (left to right): Vicky Steeves, Peggy Griesinger, Karl Blumenthal, Shira Peltzman, and Julia Kim. Photo by Alan Barnett.

The National Digital Stewardship Residency participants (hosts and residents) have demonstrated how this residency fulfills the need for emerging professionals to be placed in important institutions. Here, residents’ skills have the space to expand. This allows for the growth of the field in two ways: residents contribute to the growing body of research in digital preservation and gain skills which they can use throughout their careers as they continue to advance the field. For host institutions, the ability to bring in additional, knowledgeable staff at little or no cost is transformative.

When evaluating the NDSR program, it’s important to look at both simple numbers and testimonials. In terms of the quantitative, 100% of  the residents from the 2013-2014 team in Washington DC have found relevant positions upon completion of the residency. (See previous posts on that subject, parts one and two.) I sought out this first class of residents, and asked them how important they feel NDSR has been for them:

Vicky Steeves: Why did you apply to the NDSR program?

Margo Padilla, (Strategic Programs Manager at Metropolitan New York Library Council): “It seemed like a great way to meet and collaborate with people doing critical work in the field. I was also excited about all the projects and knew that even though I was the resident at only one location, I would learn a lot from the other residents and projects.”

Molly Schwartz, (Fulbright Scholar, Aalto University and the National Library of Finland): “As a new graduate I knew that I needed more hands-on experience and I wasn’t sure exactly what type of institution would be the right professional fit for me. NDSR seemed like a great option for many reasons: I would get more experience, come out of it with a completed project, I would learn what it is like to work at a small non-profit institution (ARL), and I would have the freedom to dive into digital information research full-time, both working on my own project and attending conferences and meetings where I could collaborate with others in the field.”

Julia Blase, (Project Manager, Field Book Project, Smithsonian Libraries): “I was very interested in working on the digital side of libraries and archives after graduate school, but knew that it could be difficult to find entry-level positions in the field, particularly those that would provide practical, complex experience in multiple aspects of the field and train me for the next step in my career. NDSR seemed to offer that chance.”

Vicky Steeves: Why do you think it’s important (or not) for the library science field to have programs like this?

Margo Padilla:  “I think programs like this are important because it helps new graduates grow into the field, discover their niche, and contribute to a larger body of research. Recent graduates lend a fresh perspective to work already being done. It is also a chance for them to learn, make mistakes, and test what works and what doesn’t.”

Molly Schwartz: “The digital information field, especially from the information steward perspective, is at a point where we need to retain and cultivate professionals who have the desire to work in a fast-paced environment and have the skill sets to get employed elsewhere. It is crucial that we provide opportunities for these types of people to develop within the field and get exposed to all the cool work they can do, work that will have real impact, if we are to tackle the challenges facing the profession.”

Julia Blase: “It is very difficult, in my experience and in the experiences of my friends, for a young professional to make the jump from an entry-level or paraprofessional position to a mid-level position, which may begin to incorporate more complex projects, strategic planning, and perhaps even the management of a project, program or other staff members. Programs like the Residency offer that in-between path, supporting and training their graduates so that they are prepared and qualified for that first mid-level position after the program, advancing both the individual careers and also, by providing motivated and prepared staff, the quality of the profession as a whole.”

Heidi Elaine Dowding, (Ph.D. Research Fellow at the Royal Dutch Academy of Arts and Sciences, Huygens ING Institute): “I think paid fellowships like this are really important, especially for students who can’t afford to accept unpaid internships to forward their career. They even the playing field in some ways, and help build really strong networks of practitioners.”

These testimonials demonstrate how impactful the NDSR curriculum is to professional development and career opportunities for postgraduates. The current resident at the Museum of Modern Art in NYC, Peggy Griesinger, remarked, “I applied to NDSR because I wanted the opportunity to contribute to how cultural heritage institutions are developing long-term digital preservation practices.”  The ability to “test drive” a career and preferred setting (public institution, private, non-profit, etc.) while accumulating and refining skills in digital preservation is an invaluable part of the program. Residents also had the opportunity to network and establish relationships with mentors who have invaluable experience in the field, which often led to gainful employment.

Additionally, having diverse institutions buy into this program affirms the value of NDSR. While these institutions are getting a resident at little or no cost to them, it takes a lot of trust to give an incubating project to an outside professional, especially one fresh from their master’s degree. In this way, NDSR takes an important step in public trust for digital archives. I reached out to a few mentors from the 2013-2014 Washington D.C. host institutions, to get their take on the value of the NDSR program.

Vicky Steeves: How useful was the program for you and your institution in hindsight? Are you using the results from the project that your resident worked on?

Shalimar White, (Manager of the Image Collections and Fieldwork Archives at Dumbarton Oaks Research Library and Collection): “One of the benefits of the NDSR program was the ability to bring in someone like Heidi [Dowding] who could evaluate a complex organization like Dumbarton Oaks from an external perspective. Heidi’s report was delivered to our new Manager of Information Technology. As recommended in the report, the IT Manager is currently developing DO’s general technical infrastructure and building out the operations of the new IT department. In the future, when the IT Manager is able to turn her attention to more strategic planning, she has indicated that the report will be a helpful guide for developing the systems and operational procedures necessary for long-term digital asset management at Dumbarton Oaks. We expect that Heidi’s work will continue to be useful and valuable in the long-term.”

Vickie Allen, (Director of the Media Library at the Public Broadcasting Service): “Having a skilled NDSR fellow at our organization for an extended period of time was critical in getting the necessary focus, interest and leadership support for our efforts to launch a complex digitization initiative. As a direct result of the quality and scope of our resident’s work, we were allocated internal funds during the final month of the residency to begin digitization. The completed project plan and associated documentation were invaluable in filling critical knowledge gaps, allowing us to move forward quickly and confidently with our digitization initiative. We plan to use these guidelines long into the future as we continue our digitization efforts, as well as translate findings into strengthening digital media management policy for our born digital content.”

Christie Moffatt, (Manager of the Digital Manuscripts Program at the National Library of Medicine): “The NDSR program was a valuable experience for the National Library of Medicine, both in terms of project accomplishments with the addition of a new thematic Web archive collection, and our participation the NDSR community. Maureen [McCormick Harlow] shared her experiences wrestling with the technical and intellectual challenges of scoping out and creating a new collection with NLM staff involved in Web collecting, which enabled us all to learn together and apply lessons learned throughout the duration of the project. The collection Maureen developed, “Disorders of the Developing and Aging Brain: Autism and Alzheimer’s on the Web,” serves as a model for thematic Web collecting at the Library, and the workflows that she helped to develop are now being implemented in our current Ebola Outbreak web collecting initiative announced earlier this month.  NLM’s web collecting efforts have and will continue to benefit from this experience.”

These host institutions have not only used their resident’s work, but will continue to use their project deliverables, recommendations and associated documentation as digital initiatives are further developed. In this way, residents are contributing to the future developments at their host institutions. This ability to impact the present and future of host institutions is what makes NDSR such an advantage. As one of the newest members of the NDSR program, I can say that the opportunities granted to me have been phenomenal. As a resident, you truly have endless possibilities in this program.


LITA: IA & UX Meet Library Technology

planet code4lib - Fri, 2014-11-07 13:00

The class I enjoy the most this semester at Indiana University is Information Architecture. It is a class where theory and practical application are blended so that we can create something tangible, but also understand the approaches – my favorite kind!

As defines it, Information Architecture (IA) “focuses on organizing, structuring, and labeling content in an effective and sustainable way.” While the class doesn’t necessarily focus on Library Science since it is offered through the Information Science courses, this concept may sound a bit familiar to those working in a library.

In the class, we have chosen a small website we believe could benefit from restructuring. Some students chose public library websites, and others websites from the private sector. Regardless of each website’s purpose, the process of restructuring is the same. The emphasis is placed on usability and user experience (UX), which the ALA Reference and User Services Association defines as “employing user research and user-centered design methods to holistically craft the structure, context, modes of interaction, and aesthetic and emotional aspects of an experience in order to facilitate satisfaction and ease of use.”

Basically, it means structuring content so that a user can use it to a high level of satisfaction.

Peter Morville and Co. developed this honeycomb to represent the multiple facets of User Experience. Check out his explanation here.

Keeping usability and UX at the forefront, much of our semester has been focused on user demographics. We developed personas of specific users by highlighting the tasks they need to carry out and the kind of behaviors they bring to the computer. For example, one of my personas is a working mother who wants to find the best dance studio for her daughter, but doesn’t have a lot of time to spend looking up information and gets frustrated easily with technology (may or may not have been influenced by my own mother).

We also developed a project brief to keep the main benefits of restructuring in mind, and we analyzed parts of the current websites that work for users, and parts that could be improved. We did not (and could not) begin proposing our restructured website until we had a solid understanding of the users and their needs.

While learning about usability, I thought back to my graduate school application essay. I discussed focusing on digital libraries and archives in order to improve accession of materials, which is my goal throughout my career. As I’m learning, I realize that accession doesn’t mean digitizing to digitize, it means digitizing then presenting the materials in an accessible way. Even though the material may be released on the web, that doesn’t always imply that a user will find it and be able to use it.

As technology increasingly evolves, keeping the goals of the library in sync with the skills and needs of the user is crucial. This is where information architecture and user experience meet library technology.

How do you integrate usability and user experience with library technology in your institution? If you are an information architect or usability researcher, what advice do you have for others wishing to integrate these tools?

Open Knowledge Foundation: Global Open Data Index 2014: Reviewing in progress

planet code4lib - Thu, 2014-11-06 19:54

October was a very exciting month for us in the Index team. We spoke to so many of you about the Index, face to face or in the virtual world, and we got so much back from you. It was amazing for us to see how the community is pulling together not only with submissions, but also giving advice in the mailing list, translating tweets and tutorials and spreading the word of the Index around. Thank you so much for your contributions.

This is the first time that we have done regional sprints, starting from the Americas in early October in AbreLATAM/ConDatos, through to our community hangout with Europe and MENA, and finishing off with Asia, Africa and Pacific. On Thursday last week, we hosted a Hangout with Rufus, who spoke about the the Index, how it can be used and where it is headed. We were also very lucky to have Oscar Montiel from Mexico, who spoke with us how they use the Index to demand datasets from the government and how they are now implementing the local data index in cities around Mexico so they can promote data openness at the municipal level. We were also excited to host Oludotun Babayemi from Nigeria, who explained how Index that involves Nigeria can help them to promote awareness in government and civilians to open data issues.

Now that the sprints are over, we still have a lot of work ahead of us. We are now reviewing all of the submissions. This year, we divided the editor role from 2014 into two roles known as ‘contributor’ and ‘reviewer’. This has been done so we can have a second pair of eyes to to ensure information is reliable and of excellent quality. Around the world people a team of reviewers are working on the submissions from the sprints. We are still looking for reviewers for South Africa, Bangladesh, Finland, Georgia, Latvia, Philippines and Norway. You can apply to become one here.

We are finalising the Index 2014 over the next few weeks. Stay tuned for more updates. In the meantime, we are also collecting your stories about participating in the Index for 2014. If you would like to contribute to these regional blogs, please email We would love to hear from you and make sure your country is represented.

pinboard: Code4Lib shop

planet code4lib - Thu, 2014-11-06 19:08
tshirts, mugs, etc.

Library of Congress: The Signal: WITNESS: Digital Preservation (in Plain Language) as a Tool for Justice

planet code4lib - Thu, 2014-11-06 18:09

Illustration of video file and wrapper from WITNESS.

Some of you information professionals may have experienced incidents where, in the middle of a breezy conversation, you get caught off guard  by a question about your work (“What do you do?”) and you struggle to come up with a straightforward, clear answer without losing the listener’s attention or narcotizing them into a stupor with your explanation.

Communicating lucid, stripped-down technical information to a general audience is a challenge…not dumbing down the information but simplifying it. Or, rather, un-complicating it and getting right to the point. At the Signal, we generally address our blog posts to institutions, librarians, archivists, students and information technologists. We preach to the choir and use peer jargon with an audience we assume knows a bit about digital preservation already. Occasionally we direct posts specifically to laypeople, yet we might still unintentionally couch some information in language that may be off-putting to them.

WITNESS, the human rights advocacy organization, has become expert in communicating complex technical information in a simple manner.  WITNESS empowers people by teaching them how to use video as a tool to document human rights abuses and how to preserve digital video so they can use it to corroborate their story when the time is right. Their audience — who may or may not be technologically savvy –  often comes to WITNESS in times of crisis, when they need immediate expertise and guidance.

Cell phone video interview on

What WITNESS has in common with the Library of Congress and other cultural institutions is a dedication to best practices in digital preservation. However, to the Library of Congress and its peer institutions, the term “digital preservation” pertains to cultural heritage; to victims of human rights violations, “digital preservation” pertains to evidence and justice.

For example, WITNESS advises people to not rename or modify the original video files. While that advice is in accord with the institutional practice of storing the original master file and  working only with derivative copies, that same advice, as applied to documenting human rights violations, protects people from the potential accusation of tampering with — or modifying — video to manipulate the truth. The original file might also retain such machine-captured metadata as the time, date and geolocation of the recording, which can be crucial for maintaining authenticity.

The Society of American Archivists recently honored WITNESS with their 2014 Preservation Publication Award for their “Activists Guide to Archiving Video.” The SAA stated, “Unlike other resources, (the guide) is aimed at content creators rather than archivists, enabling interventions that support preservation early in the digital life-cycle. The guide also uses easy-to-understand language and low-cost recommendations that empower individuals and grassroots organizations with fewer resources to take action to safeguard their own valuable collections. To date, the guide has found enthusiastic users among non-archivists, including independent media producers and archives educators, as well as archivists who are new to managing digital video content. The Award Committee noted that the guide was a ‘valuable contribution to the field of digital preservation’ and an ‘example of what a good online resource should be.’”

Screenshot from “What is Metadata” video by WITNESS.

That is an important distinction, the part about “…non-archivists, including independent media producers and archives educators, as well as archivists who are new to managing digital video content.” It means that WITNESS’s digital preservation resources are equally useful to a broad audience as they are to its intended audience of human rights advocates. Like the  Academy of Motion Picture Arts and Sciences’ 2007 publication, The Digital Dilemma (profiled in the Signal), the language that WITNESS communicates in is so plain and direct, and the advice so comprehensive, that the digital video preservation instruction in the publication is broadly applicable and useful beyond its intended audience. Indeed, WITNESS’s ”Activists Guide to Archiving Video” is used in training and college courses on digital preservation.

WITNESS’s latest resource, “Archiving for Activists,”  is a video series aimed at improving people’s understanding of digital video so they can make informed choices for shooting and preserving the best possible copy of the event. The videos in this series are:

Photo from

Some activists in the field have said that, thanks to WITNESS’s resources, they are organizing their footage better and adopting consistent naming conventions, which makes it easier to find files later on and strengthens the effectiveness of their home-grown archives. Yvonne Ng, senior archivist at WITNESS, said, “Even in a situation where they don’t have a lot of resources, there are simple things that can be done if you have a few hard drives and a simple system that everybody you are working with can follow in terms of how to organize your files and put them into information packages – putting things in folders and not renaming your files and not transcoding your files and having something like an Excel document to keep track of where your videos are.”

WITNESS will continue to offer professional digital video archival practices to those in need of human rights assistance, in the form of tools that are easy to use and readily available, in plain language. Ng said, “We talk about digital preservation in a way that is relevant and immediate to the people who are documenting abuses. It serves their end goals, which are not necessarily just to create an archive. It’s so that they can have a collection that they can easily use and it will maintain its integrity for years.”

HangingTogether: UCLA’s Center for Primary Resources and Training: A model for increasing the impact of special collections and archives

planet code4lib - Thu, 2014-11-06 17:00

Many of us in the special collections and archives community have long admired the purpose and scope of UCLA’s Center for Primary Resources and Training (CFPRT), so I was pleased to learn that the UCLA library would be celebrating the Center’s 10th anniversary with a symposium on 24 October. As a result, I now know that we should all be celebrating its remarkable success as well. The audience that day learned via stellar presentations by ten CFPRT “graduates” that the program’s impact on them, and on their students and colleagues, has been profound.

Vicki Steele, the Center’s founding director, talked about being inspired by the ARL “hidden collections” conference at the Library of Congress in 2003 (the papers were published here). She flew right back to UCLA and put together a strategy for not only making a dent in her department’s massive backlogs (she noted they had lost both collections and donors due to a well-deserved reputation for taking years to process new acquisitions) but for integrating special collections into the intellectual life of the university. Students have told her “you never know what you’re in training for” when describing the “life-changing experiences” fostered by working at CFPRT. And based on the presentations, it’s clear that this is not hyperbole. Oh, and it was great to learn that providing a very desirable wage to the Center’s fellows was a high priority from the beginning; one graduate noted that the stipend literally made it possible for her to focus on her studies and complete her M.A. program.

I confess that I’ve occasionally wondered how much the Center accomplishes beyond getting lots of special collections processed. In the wake of this symposium, I’m wondering no more. The achievements of the graduate students who have participated, their evangelism for the importance of primary sources research, and the effects of the CFPRT experience on their lives render this program a model for others to admire and, resources permitting, to replicate. Ensuring that special collections and archives achieve real impact is a huge emphasis these days—as it should be. The Center is a model for one meaningful approach.

A few of my takeaways:

  •  Alexandra Apolloni, Ph.D. student in musicology, now uses sheet music to teach her students about the many aspects of society reflected in such sources. She teaches them to “read a primary source for context.” She noted that it was useful to think about how future researchers would use the materials in order to maintain objectivity in her approach to processing and description.
  • Yasmin Dessem, MA graduate in moving image archive studies and now an archivist at Paramount Studios, discovered the power of primary sources to change history: evidence found in a collection on the notorious Lindbergh kidnapping suggests that the person executed for the crime was innocent. Too little, too late.
  • Andrew Gomez, Ph.D. graduate in history, played a central role in designing and implementing the exceptional digital resource The Los Angeles Aqueduct Digital Platform. In the process of this work, he became a huge supporter of the digital humanities as a rigorous complement to traditional historical research: his work involved standard historical skills and outputs such as studying primary sources and creating historical narratives, as well as mastering a wide variety of digital tools. He also learned how to address audiences other than fellow scholars; in effect, he saw that scholarship can have a broad reach if designed to do so. He is currently on the academic job market and noted that he is seeing ads for tenure-track faculty positions focused on digital humanities. The game may be starting to change.
  • Rhiannon Knol, M.A. student in classics, worked on textual medieval manuscripts. I liked her elegant statement about the ability of a book’s materiality to “communicate knowledge from the dead to the living.” She also quoted Umberto Eco: “Books are not made to be believed, but to be subject to inquiry.” I can imagine reciting both statements to students.
  • Erika Perez, Ph.D. graduate in history and now on the faculty of the University of Arizona, reported that when looking for a job, her experience at CFPRT helped her get her foot in the door and tended to be a major topic during interviews.
  • Aaron Gorelik, Ph.D. graduate in English, said that CFPRT changed his life by leading to his becoming a scholar of the poet Paul Monette. He had his “wow” moment when he realized that “this was a life, not a novel.” His work on Monette has guided his dissertation, teaching, and reading ever since, and he’s in the process of getting more than 100 unpublished Monette poems into press.
  • Audra Eagle Yun, MLIS graduate and now Head of Special Collections and Archives at UC Irvine, spoke of the CFPRT as an “archival incubator.” She and her fellow students were amazed that they would be trusted “to handle the stuff of history” and learned the centrality of doing research before processing. They graduated from CFPRT with the assumption that MPLP is standard processing. Ah, the joys of a fresh education, to be unfettered by unproductive past practice! She felt like a “real archivist” when she realized that she could identify the best research resources and make processing decisions without input from her supervisor.
  • Thai Jones, curator of U.S. history at the Columbia University Rare Books and Manuscripts Library, gave a fascinating keynote in which he told the story of researching his activist grandmother, Annie Stein, who worked for integration of New York City public schools from the 1950s to the 1980s. He gathered a collection of materials entirely via FOIA requests, and the resulting Annie Stein papers are heavily used. (His own life story is fascinating too: he was born and spent his early years living underground with his family because his father was on the run as a member of the Weather Underground. Gosh. Rather different from my Republican childhood!) He opined that digitization has revolutionized discovery for historians but lamented that many of his colleagues today identify and use online resources only. Please digitize more, and faster, is his mantra. It’s ours too, but we know how difficult and expensive it is to achieve. We need to keep developing methodologies for turning it around.

Few special collections and archives can muster the resources to launch and maintain a program as impressive as UCLA’s Center for Primary Resources and Training, but many can do it on a smaller scale. Do you work at one that has gotten started and from which colleagues might learn? If not, what are the challenges that have stopped you from moving forward? Please leave a comment and tell your story.



About Jackie Dooley

Jackie Dooley leads OCLC Research projects to inform and improve archives and special collections practice. Activities have included in-depth surveys of special collections libraries in the U.S./Canada and the U.K./Ireland; leading the Demystifying Born Digital work agenda; a detailed analysis of the 3 million MARC records in ArchiveGrid; and studying the needs of archival repositories for specialized tools and services. Her professional research interests have centered on the development of standards for cataloging and archival description. She is a past president of the Society of American Archivists and a Fellow of the Society.

Mail | Web | Twitter | Facebook | More Posts (15)

Jonathan Rochkind: Useful lesser known ruby Regexp methods

planet code4lib - Thu, 2014-11-06 15:50
1. Regexp.union

Have a bunch of regex’s, and want to see if a string matches any of them, but don’t actually care which one it matches, just if it matches any one or more? Don’t loop through them, combine them with Regexp.union.

union_re = Regexp.union(re1, re2, re3, as_many_as_you_want) str =~ union_re 2. Regexp.escape

Have an arbitrary string that you want to embed in a regex, interpreted as a literal? Might it include regex special chars that you want interpreted as literals instead? Why even think about whether it might or not, just escape it, always.

val = 'Section 19.2 + [Something else]' re = /key: #{Regexp.escape val}/

Yep, you can use #{} string interpolation in a regex literal, just like a double quoted string.

Filed under: General

Eric Hellman: If your website still uses HTTP, the X-UIDH header has turned you into a snitch

planet code4lib - Thu, 2014-11-06 14:54
Does your website still use HTTP? It not, you're a snitch.

As I talk to people about privacy, I've found a lot of misunderstanding. HTTPS applies encryption to the communication channel between you and the website you're looking at. It's an absolute necessity when someone's making a password or sending a credit card number, but the modern web environment has also made it important for any communication that expects privacy.

HTTP is like sending messages on a postcard. Anyone handling the message can read the whole message. Even worse, they can change the message if they want. HTTPS is like sending the message in a sealed envelope. The messengers can read the address, but they can't read or change the contents.

It used to be that network providers didn't read your web browsing traffic or insert content into it, but now they do so routinely. This week we learned that Verizon and AT&T were inserting an "X-UIDH" header into your mobile phone web traffic. So for example, if a teen was browsing a library catalog for books on "pregnancy" using a mobile phone, Verizon's advertising partners could, in theory, deliver advertising for maternity products.

The only way to stop this header insertion is for websites to use HTTPS. So do it. Or you're a snitch.

Sorry, doesn't support HTTPS. So if you mysteriously get ads for snitch-related products, or if the phrase "Verizon and AT&T" is not equal to "V*erizo*n and A*T*&T" without the asterisks, blame me and blame Google.

Here's more on the X-UIDH header.

Open Knowledge Foundation: Open Knowledge Festival 2014 report: out now!

planet code4lib - Thu, 2014-11-06 14:46

Today we are delighted to publish our report on OKFestival 2014!

This is packed with stories, statistics and outcomes from the event, highlighting the amazing facilitators, sessions, speakers and participants who made it an event to inspire. Explore the pictures, podcasts, etherpads and videos which reflect the different aspects of the event, and uncover some of its impact as related by people striving for change – those with Open Minds to Open Action.

Want more data? If you are still interested in knowing more about how the OKFestival budget was spent, we have published details about the events income and expenses here.

If you missed OKFestival this year, don’t worry – it will be back! Keep an eye on our blog for news and join the Open Knowledge discussion list to share your ideas for the next OKFestival. Looking forward to seeing you there!

OCLC Dev Network: Planned Downtime for November 9 Release

planet code4lib - Thu, 2014-11-06 14:30

WMS Web services will be down during the install window for this weekend's release. The install time for this release is between 2:00 – 7:00 am Eastern USA, Sunday Nov 9th.


Ted Lawless: Connecting Python's RDFLib and Stardog

planet code4lib - Thu, 2014-11-06 00:00
Connecting Python's RDFLib and Stardog

For a couple of years I have been working with the Python RDFLib library for converting data from various formats to RDF. This library serves this work well but it's sometimes difficult to track down a straightforward, working example of performing a particular operation or task in RDFLib. I have also become interested in learning more about the commercial triple store offerings, which promise better performance and more features than the open source solutions. A colleague has had good experiences with Stardog, a commercial semantic graph database (with a freely licensed community edition) from Clark & Parsia, so I thought I would investigate how to use RDFLib to load data in to Stardog and share my notes.

A "SPARQLStore" and "SPARQLUpdateStore" have been included with Python's RDFLib since version 4.0. These are designed to allow developers to use the RDFLib code as a client to any SPARQL endpoint. Since Stardog supports SPARQL 1.1, developers should be able to connect to Stardog from RDFLib in the similar way they would to other triple stores like Sesame or Fuseki.

Setup Stardog

You will need a working instance of Stardog. Stardog is available under a community license for evaluation after going through a simple registration process. If you haven't setup Stardog before, you might want to checkout Geir Grnmo's triplestores repository where he has Vagrant provisioning scripts for various triple stores. This is how I got up and running with Stardog.

Once Stardog is installed, start the Stardog server with security disabled. This will allow the RDFLib code to connect without a username and password. Obviously you will not want to run Stardog in this way in production but it is convenient for testing.

$./bin/stardog-admin server start --disable-security

Next create a database called "demo" to store our data.

$./bin/stardog-admin db create -n demo

At this point a SPARQL endpoint is available at ready for queries at http://localhost:5820/demo/query.


For this example, we'll add three skos:Concepts to a named graph in the Stardog store.

@prefix rdf: <> . @prefix rdfs: <> . @prefix skos: <> . @prefix xml: <> . @prefix xsd: <> . <> a skos:Concept ; skos:broader <> ; skos:preferredLabel "Baseball" . <> a skos:Concept ; skos:preferredLabel "Sports" . <> a skos:Concept ; skos:preferredLabel "Soccer" . Code

The complete example code here is available as a Gist.

Setting up the 'store'

We need to initialize a SPARQLUpdateStore as well as a named graph where we will store our assertions.

from rdflib import Graph, Literal, URIRef from rdflib.namespace import RDF, SKOS from rdflib.plugins.stores import sparqlstore #Define the Stardog store endpoint = 'http://localhost:5820/demo/query' store = sparqlstore.SPARQLUpdateStore(), endpoint)) #Identify a named graph where we will be adding our instances. default_graph = URIRef('') ng = Graph(store, identifier=default_graph) Loading assertions from a file

We can load our sample turtle file to an in-memory RDFLib graph.

g = Graph() g.parse('./sample-concepts.ttl', format='turtle') #Serialize our named graph to make sure we got what we expect. print g.serialize(format='turtle')

Since our data is now loaded as an in memory Graph we can add it to Stardog with a SPARQL INSERT DATA operation.

ng.update( u'INSERT DATA { %s }' % g.serialize(format='nt') ) Use the RDFLib API to inspect the data

Using the RDFLib API, we can list all the Concepts in the Stardog that were just added.

for subj in ng.subjects(predicate=RDF.type, object=SKOS.Concept): print 'Concept: ', subj

And, we can find concepts that are broader than others.

for ob in ng.objects(predicate=SKOS.broader): print 'Broader: ', ob Use RDFLib to issue SPARQL read queries.

RDFLib allows for binding a prefix to a namespace. This makes our queries easier to read and write.

store.bind('skos', SKOS)

A SELECT query to get all the skos:preferredLabel for skos:Concepts.

rq = """ SELECT ?s ?label WHERE { ?s a skos:Concept ; skos:preferredLabel ?label . } """ for s, l in ng.query(rq): print s.n3(), l.n3() Use RDFLib to add assertions.

The RDFLib API can also be used to add new assertions to Stardog.

soccer = URIRef('') ng.add((soccer, SKOS.altLabel, Literal('Football')))

We can now Read statements about soccer using the RDFLib API, which issues the proper SPARQL query to Stardog in the background.

for s, p, o in ng.triples((soccer, None, None)): print s.n3(), p.n3(), o.n3() Summary

With a little setup, we can begin working with Stardog in RDFLib in a similar way that we work with RDFLib and other backends. The sample code here is included in this Gist.

DuraSpace News: Recordings available for the Fedora 4.0 Webinar Series

planet code4lib - Thu, 2014-11-06 00:00

Winchester, MA

On November 5, 2014 the Hot Topics DuraSpace Community Webinar series, “Early Advantage: Introducing New Fedora 4.0 Repositories,” concluded with its final webinar, “Fedora 4.0 in Action at Penn State and Stanford.”

DuraSpace News: Fedora 4 Almost Out the Door: Final Community Opportunity for Feedback!

planet code4lib - Thu, 2014-11-06 00:00

From Andrew Woods, Technical Lead for Fedora 

Winchester, MA  Fedora 4 Beta-04 will be released before this coming Monday, November 10, 2014. The development sprint that also begins on November 10 will be focused on testing and documentation as we prepare for the Fedora 4.0 production release.


Subscribe to code4lib aggregator