You are here

Feed aggregator

Eric Lease Morgan: distance.cgi – My first Python-based CGI script

planet code4lib - Fri, 2015-01-09 16:10

Yesterday I finished writing my first Python-based CGI script — distance.cgi. Given two words, it allows the reader to first disambiguate between various definitions of the words, and second, uses Wordnet’s network to display various relationships (distances) between the resulting “synsets”. (Source code is here.)

Reader input


Display result

The script relies on Python’s Natural Language Toolkit (NLTK) which provides an enormous amount of functionality when it comes to natural language processing. I’m impressed. On the other hand, the script is not zippy, and I am not sure how performance can be improved. Any hints?

Jonathan Rochkind: Fraud in scholarly publishing

planet code4lib - Fri, 2015-01-09 15:38

Should librarianship be a field that studies academic publishing as an endeavor, and works to educate scholars and students to take a critical perspective?  Some librarians are expected/required to publish for career promotion, are investigations in this area something anyone does?

From Scientific American, For Sale: “Your Name Here” in a Prestigious Science Journal:

Klaus Kayser has been publishing electronic journals for so long he can remember mailing them to subscribers on floppy disks. His 19 years of experience have made him keenly aware of the problem of scientific fraud. In his view, he takes extraordinary measures to protect the journal he currently edits, Diagnostic Pathology. For instance, to prevent authors from trying to pass off microscope images from the Internet as their own, he requires them to send along the original glass slides.

Despite his vigilance, however, signs of possible research misconduct have crept into some articles published in Diagnostic Pathology. Six of the 14 articles in the May 2014 issue, for instance, contain suspicious repetitions of phrases and other irregularities. When Scientific American informed Kayser, he was apparently unaware of the problem. “Nobody told this to me,” he says. “I’m very grateful to you.”


The dubious papers aren’t easy to spot. Taken individually each research article seems legitimate. But in an investigation by Scientific American that analyzed the language used in more than 100 scientific articles we found evidence of some worrisome patterns—signs of what appears to be an attempt to game the peer-review system on an industrial scale.


A quick Internet search uncovers outfits that offer to arrange, for a fee, authorship of papers to be published in peer-reviewed outlets. They seem to cater to researchers looking for a quick and dirty way of getting a publication in a prestigious international scientific journal.

This particular form of the for-pay mad-libs-style research paper appears to be prominent  mainly among researchers in China. How can we talk about this without accidentally stooping to or encouraging anti-Chinese racism or xenophobia?   There are other forms of research fraud and quality issues which are prominent in the U.S. and English-speaking research world too.  If you follow this theme of scholarly quality issues, as I’ve been trying to do casually, you start to suspect the entire scholarly publishing system, really.

We know, for instance, that ghost-written scholarly pharmaceutical articles are not uncommon in the U.S. too.   Perhaps in the U.S. scholarly fraud is more likely to come for ‘free’ from interested commercial entities, then by researchers paying ‘paper salesmen’ for poor quality papers.  To me, a paper written by a pharmaceutical company employer but published under the name of an ‘independent’ researcher is arguably a worse ethical violation, even if everyone involved can think “Well, the science is good anyway.”  It also wouldn’t shock me if very similar systems to China’s paper-for-sale industry exist in the U.S., on a much smaller scale, but they are more adept at avoiding reuse of nonsense boilerplate, making it harder to detect. Presumably the Chinese industry will get better at avoiding detection too, or perhaps already is at a higher end of the market.

In both cases, the context is extreme career pressure to ‘publish or perish’, into a system that lacks the ability to actually ascertain research quality sufficiently, but which the scholarly community believes has that ability.

Problems with research quality, don’t end here, they go on and on, and are starting to get more attention.

  • An article from the LA Times from Oct 2013,
    Science has lost its way, at a big cost to humanity: Researchers are rewarded for splashy findings, not for double-checking accuracy. So many scientists looking for cures to diseases have been building on ideas that aren’t even true.” (and the HN thread on it).
  • From the Economist, also from last year, “Trouble at the lab: Scientists like to think of science as self-correcting. To an alarming degree, it is not.”
  • From Nature August 2013 (was 2013 the year of discovering scientific publishing ain’t what we thought?), “US behavioural research studies skew positive:
    Scientists speculate ‘US effect’ is a result of publish-or-perish mentality.

There are also individual research papers investigating particular issues, especially statistical methodology problems, in scientific publishing.  I’m not sure if there are any scholarly papers or monographs which take a big picture overview of the crisis in scientific publishing quality/reliability — anyone know of any?

To change the system, we need to understand the system — and start by lowering confidence in the capabilities of existing ‘gatekeeping’.  And the ‘we’ is the entire cross-disciplinary community of scholars and researchers. We need an academic discipline and community devoted to a critical examination of scholarly research and publishing as a social and scientific phenomenon, using social science and history/philosophy of science research methods; a research community (of research on research) which is also devoted to education of all scholars, scientists, and students into a critical perspective.   Librarians seem well situated to engage in this project in some ways, although in others it may be unrealistic to expect.

Filed under: General

Islandora: Announcing the Islandora GIS Interest Group

planet code4lib - Fri, 2015-01-09 14:34

A new Islandora Interest Group has been convened by James Griffin of Lafayette College Libraries. The GIS IG is looking for interested members to join in discussions about how to handle geospatial data sets in Islandora. As with our other Interest Groups, group documents and membership details are handled through GitHub.

  • The primary objective of this interest group is to aim to release a set of functionality that provides for members of the Islandora Community the ability to ingest, browse, and discover geospatial data sets
    • Considered to be essential to this solution is the ability to visualize geospatial vector and raster data sets (features and coverages), as well as the ability to index Fedora Commons Objects within Apache Solr using key geographic metadata fields
  • In doing do, this interest group addresses the preservation, access, and discovery of geospatial data sets (these data sets, not being limited to but including Esri Shapefiles [vector data sets] and images in the GeoTIFF [raster data sets]).
  • It enables and structures descriptive and technical metadata for these data sets (including the usage of ISO 19139-compliant XML Documents referred to as "Federal Geographic Data Committee" (FGDC) Documents and MODS Documents)
  • It explores and defines a series of best practices for the generation of common and open standards for the serialization of entities within geospatial data sets (such as the Keyhole Markup Language Documents and GeoJSON Objects)
  • It captures and manages user stories involving the management of geospatial metadata, and to refactor these into feature or improvement requests for the solution being implemented by the Islandora Community

If you are interested in joining up, please reply on this listserv thread or contact the convenor.

Library of Congress: The Signal: Web Archive Management at NYARC: An NDSR Project Update

planet code4lib - Fri, 2015-01-09 14:21

The following is a guest post by Karl-Rainer Blumenthal, National Digital Stewardship Resident at the New York Art Resources Consortium (NYARC).

A tipping point from traditional to emergent digital technologies in the regular conduct of art historical scholarship threatens to leave unprepared institutions and their researchers alike in a “digital black hole.” NYARC–the partnership of the Frick Art Reference Library, the Museum of Modern Art Library and the Brooklyn Museum Library & Archives–seeks to institute permanent and precedent-setting collecting programs for born-digital primary source materials that make this black hole significantly more gray.

Since the 2013 grant from the Andrew W. Mellon Foundation, for instance, NYARC has archived the web presences of its partner museums and those of prominent galleries, auction houses, artists, provenance researchers and others within their traditional collecting scopes. While working to define description standards and integrating access points with those of traditional resources, NYARC has further leveraged this leadership opportunity by designing this current National Digital Stewardship Residency project, which is to concurrently prepare their nascent collections for long-term management and preservation.

Archiving MoMA’s many exhibit sites preserves them for future art historians, but only if critical elements aren’t lost in the process.

Stewarding web archives to the future generations that will learn from them requires careful planning and policymaking. Sensitive preservation description and reliable storage and backup routines will ultimately determine the accessibility of these benchmarks of our online culture for future librarians, archivists, researchers and students. Before we can plan and prepare for the long term, however, it is incumbent upon those of us with responsibility to steward especially visually rich and complex cultural artifacts to assure their integrity at the point of collection–to assure their faithful rendition of the extent, behavior and appearance of visual information transmitted over this uniquely visual medium.

Quality assurance (QA)–the process of verifying and/or making the interventions necessary to improve the accuracy and integrity of archived web-based resources at the point of their collection–was therefore the logical place to begin defining long term stewardship needs.  As I quickly discovered, though, it also happens to be one of the slipperiest issues for even experienced web archivists. Like putting together a jigsaw puzzle, its success begins with having all of the right pieces, then requires fitting those pieces together in the correct order and sequence, and ultimately hinges on the degree to which our final product’s ‘look and feel’ resembles that of our original vision.

Unless and until the technologies that we use to crawl and capture content from the live web can simply replicate every conceivable experience that any human browser may have online, we are compelled to decide which specific properties of equally sprawling and ephemeral web presences are of primary significance to our respective missions and patrons, and which therefore demand our most assiduous and resource-intensive pursuit.

Determining those priority areas and then finding the requisite time and manpower to do them justice is challenging enough to any web archiving operation. To a multi-institutional partnership sharing responsibility for aesthetically diverse but equally rich and complex web designs, it’s enough to stop you right in your tracks. To keep NYARC’s small army of graduate student QA technicians all moving in the same direction as efficiently as possible, and to sustain a model of their work beyond the end of their grant-funded terms, I’ve therefore spent the bulk of this first phase to my NDSR project building towards the following procedural reference guide. I now welcome the broader web archiving community to review, discuss and adapt this to their own use:

This living document will be updated to reflect technical and practical developments throughout and beyond the remainder of my residency. In the meantime, it will provide NYARC’s decision-makers, and others who are designing permanent web archiving programs, an executive summary of the principles and technologies that influence the potential scopes of QA work. Its procedural guidelines walk our QA technicians through their regular assessment and documentation process. Perhaps most importantly, this roadmap directs them to the areas where they may make meaningful interventions, indicates where they alternatively must rely on help from our software service providers, Archive-It, and flags where future technical development still precludes any potential for improvement. Finally, it inventories the major problem areas and improvement strategies presently known to NYARC to make or break the whole process.

This iteration of NYARC’s documentation is the product of expansive literature review, hands-on QA work, regular consultation and problem solving with interns and professional staff, and the generous advice of colleagues throughout the community. As such, it has prepared me not only for upcoming NDSR project phases focused on preservation metadata and archival storage, but also for a much longer career in digital preservation.

As any such project must, it hinges the success of any rapidly acquired technical knowledge or expertise to equally effective project management, communication and open documentation–skill sets that every emergent professional must cultivate in order to have a permanent role in the stewardship of our always tumultuous digital culture. I’m sure that this small documentation effort will provide NYARC, and similar partners in the field, with the tools to improve the quality of their web archives. Also, I sincerely hope that it provides a model of practice to sustain such improvements over radical and unforeseen technological changes–that it makes the digital black hole just a little more gray.

Shelley Gullikson: Web Rewriting Sprint

planet code4lib - Fri, 2015-01-09 14:08

At the end of October, I was watching tweets coming out of a UX webinar and saw this:

I thought it sounded great, so ran it by Web Committee that same week and we scheduled a sprint for the end of term. Boom. I love it when an idea turns into a plan so quickly!

We agreed that we needed common guidelines for editing the pages. I planned to point to an existing writing guide, but decided to draft one using examples from our own site.

I put together a spreadsheet of all the pages linked directly from the home page or navigation menus, plus all the pages owned by admin or by me. Subject guides and course guides were left out. The committee decided to start with content owned by committee members, rather than asking permission to edit other staff members’ content. We prioritized the resulting list of 57 pages (well, 57 chunks of content – some of those were Drupal “books” with multiple pages).

Seven of us got together on an early December afternoon (six in the room, one online from the East Coast). Armed with snacks, we spent 90 minutes editing and got through most of our top and mid-priority pages.

It was a very positive experience. We got a second set of eyes on content that may have only ever been looked at by one person. We were able to talk to each other to get feedback on clear and concise wording. And we saw pages that were already pretty good, which was a nice feeling too.

We’ve organized another sprint for reading week in February. We’re going to look at the top priority pages again, to see if we can make them even clearer and more concise.


DPLA: From Book Patrol: Happy New Calendar!

planet code4lib - Fri, 2015-01-09 14:00

Now that we have our new calendar in place to help track the year ahead let’s have a look back at some of the thousands of calendars available for your perusal at the DPLA. Derived from the Latin word kalendae, which was the name of the first day of every month, there are as many varieties of calendars as there are days of the month.

From a 12th century Book of Hours to a 16th century perpetual calendar to a Native American calendar on buckskin to a handwritten calendar by Lee Harvey Oswald, there is no shortage of creative ways to track time and in many cases to advertise ones business.


LITA: Tech Tools in Traditional Positions

planet code4lib - Fri, 2015-01-09 12:00

During this winter break, I’ve had a slight lull in library work and time to reflect on my first semester of library school, aside from reading for pleasure and beginning Black Mirror on Netflix (anybody?). Overall, I’m ready to dive in to the new semester, but one tidbit from fall semester keeps floating in my thoughts, and I’m curious what LITA Blog readers have to say.

Throughout my undergraduate education at the University of Nebraska-Lincoln, I was mainly exposed to two different sets of digital humanities practices: encoding and digital archive practices, and text analysis for literature. With my decision to attend library school, I assumed I would focus on the former for the next two to three years.

Last semester, in my User Services and Tools course, we had a guest speaker from User Needs Assessment in the Indiana University Libraries. As the title suggests, he spoke about developing physical user spaces in the libraries and facilitating assessments of current spaces.

For one portion of his assessments, he used text analysis, more specifically topic modeling with MALLET, a Java-based, natural language processing toolkit, to gain a better understanding of written survey results. This post by Shawn Graham, Scott Weingart, and Ian Milligan explains topic modeling, when/how/why to use it, and various tools to make it happen, focusing on MALLET.

If you didn’t follow the links, topic modeling works by aggregating many texts a user feeds into the algorithm and returns sets of related words from the texts. The user then attempts to understand the theme presented by each set of words and give reason to why it appears. Many times, this practice can reveal themes the user may not have noticed through traditional reading across multiple texts.

Image courtesy of Library Technology Consultants.

From a digital humanities perspective, we love it when computers show us things we missed or help make a task more efficient. Thus, using topic modeling seems an intuitive step for analyzing survey results, as the guest speaker presented. Yet, was also unexpected considering his more traditional position.

I’m curious where you have used some sort of technology, coding, or digital tool to solve a problem or expedite a process in a more traditional library position. Librarians working with digital objects use these technologies and practices daily, but as digital processes, such as topic modeling and text analysis, become more widely used, I’m interested to see where else they crop up and for which reasons.

Feel free to respond with an example of when you unexpectedly used text analysis or another tech tool in your library to complete a task that didn’t necessarily involve digital objects! How did you discover the tool? How did you learn it? Would you use it again?

LITA: Jobs in Information Technology: January 8

planet code4lib - Thu, 2015-01-08 21:09

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Associate Dean for Technology and Digital Strategies, The Pennsylvania State University Libraries, University Park, PA

Visit the LITA Job Site for more available jobs and for information on submitting a  job posting.


DPLA: Metadata Aggregation Webinar: January 22, 2015 at 2 PM Eastern

planet code4lib - Thu, 2015-01-08 19:14

Metadata is basis of the work of DPLA. We rely on a growing network of Content Hubs, large repositories of digital content, and Service Hubs that aggregate metadata from partners. We, in turn, aggregate the Hubs’ metadata into the DPLA datastore.

With new Hubs, we often work together to identify organizational and governance structures that make the most sense for their local situation. Once an administrative model is established, the practical matter of how to aggregate their partners’ metadata and how to deal with deal with quality control over the resulting aggregated set plays a larger role.

DPLA’s Hub network does not rely on a single metadata aggregation workflow or tool, and own aggregation practices are quite a bit different from our partners’. While diversity in approaches is good in that each Hub can create a process that works best for them, it also means that our community hasn’t decided on a set of standard practices or tools.

We’ve recently implemented an application process for new Hubs, so it seems timely to start a conversation about metadata aggregation practices among our current and potential Hubs, their partners, and really, anyone else interested in sharing and enhancing metadata. It seems that there’s always something to learn about metadata aggregation, and we’re hopeful that DPLA can be a conduit for a discussion about some of the fundamental concepts and requirements for local practice and aggregation at scale.

To that end, on January 22, at 2 pm eastern, we will be hosting a webinar about metadata aggregation. We’ll be taking an inside look at aggregation best practices at two of our DPLA Service Hubs in North Carolina and South Carolina. In addition, DPLA has been working on improving our existing tools as well as creating some new ones for metadata aggregation and quality control.We’d like to share what’s in place and preview some of our plans and we hope to get feedback on future directions.


This webinar will be offered to the public. Since we’ll be limited to 100 seats, please limit registration to no more than two seats per organization. Please get in touch with Gretchen with any questions.

Hydra Project: Indiana University and WGBH Boston share major NEH grant

planet code4lib - Thu, 2015-01-08 09:31

Hydra Partners Indiana University and WGBH Boston have jointly been awarded nearly $400,000 by the National Endowment for the Humanities to develop HydraDAM2, a Hydra-based software tool that will assist in the long-term preservation of audio and video collections.

HydraDAM2 will primarily address challenges posed by long-term preservation of digital audio and video files. Because these “time-based media” files are significantly larger than many other digital files managed by libraries and archives, they potentially require special solutions and workflows.

An important feature of HydraDAM2 is that it will be open source and can be used and shared freely among cultural institutions, including libraries, archives, universities and public broadcasters.

HydraDAM2 is also scalable to both small and large organizations, having the ability to interact with massive digital storage systems as well as with smaller digital tape storage systems.

The full press release can be found at

Congratulations to Jon Dunn at IU, Karen Cariani at WGBH and to their teams.

District Dispatch: Key ALA Offices to team up with “Advocacy Guru” at 2015 Midwinter Meeting

planet code4lib - Thu, 2015-01-08 06:06

In her inaugural column for American Libraries, titled “Advocate. Today.,” American Library Association (ALA) President Courtney Young challenged librarians of all types, and friends of libraries, to commit to spending just an hour a week advocating for libraries. To take the mystery out of just what “advocacy” means, how to do it and how to have fun along the way, ALA’s Offices of Intellectual Freedom (OIF), Library Advocacy (OLA), Public Information and the Washington Office will partner with all ALA divisions to present “An Hour a Week: Library Advocacy is Easy!!!” during the 2015 ALA Midwinter Meeting in Chicago. The program is being cosponsored and will be co-promoted by all of ALA’s twelve divisions.

Grassroots advocacy guru Stephanie Vance leads a session at a previous ALA conference.

The session, which will be held on Saturday, January 31, 2015, from 10:30–11:30 a.m., will be led by the ever-popular “Advocacy Guru,” Stephanie Vance, who will walk “newbies” and “old pros” alike through just what advocacy means today–from engaging with the local PTAs and library boards to lobbying the White House. With the help of panelists from OIF and OLA, Vance will share easy advocacy strategies and lead a lightening tour of the many terrific ALA advocacy resources available to give “ALAdvocates” everything they need to answer Courtney’s call.

For fully half the program, Vance also will “dive” into the audience to “get the story” of what works in the real world from real librarians and library supporters from all parts of the profession. These everyday advocates will share their own hands-on experiences at bringing libraries’ messages to allies, policy-makers and elected officials at every level and explain why every librarian’s (and library lover’s) involvement in advocacy is so critical to the future of libraries of every kind everywhere.

Additional speakers include Marci Merola, director of the ALA Office for Library Advocacy and Barbara Jones, director of the ALA Office for Intellectual Freedom.

View other ALA Washington Office Midwinter Meeting conference sessions

The post Key ALA Offices to team up with “Advocacy Guru” at 2015 Midwinter Meeting appeared first on District Dispatch.

Library of Congress: The Signal: Digital Preservation in Mid-Michigan: An Interview with Ed Busch

planet code4lib - Wed, 2015-01-07 18:12

Conferences, meetings and meet-ups are important networking and collaboration events that allow librarians and archivists to share digital stewardship experiences. While national conferences and meetings offer strong professional development opportunities, regional and local meetings offer opportunities for practitioners to connect and network with a local community of practice. In a previous blog post, Kim Schroeder, a lecturer at the Wayne State University School of Library and Information Science, shared her experiences planning and holding Regional Digital Preservation Practitioners (RDPP) in Detroit. In this post, part of our Insights Interview series, I’m excited to talk to Ed Busch, Electronic Records Archivist at Michigan State University, about his experiences spearheading the Mid-Michigan Digital Practitioners Group.

Erin: Please tell us a little bit about yourself and what you do at Michigan State University.

Ed Busch with MSU archives collection of civil war documents. Photo courtesy of Communications and Brand Strategy.

Ed: I come from what I suspect is a unique background for an archivist. I have an undergraduate B.S. in Fisheries from Humboldt State University in California, took coursework in programming (BASIC, FORTAN, APL), worked as a computer operator (loading punch cards and hanging tapes), performed software testing as well as requirements writing, and was a stay-at home dad for a period of time.

It was during this period, that I looked into librarianship; I thought I could bring an IT background along with my love of history and genealogy to the field. After I completed my MLIS and Archives Administration certificate at Wayne State in 2007, I began a processing archivist position at the MSU Archives that lead to my current position as the Electronic Records Archivist.

As an archivist here, I work on a lot of different projects. This includes “digital projects” such as web crawling (via Archive-It), adding content to our website, managing our Archivists’ Toolkit installation, managing a couple of listservs (Michigan Archival Association and Big10 Archivists), working on our Trusted Digital Repository workflows and identifying useful tools to aid processing digital records. I also continue to do some paper processing, manage our Civil War Letters and Diaries Digitization project and the development of an AV Digitization Lab at the archives. I’m also the first person staff consults for PC or network issues at the archives.

Erin: How are you involved in digital preservation at your organization?

Ed: I supported my fellow electronic records archivist Lisa Schmidt on a NHPRC grant to create the Spartan Archive, a repository for Michigan State University’s born-digital institutional administrative records. For the grant, we focused on MSU’s Office of the Registrar digital records.

As a follow-on to the grant we are working on creating a Trusted Digital Repository for MSU. We are currently ingesting digital records using Archivematica into a preservation environment. Lisa and an intern do most of the actual ingesting while I provide technical advice, create workflows for unique items and identify useful tools. We are also evaluating applications that can help manage our digital assets and to provide access to them.

One area that has been on the “To Do list” is processing the digital assets from our university photographers and videographers. The challenges include selecting what to keep and what not, how to provide access and how to fund the storage for this large amount of data. I’ve also explored some facial recognition applications but haven’t found a good way to integrate into our TDR yet.

I’m also the person doing all the web archiving for the University and testing out migrating ArchivesSpace so that we can schedule a transition for it. Besides the Mid-Michigan Digital Practitioners (MMDP) meeting planning, I also attend meetings of Web Developers here at MSU (WebDev CAFE) and am a volunteer on the ArchivesSpace Technical Advisory Council.

Erin: Could you talk about Mid-Michigan Digital Practitioners Group. You have had some very successful regional meetings over the past couple of years. Can you tell us more about these?

Presentation during the first MMDP. Photo credit: Courtesy of MSU Archives.

Ed: In February of 2013, I heard about a new group for Digital Preservation Practitioners in the Detroit/Ann Arbor/Toledo/Windsor Area. I recall thinking that this sounds neat and wanted to explore if there was interest in holding a session for Mid-Michigan Digital Preservation Practitioners with the purpose to get together and talk about what the various institutions are doing: projects, technologies, partners, etc.

After contacting some of my colleagues about this, the answer was a resounding yes! Portia Vescio (Assistant Director of the Archives) and myself contacted Digital Curation Librarian Aaron Collie and we created Mid-Michigan Digital Practitioners. Systems Librarian Ranti Junus joined the three of us to form the Mid-Michigan Digital Practitioners planning group.  We’ve had great support from the MSU Archives and MSU Libraries leadership for this effort.

We held our first meeting at MSU in August of 2013. From the beginning, we’ve been big on using email and surveys to get ideas and help from the Mid-Michigan professionals working with digital materials. For this first meeting, we came up with a rough agenda and started soliciting presenters to talk about what they were working on. We also communicated with Kim and Lance [Stuchell]’s group to keep them in the loop. There was some concern that there were two groups but we really wanted to serve the needs of the Mid-Michigan area. Many smaller shops don’t have the resources to go far. At that first meeting, we had over 50 attendees from around 15 different institutions. What most people kept saying they liked best was the chance to talk to other people trying to solve the same problems.

We held the second meeting in March 2014 at Grand Valley State University with over 50 attendees from 24 different institutions. We repeated the process and held the third meeting at Central Michigan University this past September with 50 attendees from over 20 institutions.

We’re now just starting the planning for the 4th meeting for March 27, 2015 at the University of Michigan in Ann Arbor. We have high hopes for a great meeting and hopefully some student involvement from the U of Michigan School of Information and Wayne State University School of Library Science. We’ve also setup a listserv ( to aid communication.

Erin: What did you feel was most successful about your meetings?

Participants at the first MMDP. Photo credit: Courtesy of MSU Archives.

Ed:  I think what’s been most successful is creating a chance for archivists, librarians and museum curators from all types and sizes of institutions to share experiences, what’s worked, what hasn’t, nifty tools, cool projects, etc. about their digital materials. Feedback from the meetings has this as the thing most people liked best. We also really do try to use the feedback we get to improve each meeting, try out new things and talk about what people are interested in learning more about.

Erin: What kind of impact do you think these meetings have had on your community and the organizations in your region?

Ed:  I think our greatest contribution to the region has been creating a place for professionals from large and small institutions to see what’s happening in the area of digital materials and to share experiences. Digital materials have the same issues/problems/situations for all of us; the main difference being what resources we can use to deal with them. By providing a forum for people to meet, hopefully everyone can get ideas to take back with them and to have information they can share with their leadership on the importance of this work.

Erin: What one piece of advice would you offer others who may be interested in starting up a regional practitioners group?

Ed:  One thing that I believe has made our group able to keep going is that the core planning group is all located at MSU. We can meet every few weeks to work on the next meeting, assign tasks and share information with the host institution. Saying that, for the next MMDP meeting, we are expanding our planning group to include a few other people to call in to the planning meetings. We’ll see how that works and regroup if needed or possibly add some more. Flexibility is important.

I do sincerely believe though that what really makes a difference is the interest and commitment of the planning team and its leadership at the Archives and Libraries to keep this going even though we each have a lot on our plates. We feel this is vital to the community of archivists, librarians and curators in the area.

Erin White: Easier access for databases and research guides at VCU Libraries

planet code4lib - Wed, 2015-01-07 15:00

Today VCU Libraries launched a couple of new web tools that should make it easier for people to find or discover our library’s databases and research guides.

This project’s goal was to help connect “hunters” to known databases and help “gatherers” explore new topic areas in databases and research guides1. Our web redesign task force identified these issues in 2012 user research.

1. New look for the databases list

Since the dawn of library-web time, visitors to our databases landing page were presented with an A to Z list of hundreds of databases with a list of subject categories tucked away in the sidebar.

The new design for the databases list presents a few ways to get at databases, in this order:

For the hunters:

  • Search by title with autocomplete (new functionality)
  • A to Z links

For the gatherers:

  • Popular databases (new functionality)
  • Databases by subject

And, on database subject pages and database search results, there are links to related research guides.

2. Suggested results for search

Building on the search feature in the new database list, we created an AJAX Google Adwords-esque add-on to our search engine (Ex Libris’ Primo) that recommends databases or research guides results based on the search query. For longer, more complex queries, no suggestions are shown.

Try these queries:

Included in the suggested results:

3. Updates to link pathways for databases

To highlight the changes to the databases page, we also made some changes to how we are linking to it. Previously, our homepage search box linked to popular databases, the alphabet characters A through Z, our subject list, and “all”.

The intent of the new design is to surface the new databases list landing page and wean users off the A-Z interaction pattern in lieu of search.

The top three databases are still on the list both for easy access and to provide “information scent” to clue beginner researchers in on what a database might be.

Dropping the A-Z links will require advanced researchers to make a change in their interaction patterns, but it could also mean that they’re able to get to their favorite databases more easily (and possibly unearth new databases they didn’t know about).

Remaining questions/issues
  • Research guides search is just okay. The results are helpful a majority of the time and wildly nonsensical the rest of the time. And, this search is slowing down the overall load time for suggested results. The jury is still out on whether we’ll keep this search around.
  • Our database subject categories need work, and we need to figure out how research guides and database categories should relate to each other. They don’t connect right now.
  • We don’t know if people will actually use the suggested search results and are not sure how to define success. We are tracking the number of clicks on these links using Google Analytics event tracking – but what’s good? How do we know to keep this system around?
  • The change away from the A-Z link list will be disruptive for many and was not universally popular among our librarians. Ultimately it should be faster for “hunters”, but we will likely hear groans.
  • The database title search doesn’t yet account for common and understandable misspellings2 of database names, which we hope to rectify in the future with alternate titles in the metadata.
Necessary credits

Shariq Torres, our web engineer, provided the programming brawn behind this project, completely rearchitecting the database list in Slim/Ember and writing an AJAX frontend for the suggested results. Shariq worked with systems librarians Emily Owens and Tom McNulty to get a Dublin Core XML file of the databases indexed and searchable in Primo. Web designer Alison Tinker consulted on look and feel and responsified the design for smaller-screen devices. A slew of VCU librarians provided valuable feedback and QA testing.

  1. I believe this hunter-gatherer analogy for information-seeking behaviors came from Sandstrom’s An Optimal Foraging Approach to Information Seeking and Use (1994) and have heard it in multiple forms from smart librarians over the years.
  2. Great info from Ken Varnum’s Database Names are Hard to Learn (2014)

DPLA: What’s Ahead for DPLA: Our New Strategic Plan

planet code4lib - Wed, 2015-01-07 14:45

The Digital Public Library of America launched on April 18, 2013, less than two years ago. And what a couple of years it has been. From a staff of three people, a starting slate of two million items, and 500 contributing institutions, we are now an organization of 12, with over eight million items from 1,300 contributing institutions. We have materials from all 50 states—and from around the world—in a remarkable 400 languages. Within our collection are millions of books and photographs, maps of all shapes and sizes, material culture and works of art, the products of science and medicine, and rare documents, postcards, and media.

But focusing on these numbers and their growth, while gratifying and a signal DPLA is thriving, is perhaps less important than what the numbers represent. DPLA has always been a community effort, and that community, which became active in the planning phase to support the idea of a noncommercial effort to bring together American libraries, archives, and museums, and to make their content freely available to the world, has strengthened even more since 2013. A truly national network and digital platform is emerging, although we still have much to do. A strong commitment to providing open access to our shared cultural heritage, and a deeply collaborative spirit, is what drives us every day.

Looking back, 2013 was characterized by a start-up mode: hiring staff, getting the site and infrastructure live, and bringing on a first wave of states and collections. 2014 was a year in which we juggled so much: many new hubs, partners, and content, lining up additional future contributors, and beginning to restructure our technology behind the scenes to prepare for an even more expansive collection and network.

Beginning this year, and with the release of our strategic plan for the next three years, the Digital Public Library of America will hit its stride. We encourage you to read the plan to see what’s in store, but also to know that it will require your help and support; so much in the plan is community-driven, and will be done with that same emphasis on widespread and productive collaboration.

We will be systematically putting in place what will be needed to ensure that there’s an on-ramp to the DPLA for every collection in the United States, in every state. We call this “completing the map,” or making sure that we have a DPLA service hub available to every library, archive, museum, and cultural heritage site that wishes to get their materials online and described in such a way as to be broadly discoverable. We also plan to make special efforts around certain content types—areas where there are gaps in our collection, or where we feel DPLA can make a difference as an agent of discovery and serendipity.

We have already begun to make some major technical improvements that will make ingesting content and managing metadata even better. This initiative will accelerate and be shared with our network. Moreover, we will make a major effort in the coming years to make sure that our valuable unified collection reaches every classroom, civic institution, and audience, to educate, inform, and delight.

There’s a lot to do. We just put a big pot of coffee on. Please join us for this next phase of rapid growth and communal effort!

LibUX: Links Should Open in the Same Window

planet code4lib - Wed, 2015-01-07 07:53

A question came up on ALA Think Tank:

What do you prefer: to click a link and it open in a new tab or for it to open in the same page? Is there a best practice?

There is. The best practice is to leave the default link behavior alone. Usually, this means that the link on a website will open in that same window or tab. Ideas about what links should do are taken for granted, and “best practices” that favor links opening new windows – well, aren’t exactly.

It’s worth taking a look at the original thread because I really hesitate to misrepresent it. I’m not bashing. Well-meaning Think Tankers were in favor of links opening new tabs. Below, I cherry-picked a few comments to communicate the gist:

  • “Most marketing folks will tell you that If it is a link outside your website open in a new tab, that way they don’t lose your site. Within your own site then stay with the default.”
  • “New tab because it’s likely that I want to keep the original page open. And, as [name redacted] mentions, you want to keep them on your site.”
  • “External links open in new tabs.”
  • “I choose to open in a new tab, so the user can easily return to the website in the original tab.”
  • “I was taught in Web design to always go with a new tab. You don’t want to navigate people away from your site.”
  • “I prefer a new tab.”
  • “I prefer a new tab” – not a duplicate.
  • “Marketers usually tell you new tab so people don’t move away from your page as fast.”
  • “I like new tabs because then I don’t lose the original page.”
  • “I prefer new tabs.”
  • “I think best practice is to open links on a new tab.”

There were three themes that kept recurring:

  1. We don’t want users to leave the website
  2. Users find new tabs or windows convenient
  3. I prefer …

I linked these up to a little tongue-in-cheek section at the bottom, but before we get squirrelly I want to make the case for linking within the same window.

Links should open in the same window

Best-in-show user experience researchers Nielsen Norman Group write that “links that don’t behave as expected undermine users’ understanding of their own system,” where unexpected external linking is particularly hostile. See, one of the benefits of the browser itself is that it frees users “from the whims of particular web page or content designers.” For as varied and unique as sites can be, browsers bake-in consistency. Consistency is crucial.


Jakob’s Law of the Web User Experience
Users spend most of their time on other websites.

Design conventions are useful. The menu bar isn’t at the top of the website because that’s the most natural place for it; it’s at the top because that is where every other website put it. The conventions set by the sites that users spend the most time on–Facebook, Google, Amazon, Yahoo, and so on–are conventions users expect to be adopted everywhere.

Vitaly Friedman summarizes a bunch of advice from usability-research powerhouses with this:

[A] user-friendly and effective user interface places users in control of the application they are using. Users need to be able to rely on the consistency of the user interface and know that they won’t be distracted or disrupted during the interaction.

And in case this just feels like a highfalutin excuse to rip off a design, interaction designer and expert animator Rachel Nabors makes the case that

Users … may be search-navigators or link-clickers, but they all have additional mental systems in place that keep them aware of where they are on the site map. That is, if you put the proper markers in place. Without proper beacons to home in on, users will quickly become disoriented.

This is all to stress the point that violating conventions, such as the default behaviors of web browsers, is a big no-no. The default behavior of hyperlinks is that they open within the same page.

While not addressing this question directly, Kara Pernice–the managing director at Nielsen Norman Groupwrote last month about the importance of confirming the person’s expectation of what a link is and where the link goes. Breaking that promise actually endangers the trust and credibility of the brand – in this case, the library.

Accessibility Concerns

Pop-ups and new windows have certain accessibility issues which can cause confusion for users relying on screen readers to navigate the website. WebAIM says:

Newer screen readers alert the user when a link opens a new window, though only after the user clicks on the link. Older screen readers do not alert the user at all. Sighted users can see the new window open, but users with cognitive disabilities may have difficulty interpreting what just happened.

Compatibility with WCAG 2.0 involves an “Understanding Guideline” which suggests that the website should “provide a warning before automatically opening a new window or tab.” Here is the technique.


On Twitter, I said:

Your links shouldn't open new windows. There are exceptions, but this is a 90/10 rule here. #alatt #libux

— Michael Schofield (@schoeyfield) January 6, 2015

Hugh Rundle, who you might know, pointed out a totally legit use case:

@schoeyfield I don’t disbelieve you, but I do find it difficult to comprehend. If I’m reading something I want to look at the refs later.

— Hugh Rundle (@HughRundle) January 6, 2015

Say you’re reading In the Library with the Lead Pipe where the articles can get pretty long, and you are interested in a bunch of links peppered throughout the content. You don’t want to be just midway through the text then jump to another site before you’re ready. Sometimes, having a link open in a new tab or window makes sense.

But hijacking default behavior isn’t a light decision. Chris Coyier shows how to use target attributes in hyperlinks to force link behavior, but gives you no less than six reasons why you shouldn’t. Consider this: deciding that such-and-such link should open in a new window actually eliminates navigation options.

If a link is just marked up without any frills, like <a href=>, users’ assumed behavior of that link is that it will open in the same tab/window, but by either right-clicking, using a keyboard command, or lingering touch on a mobile device, the user can optionally open in it in a new window. When you add target=_blank to the mix, alternate options are mostly unavailable.

I think it’s a compelling use-case of opening reference links in new windows midway through long content, but it’s worth considering whether the inconvenience of default link behavior is greater than the interaction cost and otherwise downward drag on overall user experience.

Uh, you said “exceptions” …

In my mind, I do think it is a good idea to use target=_blank when opening the link will interrupt an ongoing process:

  • the user is filling out a form and needs to click on a link to review, say, terms of service
  • the user is watching video or listening to audio

Jakob Nielsen and Vitaly Friedman think it’s a-okay to link to a non-html-document, like a pdf or mp3. Chris Coyier, however, doesn’t think so.

So, yeah, there are exceptions.

So, is there a best practice?

The best practice is to leave the default link behavior alone. It is only appropriate to open a link in a new tab or window in the rarest use cases.

Frequent Comments, or, Librarians Aren’t Their Users

We don’t want users to leave the website.

Marketing folks say this sort of thing. They are the same people who demand carousels, require long forms, and make ads that look like regular content. Using this reasoning, opening a link in a new window isn’t just an antipattern, it is a dark pattern – a user interface designed to trick people.

Plus, poor user experiences negatively impact conversion rate and the bottom line. Tricks like the above are self-defeating.

Users find new tabs or windows convenient.

No they don’t.

I prefer ….

You are not your user.

The post Links Should Open in the Same Window appeared first on LibUX.

Peter Sefton: Letter of resignation

planet code4lib - Wed, 2015-01-07 04:02

to: The boss

cc: Another senior person, HR

date: 2014-12-10

Dear <boss>,

As I discussed with you last week, I have accepted a position with UTS, starting Feb 9th 2015, and I resign my position with UWS. My last day will be Feb 6th 2015.

Regards, Peter

Dr PETER SEFTON Manager, eResearch, Office of the Deputy Vice-Chancellor (Research & Development) University of Western Sydney

Anticipated FAQ:

  • What? eResearch Support Manager – more or less the same gig as I’ve had at UWS, in a tech-focussed uni with a bigger team, dedicated visualisation service and HPC staff, an actual budget and mostly within a few city blocks.

  • Why UTS? A few reasons.

    • There was a job going, I thought I’d see if they liked me. They did. I already knew some of the eResearch team there. I’m confident we will be good together.

    • It’s a continuing position, rather than the five-year, more-than-half-over contract I was on, not that I’m planning to settle down for the rest of my working life as an eResearch manager or anything.

    • The physical concentration should be conducive to Research Bazaar #resbaz activities such as Hacky Hour.

  • But what about the travel!!!!? It will be 90 minutes laptop time each way on a comfy, reasonably cheap and fairly reliable train service with almost total mobile internet coverage, with a few hundred metres walking on either end. That’s a change from 35-90 minutes each way depending on what campus(es) I was heading for that day and the mode of transport, which unfortunately was mostly motor vehicle. I do not like adding yet another car to Sydney’s M4, M7 or M5, despite what I said in my UWS staff snapshot. I think I’ll be fine with the train. If not, oops. Anyway, there are inner-Sydney family members and mates I’ll see more of if only for lunch.

    When the internets stop working the view is at its best. OK, apart from the tunnel and the cuttings.

  • What’s the dirt on UWS? It’s not about UWS, I’ve had a great time there, learned how to be an eResearch manager, worked on some exciting projects, made some friends, and I’ll be leaving behind an established, experienced eResearch team to continue the work. I’m sorry to be going. I’d work there again.

  • Why did you use this mode of announcement? I was inspired by Titus Brown, a few weeks a go.

[updated 2015-01-07 – typo]

Library Tech Talk (U of Michigan): Website Refresh: Really Thinking It Through

planet code4lib - Wed, 2015-01-07 00:00
The Digital Library Production Service (DLPS) recently did a thoughtful and comprehensive update of its web presence on the University of Michigan Library website. This post summarizes the process and calls out the value of having a web content strategist in the mix.

Harvard Library Innovation Lab: Link roundup January 6, 2015

planet code4lib - Tue, 2015-01-06 19:15

Whoa! A batch of links in one day.

STUDIO for Creative Inquiry » Balance from Within

The sofa provides a space for a range of social interactions.

Career Spotlight: What I Do as a Librarian

Librarian career spotlight. “Customer service is always my number one goal.”

Cartoon: Dewey

Hilarious provisional additions to the Dewey Decimal System

Lincoln Book Tower | Ford’s Theatre

A 34 foot tower of books about Abraham Lincoln lives at the Ford’s Theatre Center for Education and Leadership

Watch This 3D-Printed Object Fold and Launch Paper Airplanes | Mental Floss

Use your 3D printer to make an all-in-one paper airplane folder and launcher

Library of Congress: The Signal: Report Available for the 2014 DPOE Training Needs Assessment Survey

planet code4lib - Tue, 2015-01-06 18:49

The following is a guest post by Barrie Howard, IT Project Manager at the Library of Congress.

In September, the Digital Preservation Outreach and Education (DPOE) program wrapped up the “2014 DPOE Training Needs Assessment Survey” in an effort to get a sense of current digital preservation practice, a better understanding about what capacity exists for organizations and professionals to effectively preserve digital content and some insight into their training needs. An executive summary (PDF) and full report (PDF) about the survey results are now available.

The respondents expressed an overwhelming concern for making their content accessible for at least a ten-year horizon, and showed strong support for educational opportunities, like the DPOE Train-the-Trainer Workshop, which provides training to working professionals, increasing organizational capacity to provide long-term access to digital content.

As mentioned in a previous blog post announcing the survey results, this survey was a follow-up to an earlier survey conducted in the summer and fall of 2010.  The questions addressed issues such as the primary function of an organization (library, archive, museum, etc.), staff size and responsibilities, collection items, preferred training content and delivery options and financial support for professional development and training. There was good geographic coverage in the responses from organizations in 48 states, Washington D.C. and Puerto Rico, and none of the survey questions were skipped by any of the respondents. Overall, the distribution of responses was about the same from libraries, archives, museums and historical societies between 2010 and 2014, although there was a notable increase in participation from state governments.

The most significant takeaways from the 2014 survey are:

1) an overwhelming expression of concern that respondents ensure their digital content is accessible for ten or more years (84%);

2) evidence of a strong commitment to support employee training opportunities (83%, which is an increase from 66% reported in 2010), and;

3) similar results between 2010 and 2014. This trend will be of particular interest when the survey is conducted again in 2016.

Other important discoveries reveal changes in staff size and configuration over the last four years. There was a marked 6% decrease in staff size at smaller organizations (those with 1-50 employees), and a slight 2% drop in staff size at large organizations with over 500 employees. In comparison, medium-size organizations reported a 4% uptick in the staff range of 51-200, and 3% for the 201-500 tier. There was a substantial 13% increase across all organizations in paid full-time or part-time professional staff with practitioner experience, and a 5% drop in organizations reporting no staff at all. These findings suggest positive trends across the digital preservation community, which bodes well for the long-term preservation of our collective cultural heritage. Born-digital content wasn’t extant as a choice for the 2010 survey regarding content held by respondents, yet is a close second to reformatted materials. This will be another closely-monitored data point in 2016.

Preparation of charts and graphs by Mr. Robert L. Bostick and Mrs. Florence A. Phillips, 1951. Library of Congress Prints and Photographs Division.

Regarding training needs, online delivery is trending upward across many sectors to meet the constraints of reduced travel and professional development budgets. However, results of the 2014 survey reveal respondents still value intimate, in-person workshops as one of their most preferred delivery options with webinars and self-paced, online courses as the next two choices. Respondents demonstrated a preference for training focused on applicable skills, rather than introductory material on basic concepts, and show a preference to travel off-site within a 100-mile radius for half- to full-day workshops over other options.

DPOE currently offers an in-person, train-the-trainer workshop, and is exploring options for extending the workshop Curriculum to include online delivery options for the training modules. These advancements will address some of the issues raised in the survey, and may include regularly scheduled webinars, on-demand videos, and pre- and post-workshop videos. Keep a watchful eye on the DPOE website and The Signal for subsequent DPOE training materials as they become available.


Subscribe to code4lib aggregator