You are here

Feed aggregator

Jonathan Rochkind: “Dutch universities start their Elsevier boycott plan”

planet code4lib - Fri, 2015-07-03 01:42

“We are entering a new era in publications”, said Koen Becking, chairman of the Executive Board of Tilburg University in October. On behalf of the Dutch universities, he and his colleague Gerard Meijer negotiate with scientific publishers about an open access policy. They managed to achieve agreements with some publishers, but not with the biggest one, Elsevier. Today, they start their plan to boycott Elsevier.

Dutch universities start their Elsevier boycott plan

Filed under: General

Mark E. Phillips: Characteristics of subjects in the DPLA

planet code4lib - Thu, 2015-07-02 14:49

There are still a few things that I have been wanting to do with the subject data from the DPLA dataset that I’ve been working with for the past few months.

This time I wanted to take a look at some of the characteristics of the subject strings themselves and see if there is any information there that is helpful, useful for us to look at as an indicator of quality for the metadata record associated with that subject.

I took at look at the following metrics for each subject string; length, percentage integer, number of tokens, length of anagram, anagram complexity, number of non-alphanumeric characters (punctuation).

In the tables below I present a few of the more interesting selections from the data.

Subject Length

This is calculated by stripping whitespace from the ends of each subject, and then counting the number of characters that are left in the string.

Hub Unique Subjects Minimum Length Median Length Maximum Length Average Length stddev ARTstor 9,560 3 12.0 201 16.6 14.4 Biodiversity_Heritage_Library 22,004 3 10.5 478 16.4 10.0 David_Rumsey 123 3 18.0 30 11.3 5.2 Digital_Commonwealth 41,704 3 17.5 3490 19.6 26.7 Digital_Library_of_Georgia 132,160 3 18.5 169 27.1 14.1 Harvard_Library 9,257 3 17.0 110 30.2 12.6 HathiTrust 685,733 3 31.0 728 36.8 16.6 Internet_Archive 56,910 3 152.0 1714 38.1 48.4 J._Paul_Getty_Trust 2,777 4 65.0 99 31.6 15.5 Kentucky_Digital_Library 1,972 3 31.5 129 33.9 18.0 Minnesota_Digital_Library 24,472 3 19.5 199 17.4 10.2 Missouri_Hub 6,893 3 182.0 525 30.3 40.4 Mountain_West_Digital_Library 227,755 3 12.0 3148 27.2 25.1 National_Archives_and_Records_Administration 7,086 3 19.0 166 22.7 17.9 North_Carolina_Digital_Heritage_Center 99,258 3 9.5 3192 25.6 20.2 Smithsonian_Institution 348,302 3 14.0 182 24.2 11.9 South_Carolina_Digital_Library 23,842 3 26.5 1182 35.7 25.9 The_New_York_Public_Library 69,210 3 29.0 119 29.4 13.5 The_Portal_to_Texas_History 104,566 3 16.0 152 17.7 9.7 United_States_Government_Printing_Office_(GPO) 174,067 3 39.0 249 43.5 18.1 University_of_Illinois_at_Urbana-Champaign 6,183 3 23.0 141 23.2 14.3 University_of_Southern_California._Libraries 65,958 3 13.5 211 18.4 10.7 University_of_Virginia_Library 3,736 3 40.5 102 31.0 17.7

My takeaway from this is that three characters long is just about the shortest subject that one is able to include,  not the absolute rule, but that is the low end for this data.

The average length ranges from 11.3 average characters for the David Rumsey hub to 43.5 characters on average for the United States Government Printing Office (GPO).

Put into a graph you can see the average subject length across the Hubs a bit easier.

Average Subject Length

The length of a field can be helpful to find values that are a bit outside of the norm.  For example you can see that there are five Hubs  that have maximum character lengths of over 1,000 characters. In a quick investigation of these values they appear to be abstracts and content descriptions accidentally coded as a subject.

Maximum Subject Length

For the Portal to Texas History that had a few subjects that came in at over 152 characters long,  it turns out that these are incorrectly formatted subject fields where a user has included a number of subjects in one field instead of separating them out into multiple fields.

Percent Integer

For this metric I stripped whitespace characters, and then divided the number of digit characters by the number of total characters in the string to come up with the percentage integer.

Hub Unique Subjects Maximum % Integer Average % Integer stddev ARTstor 9,560 61.5 1.3 5.2 Biodiversity_Heritage_Library 22,004 92.3 2.2 11.1 David_Rumsey 123 36.4 0.5 4.2 Digital_Commonwealth 41,704 66.7 1.6 6.0 Digital_Library_of_Georgia 132,160 87.5 1.7 6.2 Harvard_Library 9,257 44.4 4.6 9.0 HathiTrust 685,733 100.0 3.5 8.4 Internet_Archive 56,910 100.0 4.1 9.4 J._Paul_Getty_Trust 2,777 50.0 3.6 8.0 Kentucky_Digital_Library 1,972 63.6 5.7 9.9 Minnesota_Digital_Library 24,472 80.0 1.1 5.1 Missouri_Hub 6,893 50.0 2.9 7.5 Mountain_West_Digital_Library 227,755 100.0 1.1 5.5 National_Archives_and_Records_Administration 7,086 42.1 4.7 9.4 North_Carolina_Digital_Heritage_Center 99,258 100.0 1.5 5.9 Smithsonian_Institution 348,302 100.0 1.1 3.6 South_Carolina_Digital_Library 23,842 57.1 2.3 6.5 The_New_York_Public_Library 69,210 100.0 12.0 13.5 The_Portal_to_Texas_History 104,566 100.0 0.4 3.7 United_States_Government_Printing_Office_(GPO) 174,067 80.0 0.4 2.4 University_of_Illinois_at_Urbana-Champaign 6,183 50.0 6.1 10.9 University_of_Southern_California._Libraries 65,958 100.0 1.3 6.4 University_of_Virginia_Library 3,736 72.7 1.8 6.8

Average Percent Integer

If you group these into the Content-Hub and Service-Hub categories you can see things a little better.

It appears that the Content-Hubs on the left trend a bit higher than the Service-Hubs on the right.  This probably has to do with the use of dates in subject strings as a common practice in bibliographic catalog based metadata which isn’t always the same in metadata created for more heterogeneous collections of content that we see in the Service-Hubs.


For the tokens metric I replaced punctuation character instance with a single space character and then used the nltk word_tokenize function to return a list of tokens.  I then just to the length of that resulting list for the metric.

Hub Unique Subjects Maximum Tokens Average Tokens stddev ARTstor 9,560 31 2.36 2.12 Biodiversity_Heritage_Library 22,004 66 2.29 1.46 David_Rumsey 123 5 1.63 0.94 Digital_Commonwealth 41,704 469 2.78 3.70 Digital_Library_of_Georgia 132,160 23 3.70 1.72 Harvard_Library 9,257 17 4.07 1.77 HathiTrust 685,733 107 4.75 2.31 Internet_Archive 56,910 244 5.06 6.21 J._Paul_Getty_Trust 2,777 15 4.11 2.14 Kentucky_Digital_Library 1,972 20 4.65 2.50 Minnesota_Digital_Library 24,472 25 2.66 1.54 Missouri_Hub 6,893 68 4.30 5.41 Mountain_West_Digital_Library 227,755 549 3.64 3.51 National_Archives_and_Records_Administration 7,086 26 3.48 2.93 North_Carolina_Digital_Heritage_Center 99,258 493 3.75 2.64 Smithsonian_Institution 348,302 25 3.29 1.56 South_Carolina_Digital_Library 23,842 180 4.87 3.45 The_New_York_Public_Library 69,210 20 4.28 2.14 The_Portal_to_Texas_History 104,566 23 2.69 1.36 United_States_Government_Printing_Office_(GPO) 174,067 41 5.31 2.28 University_of_Illinois_at_Urbana-Champaign 6,183 26 3.35 2.11 University_of_Southern_California._Libraries 65,958 36 2.66 1.51 University_of_Virginia_Library 3,736 15 4.62 2.84

Average number of tokens

Tokens end up being very similar to that of the overall character length of a subject.  If I was to do more processing I would probably divide the length by the number of tokens and get an average work length for the tokens in the subjects.  That might be interesting.


I’ve always found anagrams of values in metadata to be interesting,  sometimes helpful and sometimes completely useless.  For this value I folded the case of the subject string to convert letters with diacritics to their ASCII version and then created an anagram of the resulting letters.  I used the length of this anagram for the metric.

Hub Unique Subjects Min Anagram Length Median Anagram Length Max Anagram Length Avg Anagram Length stddev ARTstor 9,560 2 8 23 8.93 3.63 Biodiversity_Heritage_Library 22,004 0 7.5 23 9.33 3.26 David_Rumsey 123 3 12 13 7.93 2.28 Digital_Commonwealth 41,704 0 9 26 9.97 3.01 Digital_Library_of_Georgia 132,160 0 9.5 23 11.74 3.18 Harvard_Library 9,257 3 11 21 12.51 2.92 HathiTrust 685,733 0 14 25 13.56 2.98 Internet_Archive 56,910 0 22 26 12.41 3.96 J._Paul_Getty_Trust 2,777 3 19 21 13.02 3.60 Kentucky_Digital_Library 1,972 2 14.5 22 13.02 3.28 Minnesota_Digital_Library 24,472 0 12 22 9.76 3.00 Missouri_Hub 6,893 0 22 25 11.09 4.06 Mountain_West_Digital_Library 227,755 0 7 26 11.85 3.54 National_Archives_and_Records_Administration 7,086 3 11 22 10.01 3.09 North_Carolina_Digital_Heritage_Center 99,258 0 6 26 11.00 3.54 Smithsonian_Institution 348,302 0 8 23 11.53 3.42 South_Carolina_Digital_Library 23,842 1 12 26 13.08 3.67 The_New_York_Public_Library 69,210 0 10 24 11.45 3.17 The_Portal_to_Texas_History 104,566 0 10.5 23 9.78 2.98 United_States_Government_Printing_Office_(GPO) 174,067 0 14 24 14.56 2.80 University_of_Illinois_at_Urbana-Champaign 6,183 3 7 21 10.42 3.46 University_of_Southern_California._Libraries 65,958 0 9 23 9.81 3.20 University_of_Virginia_Library 3,736 0 9 22 12.76 4.31

Average anagram length

I find this interesting in that there are subjects in several of the Hubs (Digital_Commonwealth, Internet Archive, Mountain West Digital Library, and South Carolina Digital Library that have a single subject instance that contains all 26 letters.  That’s just neat.  Now I didn’t look to see if these are the same subject instances that were themselves 3000+ characters long.




It can be interesting to see what punctuation was used in a field so I extracted all non-alphanumeric values from the string which left me with the punctuation characters.  I took the number of unique punctuation characters for this metric.

Hub Name Unique Subjects min median max mean stddev ARTstor 9,560 0 0 8 0.73 1.22 Biodiversity Heritage Library 22,004 0 0 8 0.59 1.02 David Rumsey 123 0 0 4 0.18 0.53 Digital Commonwealth 41,704 0 1.5 10 1.21 1.10 Digital Library of Georgia 132,160 0 1 7 1.34 0.96 Harvard_Library 9,257 0 0 6 1.65 1.02 HathiTrust 685,733 0 1 9 1.63 1.16 Internet_Archive 56,910 0 2 11 1.47 1.75 J_Paul_Getty_Trust 2,777 0 2 6 1.58 0.99 Kentucky_Digital_Library 1,972 0 1.5 5 1.50 1.38 Minnesota_Digital_Library 24,472 0 0 7 0.42 0.74 Missouri_Hub 6,893 0 3 7 1.24 1.37 Mountain_West_Digital_Library 227,755 0 1 8 0.97 1.04 National_Archives_and_Records_Administration 7,086 0 3 7 1.68 1.61 North_Carolina_Digital_Heritage_Center 99,258 0 0.5 7 1.34 0.93 Smithsonian_Institution 348,302 0 2 7 0.84 0.96 South_Carolina_Digital_Library 23,842 0 3.5 8 1.68 1.41 The_New_York_Public_Library 69,210 0 1 7 1.57 1.12 The_Portal_to_Texas_History 104,566 0 1 7 0.84 0.91 United_States_Government_Printing_Office_(GPO) 174,067 0 2 7 1.38 0.99 University_of_Illinois_at_Urbana-Champaign 6,183 0 2 6 1.31 1.25 University_of_Southern_California_Libraries 65,958 0 0 7 0.75 1.09 University_of_Virginia_Library 3,736 0 5 7 1.67 1.58 63 0 2 5 1.17 1.31

Average Punctuation Characters

Again on this one I don’t have much to talk about.  I do know that I plan to take a look at what punctuation characters are being used by which hubs.  I have a feeling that this could be very useful in identifying problems with mapping from one metadata world to another.  For example I know there are examples of character patterns that resemble sub-field indicators from a MARC record in the subject values in the DPLA, dataset, (‡, |, and — ) how many that’s something to look at.

Let me know if there are other pieces that you think might be interesting to look at related to this subject work with the DPLA metadata dataset and I’ll see what I can do.

Let me know what you think via Twitter if you have questions or comments.

Open Knowledge Foundation: Just Released: “Where Does Europe’s Money Go? A Guide to EU Budget Data Sources”

planet code4lib - Thu, 2015-07-02 11:57

The EU has committed to spending €959,988 billion between 2014 and 2020. This money is disbursed through over 80 funds and programmes that are managed by over 100 different authorities. Where does this money come from? How is it allocated? And how is it spent?

Today we are delighted to announce the release of “Where Does Europe’s Money Go? A Guide to EU Budget Data Sources”, which aims to help civil society groups, journalists and others to navigate the vast landscape of documents and datasets in order to “follow the money” in the EU. The guide also suggests steps that institutions should take in order to enable greater democratic oversight of EU public finances. It was undertaken by Open Knowledge with support from the Adessium Foundation.

As we have seen from projects like Farm Subsidy and journalistic collaborations around the EU Structural Funds it can be very difficult and time-consuming to put together all of the different pieces needed to understand flows of EU money.

Groups of journalists on these projects have spent many months requesting, scraping, cleaning and assembling data to get an overview of just a handful of the many different funds and programmes through which EU money is spent. The analysis of this data has led to many dozens of news stories, and in some cases even criminal investigations.

Better data, documentation, advocacy and journalism around EU public money is vital to addressing the “democratic deficit” in EU fiscal policy. To this end, we make the following recommendations to EU institutions and civil society organisations:

  1. Establish a single central point of reference for data and documents about EU revenue, budgeting and expenditure and ensure all the information is up to date at this domain (e.g. at a website such as At the same time, ensure all EU budget data are available from the EU open data portal as open data.
  2. Create an open dataset with key details about each EU fund, including name of the fund, heading, policy, type of management, implementing authorities, link to information on beneficiaries, link to legal basis in Eur-Lex and link to regulation in Eur-Lex.
  3. Extend the Financial Transparency System to all EU funds by integrating or federating detailed data expenditures from Members States, non-EU Members and international organisations. Data on beneficiaries should include, when relevant, a unique European identifier of company, and when the project is co-financed, the exact amount of EU funding received and the total amount of the project.
  4. Clarify and harmonise the legal framework regarding transparency rules for the beneficiaries of EU funds.
  5. Support and strengthen funding for civil society groups and journalists working on EU public finances.
  6. Conduct a more detailed assessment of beneficiary data availability for all EU funds and for all implementing authorities – e.g., through a dedicated “open data audit”.
  7. Build a stronger central base of evidence about the uses and users of EU fiscal data – including data projects, investigative journalism projects and data users in the media and civil society.

Our intention is that the material in this report will become a living resource that we can continue to expand and update. If you have any comments or suggestions, we’d love to hear from you.

If you are interested in learning more about Open Knowledge’s other initiatives around open data and financial transparency you can explore the Where Does My Money Go? project, the OpenSpending project, read our other previous guides and reports or join the Follow the Money network.

Peter Murray: Thursday Threads: New and Interesting from ALA Exhibits

planet code4lib - Thu, 2015-07-02 10:51
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

I’m just home from the American Library Association meeting in San Francisco, so this week’s threads are just a brief view of new and interesting things I found on the exhibit floor.

Funding for my current position at LYRASIS ran out at the end of June, so I am looking for new opportunities and challenges for my skills. Check out my resume/c.v. and please let me know of job opportunities in library technology, open source, and/or community engagement.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.


Book-Donations-Processing-as-a-Service. See something new everyday. #alaac15

— Peter Murray (@DataG) June 28, 2015

I didn’t get to talk to anyone at this booth, but I was interested in the concept. I remember donations processing being such a hassle — analyze each book for its value, deciding whether it is part of your collection policy, determining where to sell it, manage the sale, and so forth. American Book Drive seems to offer such a service. Right now their service is limited to California. I wonder if it will expand, or if there are similar service providers in other areas of the countries.

Free Driver’s Ed Resources for Libraries

Free driver's ed resources for librs. Group has a great story. – Another first at #alaac15

— Peter Murray (@DataG) June 28, 2015

This exhibitor had a good origin story. A family coming to the U.S. had a difficult time getting their drivers licenses, so they created an online resource for all 50 states that covers the details. They’ve had success with the business side of their service, so they decided to give it away to libraries for free.

Free Online Obituaries Service from Orange County Library

Orange County Public Library offering free obituary service and publicizing through libraries. Via @us_imls

— Peter Murray (@DataG) June 28, 2015

With newspapers charging more for printing obituaries, important community details are no longer being printed. The Epoch Project from the Orange County (FL) Library System provides a simple service with text and media to capture this cultural heritage information. Funded initially by an IMLS grant [PDF], they are now in the process of rounding up partners in each state to be ambassadors to bring the service to other libraries around the country.

Link to this post!

Open Library Data Additions: HathiTrust Metadata

planet code4lib - Thu, 2015-07-02 06:46

Metadata records from

This item belongs to: data/ol_data.

This item has files of the following types: Data, Data, Metadata

HangingTogether: The Evolving Scholarly Record Workshop — the San Francisco edition and the series wrap-up

planet code4lib - Wed, 2015-07-01 21:19

The report, The Evolving Scholarly Record, introduced a framework for discussing the changes in the scholarly record and in the roles of stakeholders.

Over the past year, OCLC has conducted a series of workshops to socialize the framework.   You can read about the first three Evolving Scholarly Record workshops on  the Amsterdam workshop, the DC workshop, and the Chicago workshop.

For the fourth and final workshop in the series, we wanted to be more cumulative so we took a different tack from the first three workshops.  Instead of having guest speakers in the morning and small group breakout discussions in the afternoon, presentations by OCLC staff set the context for plenary discussions.  I reviewed the ESR framework and recapped the 3 previous workshops, Constance Malpas previewed the report, Stewardship of the Evolving Scholarly Record: From the Invisible Hand to Conscious Coordination, and Jim Michalko talked about boundaries and internalizing and externalizing roles in managing scholarly outputs.  Slides and videos of these presentations are available.

In the previous workshops, breakout discussions had focused around these four topics:  Selection, Support for the Researcher, Collaboration within the University, and Collaboration with External Entities.  Here are some of the takeaways from those discussions. (There were also discussions under the broad topic of technology, but those have been integrated with the other topics.)


  • Establish priorities: for example, institutional (local) materials, at-risk materials, materials most valued by researchers in specific disciplines.
  • Establish limits: what doesn’t need to be saved? What can be de-selected?
  • Establish clear selection criteria, especially for non-traditional scholarly outputs: for example, blogs, web sites.
  • Accept adequate content sampling.
  • Be aware of system-wide context: how do local selection decisions complement/duplicate stewardship activities elsewhere? Which local collections are considered “collections of record” by the broader scholarly community?

Support for Researchers

  • Offer expertise with reliable external repositories to help researchers make good choices in use of disciplinary repositories.  Provide a local option for disciplines lacking good external choices.
  • Use the dissertation as the first opportunity to establish a relationship with a researcher.  Mint an ORCID and/or ISNI and provide DOIs.  Offer profiling, bibliography, and resume services that save researchers time.  Find ways to ensure portability of research outputs throughout a researcher’s career.
  • Determine how to link various research materials to a project and define for each project what an object is and how to link related bits to the object.
  • Become an integral part of the grant proposal process to ensure that materials flow to the right places instead of needing to be rescued after the fact.
  • Agree on and be explicit about service levels and end-of-life provisions.

Collaboration within the University

  • Use service offerings to re-position the library in the campus community.  Decide where the library will focus; it can’t be expert in all things.
  • Make alliances on campus so you can integrate library services into the campus infrastructure. Help other parts of the university negotiate licensing of data from vendors.
  • Use policy and financial drivers (mandates, ROI expectations, reputation and assessment) to motivate a variety of institutional stakeholders.
  • Create statements of organizational responsibility about selection, services, terms, and which parts of the university will do what.
  • Coordinate to optimize expertise, minimize duplication, rebalance resources, and contain costs.

Collaboration with External Entities

  • Identify the things can be done elsewhere and those that need to be done locally. Figure out what kinds of relationships are needed with external repositories.
  • Help researchers negotiate on IP rights, terms of use, privacy, and so forth.
  • Determine which external repositories are committed to preservation and which will collect the related materials from processes and aftermaths.  Rely on external services like JSTOR, arXiv, SSRN, and ICPSR, which are dependable delivery and access systems with sustainable business models.
  • Learn how to interoperate with systems such as SHARE. Employ persistent object identifiers and multiple researcher name identifiers to interoperate with other systems.
  • Consider centers of excellence; host one and rely on others.

It is clear that no single institution can hope to gather and manage all of—or even a significant share of—the scholarly record.  This is the starting point for the new report, Stewardship of the evolving scholarly record: From the invisible hand to conscious coordination.

In the fourth workshop we had discussions with all attendees present.  Having started with what came out of the previous workshops, it was easier for them to stretch a little bit beyond that.  Highlights from the plenary discussions are:

Things that institutions should consider doing:

  • Establish when research outputs should be archived locally.  In many cases a citation with a pointer to outputs archived elsewhere will be satisfactory.
  • Decide which materials merit application of preservation protocols.
  • Embed data capture requirements in the researchers’ workflow and see that metadata is created early in the flow.
  • Partner with the sponsored projects office to communicate about data lifecycle.
  • Explore with the Office of Academic Affairs if there are opportunities to work together on collecting assets for promotion and tenure.
  • Do ongoing analysis on Data Management Plans to provide fundamental planning data.
  • Use the library’s space and its “power to convene” to foster critical cross-campus conversations.
  • Develop practices for library assignment and management of ORCID, ISNI, DOI…  Identifiers are crucial.
  • When other units are licensing services (such as those from Elsevier), help with the negotiations and help to ensure that the various campus systems will interoperate.
  • Establish a relationship with HathiTrust and others who can share the stewardship workload.
  • Think about what else you will archive, beyond your institutional output.

Things to consider as a community

  • Assemble case studies of successful faculty engagement.
  • Decipher and interpret impact calculations in different systems.
  • Develop models for above-campus infrastructure, with shared investment and governance.  For instance, instead of allowing commercial providers to mine and share our data, develop Open Source tools and retain our data and mine it ourselves.
  • Identify a way to coordinate selection decisions with those of other institutions.
  • Develop shared goals and criteria to influence vendors to improve tools:  Aggregate information about researcher workflow preferences and what the potential is for their tools interoperating with other systems.  Prioritize vendor metadata interoperability requirements for selected tools to allow machine-readable acquisition.
  • Assess the reliability of external repositories.
  • Develop best practices for agreement language, such as preservation commitments with repositories and exit plans with vendors.

The four workshops gave us a chance to not just socialize the framework, but to really hear about the concerns of libraries and other stakeholders, learn what is being done, and begin to think about what lies ahead.  In the near future, we’ll be synthesizing all this and considering next steps.

About Ricky Erway

Ricky Erway, Senior Program Officer at OCLC Research, works with staff from the OCLC Research Library Partnership on projects ranging from managing born digital archives to research data curation.

Mail | Web | Twitter | LinkedIn | More Posts (39)

FOSS4Lib Recent Releases: Collective Access - 1.5

planet code4lib - Wed, 2015-07-01 20:05
Package: Collective AccessRelease Date: Thursday, June 11, 2015

Last updated July 1, 2015. Created by David Nind on July 1, 2015.
Log in to edit this page.

An exciting new version of CollectiveAccess is now available, with a number of new features and improvements!

  • Improved PDF reports
  • Additional external data sources such as Wikipedia and WorldCat
  • Media Annotation tools
  • Improved search functionality
  • Support for complex "interstitial" relationships
  • Collection management tools such as support for deaccessions and location tracking
  • "Check in/ Check out" library circulation module
  • Improvements to the data importer

...and more!

LITA: Jobs in Information Technology: July 1, 2015

planet code4lib - Wed, 2015-07-01 17:44

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Digital Systems, Training, and Support Coordinator, University of Arkansas at Little Rock, Little Rock, AR

Systems and Digital Services Librarian, University of Arkansas at Little Rock, Little Rock, AR

Digital Library Data Curation Developer, University of Notre Dame, Notre Dame, IN

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

In the Library, With the Lead Pipe: Revising Academic Library Governance Handbooks

planet code4lib - Wed, 2015-07-01 13:00

Original Image by Flickr user Sasquatch 1 (CC BY 2.0), with minimal modification by C. Strunk (10 June 2015).

In Brief

Regardless of our status (tenure track, non-tenure track, staff, and/or union), academic librarians at colleges and universities may use a handbook or similar document as a framework for self-governance. These handbooks typically cover rank descriptions, promotion requirements, and grievance rights, among other topics. Unlike employee handbooks used in the corporate world, these documents may be written and maintained by academic librarians themselves1. In 2010, a group of academic librarians at George Mason University was charged with revising our Librarians’ Handbook. Given the dearth of literature about academic librarians’ handbooks and their revision, we anticipate our library colleagues in similar situations will benefit from our experience and recommendations.


Background and Context

There are three handbooks at George Mason University (Mason) governing individuals in various faculty positions: the Academic/Professional Faculty Handbook, the LibrariansHandbook, and the Mason Faculty Handbook, which covers instructional faculty.2 Librarians at Mason, a young institution founded in 1957, are classified as professional faculty, a non-tenured faculty classification. As such, librarians are governed by the University’s Administrative/Professional Faculty Handbook (A/P Handbook), as well as by the LibrariansHandbook (Handbook), which became an appendix of the former in 2000. The Handbook contains provisions that apply only to professional faculty librarians. Although the history of the Handbook is not well documented, its precursor was an evaluation and promotion document that was used by library administration as early as the 1970s.

Librarians who hold a professional faculty position at Mason (~45) are members of the Librarians’ Council (Council), which plays a significant governance role by defining the standards for librarian rank, contract renewal, and promotion in the Librarians’ Handbook. Our Handbook differs significantly from the A/P Handbook because the Librarians’ Handbook includes a statement on academic freedom, a professional review process for contract renewal and promotion, professional ranks, and some aspects of the grievance and appeal processes. Consequently, our Handbook is more analogous to the Mason Faculty Handbook.

In late 2009, the Council voted to review and, as needed, revise the Handbook, especially sections related to professional peer review and librarian ranks. We began in the summer of 2010 with the appointment of an ad hoc handbook review committee that was selected by Council officers and approved by the library’s senior administrators. The A/P Handbook was under review at this same time, so revising the Librarians’ Handbook concurrently made sense. Our colleague representing the library in the A/P Handbook group was also appointed to chair the Council’s ad hoc committee, and her dual role proved to be most advantageous to our revision process.

Literature Review

Although there is a substantial body of literature regarding employee handbooks as a whole (most of it in Business and Human Resources), relatively little has been published on creating and revising faculty handbooks, let alone librarian faculty handbooks. Articles that do include library faculty tend to do so in a cursory fashion. One example is a 1985 Chronicle of Higher Education article “Writing a Faculty Handbook: a 2-Year, Nose-to-the-Grindstone Process” that provides a brief description of the recommended process from a legal standpoint and an outline for how to structure the resulting document. This outline includes librarians, along with other “Special Academic Staff and Categories,” but only as a line item (Writing 1985, 29).

One of the few articles describing the actual process of writing and revising faculty handbooks, James L. Pence’s “Adapting Faculty Personnel Policies” focuses solely on instructional faculty (Pence 1990). A more detailed faculty handbook outline that addresses material applicable to librarians is provided in Drafting and Revising Employment Policies and Handbooks. 2002 Cumulative Supplement (Decker et al. 2002, 456-511), but it is more of a prescriptive example from the standpoint of human resources law rather than a “how to” or case study.

Although it is unclear why library faculty are not more fully included in such articles, one reason may be the lack of conformity regarding librarian status in higher education institutions. As surveys such as Mary K. Bolin’s “A Typology of Librarian Status at Land Grant Universities” indicate, librarian status varies widely (Bolin 2008). For the purpose of her survey, Bolin grouped librarian statuses into “Professorial,” “Other ranks with tenure,” “Other ranks without tenure,” and “Non-faculty (Staff)” (Bolin 2008, 223). These statuses span a continuum, with “Professorial” being closest to instructional faculty who have tenure and research requirements and “Non-Faculty (Staff)” being the furthest. More variation may exist within those statuses; for instance, at some institutions, “Non-Faculty (Staff)” librarians are represented in Faculty Senate, while at others, they are not (Bolin 2008, 224).3

We are not aware of any research that reports how many academic librarians are covered by broader faculty handbooks. Given the wide disparity of librarian status, and the fact that librarians may or may not be part of their institution’s larger faculty handbook, it isn’t surprising that librarian handbooks have not received a lot of attention in the literature.


Our review began in earnest in September 2010. We met weekly, which made it easier to maintain discussion continuity from one meeting to the next, and began with deciding on an approach and a tentative timeline. Our initial deadline was April 2011, which mirrored the working deadline for the A/P Handbook. Because both handbooks had to be approved by Mason’s Board of Visitors, we believed it would be advantageous to submit ours as part of the A/P Handbook.

To learn more about the history of pertinent sections, we talked to our library colleagues who worked on previous versions of the Handbook. We reviewed the Academic College and Research Libraries (ACRL) Standards for Faculty Status for Academic Librarians (2007) as well as librarian governance documents from other colleges and universities in Virginia.4 These document reviews confirmed that our handbook was already aligned with the published ACRL standards, provided insight into how governance was handled at other institutions, and gave us ideas to consider for our own handbook. For instance, we considered aligning the professional peer review process with the annual administrative review process and adjusting the professional review calendar.

We met with representatives from the University’s Human Resources Department, the Provost’s Office, and the Office of University Counsel several times to ask questions and learn more about the legal and administrative issues and policies affecting our Handbook. Information shared during these meetings indicated the roles of different faculty handbooks at Mason and how ours fits into the broader institutional picture.

Each committee member volunteered to revise specific sections based on their interest and experience. We reviewed sections as they were revised, rather than in any specific order. Several sections (e.g., Introduction, Professional Development) were revised quickly, whereas others involved deeper discussion. For example, we thought it was critical to discuss the sections on Librarian ranks and professional review together because they were so closely related. The complexity and sensitivity of this subject matter sparked discussions that spanned multiple meetings and content iterations.

Section discussions were often quite detailed, covering all possible aspects–from the overall intent and purpose of the content to the specific definitions of words and phrases. Decisions about the level of textual vagueness or detail desired had to be made. Proposed revisions were considered, modified, discussed, and modified again. We spent a lot of time on word choice to make the document more cohesive and minimize ambiguity. To enhance the Handbook’s professional appearance, we standardized punctuation, capitalization, and format, and removed references to specific web sites.

Throughout our work, we needed to share working documents easily with one another (there were seven committee members), which we accomplished using Dropbox. This practice alleviated some problems with version control, but edits were occasionally made to multiple versions of the same document that later needed to be reconciled. The “track changes” functionality within Microsoft Word was also critical to share changes we had made and add comments and questions. As the work progressed, the committee Chair compiled the revised sections into a final draft for review and created a list of major revisions to share with Council members and reviewers.

Feedback from our colleagues was critical, and we gathered it using online polls and surveys, and town hall meetings. We frequently presented reports at Council meetings to inform the larger body of our progress, receive verbal feedback, and address any questions or concerns.

Both the University Librarian and Vice President/CIO for Information Technology (CIO), the Libraries’ most senior administrators at that time5, were required to review and approve our revised Handbook prior to submission of the final draft to HR for integration with the A/P Handbook. To expedite this process, we gave the University Librarian revised sections as they were completed for his review and comments, and we met with him on several occasions to discuss his questions and concerns.

The revision schedule changed during the process, largely because it took us longer to revise some sections than we originally anticipated. Other delays occurred after we wisely decided to mirror the A/P Handbook revision schedule, which lagged behind ours. For example, protracted discussions of the grievance and termination sections of the A/P Handbook lead to delayed revision of those same sections in the Librarians’ Handbook. We wanted to ensure that whatever modifications the A/P Handbook Committee made would not conflict with the rights conferred librarians in our existing Handbook (e.g., grieving salary or filing a grievance as a group). We also chose to defer to the A/P Handbook for parts of the grievance and termination sections that were duplicative, which streamlined our document. A more flexible timeline meant our revision process took longer than it might have otherwise, but our revised text did not conflict with the revised  A/P Handbook. As a result, it enabled submission of a combined, single document from HR to the Board of Visitors at one time rather than in pieces.

We finished the Handbook revision in July 2011 and sent electronic copies to the CIO and the University Librarian for review and comment. The University Librarian provided his feedback in late November and our final revision was completed in February 2012, after which it was sent to Human Resources for integration into the newly revised A/P Handbook. Subsequently, the combined document was submitted to Mason’s Board of Visitors, who approved it on March 21, 2012.

Table 1: Table of Contents. George Mason University Librarians’ Handbook (George Mason University 2012b).


Professional Peer Review for Librarians

Because professional review is the most important aspect of self-governance defined in our Handbook and detailed in our Council’s Bylaws, a brief description of this process is in order.6 The Council’s Professional Review Committee (PRC), a standing committee, consists of seven elected members who serve staggered two-year terms; a librarian is eligible to serve on the committee after having gone through this peer review process at least once. The University Librarian, in consultation with the PRC Chair, designates subcommittees of three reviewers for each reappointment or promotion review. Librarians are permitted to request that a subcommittee member be recused if a potential conflict of interest exists.7

Based on a librarian’s hire date (see the Professional Review Calendar section below), their review begins with submission of an annotated curriculum vitae (CV) or a “dossier” to the PRC. The dossier is, in fact, a notebook containing the librarian’s CV and detailed documentation of all accomplishments (e.g., publications, presentations, awards, grants, offices held, etc.) achieved during a specific period of time. Mason librarians report progress in three areas: 1) professional competence; 2) scholarship and professional service; and, 3) service to the university and the community. However, only information related to scholarship and service are included in the dossier; content related to professional competence, as well as the required supervisor’s evaluation letter, are neither reviewed by nor available to the PRC subcommittee.

Points to Consider

During the revision process, we identified several major issues we believe readers will benefit from learning how we handled, or did not handle. They are grouped by issues related to Handbook content and those related to our revision process (see below).

Professional Review Calendar

A critical situation the committee wanted to rectify was the inequity in time newly hired librarians were allowed before their initial professional peer review. All librarians, regardless of rank, receive an initial two-year contract.8 Librarians hired before March in a calendar year must submit their CV or dossier for review the first January after they are hired. However, librarians hired later in a year may have up to twice as much time before their initial review (Table 2). Several adjustments to the calendar were considered, but ultimately, we could only incorporate minor changes due to limits imposed by the Provost’s and University Librarian’s schedules, which, in turn, are dictated by the University’s fiscal year.

*For a subsequent promotion, the dossier should cover all professional activities since the last promotion. ** Contract Term = Rank + 1 year

Table 2. Documentation Requirements by Librarian Rank and Review Type (George Mason University 2012b).


Librarian Ranks

Handbook content related to librarian rank required a lot of attention, with one example being the basic definition of a librarian. The Handbook defines a Mason librarian as a library employee with a professional faculty appointment and an ALA recognized degree9; this definition also confers Council membership. Table 3 details the basic criteria required for each librarian rank.

Table 3. Mason librarian rank criteria (George Mason University 2012b).


Recently, however, professional faculty positions formerly held by librarians have been filled by individuals without an MLS, thus disqualifying those individuals from becoming Council members. Likewise, individuals who hold an MLS or similar degree and are hired in classified staff positions are not eligible for Council membership. The revision committee and Council discussed retiring the library degree requirement, but ultimately did not change the definition primarily because these colleagues would not be subject to professional peer review. If the Council membership definition were changed, three repercussions may take place:

  1. the Librarians’ Council would no longer be a “Librarians’” Council;
  2. non-MLS professional faculty would be subject to peer review or there would be two systems of review; and/or
  3. the Librarians’ Council would be dissolved and all rights conferred by the Librarians’ Handbook terminated.

None of these possible repercussions appealed to Council members at the time.

We also discussed whether to require Librarian 1s to apply for promotion to the Librarian 2 rank as part of their initial reappointment. This idea was dismissed because we were unable to make the desired changes to the professional review calendar. Under the existing calendar, Librarian 1s with no previous experience going up for initial professional peer review might have as little as 18 months of experience in an academic library. This is insufficient experience for advancement to the rank of Librarian 2, which at Mason requires a minimum of three years.

External Reviewers

Composition of the Professional Review (PRC) subcommittee for individuals seeking promotion to Librarian 4 concerned the Handbook committee. Neither the Handbook nor Council Bylaws require reviewers to be at or above the level of the librarian under review or promotion, even though it circumvents potential personnel problems to make this a requirement (e.g., a negative review may be more easily challenged). Because no Mason librarians held the Librarian 4 rank when we revised the Handbook (and there still are none), we wanted to ensure promotion to this rank was conducted by reviewers who could draw from the maximum years of experience possible despite holding the rank of, at least, Librarian 3.

One solution we considered was to invite an external reviewer to participate in a PRC promotion subcommittee. This reviewer would be selected from either the Mason community (non-library), or from another institution. Although external reviewers typically participate in instructional faculty promotion and tenure reviews, we decided this option would not work for us and sought another solution. Eventually, we concluded that all reviewers for a Librarian 4 promotion should be a Librarian 3 at a minimum. Because PRC members are elected on staggered terms, however, there is no way to predict how many Librarian 3s may be serving on the PRC in a given year, nor do we know who has decided to seek promotion until a month before the reviews begin.10

Consequently, to increase the pool of reviewers needed for a given year, we revised the Handbook to allow eligible library faculty not currently on the PRC to be appointed as a reviewer rather than hold an election. This change, which was incorporated into our Council’s Bylaws, ensured that all PRC subcommittee members who review a Librarian 4 promotion bring the experience of at least a Librarian 3 to the process. Furthermore, in years when there are a large number of reviews (15-20) to be conducted, the PRC now has the ability to appoint additional reviewers when needed.

Dossier Requirement for Review

Formerly, librarians being reviewed for each contract renewal and/or promotion submitted dossiers (i.e., often lengthy notebooks) to the PRC that documented their publications, presentations, service, and professional development activities. After much discussion, we proposed that librarians undergoing a second or later reappointment had the option to submit an annotated CV in lieu of a full dossier. Dossiers would continue to be required from Librarian 2s and higher undergoing their first contract renewal and/or all librarians applying for promotion in rank. We thought this option was logical from the standpoint that annual evaluations are required of each librarian, anyway, so an annotated CV would suffice for the purpose of contract renewal. Because an annotated CV is a synopsis of one’s professional activities, it requires less work and documentation than a dossier. The University Librarian approved this option.

Council Approval

Neither the Council’s Bylaws nor the Handbook require members to vote on a Handbook revision. We had to decide whether it was important to seek Council approval of the revised document. After much discussion, we chose to ask for a vote of endorsement before sending a complete draft revision to the University Librarian for his formal review. Council approved the draft by a substantial majority.

Follow Up on Implementation of Revisions

Once our ad hoc committee met its original charge of producing a revised and approved handbook, we were disbanded. We did not develop a plan to implement changes to the Handbook, and neither did the Librarians’ Council. As a result, three years after approval, much work remains to be done. Changes to the Handbook required revision of the Council’s Bylaws and procedural changes in the professional peer review process. The Bylaws were revised, but the Professional Review Committee has been slow to incorporate all the procedural changes and decisions described in the new Handbook into the PRC documents that are used to manage and guide the process. This has resulted in continuing confusion with the professional review process, ironically the primary reason we opened the Handbook for revision.

Recommendations for Revising Your Handbook

When we began this project, it seemed overwhelming. Early on, we discussed our revision strategy and made decisions about how to allocate and accomplish the work. Our plans changed over time, of course, and new approaches were proposed and adopted. Re-examination and adjustment of our workflow throughout the project contributed greatly to our success.

Library consolidation, changes in librarian status, and other factors are affecting even long-established academic libraries, public and private. These changes likely require modifications to documents governing librarians at these institutions. Despite our institution’s relative youth, we offer the following recommendations to other librarians embarking on a handbook or similar governing document revision.

 Table 4. Recommended actions and resulting benefits when conducting a handbook revision.


Like most undertakings of this magnitude and importance, the Handbook revision project was extremely time-consuming and, at times, frustrating. Nevertheless, we successfully balanced our Council’s needs within the University’s framework. We were intent on working with our colleagues to create a more professional document that is applicable and fair to today’s members. Most importantly, we gained an intimate familiarity with our handbook—a responsibility all academic librarians with a similar governance structure should work toward. Even when librarians who have a handbook or similar governance documents never have the opportunity or need to revise their handbook, we believe that it is vital to be knowledgeable about its content and ready to advocate for and promote the rights it confers to their colleagues and administrations.


The authors would like to thank our peer review editor, Vicki Sipe, Catalog Librarian at the University of Maryland, Baltimore County and our editors at In the Library with the Lead Pipe, Ellie Collier and Annie Pho.


Association of College and Research Libraries. Association of College and Research Libraries Standards for Faculty Status for Academic. 2007. Accessed June 5, 2015.

Bolin, Mary K. “A Typology of Librarian Status at Land Grant Universities.” The Journal of Academic Librarianship. 2008. v. 34, issue 3, pp. 220-230. doi:10.1016/j.acalib.2008.03.005

Decker, Kurt. H. Drafting and Revising Employment Policies and Handbooks, 2nd ed., 2 volumes. New York, NY: Wiley, 1994.

Decker, Kurt. H., Louis R. Lessig, and Kermit M. Burley. Drafting and Revising Employment Policies and Handbooks. 2002 Cumulative Supplement. New York, NY: Panel Publishers, 2002.

George Mason University. Administrative/Professional Faculty Handbook, (2012a). Accessed December 22, 2014.

George Mason University. “Librarians’ Handbook,” in Administrative/Professional Faculty Handbook, Appendix C (2012b): 18-32. Accessed December 22, 2014.

George Mason University. Faculty Handbook, 2014. Accessed December 22, 2014.

Pence, James. L. “Adapting Faculty Personnel Policies.” New Directions for Higher Education. Fall 1990. 59-68. DOI: 10.1002/he.36919907108

“Writing a Faculty Handbook: a 2-Year, Nose-to-the-Grindstone Process.” Chronicle of Higher Education. October 2, 1985. v. 31, issue 5. p. 28




  1. Some librarians are governed by documents developed by Human Resources, faculty unions, or content within a Faculty Handbook. There is a dearth of available information regarding handbooks for academic librarians. See Bolin 2008
  2. For the purposes of this article, we define “instructional faculty” as faculty in the more commonly accepted traditional sense (i.e., professors of English or Chemistry) as well as non-teaching research faculty, and term and adjunct faculty, who are also included in this faculty handbook.
  3. Professional library faculty at George Mason do not have elected representation in the Faculty Senate.
  4. In addition to the George Mason University Faculty Handbook, we read faculty handbooks from the following Virginia colleges and universities:  James Madison University, Radford University, University of Mary Washington, University of Virginia, Virginia Commonwealth University,  and Virginia Tech. Since our review, all of these handbooks have been revised except for Radford University and the University of Mary Washington.
  5. The University Libraries is now directly under the purview of the Provost.
  6. The professional peer review process is distinct from Mason’s professional faculty annual performance review, which takes place between a librarian and his/her supervisor and is required by the A/P Handbook.
  7. The standard  procedure is that librarians from the same department cannot review one another. This can cause problems for departments with large numbers of librarians.
  8. Mason librarians hold multi-year contracts, the duration of which is determined by an individual’s rank.
  9. According to the A/P Handbook, “Typical professional faculty positions are librarians, counselors, coaches, physicians, lawyers, engineers and architects…[that] require the incumbent to regularly exercise professional discretion and judgment and to produce work that is intellectual and varied and is not standardized” (George Mason University 2012a, 4).
  10. In 2014-2015, the PRC included three Librarian 2s and four Librarian 3s while the entire Council composition was: 2 Librarian 1s, 21 Librarian 2s and 18 Librarian 3s and no Librarian 4s.

LITA: An Interview With Emerging Leader Isabel Gonzalez-Smith

planet code4lib - Wed, 2015-07-01 13:00

Tell us about your library job.  What do you love most about it?

I am an Undergraduate Experience Librarian at the University of Illinois at Chicago’s Richard J. Daley Library where I focus on how the library can support the academic success of our undergraduates. It’s hard to pick a single thing I love about my job because it is really personal to me. As an alumna, serving UIC undergrads is like stepping back into my own undergraduate experience and constantly thinking about ways I can improve that of our current students. Collaboration is key to many of our library efforts and my current role at UIC Library allows me to meet campus partners with the same mission. It doesn’t hurt that I work with an inspiring team of librarians that constantly push me to be the best professional I can be.

Where do you see yourself going from here?

My greatest motivator is improving the experience of the communities we serve as librarians. It might be nerdy but I geek out about data-driven decision making, the iterative process of refinement, and holistic problem solving when it comes to both virtual and physical services. I’m hoping my next career move is in user experience and assessment.

Why did you apply to be an Emerging Leader? What are your big takeaways from the ALA-level activities so far?

It’s funny – I applied to the program several years ago when a previous EL and friend of mine encouraged me to but I wasn’t accepted. I remember feeling really bummed about it! Years later, I had other friends who became Emerging Leaders bring it up and motivate me to try again. I’m so glad I did! I have found the Emerging Leaders and ALA community very welcoming – people want to see you succeed. Being an Emerging Leader means having the tools and the encouragement to engage more directly with ALA – developing a true appreciation and understanding that it is YOUR organization.

What have you learned about LITA governance and activities so far?

LITA is such an awesome division. I am very grateful I was selected as the LITA sponsored Emerging Leader because it has allowed me to get to know the members who make LITA happen. Members work so hard for each other and they’re truly an innovative bunch. I had no idea how many groups of people worked towards different initiatives in committees, task forces, interest groups and I’m still learning about each of them. Governance takes a lot of people and it is much clearer to me now that I have been more involved.

What was your favorite LITA moment? What would you like to do next in the organization?

Hands down – working with the search committee in selecting LITA’s next Executive Director. Special thanks to the LITA Board for inviting me to have a voice on the committee. It speaks volumes that LITA Board members embraced an early career librarian and allowed me the opportunity to have a say in LITA’s future. Very exciting moment!

Open Knowledge Foundation: UK Crime Data: Feeling is Believing

planet code4lib - Wed, 2015-07-01 10:01

Latest crime data shows that the UK is getting significantly more ‘peaceful’. Last month, the Institute for Economics and Peace published the UK Peace Index, revealing UK crime figures have fallen the most of all EU countries in the past decade. Homicide rates, to take one indicator, have halved over the last decade.

Crime Scene by Alan Cleaver, Flickr, CC-BY

But the British public still feels that crime levels are rising. How can opening up crime data play a part in convincing us we are less likely to experience crime than ever before?

The ‘Perception Gap’

The discrepancy between crime data and perceptions of the likelihood of crime is particularly marked in the UK. Although it has been found that a majority of the public broadly trust official statistics, the figures are markedly lower for those relating to crime. In one study, 85% of people agreed that the Census accurately reflects changes in the UK, but only 63% said the same of crime statistics.

Credibility of Police Data

Police forces have been publishing crime statistics in the UK since 2008, using their own web-based crime mapping tools or via the national crime mapping facility ( and This has been purportedly for the purpose of improving engagement with local communities alongside other policy objectives, such as promoting transparency. But allegations of ‘figure fiddling’ on the part of the police have undermined the data’s credibility and in 2014, the UK Statistics Authority withdrew its gold-standard status from police figures, pointing to ‘accumulating evidence’ of unreliability.

The UK’s open data site for crime figures allows users to download street-level crime and outcome data in CSV format and explore the API containing detailed crime data and information about individual police forces and neighbourhood teams. It also provides Custom CSV download and JSON API helper interfaces so you can more easily access subsets of the data.

But the credibility of the data has been called into question. Just recently, data relating to stop-search incidents for children aged under-12 was proved ‘inaccurate’. The site itself details many issues which call the accuracy of the data into question: inconsistent geocoding policies in police forces; “Six police forces we suspect may be double-reporting certain types of incidents“; ‘siloed systems’ within police records; and differing IT systems from regional force to force.

In summary, we cannot be sure the ‘data provided is fully accurate or consistent.’

The Role the Media Plays: If it Bleeds, it Leads

In response to persistent and widespread public disbelief, the policies of successive UK governments on crime have toughened: much tougher sentencing, more people in prison, more police on the streets. When the British public were asked why they think there is more crime now than in the past, more than half (57%) stated that it was because of what they see on television and almost half (48%) said it was because of what they read in newspapers [Ipsos MORI poll on Closing the Gaps. One tabloid newspaper, exclaimed just recently: “Rape still at record levels and violent crime rises” and “Crime shows biggest rise for a decade“. As the adage goes, If it Bleeds, it Leads.

Crime Data and Mistrust of the Police

Those engaged in making crime figures meaningful to the public face unique challenges. When Stephen Lawrence was murdered in 1993, and the following public inquiry found institutional racism to be at the heart of the Met police, public trust towards the police was shattered. Since then, the police have claimed to have rid their ranks of racism entirely.

Police by Luis Jou García, Flickr, CC BY-NC 2.0

But many remain less than convinced. According to official statistics, in 1999-2000, a black person was five times more likely than a white person to be stopped by police. A decade later, they were seven times more likely. One criminologist commented: “Claims that the Lawrence inquiry’s finding of institutional racism no longer apply have a hollow ring when we look at the evidence on police stops.” [Michael Shiner reported in the Guardian].

Equally, the police distrust the public too. The murder of two young, female police officers in Manchester in 2012 ignited the long-rumbling debate over whether the police should be armed. So the divide between the police and the public is a serious one.

A Different Tack?

In 2011, a review was undertaken by the UK Statistics Authority into Crime Data. Its recommendations included:

  • Improving the presentation of crime statistics to make them more authoritative
  • Reviewing the availability of local crime and criminal justice data on government websites to identify opportunities for consolidation
  • Sharing of best practice and improvements in metadata and providing reassurance on the quality of police crime records.

It’s clear that the UK police recognise the importance of improving their publication of data. But it seems that opening data alone won’t fix the shattered trust between the public and the police, even if the proof that Britons are safer than ever before is there in transparent, easily navigable data. We need to go further back in the chain of provenance, scrutinise the reporting methods of the police for instance.

But this is about forgiveness too, and the British public might just not be ready for that yet.

Terry Reese: MarcEdit 6 Update

planet code4lib - Wed, 2015-07-01 03:45

Changes in this update:

6.1.21 * 6.1.21 ** Bug Fix: Conditional Delete - When selecting regular expressions -- there were times when the process wasn't being recognized. ** Enhancement: Conditional Delete - This function use to only work when using the Regular Expression option. This now works for all options. ** Bug Fix: ValidateISBNs - Process would only process the first subfield. If the subfield to be processed wasn't the first one, it wouldn't be validated. ** Enhancement: ValidateISSN: Uses mathematical formula to validate ISSNs. ** Bug Fix: Generate Fast Headings (Stand alone tool) -- LDR fields could be deleted. ** Enhancement: Working to make the global edit functions a little more fault tolerant around record formatting. ** Enhancement: Generate MARC record from URL -- program generates MARC records from Webpages. If you pass it an LC URL, it will generate data from the MARCXML. At this point, only the Windows and Linux downloads were updated. I'll be replacing the Mac download with the first version of the native OSX build the July 4th weekend. You can get the updates either via the Automated updated tool or from the website at: --tr

Eric Lease Morgan: JSTOR Workset Browser

planet code4lib - Tue, 2015-06-30 18:07

Given a citations.xml file, this suite of software — the JSTOR Workset Browser — will cache and index content identified through JSTOR’s Data For Research service. The resulting (and fledgling) reports created by this suite enables the reader to “read distantly” against a collection of journal articles.

The suite requires a hodgepodge of software: Perl, Python, and the Bash Shell. Your milage may vary. Sample usage: cat etc/citations-thoreau.xml | bin/ thoreau

“Release early. Release often”.

Zotero: Zotero 4.0.27: Streamlined saving, easier bibliography language selection, and more

planet code4lib - Tue, 2015-06-30 17:22

Zotero 4.0.27, now available, brings some major new features, as well as many other improvements and bug fixes.

Streamlined saving (Zotero for Firefox)

In Zotero for Firefox, it’s now easier than ever to save items from webpages.

Zotero senses information on webpages through bits of code called site translators, which work with most library catalogs, popular websites such as Amazon and the New York Times, and many gated databases.

In the past, there have been two different ways of saving web sources to Zotero:

  • If Zotero detected a reference on a webpage, you could click an icon in the address bar — for example, a book icon on Amazon or a journal article icon on a publisher’s site — to save high-quality metadata for the reference to your Zotero library.
  • If a site wasn’t supported or a site translator wasn’t working, you could still save any webpage to your Zotero library by clicking the “Create Web Page Item from Current Page” button in the Zotero for Firefox toolbar or by right-clicking on the page background and choosing “Save Page to Zotero”. In such cases, you might need to fill in some details that Zotero couldn’t automatically detect.

In Zotero 4.0.27, we’ve combined the address bar icon and the “Create Web Page Item from Current Page” button into a single save button in the Firefox toolbar, next to the existing Z button for opening the Zotero pane.

The new save button on a New York Times article

(Don’t be confused by the book icon in the address bar in the top left — that’s a new Firefox feature, unrelated to Zotero.)

You can click the new save button on any webpage to create an item in your Zotero library, and Zotero will automatically use the best available method for saving data. If a translator is available, you’ll get high-quality metadata; if not, you’ll get basic info such as title, access date, and URL, and you can edit the saved item to add additional information from the webpage. The icon will still update to show you what Zotero found on the page, and, as before, you can hover over it to see which translator, if any, will be used.

This also means that a single shortcut key — Cmd+Shift+S (Mac) or Ctrl+Shift+S (Windows/Linux) by default — can be used to save from any webpage.

The new save button also features a drop-down menu for accessing additional functionality, such as choosing a non-default translator or looking up a reference in your local (physical) library without even saving it to Zotero.

Additional save options

(This functionality was previously available by right-clicking on the address bar icon, though if you knew that, you surely qualify for some sort of prize.) The new menu will be used for more functionality in the future, so stay tuned.

Prefer another layout? In addition to the new combined toolbar buttons, Zotero provides separate buttons for opening Zotero and saving sources that can be added using Firefox’s Customize mode.

Custom button layout

With the separate buttons, you can hide one or the other button and rely on a keyboard shortcut, move the buttons into the larger Firefox menu panel, or even move the new save button between the address bar and search bar, close to its previous position. (Since the new save button works on every page, it no longer makes sense for it to be within the address bar itself, but by using the separate buttons you can essentially recreate the previous layout.)

While all the above changes apply only to Zotero for Firefox for now, similar changes will come to the Chrome and Safari connectors for Zotero Standalone users in a future version. For now, Zotero Standalone users can continue to use the address bar (Chrome) or toolbar (Safari) icon to save recognized webpages and right-click (control-click on Macs) on the page background and choose “Save Page to Zotero” to save basic info for any other page.

Easier bibliography language selection

Making Zotero accessible to users around the world has always been a priority. Thanks to a global community of volunteers in the Zotero and Citation Style Language (CSL) projects, you can use the Zotero interface and also generate citations in dozens of different languages.

Now, thanks to community developers Rintze Zelle and Aurimas Vinckevicius, it’s much easier to switch between different languages when generating citations.

Previously, Zotero would automatically use the language of the Zotero user interface — generally the language of either Firefox or the operating system — when generating citations. While you’ve always been able to generate citations using a different language, doing so required changing a hidden preference.

You can now set the bibliography language at the same time you choose a citation style, whether you’re using Quick Copy, Create Bibliography from Selected Items, or the word processor plugins.

Choosing a bibliography language for Quick Copy

In the above example, even though the user interface is in English, the default Quick Copy language is being set to French. If an item is then dragged from Zotero into a text field, the resulting citation will be in French, using French terms instead of English ones (e.g., “édité par” instead of “edited by”).

The new language selector is even more powerful when using the word processor plugins. The bibliography language chosen for a document is stored in the document preferences, allowing you to use different languages in different documents — say, U.S. English for a document you’re submitting to an American journal and Japanese for a paper for a conference in Japan.

Note that, of the thousands of CSL styles that Zotero supports, not all can be localized. If a journal or style guide calls for a specific language, the language drop-down will be disabled and citations will always be generated using the required language. For example, selecting the Nature style will cause Zotero to use the “English (UK)” locale in all cases, as is required by Nature’s style guide.

Other changes

Zotero now offers an “Export Library…” option for group libraries, allowing the full collection hierarchy to be easily exported. If you find yourself facing many sync conflicts, you can now choose to resolve all conflicts with changes from one side or the other. For Zotero Standalone users, we’ve improved support for saving attachments from Chrome and Safari on many sites, bringing site compatibility closer to that of Zotero for Firefox. And we’ve resolved various issues that were preventing complete syncs for some people.

There’s too much else to discuss here, but see the changelog for the full list of changes.

Get it now

If you’re already using Zotero, your copy of Zotero should update to the new version automatically, or you can update manually from the Firefox Add-ons pane or by selecting the “Check for Updates” menu option in Zotero Standalone. If you’re not yet using Zotero, try it out today.

Ed Summers: Links in Obergefell v. Hodges

planet code4lib - Tue, 2015-06-30 16:58

Last week’s landmark ruling from the Supreme Court on same sex marriage was routinely published on the Web as a PDF. Given the past history of URL use in Supreme Court opinions I thought I would take a quick look to see what URLs were present. There are two, both are in Justice Alito’s dissenting opinion, and one is broken … just four days after the PDF was published. You can see it yourself at the bottom of page 100 in the PDF.

If you point your browser at

you will get a page not found error:

Sadly even the Internet Archive doesn’t have a snapshot of the page available.

But notice it thinks it can get a copy of it still. That’s because the Center for Disease Control’s website is responding with a 200 OK instead of a 404 Not Found:

zen:~ ed$ curl -I HTTP/1.1 200 OK Content-Type: text/html X-Powered-By: ASP.NET X-UA-Compatible: IE=edge,chrome=1 Date: Tue, 30 Jun 2015 16:22:18 GMT Connection: keep-alive

At any rate, it’s not Internet Archive’s fault that they haven’t archived the Webpage originally published in 2009, because the URL is actually a typo. Instead it should be

which leads to:

So between the broken URL and the 200 OK for something not found we’ve got issues of link rot and reference rot all rolled up into a one character typo. Sigh.

I think a couple lessons for web publishers can be distilled from this little story:

  • when publishing on the Web include link checking as part of your editorial process
  • if you are going to publish links on the Web use a format that’s easy to check … like HTML.

LITA: ALA appoints Jenny Levine next LITA Executive Director

planet code4lib - Tue, 2015-06-30 15:10

The American Library Association is pleased to announce the appointment of Jenny Levine as the Executive Director of the Library and Information Technology Association, a division of the ALA, effective August 3, 2015. Ms. Levine has been at the American Library Association since 2006 as the Strategy Guide in ALA’s Information Technology and Telecommunications Services area, charged with providing vision and leadership regarding emerging technologies, development of services, and integration of those services into association and library environments. In that role she coordinated development of ALA’s collaborative workspace, ALA Connect, and provided ongoing support and documentation. She convened the staff Social Media Working Group and coordinated a team-based approach for strategic posting to ALA’s social media channels. In addition, she has been the staff liaison to ALA’s Games and Gaming Round Table (GameRT) and coordinated a range of activities, including the 2007 & 2008 Gaming, Learning, and Libraries Symposia and International Games Day @ your library. She developed the concept for and manages the Networking Uncommons gathering space at ALA conferences.

Prior to joining the ALA staff, Jenny Levine held positions as Internet Development Specialist and Strategy Guide at the Metropolitan Library System in Burr Ridge (IL), Technology Coordinator at the Grande Prairie Public Library District in Hazel Crest (IL), and Reference Librarian at the Calumet City Public Library in Calumet City (IL). She received the 2004 Illinois Library Association Technical Services Award and a 1999 Illinois Secretary of State Award of Recognition.

Jenny has an M.L.S. from the University of Illinois, Urbana-Champaign, and a B.S. in Journalism/Broadcast News from the University of Kansas, Lawrence. Within ALA, she is a member of LITA, GameRT, the Intellectual Freedom Round Table (IFRT), and the Gay, Lesbian, Bisexual, and Transgender Round Table (GLBTRT). She is also active outside ALA and belongs to the American Civil Liberties Union (ACLU), the Electronic Frontier Foundation (EFF), the ALA-tied Freedom to Read Foundation (FTRF), the Human Rights Campaign (HRC) and the Illinois Library Association (ILA).

Jenny Levine has been an active presenter and writer, including three issues of Library Technology Reports on Gaming & Libraries. Among the early explorers of Library 2.0 technologies, from the Librarians’ Site du Jour (the first librarian blog) to the ongoing The Shifted Librarian, she is active in a wide variety of social media.

Ms. Levine becomes executive director of LITA on the retirement of Mary Taylor, LITA executive director since 2001. Thanks go to the search committee for a thoughtful and successful process: Rachel Vacek, Thomas Dowling, Andromeda Yelton, Isabel Gonzalez-Smith, Keri Cascio, Dan Hoppe and Mary Ghikas.

David Rosenthal: Blaming the Victim

planet code4lib - Tue, 2015-06-30 15:00
The Washington Post is running a series called Net of Insecurity. So far it includes:
  • A Flaw In The Design, discussing the early history of the Internet and how the difficulty of getting it to work at all and the lack of perceived threats meant inadequate security.
  • The Long Life Of A Quick 'Fix', discussing the history of BGP and the consistent failure of attempts to make it less insecure, because those who would need to take action have no incentive to do so.
  • A Disaster Foretold - And Ignored,  discussing L0pht and how they warned a Senate panel 17 years ago of the dangers of Internet connectivity but were ignored.
Perhaps a future article in the series will describe how successive US administrations consistently strove to ensure that encryption wasn't used to make systems less insecure and, the encryption that was used was as weak as possible. They prioritized their (and their opponents) ability to spy over mitigating the risks that Internet users faced, and they got what they wanted. As we see with the compromise of the Office of Personnel Management and the possibly related compromise of health insurers including Anthem. These breaches revealed the kind of information that renders everyone with a security clearance vulnerable to phishing and blackmail. Be careful what you wish for!

More below the fold.

The compromises at OPM and at Sony Pictures have revealed some truly pathetic security practices at both organizations, which certainly made the bad guy's job very easy. Better security practices would undoubtedly have made their job harder. But it is important to understand that in a world where Kaspersky and Cisco cannot keep their systems secure, better security practices would not have made the bad guy's job impossible.

OPM and Sony deserve criticism for their lax security. But blaming the victim is not a constructive way of dealing with the situation in which organizations and individuals find themselves.

Prof. Jean Yang of C-MU has a piece in MIT Technology Review entitled The Real Software Security Problem Is Us that, at first glance, appears to make a lot of sense but actually doesn't. Prof. Yang specializes in programming languages and is a "cofounder of Cybersecurity Factory, an accelerator focused on software security". Writing:
we could, in the not-so-distant future, actually live in a world where software doesn’t randomly and catastrophically fail. Our software systems could withstand attacks. Our private social media and health data could be seen only by those with permission to see it. All we need are the right fixes.
A better way would be to use languages that provide the guarantees we need. The Heartbleed vulnerability happened because someone forgot to check that a chunk of memory ended where it was supposed to. This could only happen in a programming language where the programmer is responsible for managing memory. So why not use languages that manage memory automatically? Why not make the programming languages do the heavy lifting?
Another way would be to make software easier to analyze. Facebook had so much trouble making sense of the software it used that it created Hack and Flow, annotated versions of PHP and Javascript, to make the two languages more comprehensible.
Change won’t happen until we demand that it happens. Our software could be as well-constructed and reliable as our buildings. To make that happen, we all need to value technical soundness over novelty. It’s up to us to make online life is as safe as it is enjoyable.It isn't clear who Prof. Yang's "we" is, end users or programmers. Suppose it is end users. Placing the onus on end users to demand more secure software built with better tools is futile. There is no way for an end user to know what tools were used to build a software product, no way to compare how secure two software products are, no credible third-party rating agency to appeal to for information. So there is no way for the market to reward good software engineering and punish bad software engineering.

Placing the onus on programmers is only marginally less futile. No-one writes a software product from scratch from the bare metal up. The choice of tools and libraries to use is often forced, and the resulting system will have many vulnerabilities that the programmer has no control over. Even if the choice is free, it is an illusion to believe that better languages are a panacea for vulnerabilities. Java was designed to eliminate many common bugs, and it manages memory. It was effective in reducing bugs, but it could never create a "world where software doesn’t randomly and catastrophically fail".

Notice that the OPM compromise used valid credentials presumably from social engineering, so it would have to be blamed on system administrators not programmers, or rather on management's failure to mandate two-factor authentication. But equally, even good system administration couldn't make up for Cisco's decision to install default SSH keys for "support reasons".

For a more realistic view, read A View From The Front Lines, the 2015 report from Mandiant, a company whose job is to clean up after compromises such as the 2013 one at Stanford. Or Dan Kaminsky's interview with Die Zeit Online in the wake of the compromise at the Bundestag:
No one should be surprised if a cyber attack succeeds somewhere. Everything can be hacked. ...  All great technological developments have been unsafe in the beginning, just think of the rail, automobiles and aircrafts. The most important thing in the beginning is that they work, after that they get safer. We have been working on the security of the Internet and the computer systems for the last 15 years.Yes, automobiles and aircraft are safer but they are not safe. Cars kill 1.3M and injure 20-50M people/year, being the 9th leading cause of death. And that is before they become part of the Internet of Things and their software starts being exploited. Clearly, some car crash victims are at fault and others aren't. Dan is optimistic about Prof. Yang's approach:
It is a new technology, it is still under development. In the end it will not only be possible to write a secure software, but also to have it happen in a natural way without any special effort, and it shall be cheap.I agree that the Langsec approach and capability-based systems such as Capsicum can make systems safer. But making secure software possible is a long way from making secure software ubiquitous. Until it is at least possible for organizations to deploy a software and hardware stack that is secure from the BIOS to the user interface, and until there is liability on the organization for not doing so, blaming them for being insecure is beside the point.

The sub-head of Mandiant's report is:
For years, we have argued that there is no such thing as perfect security. The events of 2014 should put any lingering doubts to rest.It is worth reading the whole thing, but especially their Trend 4, Blurred Lines, that starts on page 20. It describes how the techniques used by criminal and government-sponsored bad guys are becoming indistinguishable, making difficult not merely to defend against the inevitable compromise, but to determine what the intent of the compromise was.

The technology for making systems secure does not exist. Even if it did it would not be feasible for organizations to deploy only secure systems. Given that the system vendors bear no liability for the security of even systems intended to create security, this situation is unlikely to change in the foreseeable future.

Cynthia Ng: Accessible Format Production Part 5: Editing the Document

planet code4lib - Tue, 2015-06-30 01:24
Much like for PDF, there are different levels of accessibility for an electronic text (e-text) document. The more that you complete for a document, the more accessible it is. However, you still want to balance quality vs. quantity.  First, I am going to assume that you have a document file of some sort, whether that be … Continue reading Accessible Format Production Part 5: Editing the Document

Access Conference: Announcing our AccessYYZ Binkley Lecturer: Molly Sauter!

planet code4lib - Mon, 2015-06-29 19:47

The Access 2015 Organizing Committee is thrilled to announce that our speaker for the Dave Binkley Memorial Lecture is Molly Sauter!

Molly is a Vanier Scholar and PhD student in Communication Studies at McGill University in Montreal, Canada. She holds a masters degree in Comparative Media Studies from MIT, and is an affiliate researcher at the MIT Center for Civic Media at the MIT Media Lab and at the Berkman Center for Internet and Society at Harvard University. Molly has published widely on internet activism, hacker culture, and depictions of technology in the media. Her recent book, The Coming Swarm, examines the use of Distributed Denial of Service (DDoS) actions as a form of political activism.

You can find Molly online at and on twitter at @oddletters.

More about the Dave Binkley Memorial Lecture.

LITA: 2015 LITA Forum, Registration Opens!

planet code4lib - Mon, 2015-06-29 17:00

Registration Now Open!

2015 LITA Forum
Minneapolis, MN
November 12-15, 2015

Plan now to join us in Minneapolis, Minnesota, at the Hyatt Regency Minneapolis for the 2015 LITA Forum, a three-day educational event that includes 2 preconferences, 3 keynote sessions, more than 55 concurrent sessions and 15 plus poster presentations.

2015 LITA Forum is the 18th annual gathering of technology-minded information professionals and is a highly regarded annual event for those involved in new and leading edge technologies in the library and information technology field. Registration is limited in order to preserve the important networking advantages of a smaller conference. Attendees take advantage of the informal Friday evening reception, networking dinners and other social opportunities to get to know colleagues and speakers. Comments from past attendees:

  • “Best conference I’ve been to in terms of practical, usable ideas that I can implement at my library.”
  • “I get so inspired by the presentations and conversations with colleagues who are dealing with the same sorts of issues that I am.”
  • “After LITA I return to my institution excited to implement solutions I find here.”
  • “This is always the most informative conference! It inspires me to develop new programs and plan initiatives.”

This Year’s featured Keynote Sessions

Mx A. Matienzo
Director of Technology for the Digital Public Library of America, he focuses on promoting and establishing digital library interoperability at an international scale. Prior to joining DPLA, Matienzo worked as an archivist and technologist specializing in born-digital materials and metadata management, at institutions including the Yale University Library, The New York Public Library, and the American Institute of Physics.

Carson Block
Carson Block Consulting Inc. has led, managed, and supported library technology efforts for more than 20 years. He has been called “a geek who speaks English” and enjoys acting as a bridge between the worlds of librarians and hard-core technologists.

Lisa Welchman
President of Digital Governance Solutions at ActiveStandards. In a 20-year career, Lisa Welchman has paved the way in the discipline of digital governance, helping organizations stabilize their complex, multi-stakeholder digital operations. Her book Managing Chaos: Digital Governance by Design was published in February of 2015 by Rosenfeld Media.

The Preconference Workshops include

So You Want to Make a Makerspace: Strategic Leadership to support the Integration of new and disruptive technologies into Libraries: Practical Tips, Tricks, Strategies, and Solutions for bringing making, fabrication and content creation to your library.
Leah Kraus is the Director of Community Engagement and Experience at the Fayetteville Free Library.
Michael Cimino is the Technology Innovation and Integration Specialist at the Fayetteville Free Library.

Beyond Web Page Analytics: Using Google tools to assess searcher behavior across web properties
Rob Nunez, Robert L Nunez, Head of Collection Services, Kenosha Public Library, Kenosha, WI
Keven Riggle, Systems Librarian & Webmaster, Marquette University Libraries

for registration and additional information.

Join us in Minneapolis!


Subscribe to code4lib aggregator