You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 2 hours 54 min ago

Tara Robertson: alternate formats: who pays?

Mon, 2016-06-06 22:44

Yesterday a had a big realization. Many textbook publishers continue to publish inaccessible content and those costs are borne by the public education system through alternate format production. Publishers are not responsible for producing accessible material and universities and colleges purchase things that aren’t accessible to all their students and then pay again to make them accessible. In BC I’d estimate that at least $1 million per year is spent on obtaining or producing alternate formats. This is an access issue, a human rights issue, and it’s also an economics issue.

Here are some of the conversations and pieces of information that led to this observation.

Creating an Inclusive Quality Standard of Education

I was sad to miss The Guelph Accessibility Conference at University of Guelph last week. Karen McCall presented Creating an Inclusive Quality Standard of Education (PDF handouts of her slides) where she argues that access to education is a human right. At work I’m more focused on the technical workflows and had forgotten about the human rights issues around access to education. She says that “accommodation is the norm, rather than the exception” and that this keeps people with disabilities “on the periphery of society” (slide 3). She states that “what this does is shift “the ‘cost” of inclusive design and inclusive communities to the corporate sector instead of in primary, secondary and tertiary education” (slide 3).

Karen states that in the US $79 billion is spent on ICT (information communication technology) a year, so there is enough purchasing power to demand that things are accessible from the start. She argues that “the best way to ensure inclusive communities is to mandate the procurement of eAccessible only products and services” (slide 6). This would also encourage competition and innovation in the market, which would benefit everyone.

Universal design for learning workshops

Recently I’ve presented a few workshops on universal design for learning (UDL) with Amanda Coolidge and Sue Doner. These workshops build on the personas from the Accessibility Toolkit. The workshop materials are also CC-BY licensed, so feel free to use or adapt them.


Appendix: Redesign or Accommodation Activity Guidelines

In this workshop we also compare disability accommodation and UDL. There will always be a need for disability accommodation, but we argue that using the UDL principles can solve many of the common access issues (videos without captions, images that lack descriptions, poor organization of information and concepts).

Disability Accommodation Universal design for learning reactive proactive accommodation is for one student who has appropriate documentation improves accessibility for many students students with disabilities; students who have a disability and lack the documentation; students with a disability who for whom the stigma in accessing services is too great; students for whom English is not their first language; students with a variety of learning styles for many students there is a stigma in accessing disability services the onus is on the instructor to think about how they are teaching rather than on the individual student to request a retrofit

Jennifer LeVecque, from Camosun’s Disability Services Department, pointed out that for print coursepacks from the campus bookstore it’s possible that the publisher gets paid more than once. First, the library might already be paying to license journal articles databases that have those articles. Second, the bookstore (or the copyright office) might be paying the publisher for the rights to produce the coursepack, then passing those costs on to the student. When most academic libraries opted out of Access Copyright tariff in 2012, many worked to change the workflow for producing and licensing coursepacks, encouraging faculty to link directly to the articles that the library had licensed. This is also a UDL best practice as it supports multiple ways of representation and allows students who have print disabilities to access these digital files using whatever assistive technology they use.

CAPER-BC Advisory Committee meeting

At the CAPER BC Advisory Committee meeting there were questions about why publishers are producing new e-textbooks that are not accessible. Jewelles Smith, BC Director for NEADS, suggested that it would be useful to collaborate in assessing the accessibility of specific publisher e-textbook platforms, or of common e-textbook titles that are being used. Last month Benetech published their Buy Accessible guidelines, which is a list of specific questions for people who are selecting and purchasing textbooks to ask publishers and vendors.

So what?

Many for profit textbook publishers continue to publish content that is inaccessible and the public education system spends money to remediate these textbooks to make them accessible. Textbook publishers make a lot of money and have shrugged off their ethical and legal (depending on where you live) responsibilities to students with disabilities and faculty keep choosing to use these textbooks, and bookstores keep buying them. Then Disability Service Offices and organizations like where I work spend a lot of time and money retrofitting. This is not a financially sustainable model.


We need to build in language around accessibility into procurement policies at universities and colleges. Where things are not accessible we need to make the cost of retrofit explicit and charge that cost back to the publisher. With digital workflows publishers have the opportunity to make fully accessible digital versions of textbooks available for students to buy. Right now alternate format production is a market externality to publishers, so there is no financial incentive or cost to meeting accessibility guidelines. If we believe that education is a human right for all, then we need procurement policies and laws that reflect this.

HangingTogether: “Ground Truthing” MARC

Mon, 2016-06-06 20:00

Although the thought was revolutionary back in 2002, librarians now widely recognize that our metadata requirements have outgrown the MARC standard. After 50 years of service it’s time to make library data more actionable and linkable on the web. But to do that we need to bring our existing data assets into some sort of new regime. And doing that well is going to take understanding what we have to work with.

Before I go on I need to make it clear that when I say “MARC” I’m really conflating a number of things that attempt to describe and control which data elements are recorded and how.  For the purposes of this piece, MARC comprises the various flavors of the MARC standard (primarily MARC21), cataloging rules as expressed in AACR (first edition 1, then edition 2), and even punctuation rules as described by ISBD. These are the foundational elements of the library bibliographic universe as it has been developing since the 1960s.

In a perfect world these standards would have described exactly the right things to do and the right ways to accomplish them and they would have been widely understood and uniformly applied. This is not a perfect world.

So as the profession moves forward to create a new bibliographic infrastructure that is web-ready, linked and linkable, an important — indeed vital — question that must be answered is: What do we have to work with? That is, if we want to create canonical linked data entities for elements like authors, subjects, works, etc. we need data that is largely machine actionable in some critical ways. In other words, instead of looking at our data from afar, as satellites look at the Earth, we must “ground truth” our data just as those working with remote sensing do — we need to check our perceptions on the ground to make sure they are accurate. This is what my project, “MARC Usage in WorldCat” has been doing for the last 3 years.

For example, look at how many ways we have recorded the fact that an item includes “illustrations”. It’s clear that processing any of our data will require a lot of normalization, at minimum, to be able to understand that “ill.” and “illustrations” and “ilustrations” and “ilustr.” and “il.” and whatever, all mean the same thing.

This is not the only use for the site, as Reinhold Huevelmann at the German National Library (DNB) points out:

Mapping an element from an internal metadata format, e.g. Pica+, to MARC 21 (see ) sometimes needs grounded discussions and informed decisions.  In cases of doubt, apart from the MARC 21 standard and its documentation, the reality of MARC 21 data elements used in OCLC’s WorldCat provides good guidance in choosing the right target element, to prevent theoretically available, but exotic options, and thus to enhance the visibility of bibliographic data and the resources described.

As Reinhold points out, the standards are one thing, but another is how they have been used “on the ground”. There are often decisions that must be made between competing options and it can be helpful to know what decisions others have made.

Thus this project is unlike most of what OCLC Research does, as it does not come to any conclusions, but rather is a tool that can be used to help you reach your own conclusions about library bibliographic data and how it continues to evolve over time. A side benefit has also been that occasionally we catch errors that can be corrected.

About Roy Tennant

Roy Tennant works on projects related to improving the technological infrastructure of libraries, museums, and archives.

Mail | Web | Twitter | Facebook | LinkedIn | Flickr | YouTube | More Posts (96)

District Dispatch: CopyTalk webinar on state government overreach now available

Mon, 2016-06-06 17:36

From Lotus Head

In recent years entities of state government have attempted to rely on copyright as a means to suppress the dissemination of taxpayer-funded research and as a means to chill criticism but failed in the courts due to a lack of copyright authority. Ernesto Falcon, legislative counsel with the Electronic Frontier Foundation, reviews the status of pending California legislation, the court decisions that lead to its creation, and the debate that now faces the California legislature in this CopyTalk webinar.


The post CopyTalk webinar on state government overreach now available appeared first on District Dispatch.

Islandora: Registration now open: iCampMO 2016

Mon, 2016-06-06 16:31

Registration is now open for the last Islandora Camp of 2016: Islandora Camp Missouri, taking place in Kansas City from October 12 - 14. You can save some money with the Early Bird rate until August 15th. We also have a Call for Proposals open until August, so you can share your own work with Islandora and related systems. 

Interested in becoming a sponsor? Have questions about the event? Contact us.

Big thanks to our host and sponsor:

Villanova Library Technology Blog: The Community Bibliography – A Falvey Memorial Library Project

Mon, 2016-06-06 13:00

The Community Bibliography is “[a] celebration of Villanova University community authors and scholars past, present and future.” It is “an open repository of the entire published output of the Villanova University community.” The goal is to digitally preserve “our proud scholarly heritage, from our community’s historical publications of the 19th century to the cutting edge research of today.” Community is defined as any individual (faculty, staff, student, alumnus, Augustinian, administrator) affiliated with Villanova University.

This Bibliography may be of interest to Villanova alumni returning for Reunion 2016 (Thursday, June 9 – Sunday, June 12). The Community Bibliography hosts citations for alumni authors from the Class of 1920 through the Class of 2015. Here is an opportunity to check out what your classmates have accomplished.

The Community Bibliography evolved from discussions among Library Director (at the time) Joe Lucia; Darren Poley, Theology/Outreach librarian; Michael Foight, Special Collections and Digital Library coordinator;  and Andrew Nagy, a former Falvey technical developer. Poley explains, “The idea was to use the citation management software Andrew developed for the Finding Augustine project to manage a comprehensive list of published artifacts by anyone affiliated with Villanova since the inception of the University. Michael and I agreed that his team would manage the image management associated with creating an institutional repository, while my Outreach team would oversee the development and maintain a bibliography that would be fully searchable on the Web and that [we] would not need to worry about copyright issues since it would only be supplying the citations.”

A data entry pilot project began in January 2007 and that was a pivotal year for the Community Bibliography. In May the project officially came under the supervision of the Outreach team and, three months later, the project gained momentum with increased multi-faceted data gathering. Later that year Falvey personnel began talking to people outside of Falvey about inter-operability. In November a content review produced procedural and system refinements.

The Community Bibliography was unveiled to the University’s academic leaders at a March 1, 2008, gala dinner in Falvey. There, Poley said, “Our Community Bibliography specifically allows for all works, popular and scholarly, to be documented, but why bother? This information is already gathered both formally and informally. Professors keep track of works for Curriculum Vitae, offices and departments monitor faculty and staff publications. But how does one know altogether what Villanova as a community has published? The problem is that there is no one place where information on all of these works is available … Our Community Bibliography becomes the device for allowing ourselves and others to see in a measurable way what our community has produced.”

A February 2008 newsletter article, “The ‘institutional repository’ rethought:  Community Bibliography debuts,” not only explains the significance of the project, but also tells how it relates to the Faculty Fulltext project created by the Digital Library.

Stephen Spatz, assistant Outreach and Research Support librarian, does most of the day-by-day work on the Bibliography. He gathers and uploads citations of works by Villanova University community members; he searches mostly Falvey’s database collection, but also occasionally locates materials in faculty and departmental webpages and “even in a few cases, typewritten bibliographies, both published and unpublished.” He says, “There are currently about 12,000 citations in the database, most of which cover the most recent scholarly output of the VU community, but about 5% predate 1980 and, even in some cases, stretch back into the 19th century.” Spatz also maintains the Digital Library’s Faculty Fulltext database “which aims to parallel the citation-only content of the Community Bibliography with full-text versions of the most recent scholarly output of VU faculty.” Spatz also supervises students who do some of the data entry.

The two projects, Community Bibliography and Faculty Fulltext, developed from an academic movement to counter the commercialization of intellectual property, making information freely available as a means of sharing and promoting scholarship. Falvey’s early creation of these two projects puts it on the cutting edge of new ways of using technology to share scholarly information.

For more information contact


Darren Poley, Stephen Spatz and Michael Foight generously contributed information for this article.


Terry Reese: MarcEdit Updates

Mon, 2016-06-06 05:27

This update has been a little while coming and represents a significant number of updates, bug fixes and enhancements.  On the Mac side, the two largest updates where the implementation of the Delimited Text Translator and the OAI Harvester, on all platforms – Alma Integration.  You can find notes about the Alma integration here:

Please see the full list of updates below.  Downloads can be picked up through the automatic update mechanism in MarcEdit or via the downloads page at:


MarcEdit Windows/Linux Updates:

* Bug Fix: ILS Integration: Local Integration — corrected display rendering and search for keyword
* Bug Fix: Add/Delete Records — Corrected problem when using the Add field only if not a duplicate option
* Enhancement: Validate Headings — added dynamic caching
* Enhancement: Build Links — added dynamic caching
* Enhancement: ILS Integration — First version of Alma integration
* Bug Fix: Math conversion — Degree/minute/seconds to Degrees correction
* Settings Change: Updated the RDA Field conversion to limit abbreviation checking in the 245 field to the 245$c
* Enhancement: RDA Abbreviations — new abbreviations added
* Enhancement: Select/Delete MARC Records — added option to expose specialized search options like Field # searching, Range Searching, File Searching and Size searching.
* Bug Fix: OAI Harvester — Debug URL wasn’t correct when adding date values.
* Bug Fix: RDA Helper — Added Data validation to ensure that invalid 008 data doesn’t cause a data crash.
* Enhancement: Delimited Text Translator — Added more preview options
* Enhancement: Delimited Text Translator — Added Holdings LDR/008 values
* Enhancement: UI Improvements — a large number of textboxes that accept file paths now support drag and drop.

MarcEdit Mac Updates:

* Bug Fix: ILS Integration: Local Integration — corrected display rendering and search for keyword
* Bug Fix: Add/Delete Records — Corrected problem when using the Add field only if not a duplicate option
* Enhancement: Validate Headings — added dynamic caching
* Enhancement: Build Links — added dynamic caching
* Enhancement: ILS Integration — First version of Alma integration
* Bug Fix: Math conversion — Degree/minute/seconds to Degrees correction
* Settings Change: Updated the RDA Field conversion to limit abbreviation checking in the 245 field to the 245$c
* Enhancement: RDA Abbreviations — new abbreviations added
* Enhancement: Select/Delete MARC Records — added option to expose specialized search options like Field # searching, Range Searching, File Searching and Size searching.
* Bug Fix: RDA Helper — Added Data validation to ensure that invalid 008 data doesn’t cause a data crash.
* Enhancement: UI Improvements — a large number of textboxes that accept file paths now support drag and drop.
* Enhancement: OAI Harvester Implemented
* Enhancement: Delimited Text Translator implemented

Terry Reese: MarcEdit Alma Integration

Mon, 2016-06-06 05:27

Over the past month, I’ve been working with ExLibris (thank you to Ori Miller at ExLibris) and Boston College (thanks to Margaret Wolfe) to provide direct integration between MarcEdit and Alma via the Alma Apis.  Presently, the integration allows users to search, create, and update records.  Setup is pretty easy (I think) and once you have your API access setup correctly – you should be off and running.  But, it will be interesting to see if that’s the case as more people play around with this in their sandboxes.

Setting up integration

MarcEdit Alma integration requires that you configure an API key with Alma that supports the bib api and the user api.  The bib api represents the endpoints where the record editing and retrieval happen, while the user api is used to provide a thin layer of authentication before MarcEdit attempts to run an operation (since Alma doesn’t have it’s own authentication process separate from having a key). 

I’d recommend testing this first in your Sandbox.  To do this, you’ll need to know your sandbox domain, and be able to configure the API accordingly.  If you don’t know how to do this, you’ll want to contact ExLibris. 

Once you have your API key, open MarcEdit’s main window and click the Preferences icon.

This will open the Preference’s window.  Select the ILS Integration Link, and then check the Enable ILS Integration Checkbox, select Alma from the listbox and then enter the domain for your sandbox.  Alma’s API doesn’t require a username, so leave that blank, but enter your API key into the Password Textbox.  Finally, you’ll need to have setup a Z39.50 connection to your instance.  This is how MarcEdit searches Alma for record retrieval.  If you haven’t setup a Z39.50 Connection, you can do that here, or you can open the Z39.50 Client, Select Modify Databases, Add a new Z39.50 Server, and enter the information for your Alma Instance.  Here’s an example configuration (minus the username and password) for Boston College’s Sandbox:

With your Z39.50 Server configured and selected – the ILS Integration Preference’s window will look something like this:

Save these settings.  Now, when you open the MarcEditor, you’ll see a new menu item:

This menu item will allow you to search and update/create records.  To find items, click on the menu and select Search.  You’ll get the following window:

If I run a search for Boston, I’ll retrieve 5 results based on the limit set in the Limit textbox:

You can either download all the items by clicking the Download All Items, or you can select the items individually that you want to download, and right click on the Results.  This will give you a menu allowing you to download the records. 

When downloaded, the record will be opened into MarcEdit like the below:

Couple notes about the download.  If the download includes an 852 (and they can) – you’ll want to delete that field, otherwise the field will get duplicated.  Right now, I’m trying to figure out if MarcEdit should just remove the value, or if there is an applicable use case for keeping it. 

Download the record, make the edits that you want to make to the record, and then click the Update/Create option from the Alma window.

When you click the Update/Create – the tool will upload your data to your Alma server.  If there is an error, you’ll receive the returned error message.  If the process was successful, you’ll get an message telling you that the data had been processed. If you are interesting in seeing the resulting XML output – MarcEdit automatically copies the data to the clipboard. 

Couple of notes about the process – in my testing, I found that updating Serials records was spotty.  I’m thinking this might have something to do with permissions – but I’m not positive about that.  I’m hoping to do a bit more investigation – but I wanted to get this out for folks to start playing with it and maybe providing some feedback.

Secondly, there is a holdings API – it would be possible to allow users to modify holdings data via MarcEdit, but I’d need use-cases in order to see how it fits into this process.

I’m sure this will be a process that I’ll be refining over the next few weeks – but in the mean time, I’d welcome any and all comments. 


* I’ll be posting a short youtube video and will update the url here.

District Dispatch: Ask away: Get your copyright questions answered

Mon, 2016-06-06 05:26

Photo by Teddy Mafia via Flickr

Have a question about copyright policies? Library copyright experts will be available during the 2016 American Library Association’s (ALA) Annual Conference in Orlando, Fla. to respond to vexing copyright questions about licensing, fair use, electronic reserves, using music, images and video content, and more. Join copyright leaders during the interactive session “Ask Us Anything: Copyright Open House,” at which participants will have the opportunity to engage copyright experts on all of their copyright concerns. The session takes place on Sunday, June 26, 2016, from 1:00–2:30 p.m. in Orange County Convention Center, in room S329.

The program will include a late breaking copyright policy update from copyright leaders. The session will be a great opportunity to meet copyright geeks keen on helping academic, public and school librarians. The session is co-sponsored by the ALA Committee on Legislation (COL) Copyright Subcommittee. Participants will hear from a number of dynamic copyright experts, including Michael Brewer, head of the Research & Learning Department at the University of Arizona Libraries; Chris Le Beau, assistant teaching professor of the School of Information Science & Learning Technologies, University of Missouri; Laura Quilter, Copyright and Information Policy Librarian at the University of Massachusetts, Amherst; Carrie Russell, program director of the Public Access to Information for the American Library Association’s Office for Information Technology Policy; and Peggy Tahir, Education & Copyright Librarian at the University of California–San Francisco (UCSF) Library.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post Ask away: Get your copyright questions answered appeared first on District Dispatch.

LibUX: Tim Broadwater, UX Architect

Mon, 2016-06-06 04:44

Tim is an artist and front-end developer, presently the UX Architect at West Virginia University Libraries and several times certified by the Nielsen Norman Group. He’s written two articles for LibUX ( “Value vs. Feasibility” / “Why am I doing this to our users?” ), and he’s been super amazing to work with.

I asked to pick his brain about his experience in NN/g’s certification programs and the burgeoning UX degree field – and I am left feeling pretty good about the state of library user experience design.

If you like, you can download the MP3 or subscribe to LibUX on Stitcher, iTunes, Google Play Music, or just plug our feed straight into your podcatcher of choice. Help us out and say something nice. Your sharing and positive reviews are the best marketing we could ask for.

Here are the pulls
  • 3:50 – UX Certifications and the burgeoning UX degree field
  • 9:22 – Are we at peak UX?

I know a handful of professionals who are great web developers and great designers …, and they refer to themselves as UX designers or UX architects, however they have never once conducted any type of usability study or intercept or any type of evaluation that involves their users – they don’t meet their users, ever. I think the term gets broadly applied to where it becomes a buzzword. Tim Broadwater

  • 15:55 – Pitching user research to stakeholders
  • 16:35 – Tim’s case study

We can shoot from the hip over and over and over again and sometimes we get an “okay” success, but … most of the time we get an absolute failure. How do we go forward? We have to make decisions based on user data. … Our target audience is constantly changing so we have to always be able to take the pulse. Tim Broadwater

  • 20:21 – We — Michael and Tim — love the hamburger menu. Unashamedly. And it’s going to be around for years.

I can’t deny [the hamburger menu] affords a certain amount of convenience in terms of design because of the … complexity of maintaining a front-end framework that must be as malleable [as a libraries’ must be] to adapt to so many different kinds of applications and so many different kinds of users. Michael Schofield

  • 28:50 – This has become Navigation UX Talk with Tim and Mike.
  • 34:03 – Left navigation? Ugh! As if!

I think left hanf navigation is kind of a lazy way to deal with your secondary tier navigation. There are so many different options now that are out there. I think what we’re seeing now is that with long scrolling pages and different kind of navigation items or navigations that are sticky, staying on the page, … there are different ways to get to the same information and it’s more important to evaluate what works best for you or your users, as oppose to playing it safe or going with your peers.Tim Broadwater

Why do all higher-education websites look the same? Because we’re all looking at each other’s for peer research! No one is looking at, which has this great search box functionality and I would argue that’s a perfect example for a library website … – and it uses the hamburger icon as well. Tim Broadwater

The post Tim Broadwater, UX Architect appeared first on LibUX.

Patrick Hochstenbach: Portrait of Anna Boulais

Mon, 2016-06-06 04:29
Filed under: Figure Drawings, portaits Tagged: fountainpen, ink, paper, sktchy, twsbi

Hydra Project: Hydra Virtual Connect 2016

Sun, 2016-06-05 12:19
What is Hydra Virtual Connect?

Hydra Virtual Connect (HVC) is an opportunity for Hydra Project participants to gather online to touch base on the progress of community efforts at a roughly halfway point between face-to-face Hydra Connect meetings. Hydra is a growing, active community with many initiatives taking place across interest groups, working groups, local and collaborative development projects, and other efforts, and it can be difficult for community members to keep up with all of this activity on a regular basis. HVC will give the Hydra community a chance to come together to catch up on developments, make new connections, and re-energize itself towards Hydra Connect 2016 in Boston in October.

Suggestions for an event such as this have come from a number of members of the Hydra community, and the idea was further discussed and refined at the Hydra Power Steering meeting in March 2016.

When will Hydra Virtual Connect take place?

Hydra Virtual Connect 2016 will take place on Thursday, July 7 from 11:00 AM – 2:00 PM EDT / 8:00 AM – 11:00 AM PDT / 16:00-19:00 BST / 15:00-18:00 UTC.  Reserve the time slot!!!

Further details

Further details can be found on the HVC wiki page here.

Booking details for the face-to-face Hydra Connect in Boston this October will be announced shortly.

David Rosenthal: He Who Pays The Piper

Fri, 2016-06-03 22:00
As expected, the major publishers have provided an amazingly self-serving response to the EUs proposed open access mandate. My suggestion for how the EU should respond in turn is:
When the EU pays for research, the EU controls the terms under which it is to be published. If the publishers want to control the terms under which some research is published, publishers should pay for that research. You can afford to.;-)

District Dispatch: Upping people’s digital IQ

Fri, 2016-06-03 17:46

OITP’s Larra Clark participates in roundtable discussion, “What’s Your Digital IQ?”

It was my pleasure last week to join the Council of Better Business Bureaus (BBB), Nielsen and the Multicultural Media, Telecom and Internet Council (MMTC) in a roundtable discussion on the importance of digital empowerment. “What’s Your Digital IQ?” was opened by Congressman Gus Bilirakis (R-FL) and former Federal Trade Commissioner Julie Brill talking about the importance of the $1 trillion digital economy and the need for tools to help people be smart online and protect themselves against hackers and scam.

Brill referenced recent analysis from the National Telecommunications and Information Administration (NTIA) citing that a lack of trust in Internet privacy and security may deter online activities. Forty-five percent of online households reported that privacy and security concerns stopped them from conducting financial transactions, buying goods or services, posting on social networks or expressing opinions on controversial or political issues via the Internet. (This last finding reminded me of past research by the Pew Research Center on Social Media and the Spiral of Silence.)

“Consumers need help,” Brill said. “Digital literacy and consumer education are necessary” to address privacy concerns and keep online economic activities humming.

The BBB has begun to take up this charge with its Digital IQ initiative, and Genie Barton, vice president and director of the BBB’s Online Interest-Based Advertising Accountability Program and Mobile Marketing Initiatives, discussed its commitment to building a trusted marketplace. Nicol Turner-Lee, vice president and chief research & policy officer for MMTC, affirmed the need for increasing people’s digital savvy—particularly among communities of color.

Interestingly, when I took the Digital IQ “challenge,” it reminded me a lot of the digital and information literacy that happens in libraries. One question asked if all the useful results of a web search are found on the first two pages. Others asked about http vs. https, giving personal information for loyalty programs, and offered food for thought regarding online advertising.

Libraries have long been champions of the right to privacy and teachers/guides for improving the digital skills of our community members. Librarians in all types of libraries help youth and new Internet users better understand and protect their digital “footprint” and be smarter online. According to the Digital Inclusion Survey, 57 percent of public libraries report they offer training on safe online practices. And PLA’s online hub for digital literacy support and training provides modules on Internet privacy and online scams. (Thanks also to librarians who emailed me before the panel to share some of your new or favorite resources, like the San Jose Public Library Virtual Privacy Lab and 10 Tips for Protecting Your Digital Privacy.)

Pew researcher John Horrigan specifically calls out libraries as part of the solution for increasing digital readiness. “Libraries, who are already the primary curator on programs to encourage digital readiness in many communities, should embrace and expand that role.”

I think the BBB and libraries could do great things together in this space. Are any of you out there already working with your local BBB? Let me know at

The post Upping people’s digital IQ appeared first on District Dispatch.

LITA: Lets look at gender in Library IT

Fri, 2016-06-03 17:42

So. Let’s talk about library technology organizations and gender.

I attended LitaForum 2015 last year, and like many good attendees, I tweeted thoughts as I went. Far more popular in the Twitterverse than anything original I sent out was a simple summary of a slide in a presentation by Angi Faiks, “Girls in Tech: A gateway to diversifying the library workforce.”

The tweet in question was:

That this struck a chord is shocking, presumably, to no one.

The slide that prompted my tweet references a 2009 article by Melissa Lamont that (a) you should read, and (b) briefly presents (among other interesting data) numbers from the 2014-2015 ARL Annual Salary Survey (paywalled).

What is the problem symptom?

Given the popularity of the tweet, I thought I’d dig a little deeper and see what I could find out about Library IT and gender, with the expectation that it would be pretty disappointing.

Spoiler alert: it is.

Before you start thinking, “But…I work in a library, where it’s all mutual respect and a near-perfect meritocracy as far as the eye can see,” well, think again. The overall message I received during conversations on the edges of the conference was that women — especially young women — are often ignored, and their talents squandered, in the higher-tech side of the library world. And when you move away from anecdotes and start looking at the data, well, the numbers are striking and no less upsetting.

  • At the beginning of 2016, Bobbi Newman published a great examination of the LITA Top Tech Trends panelists by sex. Roughly 2/3 of seats between 2003 and 2016 were men. 3/4 of repeat panelists were men.

  • The Lamont article mentioned before — and please, go read it — does some great original research enumerating what is likely a leading indicator: percentage of women authoring papers in library technology journals vs. more generic library journals (with the latter used as a control). First authorship in the higher tech journals goes to women about 34% of the time (JASIS&T is a low outlier with only 28%), while 65% of articles in the control journals have female first authors, mirroring pretty closely the percentage of women librarians in ARL libraries overall.

What are the data?

The numbers in my tweet suffer a bit from an apples-and-oranges comparison, with the ALA gender/race information coming from (wait for it…) the ALA, while the Library IT Heads numbers come from the 2014-2015 ARL statistics (Table 18).

Much (most?) of the IT work in libraries is, of course, done by “off-label” librarians — those hired to do a specific non-IT job, who are then pressed into service to do some programming or sysadmin or whatnot. However, we don’t have numbers for those, so I’m going to up the focus on the US ARL statistics for self-identified library IT departments, partially because I work in an ARL library, and partially because large academic libraries often have an internal, labeled IT department which makes counting easy.

Obviously, I’ve made a decision to give up generality in order to be able to make stronger assertions (e.g., LITA membership breakdown, were it available, might be more appropriate). I’d be very interested in looking at other data (or other slices of these data) if people have any available.

Categorizing Library IT positions

The ARL stats have a number of position categories, four of which obviously relate to Library IT and on which I’m going to focus here.

The leadership position I’ll treat as it’s own thing.

  • Department Head, Library Technology

The other three non-head IT positions I’ll treat as a group, giving this collection the whimsical name Library IT, non-head.

  • Library IT, Library Systems
  • Library IT, Web Developer
  • Library IT, Programmer

There are obviously other jobs that might or might not fit into library IT, depending on how a particular institution is structured. For example, at Michigan we have people who do markup for TEI documents and digitization specialists, neither set of which would obviously fall into one of the above categories. All those folks are part of Library IT on the organization chart at Michigan (and might not be at other places).

Let’s start with the non-head librarians and then look at department heads.

Library IT, non-head positions

61% of all US ARL Librarians are women, but only 29% of US ARL Librarians working in Library IT are women.

Overall, women outnumber men in ARL libraries by a substantial margin. The ARL report notes that, “the largest percentage of men employed in ARL libraries was 38.2% in 1980–81; since then men have consistently represented about 35% of the professional staff in ARL libraries,” (p. 15). That number is closer to 40% when looking at ARL institutions just within the US, as stated above.

So, we’ll call it 40% male librarians overall. How about in library IT?

In Library IT, men outnumber women by 526 to 212, giving us the 29% quoted above. That means there are about two and a half times as many men as women in library IT.

IT in general has been a male-dominated profession for a few decades now. A fairly recent article reports 2013 numbers that show women holding about 26% of jobs in computing, with many Big Name Tech Companies (Google, Facebook, Twitter, etc.) doing significantly worse.

We also don’t know about non-librarians working in library IT (I would be considered one). Given the overall IT statistics, it’s hard to believe that including non-librarians would move the needle toward having more women employees.

So on the one hand, we’re probably doing a very-slightly-less-awful job of bringing in women than the IT world in general. On the other, well, it’s only very slightly less awful, and this in a profession that is majority-female.

Library IT Heads

63% of Department Heads for department other than IT in US ARL libraries are women. About 30% of Library IT Heads are women.

Given the numbers we’re about to look at, it’s worthwhile to note that the majority doesn’t always hold the power, a message driven home by this tweet from Amy Buckland:

The library writ large, then, is female-majority, but not necessary female-dominated. Library IT, of course, is neither female-dominated or female-majority.

First, a broader look. Leadership positions in the wider, non-library IT world in general go overwhelmingly to men. Women hold positions at the CIO level in only about 17% of the Fortune 500. So, the baseline is terrible.

The ARL Stats for 2014-15 (table 30) show 91 US libraries that have a head of Library IT, 27 (30%) of whom are women. That’s about the same as the rank & file IT workers, but far different than the nearly two-thirds of other department heads that are women.

Many people presume this is indicative of what has been called the pipeline problem, the idea that it’s hard to hire women leaders because there aren’t many women coming up, and the lack of women in leadership roles make it harder to recruit women at the lower levels. This is a truth, but certainly not a complete truth.

Sex and salary in Library IT

The good news, such as it is, is that there is (basically) salary parity between men and women at both the IT rank & file and IT head positions.

The bad news is that this is one place where Library IT does better than the library on average. Across the whole library, men make an average of 5% more than women, an inequality that is true at every level of experience (ACRL table 38).

What does it mean?

The numbers give us what, of course, not why. For that explanation, many people initially grab onto the pipeline problem.

“Oh, woe is us white guys trying to do the right thing,” we lament. “We want to hire women and minorities, but none ever apply. There’s a pipeline problem.”

So let’s revisit our friend the pipeline problem. The problem is not just that the pipeline is small. The pipeline leaks.

Rachel Thomas’s article, “If you think women in tech is just a pipeline problem, you haven’t been paying attention” notes right up front that:

According to the Harvard Business Review, 41% of women working in tech eventually end up leaving the field (compared to just 17% of men)

Women leave IT at a much higher rate than other positions. IT can be, in ways large and small, antagonistic to women. Odds are your organization is. For those of you who think otherwise, I challenge you to find a shortish young woman where you work and ask her if she ever feels ignored or undervalued precisely because she’s a shortish young woman. Ask her if she always gets attributed for her ideas. Ask her if her initiatives are given the same consideration as her male colleagues.

What can we do?

Admit that there’s a problem. And then talk about it.

That’s easy to say and crazy-hard to do. The more privilege one has — and I’m a white, well-educated, middle-aged male, so I’ve got privilege up the wazoo — the easier it is to dismiss bias as small, irrelevant, or “elsewhere.”

As to how to get a conversation started, my colleague Meghan Musolff enrolled me to help her with an ingenious plan:

  • Send out an invitation to talk about a diversity-related reading

  • Show up

Our first monthly meeting of what we’re calling the “Tech Diversity Reading Group” ended up drawing about 2/3 of the department (including the boss, who bought pizza) and revolved around the Rachel Thomas pipeline article from above. And yes, the conversation was dominated by men, and yes, there were some nods to, “But this isn’t Silicon Valley so it doesn’t apply to us” or “That doesn’t happen around here, does it?” and, yes, many of the women didn’t feel comfortable speaking out.

We got a lot of feedback, in both directions, but none of it was of the “this isn’t a problem” variety. It wasn’t perfect (or maybe even “good”), but we were there, giving it a shot.

And you can, too.

District Dispatch: Narrowing the Justice Gap

Fri, 2016-06-03 16:51

Photo via Flickr.

While criminal justice issues have been increasingly in the public eye, lack of access to civil legal information and resources is a less well-known challenge that results in people appearing in court without lawyers in critical life matters such as eviction, foreclosures, child custody and child support proceedings, and debt collection cases. According to the Legal Services Corporation (LSC), more than 64 million Americans are eligible for civil legal aid, yet civil legal aid providers routinely turn away 80% of those who need help because of a lack of resources.

LSC is considering how it might increase access and awareness of civil legal information and resources through public libraries. A planning grant is supporting research and input from a diverse advisory committee to inform the development of a training curriculum for public librarians. I was pleased to join Public Library Association President-Elect (and Cleveland Public Library Director) Felton Thomas at the advisory committee meeting with others from state law libraries, university law libraries, legal aid providers already partnering with public libraries, and OCLC to learn more about the justice gap and how libraries may play a role in helping people find the legal information they need to narrow the justice gap.

Fortunately, a significant body of work already exists, including a series of webinars for librarians developed by LSC and; a Public Library Toolkit developed by the American Association of Law Libraries; the Law4AZ initiative developed and delivered by the State Library of Arizona; and collaboration among the Hawaii State Judiciary, the Hawaii State Public Library System, and the Legal Aid Society of Hawaii to expand court forms available online and support librarian training and public seminars.

I’d be glad to hear from readers about their own experiences in this area and/or what you’d like to see in any future training or resources that may be developed. Shoot me a line at

The post Narrowing the Justice Gap appeared first on District Dispatch.

Open Knowledge Foundation: Addressing Challenges in Opening Land Data – Resources Are Now Live

Fri, 2016-06-03 14:31

Earlier this year, Open Knowledge International announced a joint-initiative with Cadasta Foundation to explore open data in property rights with the ultimate goal of defining the land ownership dataset for the Global Open Data Index. Now, we are excited to share some initial, ground-breaking resources that showcase the complexity of working at the intersection of open data advocacy and the property rights space.

Land ownership information, including the Land Registry and Cadastre, are traditionally closed datasets within a pay-for-access system. In these situations, the instinct within the open data community is to default to open. While we believe more openness is vital to our aims of using data to secure property tenure, who can use this open data and for what purpose must also be taken into account. Further, property rights administration systems are highly complex and vary greatly from context to context. The implications of open data in a country where the frequent clash of government, community and private sector interests fosters mutual mistrust are very different from those in countries with established land administration system where most of the population’s property rights are formally documented. Our acknowledgement of these nuances are reflected within our research thus far and has been the foundation of our process to define open data in land rights.



These guides also exemplify the results of a partnership between the open data community and actors with sector-specific expertise. We foresee these resources and the lessons learned providing a framework for cross-sector data explorations as well as specific guidance for the international open data community involved with the Global Open Data Index.

Our initial resources include a comprehensive Overview of Property Rights Data and a Risk Assessment. These two guides are intended to explain what land ownership data is, where it can be found, as well as outline the process that OKI and Cadasta conducted to determine what of this data should be open. All current and forthcoming resources, as well as additional background on this project can be found on Cadasta’s Open Data page

We are actively seeking feedback to inform our research going forward and ensure that this work become core resources within the open data and land rights communities alike. Please reply with your comments and questions on the Discussion Forum or by reaching out to our researcher, Lindsay Ferris, directly at We look forward to hearing from you.

FOSS4Lib Recent Releases: veraPDF - 0.16.2

Fri, 2016-06-03 13:52

Last updated June 3, 2016. Created by Peter Murray on June 3, 2016.
Log in to edit this page.

Package: veraPDFRelease Date: Friday, June 3, 2016

OCLC Dev Network: Server-Side Linked Data Consumption with Ruby

Fri, 2016-06-03 13:30

Learn about how to use Ruby to consume linked data from a specific graph URL.

Hydra Project: OR2017 will be in Brisbane

Fri, 2016-06-03 11:48

Likely of interest to many Hydranauts:

The Open Repositories (OR) Steering Committee in conjunction with the University of Queensland (UQ), Queensland University of Technology (QUT) and Griffith University are delighted to inform you that Brisbane will host the annual Open Repositories 2017 Conference.

The University of Queensland (UQ), Queensland University of Technology (QUT) and Griffith University welcomed today’s announcement that Brisbane will host the International Open Repositories Conference 26-30 June 2017 at the Hilton Brisbane.

The annual Open Repositories Conference brings together users and developers of open digital repository platforms from higher education, government, galleries, libraries, archives and museums. The Conference provides an interactive forum for delegates from around the world to come together and explore the global challenges and opportunities facing libraries and the broader scholarly information landscape.

Eric Lease Morgan: Achieving perfection

Fri, 2016-06-03 09:48

Through the use of the Levenshtein algorithm, I am achieving perfection when it comes to searching VIAF. Well, almost.

I am making significant progress with VIAF Finder [0], but now I have exploited the use of the Levenshtein algorithm. In fact, I believe I am now able to programmatically choose VIAF identifiers for more than 50 or 60 percent of the authority records.

The Levenshtein algorithm measures the “distance” between two strings. [1] This distance is really the number of keystrokes necessary to change one string into another. For example, the distance between “eric” and “erik” is 1. Similarly the distance between “Stefano B” and “Stefano B.” is still 1. Along with a colleague (Stefano Bargioni), I took a long, hard look at the source code of an OpenRefine reconciliation service which uses VIAF as the backend database. [2] The code included the calculation of a ratio to denote the relative distance of two strings. This ratio is the quotient of the longest string minus the Levenshtein distance divided by the length of the longest string. From the first example, the distance is 1 and the length of the string “eric” is 4, thus the ratio is (4 – 1) / 4, which equals 0.75. In other words, 75% of the characters are correct. In the second example, “Stefano B.” is 10 characters long, and thus the ratio is (10 – 1) / 10, which equals 0.9. In other words, the second example is more correct than the first example.

Using the value of MARC 1xx$a of an authority file, I can then query VIAF. The SRU interface returns 0 or more hits. I can then compare my search string with the search results to create a ranked list of choices. Based on this ranking, I am able to more intelligently choose VIAF identifiers. For example, from my debugging output, if I get 0 hits, then I do nothing:

query: Lucariello, Donato hits: 0

If I get too many hits, then I still do nothing:

query: Lucas Lucas, Ramón hits: 18 warning: search results out of bounds; consider increasing MAX

If I get 1 hit, then I automatically save the result, which seems to be correct/accurate most of the time, even though the Levenshtein distance may be large:

query: Lucaites, John Louis hits: 1 score: 0.250 John Lucaites (57801579) action: perfection achieved (updated name and id)

If I get many hits, and one of them exactly matches my query, then I “achieved perfection” and I save the identifier:

query: Lucas, John Randolph hits: 3 score: 1.000 Lucas, John Randolph (248129560) score: 0.650 Lucas, John R. 1929- (98019197) score: 0.500 Lucas, J. R. 1929- (2610145857009722920913) action: perfection achieved (updated name and id)

If I get many hits, and many of them are exact matches, then I simply use the first one (even though it might not be the “best” one):

query: Lucifer Calaritanus hits: 5 score: 1.000 Lucifer Calaritanus (189238587) score: 1.000 Lucifer Calaritanus (187743694) score: 0.633 Luciferus Calaritanus -ca. 370 (1570145857019022921123) score: 0.514 Lucifer Calaritanus gest. 370 n. Chr. (798145857991023021603) score: 0.417 Lucifer, Bp. of Cagliari, d. ca. 370 (64799542) action: perfection achieved (updated name and id)

If I get many hits, and none of them are perfect, but the ratio is above a configured threshold (0.949), then that is good enough for me (even if the selected record is not the “best” one):

query: Palanque, Jean-Remy hits: 5 score: 0.950 Palanque, Jean-Rémy (106963448) score: 0.692 Palanque, Jean-Rémy, 1898- (46765569) score: 0.667 Palanque, Jean Rémy, 1898- (165029580) score: 0.514 Palanque, J. R. (Jean-Rémy), n. 1898 (316408095) score: 0.190 Marrou-Davenson, Henri-Irénée, 1904-1977 (2473942) action: perfection achieved (updated name and id)

By exploiting the Levenshtein algorithm, and by learning from the good work of others, I have been able to programmatically select VIAF identifiers for more than half of my authority records. When one has as many as 120,000 records to process, this is a good thing. Moreover, this use of the Levenshtein algorithm seems to produce more complete results when compared to the VIAF AutoSuggest API. AutoSuggest identified approximately 20 percent of my VIAF identifiers, while my Levenshtein algorithm/logic identifies more than 40 or 50 percent. AutoSuggest is much faster though. Much.

Fun with the intelligent use of computers, and think of the possibilities.

[0] VIAF Finder –

[1] Levenshtein –

[2] reconciliation service –