You are here

Feed aggregator

FOSS4Lib Recent Releases: Sufia - 6.2.0

planet code4lib - Mon, 2015-07-13 17:50

Last updated July 13, 2015. Created by Peter Murray on July 13, 2015.
Log in to edit this page.

Package: SufiaRelease Date: Thursday, July 9, 2015

Eric Hellman: The Library Digital Privacy Pledge

planet code4lib - Mon, 2015-07-13 17:33
I've been busy since my last post! We've created the Free Ebook Foundation, which will be the home for and GITenberg. I helped with the NISO "Consensus Framework to Support Patron Privacy in Digital Library and Information Systems", which I'll write more about soon. And some coding.

But I've also become a volunteer for the Library Freedom Project, run by radical librarian Alison Macrina. The project I'm working on is the "Library Digital Privacy Pledge."

The Library Digital Privacy Pledge is a result of discussions on several listservs about how libraries and the many organizations that serve libraries could work cooperatively to (putting it bluntly) start getting our shit together with regard to patron privacy.

I've talked to a lot of people about privacy in digital libraries, and there's remarkable unity about its importance. There's also a lot of confusion about some basic web privacy technology, like HTTPS. My view is that HTTPS sets a foundation for all the other privacy work that needs doing in libraries.

Someone asked me why I'm so passionate about working on this. After a bit of thought, I realized that the one thing that gives me the most satisfaction in my professional life is eliminating bugs. I hate bugs. Using HTTP for library services is a bug.

The draft of the Library Digital Privacy Pledge is open for comment and improvement  for a few more weeks. We want all sorts of stakeholders to have  a chance to improve it. The current text (July 12, 2015) is as follows: 
The Library Digital Privacy Pledge of 2015 The Library Freedom Project is inviting the library community - libraries, vendors that serve libraries, and membership organizations - to sign the "Library Digital Privacy Pledge of 2015". For this first pledge, we're focusing on the use of HTTPS to deliver library services and the information resources offered by libraries. Building a culture of library digital privacy will not end with this 2015 pledge, but committing to this first modest step together will begin a process that won't turn back.  We aim to gather momentum and raise awareness with this pledge; and will develop similar pledges in the future as appropriate to advance digital privacy practices for library patrons. We focus on HTTPS as a first step because of its timeliness. At the end of July the Let's Encrypt initiative of the Electronic Frontier Foundation will launch a new certificate infrastructure that will remove much of the cost and technical difficulty involved in the implementation of HTTPS, with general availability scheduled for September. Due to a heightened concern about digital surveillance, many prominent internet companies, such as Google, Twitter, and Facebook, have moved their services exclusively to HTTPS rather than relying on unencrypted HTTP connections. The White House has issued a directive that all government websites must move their services to HTTPS by the end of 2016. We believe that libraries must also make this change, lest they be viewed as technology and privacy laggards, and dishonor their proud history of protecting reader privacy. The 3rd article of the American Library Association Code of Ethics sets a broad objective:We protect each library user's right to privacy and confidentiality with respect to information sought or received and resources consulted, borrowed, acquired or transmitted.

It's not always clear how to interpret this broad mandate, especially when the everything is done on the internet. However, one principle of implementation should be clear and uncontroversial: Library services and resources should be delivered, whenever practical, over channels that are immune to eavesdropping. The current best practice dictated by this principle is as following: Libraries and vendors that serve libraries and library patrons, should require HTTPS for all services and resources delivered via the web. The Pledge for Libraries:1. All web services and resources that this library directly controls will use HTTPS by the end of 2015.2. Starting in 2016, this library will not sign or renew any contracts for web services or information resources that do not commit to use HTTPS by the end of 2016. The Pledge for Service Providers (Publishers and Vendors):1. All web services that we (the signatories) control will enable HTTPS by the end of 2015.2. All web services that we (the signatories) offer will require HTTPS by the end of 2016. The Pledge for Membership Organizations:1. All web services that this organization directly controls will use HTTPS by the end of 2015.2. We encourage our members to support and sign the appropriate version of the pledge. Schedule:This document will be open for discussion and modification until finalized by July 27, 2015. The finalized pledge will be published on the website of the Library Freedom Project. We expect a number of discussions to take place at the Annual Conference of the American Library Association and associated meetings.The Library Freedom Project will broadly solicit signatures from libraries, vendors and publishers.In September, in coordination with the Let's Encrypt project, the list of charter signatories will be made announced and broadly publicized to popular media. FAQ Q: What is HTTPS and what do we need to implement it?A: When you use the web, your browser software communicates with a server computer through the internet. The messages back and forth pass through a series of computers (network nodes) that work together to pass messages. Depending on where you and the server are, there might be 5 computers in that chain, or there might be 50, each possibly owned by a different service provider. When a website uses HTTP, the content of these messages is open to inspection by each intermediate computer- like a postcard sent through the postal system, as well as by any other computer that shares a network those computers. If you’re connecting to the internet over wifi in a coffee shop, everyone else in the coffee shop can see the messages, too.

When a website uses HTTPS, the messages between your browser software and the server are encrypted so that none of the intermediate  network nodes can see the content of the messages. It’s like sending sealed envelopes through the postal system.

Your web site and other library services may be sending sensitive patron data across the internet: often bar codes and passwords, but sometimes also catalog searches, patron names, contact information, and reading records. This kind of data ought to be inside a sealed envelope, not exposed on a postcard.

Most web server software supports HTTPS, but to implement it, you’ll need to get a certificate signed by a recognized authority. The certificate is used to verify that you are who you say you are. Certificates have added cost to HTTPS, but the Electronic Frontier Foundation is implementing a certificate authority that will give out certificates at no charge. To find out more, go to Let’s Encrypt.

Q: Why the focus on HTTPS?A: We think this issue should not be controversial and is relatively easy to explain. Libraries understand that circulation information can’t be sent to patron on postcards. Publishers don’t want their content scooped up by unauthorized entities. Service providers don’t want to betray the trust of their customers. Q. How can my library/organization/company add our names to the list of signatories?A. Email us at Please give us contact info so we can verify your participation. Q. Is this the same as HTTPS Everywhere?A. No, that's a browser plug-in which enforces use of HTTPS. Q. My Library won't be able to meet the implementation deadline. Can we add our name to the list once we've completed implementation?A. Yes. Q. A local school uses an internet filter that blocks https websites to meet legal requirements. Can we sign the pledge and continue to serve them?A. Most of the filtering solutions include options that will whitelist important services. Work with the school in question to implement a work-around.

Q. What else can I read about libraries using HTTPS?A. The Electronic Frontier Foundation has published What Every Librarian Needs to Know About HTTPSQ. How do I know if I have implemented HTTPS correctly?A. The developers behind the “Let’s Encrypt” initiative are ensuring that best practices are used in setting up the HTTPS configuration.  If you are deploying HTTPS on your own, we encourage you to use the Qualys SSL Labs SSL Server Test service to review the performance of your implementation.  You should strive for at least a “B” rating with no major security vulnerabilities identified in the scan.

Q. Our library subscribes to over 200 databases only a fraction of them currently delivered via https. We might be able to say we will not sign new contracts but the renewal requirement could be difficult for an academic library like ours. Can we sign the pledge?A. No one is going to penalize libraries that aren’t able to comply 100% with their pledge. One way to satisfy the ethical imperatives of the pledge would be to clearly label for users the small number of insecure library resources that remain after 2016 as being subject to surveillance.

Q. I/We can contribute to the effort in a way that isn’t covered well by the pledges. Can I add another pledge?
A. We want to keep this simple, but we welcome your support. email us with your individualized statement, and we may include it on our website when signatories are announced.

Library of Congress: The Signal: Exploring Web Archiving at the Library of Congress

planet code4lib - Mon, 2015-07-13 13:22

The following is a guest post by Samantha Abrams, an intern for the Web Archiving Team at the Library of Congress.

Madison, Wisconsin’s Lake Mendota, where the The iSchool at UW-Madison sits. Credit: Samantha Abrams

As a library school graduate student, I developed an interest in archives and born-digital objects (content pulled from floppy disks, web pages, Tweets, and on) but I lack practical, professional experience working with these materials. But after my time interning with the Web Archiving Team at the Library of Congress, I am confident in my exposure to a subset of digital materials and to the professional world of web archives: its relationships, its openness, its complexities.

The Library’s Web Archiving Team works to manage and preserve at-risk digital content born from the web – web pages, and yes, social media included and more. The team considers the task of archiving the web from every angle: by working with software, like Openwayback, and developing tools to assist with crawls; considering copyright issues; and building collections that help paint a comprehensive picture of the web as it stands today (or, as it stood yesterday).

Abrams’ “RAD!” notes from a web archiving meeting. Credit: Samantha Abrams

In four weeks, I have learned about the ins and outs of what web archiving really is (and what it can be). At a recent meeting, we discussed the look, and feel, and design of the collections: how can we keep users focused as they interact with the massive collection, yet allow them to discover both related and unrelated content while introducing them to the web of the past? I have spent time cleaning up data in preparation for migration to a new curator tool. And, in what will be my final project with the Library, I have helped lay the groundwork for a Business in America Web Archive. It has been a process of learning and asking questions: web archiving is an emerging and changing field, and the way professionals consider its quirks and processes requires constant readjustment and creative thinking. To be on a team so interested in following those changes as they occur has been as challenging as it has been rewarding.

I have also spent time at the Library getting to know the archival profession on an individual level: person to person, process to process, idea to idea. Early on in my time here, I reached out to archivist Kathleen O’Neill, and asked her if she would be willing to explain the way the Manuscript Division handles the acquisition and processing of born-digital materials. She introduced me to software the Division uses to access content on tangible media, and spoke about the ethical questions this processes often raises. For instance: how do archivists handle uncovering once-deleted files stored on tangible media? I’ve also spoken with Andrew Cassidy-Amstutz, an archivist with the Veteran’s History Project, and he spoke openly about the Project’s goals: reaching out to veterans, and seeking very specific content, which, in turn, leads to a workflow focused on processing digital items in bulk, and pulling as much content as possible, as quickly as possible, from the media donated to the Project. All of my questions have been answered eagerly, with thoughtful recommendations including: You know who you should talk to next about this? You know what I once read about this exact question? Have you heard of this archivist, with this institution? You should reach out to them. And on.

And this, I have realized, has been the most rewarding experience of my time with the Library. I have been introduced to an institution filled with connected, passionate individuals, eager to share their knowledge with those interested in asking about it. The people I have met here have helped introduce me to the archival world as a whole: the way we stand connected, bound by our interest in the same field, in its materials, and in its people. And just like the rest of the Library, the Web Archiving Team is composed of talented individuals, interested in sharing what they know. And these individuals, in turn, contribute to an archival profession that is vast, far-reaching, and eager to share.

LITA: Organizing Library Workflows with Asana

planet code4lib - Mon, 2015-07-13 13:00

As coordinator for non-Roman language cataloging at my library, I have to keep track of several workflows simultaneously without actual fluency in any of the 10+ languages that my section deals with. As a librarian it goes without saying that I’m a big fan of organization and efficiency. So I’ve implemented a free task-based program called Asana in order to keep track of my section’s productivity, statistics, and progress.

Asana was created with the objective of eliminating dependability on email in order to manage projects. Tasks and conversations are all in one place to promote transparency and accessibility, which is extremely valuable when you are on a team of five or more people with multiple established workflows. I’m certain I’m not alone when I say that email can often seem like a void that creates more confusion than clarity when it comes to communicating important work updates. Not everyone that I have to correspond with is well-versed in the proper use and etiquette involved with emailing, which often inspires me to do this:

Ron Swanson, Parks & Recreation.

With Asana, I have the ability as a project manager to create a timeline with due dates, assign particular tasks to people as needed, organize initiatives & meetings, and keep track of progress all in one interface. This also cuts down on the necessity for constant meetings and prevents things from falling by unnoticed in an email thread where there are already twenty-some responses and everyone is using the Reply All button.

I have been primarily using Asana to organize cooperative cataloging projects in my section. My library is a member of several initiatives to connect with other academic institutions (e.g. the CIC) in order to catalog materials on behalf of a fellow library that may not have staff who can create bibliographic records in a particular language or format.

An example of Asana’s interface.

Here, a team member is able to log their progress on tasks assigned to them, keep track of the established timeline, and upload documents like title inventories. Having all this information in one (free!) place makes it easier for a project manager to create reports and to aggregate statistics. I’ve successfully implemented Asana for two cooperative cataloging projects thus far.

Have you used Asana in your library? Do you have a favorite task managing program?


LibUX: 024: Anticipatory Design

planet code4lib - Mon, 2015-07-13 06:57

In this episode, we — Amanda and Michael — are talking about The next design trend is one that eliminates all choices by Anne Quito. Anticipatory design describes design decisions that anticipate what the user is looking for and through which lens, but we are interested in extreme “anticipatory design” through personalization.

Can personal data and context craft a user experience that eliminates choice?

We use the term “interaction cost” to similarly describe what Quinto and Aaron Shapiro call “decision fatigue”.

The irony of creating so much choice for ourselves is that—from our health and diet to finances and fitness—people make bad decisions every day. Little ones that add up over time and, sometimes, big ones that ruin their lives. And even more importantly, people can suffer real consequences from the well-documented phenomenon of decision fatigue.

When there is too much going on, whether on the page or in the totally unrelated context the user is working from, if the cost of dealing with the interface is too great then the user will move on. Improving usability often involves reducing the interaction cost.

The next big breakthrough in design and technology will be the creation of products, services, and experiences that eliminate the needless choices from our lives and make ones on our behalf, freeing us up for the ones we really care about: Anticipatory design.

Tune-in in other ways

Listen to this and other episodes on Stitcher, find us on iTunes, subscribe through RSS, or download the MP3.

In addition to weekly podcasts and articles, I write the Web for Libraries — a newsletter chock-full of data-informed commentary about user experience design, including the bleeding-edge trends and web news I think user-oriented thinkers should know.

Email Address

The post 024: Anticipatory Design appeared first on LibUX.

Peter Sefton: Opal Mining for fun and profit, or how to travel to the CBD all week in Sydney for $27.60 even if you live a long way out

planet code4lib - Sun, 2015-07-12 22:00

When I started working at UTS in the city I assumed I’d be paying $60 a week in train fares commuting from the Blue Mountains using an Opal card. The one way fare is $8.30 but there’s a $15/day cap, and after 8 journeys the ‘Travel Reward’ kicks in and the whole system is free apart from airport stations.

Turns out, most weeks I pay $30 or less. There are lots of sites that will tell you how to accomplish this, but most of them involve hoofing it between train stations, or catching 29 buses in a row to exploit the fact that the opal card has a limited built-in memory. If you’re a long distance Sydney commuter and you work near a bus route where the stops aren’t too far apart then all you need to do is spend a few minutes out of your tea breaks and lunch breaks on Monday and Tuesday hopping on a bus for $2.10. Go a single stop and walk back, or catch a bus to lunch for at least an hour then catch one back.

Me, I walk out the door of the UTS Tower, hop on a bus one stop to Railway Square then walk back via the underpass, up through the DAB cafe, and across the bridge back to the tower. This transit takes more or less the same amount of time as it takes the staff who frequent Knight’s to walk to and from their ristretto fix. Catch a bus to nowhere thrice on Monday and twice on Tuesday and you’re set.

I particularly love how the machine at Springwood (or Faulco, or Katoomba) will tell me on Monday night that my journey home cost $0.40).

Screenshot of my Opal statement showing how to get 8 journeys (not trips) in the first two days of the week for a total weekly bill of $27.50

Now, people have raised eyebrows about this behaviour. One likened my antics to tax avoidance (which I don’t practice, cos I think tax is sharing and my parents taught me sharing is good). Another wondered if I was getting paid enough. Anyway, the transport minister approves for some reason.

Ms Berejiklian said she wanted people to use more transport and was glad they were finding cheaper ways to travel. “I love hearing people tell me ‘I am catching transport more now because it feels like I am not paying for it’”, the minister said.

Maybe she approves of a similar attitude to taxation?

Or maybe it’s Opal made travel more expensive for most commuters. We may never knows, the Herald says that we can’t find out.

Ms Berejiklian had not said before what the government expected to earn under the Opal smartcard. Attempts to prise that information from Transport for NSW using freedom of information laws have been rebuffed.

Anyway, I’m uneasy about this whole thing. I think it’s wrong that you can’t buy cards from (most) stations, only online or at shops and I wonder what all these spurious trips do to the transport planning process. I presume the whole Opal card rollout is part of a push to reduce staff numbers, and to further privatise our infrastructure, because, the NSW government thinks, private is axiomatically better.

Hints and tips:

  • If you’re using buses, always check your balance online - the bus machines are not very reliable, and the system will often miss your tap-off. When they do, always contact customer service via the web and say “I tapped off and it didn’t register” and they fix it.

  • Wait an hour from after you tapped off last time or your short bus trips get joined together into a single journey (unless you go past 4 at which point you overflow the Opal card buffer and start a new journey).

  • Aim for 5 Journeys on the first day and you’ll hit the $15 cap but do remember to tap off for that final $0.40 ride or your last leg won’t count as a journey.

  • After you reach the reward at 8 journeys it’s all free so you don’t need to tap off. I usually do, just so my travel gets counted. I know, I know, they’re gathering data on me, but I assume every time I get off at Faulconbridge, for example, that’s a vote for keeping a decent level of service there.

Am I evil for saving myself over $1000 a year? I’m just off to catch a bus to nowhere, will check the comments when I get back.

Opal Mining for fun and profit, or how to travel to the CBD all week in Sydney for $27.60 even if you live a long way out by Peter (Petie) Sefton is licensed under a Creative Commons Attribution 4.0 International License.

Patrick Hochstenbach: Listening to Jazz

planet code4lib - Sun, 2015-07-12 13:42
Filed under: Sketchbook Tagged: brushpen, jazz, mixedmedia, satchmo, sketch, sketchbook

Cynthia Ng: Cascadiafest: Server JS Afternoon Part 2 Notes

planet code4lib - Sat, 2015-07-11 00:47
Can’t believe it, but it’s the last set of talks: Part 2 of CascadiaJS Server Day afternoon. Michael Lanzetta: Containing the Chaos: Building and Scaling IoT Node Services on Azure Using PaaS and Containerization IoT = Internet of Things; Two problems = Internet (TCP, but devices are flaky in connection, can lose packets; scaling to … Continue reading Cascadiafest: Server JS Afternoon Part 2 Notes

Cynthia Ng: CascadiaFest: Server JS Afternoon Part 1 Notes

planet code4lib - Fri, 2015-07-10 22:55
Hope you didn’t eat too much at lunch. Stay awake for the first part of the CascadiaJS Server Day afternoon. Rebecca Murphey: Deploying Client-Side Apps, 1,000 at a Time A talk about how we’re rethinking the systems we build and deploy client-side web application at her work specifically (not a how-to talk). Context Does ratings … Continue reading CascadiaFest: Server JS Afternoon Part 1 Notes

Cynthia Ng: CascadiaFest: Browser JS Morning Part 1 Notes

planet code4lib - Fri, 2015-07-10 22:36
Our second day at CascadiaFest is Browser side JS. Jana Beck: Data Visualization on the Web Talk today about all the things I wish somebody had told me before embarking on the project of writing a large data visualization library. Most common toolset: D3: Data-Driven Documents + SVG Basics & Best Practices Why D3 + … Continue reading CascadiaFest: Browser JS Morning Part 1 Notes

Cynthia Ng: CascadiaFest: Server JS Morning Part 1 Notes

planet code4lib - Fri, 2015-07-10 21:49
A little sleepy this morning, but I’m certain the talks will help wake us up with the first part of CascadiaJS Server Day. Jennifer Wong: I Think I Know What You’re Talking About, But I’m Not Sure Sometimes when I”m conversation with other developers, it’s … what? “There are two hard things in computer science. … Continue reading CascadiaFest: Server JS Morning Part 1 Notes

Cynthia Ng: CascadiaFest: Browser JS Morning Talks Part 2 Notes

planet code4lib - Thu, 2015-07-09 19:53
The second half of the morning talks for CascadiaJS Browser Day. Martin Gontovnikas: Death to Cookies: Long Live JSON Web Tokens How does the web work now? Start with browser and server. To buy something, you have to log in. POST (to server) /users/login with username and password Creates a User session (at server) Returns … Continue reading CascadiaFest: Browser JS Morning Talks Part 2 Notes

Brown University Library Digital Technologies Projects: ORCID and the Humanities

planet code4lib - Thu, 2015-07-09 17:49

ORCID recently announced  integration with the MLA International Bibliography.

We are delighted to announce that, as of June 17, the Modern Language Association’s prestigious MLA International Bibliography connects to ORCID.  The Bibliography joins other repositories in supporting discoverability through use of digital identifiers, and is the first primarily focused on the humanities to integrate ORCID.

ACRL TechConnect: How is programming work supported (or not…) by administrators in libraries?

planet code4lib - Thu, 2015-07-09 16:47

[Editor’s Note:  This post is part of a series of posts related to ACRL TechConnect’s 2015 survey on Programming Languages, Frameworks, and Web Content Management Systems in Libraries.  The survey was distributed between January and March 2015 and received 265 responses.  The first post in this series is available here.]

In our last post in this series, we discussed how library programmers learn about and develop new skills in programming in libraries.  We also wanted to find out how library administrators or library culture in general does or does not support learning skills in programming.

From anecdotal accounts, we hypothesized that learning new programming skills might be impeded by factors including lack of access to necessary technologies or server environments, lack of support for training, travel or professional development opportunities, or overloaded job descriptions that make it difficult to find the time to learn and develop new skills.  While respondents to our survey did in some cases indicate these barriers, we actually found that most respondents felt supported by their administration or library to develop new programming skills.

Most respondents feel supported, but lack of time is a problem

The question we asked respondents was:

Please describe how your employing institution either does or does not support your efforts to learn or improve programming or development skills. “Support” can refer to funding, training, mentoring, work time allocation, or other means of support.

The question was open-ended, enabling respondents to provide details about their experiences.  We received 193 responses to this question and categorized responses by whether they overall indicated support or lack of support.  74% of respondents indicated at least some support for learning programming by their library administration, while 26% report a lack of support for learning programming.

Of those who mentioned that their administration or supervisors provide a supportive environment for learning about programming, the top kind of support mentioned was training, closely followed by funding for professional development opportunities.  Flexibility in work time was also frequently mentioned by respondents.  Mentoring and encouragement were mentioned less frequently.


However, even among those who feel supported in terms of funding and training opportunities, respondents indicated that time to actually complete training or professional development, is, in practice, scarce:

Work time allocation is a definite issue – I’m the only systems librarian and have responsibilities governing web site, intranet, discovery layer, link resover, ereserve system, meeting room booking system and library management system. No time for deep learning.

Low staffing often contributes to the lack of time to develop skills, even in supportive environments:

They definitely support developing new skills, but we have a very small technology staff so it’s difficult to find time to learn something new and implement it.

Respondents indicated the importance to their employers of aligning training and funding requests with current work projects and priorities:

I would be able to get support in terms of work time allocation, limited funding for training. I’m limited by external control of library technology platforms (centrally administrated), need to identify utility of learning language to justify training, use, &c.

26% of respondents indicate a lack of support for learning programming

Of those respondents who indicated that their workplace is not supportive of programming professional development or learning opportunities, lack of funding and training was the most commonly cited type of support that respondents found lacking.

Lack of  Funding and Training

The main lack of support comes in the form of funding and training. There are few opportunities to network and attend training events (other than virtually online) to learn how to do my job better. I basically have to read and research (either with a book or on the web) to learn about programming for libraries.

Respondents mentioned that though they could do training during their work hours, they are not necessarily funded to do so:

I am given time for self-education, but no formal training or provision for formal education classes.

Lack of Mentoring / Peer Support

Peer support was important to many respondents, both in supportive and unsupportive environments.  Many respondents who felt supported mentioned how important it was to have colleagues in their workplace to whom they can turn to get advice and help with troubleshooting.  Comments such as this one illustrate the difficulty of being the only systems or technology support person in one’s workplace:

They are very open to supporting me financially and giving me work time to learn (we have an institutional license to and they have funded off site training), but there is not a lot of peer support for learning. I am a solo systems department and most of our campus IT staff are contractors, so there is not the opportunity for a community of colleagues to share ideas and to learn from each other.

Understaffing / Low Pay for Programming Skills

Closely related to the lack of peer support, respondents specifically mentioned that being the only technical staff person at their institution can make it difficult to find time for learning, and that understaffing contributes to the high workload:

There’s no money for training and we are understaffed so there’s no time for self-taught skills. I am the only non-Windows programmer so there’s no one I can confer with on programming challenges. I learn whatever I need to know on the fly and only to the degree it’s necessary to get the job done.

I’m the only “tech” on site, so I don’t have time to learn anything new.

One respondent mentioned that pay for those with programming skills is not competitive at his or her institution:

We have zero means for support, partially due to a complex web of financial reasons. No training, little encouragement, and a refusal to hire/pay at market rates programming staff.

Future Research and Other Questions

As with the first post in this series, the analysis of the data yields more questions than clear conclusions.  Some respondents indicated they have very supportive workplaces, where they feel like their administration and supervisors provide every opportunity to develop new skills and learn about the technologies they want to learn about.  Others express frustration with the lack of funding or ability to collaborate with colleagues on projects that require programming skills.

One question that requires a more thorough examination of the data is whether those whose jobs do not specifically require programming skills feel as supported in learning about programming as those who were hired to be programmers.  30% of survey respondents indicated that programming is *not* part of their official job duties, but that they do programming or similar activities to perform job functions.  Initial analysis indicates there is no significant difference between these respondents and respondents as a whole.  However, there may be differences in support based on the type of position one has in a library (e.g., staff, faculty, or administration), and we did not gather that information from respondents in this survey.  At least two respondents, however, indicates that this may be the case at least at some libraries:

Training & funding is available; can have release time to attend; all is easier for librarians to obtain than for staff to obtain which is sad since staff tend to do more of the programming

Some staff have a lot of support, some have nill, it depends on where/what project you are working on.

In the next (and final) post in this series, we’ll explore some preliminary data on popular programming languages in libraries, and examine how often library programmers get to use their preferred programming languages in their work.

HangingTogether: WorldCat’s smallest and largest worksets

planet code4lib - Thu, 2015-07-09 16:01


Most titles are published only once—with no subsequent editions, no translations into other languages. At OCLC we refer to such titles as they appear in WorldCat as “singleton worksets.” And there are a lot of them.

How many? My colleague Jenny Toves provided statistics. WorldCat has 207 million worksets, and 80% are singletons.  The accompanying pie chart shows the percentage of WorldCat worksets with one, two, three, four and five or more “manifestations” – those with various reproductions, editions, translations, etc.

That pie sliver for “five or more” manifestations masks that there are also huge worksets. Thirty-one thousand worksets in WorldCat include100 or more manifestations. Dante’s La Divina Commedia is the largest, with 6,875 manifestations. The snippet from the WorldCat display of Bunyan’s The Pilgrim’s Progress at the top of this blog post represents the fourth largest workset. Care to guess what works comprise the other eight of WorldCat’s Ten Largest Worksets?

Check your answers with the list below.

WorldCat’s Ten Largest Worksets

  1. La Divina Commedia by Dante Alighieri
  2. The Whole Book of Psalmes by John Hopkins, Thomas Sternhold
  3. The Life and Adventures of Robinson Crusoe by Daniel Defoe
  4. The Pilgrim’s Progress by John Bunyan
  5. The Vicar of Wakefield by Oliver Goldsmith
  6. Paradise Lost and Paradise Regained by John Milton
  7. Commentarii de bello Gallico by Julius Caesar
  8. Pride and Prejudice by Jane Austen
  9. Les Aventures de Télémaque by François de Salignac de La Mothe-Fénelon
  10. Treasure Island by Robert Louis Stevenson

About Karen Smith-Yoshimura

Karen Smith-Yoshimura, program officer, works on topics related to renovating descriptive and organizing practices with a focus on large research libraries and area studies requirements.

Mail | Web | Twitter | More Posts (60)

Library of Congress: The Signal: We Welcome Our Email Overlords: Highlights from the Archiving Email Symposium

planet code4lib - Thu, 2015-07-09 13:45

This post is co-authored with Erin Engle, a Digital Archivist in the Office of Strategic Initiatives.

Despite the occasional death knell claims, email is alive, well and exponentially thriving in many organizations. It’s become an increasingly complex challenge for collecting and memory institutions as we struggle with the same issues: How is email processed differently from other collections? Are there donor issues specific to email? What are the legal or regulations surrounding email records for cultural heritage institutions? Are there standard preservation file formats for email? How can we make email archives available for research?

Archiving Email Symposium. Photo courtesy of Erin Engle.

On June 2, 2015, the Library of Congress and the National Archives and Records Administration co-hosted the Archiving Email Symposium at the Library to share information about the state of practice in accessioning and preserving email messages and related attachments. The approximately 150-person audience included a wide range of practitioners, from technologists and software developers, librarians, curators, records managers, lone arranger archivists and academics, and representatives from large federal agencies with many thousands of employees as well as grant funding programs including the National Endowment for Humanities, Institute for Museum and Library Services and National Historical Publications and Records Commission. In addition, we hosted an informal workshop on June 3 with a subset of participants to discuss issues and challenges identified during the Symposium in order to better define the gaps in our tools, processes and polices for archiving email collections.

In this first post in a series about the event, we’ll cover the overarching themes of the Symposium. Future posts will go into more depth about each of the four perspectives described below, which will include links to webcasts of the presentations, and a summary of the June 3 workshop.

The idea for this project first took root last August when we gathered an informal group of practitioners to share our collective but disparate work to preserve email. This led to the formation of the (again informal) Email Interest Group which initiated a series of online discussions and tool demonstrations from projects including Stanford Library’s ePADD project, Harvard’s Electronic Archiving System, the Smithsonian Institution Archives and the Rockefeller Archive Center coordinated CERP project and more. The strong attendance and engagement during these meetings demonstrated a significant and sustained interest in the multifaceted problems of email preservation from a variety of perspectives including selection, processing, accessioning, format identification and normalization and long-term preservation and use.

Clearly, we were onto something. The “email problem” had legs as the saying goes. Online discussion is great but sometimes, a face-to-face meeting is in order to  investigate more deeply the issues and to network with others working in the same space. As we started working on the agenda, our program committee helped us bring four different perspectives on the email problem into focus. The full agenda (PDF)  lists speakers for each perspective.

• The Technical perspective looked at institutional approaches to processing and archiving email in which presenters discussed the reasons and approaches for the normalization of email archives (or the considerations of when normalization might be appropriate), strategies for PII and other redaction needs, tools for providing patron access, repository needs including ingest requirements, and workflow selections for implementing specific technical email archiving solutions.

• The Archival perspective focused on practical approaches to accessioning and processing email from “boots on the ground” archivists who presented lessons learned, including real life challenges and successes stories, to help participants understand how policies and decision-making practices were applied to accessioning and processing email archives.

• The Records Management perspective considered the challenges of “email as a record” including technological barriers, legal mandates and retention periods.

• The Policy and Guidelines Development perspective included Institutional approaches to how private, public and state government institutions are managing email to not only maximize long-term research value but also to comply with technical, legal, access and intellectual policy issues in processing email archives.
These information-packed sessions were bracketed at the start by welcoming remarks from senior leaders from both hosting institutions and at the end of the day with a thought provoking summary by Chris Prom, assistant university archivist and assistant professor of library administration at the University of Illinois Urbana-Champaign and author of the guide to email preservation (PDF) for the Digital Preservation Coalition’s Technology Watch Report series.

By all accounts (including the Twitter hashtag #ArchEmail), the Symposium was a rousing success. So yes, we welcome our email overlords with open arms. And we should – we are all already under email’s thumb. It’s not going anywhere except into our respective repositories and archives. Let’s continue the conversation so we can learn from each other how to manage these substantial and challenging issues.

Next up in this series on the Archiving Email Symposium, an in-depth look at the institutional approaches to processing and archiving email from the Technical perspective.

LITA: Online Surveys in Libraries: Getting Started

planet code4lib - Thu, 2015-07-09 13:00

Editor’s Note: This is part one of a two-part guest post on survey use in libraries by Celia Emmelhainz.

Surveys are everywhere. You go to a government website, a vendor’s blog, an organization’s page, or step into a building: “We just want a few minutes of your time.” A scattering of survey requests linger in my email: ACRL, RDA, data librarians, IndieGoGo, four campus programs, the International Librarians’ Network, Thompson Reuters, and Elsevier. And that’s just the past month!

Then, when you try to actually open a survey, there are tiny little buttons: you have a large screen, but you can’t manage to hit any of them. There are pages and pages of Likert scales. Do they want your life’s story, told in rankings of five items and slider bars? They definitely want you to brainstorm for them, but who has time to think of the top 15 libraries in the world, ranked by specialization?

On Using Surveys Well 

If I sound skeptical of surveys, it’s because I am: People are over-surveyed. Organizations repeatedly survey-blast the same users, not caring about the value of each person’s time. Samples aren’t representative; results aren’t analyzed—we just present pie charts and summary graphs as if that’s all we can do. We use them to justify our existence, not to understand the word or improve services. In the hands of the wrong person, surveys can be deceptive tools.

And yet, I find mixed-method surveys to be tremendously useful for librarians, particularly if we’re exploring a new area on which there’s little to no data in the existing LIS literature. As Dwight B. King, Jr. writes for librarians:

Focus groups are effective in drawing out users’ true feelings, but because the group is small, it is difficult to make generalizations… Interviews are good for obtaining in-depth information, but… can be very time-consuming. Survey questionnaires are often the best choice for ‘an economical method to reach a large number of people’ with a large number of questions.”

So, surveys: use them with care. Make sure they’re necessary, and well used. Ideally we should be moving to well-designed national surveys on library issues, at no cost to local libraries, plus occasional targeted surveys at the local level.

But there is still a role for local surveys. And so, I’ll talk here about how I’ve used various survey tools in libraries, and end with some advice for when you create your own survey.

Choosing a Survey Tool

I’ve worked with SurveyMonkey, LibSurveys, SurveyGizmo, Google Forms, and Qualtrics. Most have a free/student option or trials, but institutional accounts offer many more features.

Google Forms: Free to anyone with a google account. It’s easy to create forms in Google Drive. I’d use short Google Forms to gather librarian preferences on an issue, as a pre-survey for library instruction to gauge student interest in various topics, or for thoughts from people who are using our trial databases. You’re not going to be able to do a lot of analysis, so keep it short and sweet, and download a summary report in PDF. You can also send responses to Google Sheets to analyze, and/or download to Excel from there.

SurveyMonkey: I’ve used the free accounts, which allow 100 responses, as well as paid accounts. This is a great tool if you’re starting small, and just learning to design and analyze surveys. I’ve used a paid subscription to survey different sets of students or faculty, and have also used it for pre/post surveys of library instruction. It’s easy to filter results by date and only download the responses you need, so you can e.g. put a feedback form and just select the current day’s batch to download.

LibSurveys: As part of LibApps, Springshare offers LibSurveys, including both simple forms and longer surveys. The interface is meant to be simple, but adding and adjusting fields (questions) is a somewhat buggy process. Once you’ve collected responses, you can view answers by question and download to CSV. Play with it if you’ve got access to it, but let me be honest; it’s not my fave! I’d rather see Springshare integrate with one of the other survey options listed here.

SurveyGizmo: This is easily my favorite. I’ve surveyed students, teachers/faculty, and librarians at school and university libraries, done usability surveys for websites, collected reference data (before I had access to Springshare), and even surveyed 385 young recent MLIS grads about their experiences in the job market last year. I find the interface and layout attractive and easy to use, and the reports and exports also easy to use. For more advanced users, you can clean data, code textual results, and even analyze data online using cross-tab reports.

Qualtrics: The institutional subscription is wonderful but expensive, so you won’t be using it unless your library has access to a university subscription. This is a sophisticated piece of survey software that allows for detailed ‘skip logic’ (adjusting the next questions based on prior responses, to keep all the questions relevant) and survey layout. I’m just getting started in using this, through a Qualtrics working group on our campus.

If you’ve used surveys, I’d love to hear in the comments about which tools or projects have and haven’t worked for you!

Celia Emmelhainz is the social sciences data librarian at the Colby College, and leads a collaborative blog for data librarians at She has worked on library ethnography and survey projects, and currently studies qualitative data archiving, data literacy, and global information research. Find her at @celiemme on twitter, or in the Facebook databrarians group.

Peter Murray: Thursday Threads: Battles over strong encryption, IPv4 addresses exhausted while IPv6 surges

planet code4lib - Thu, 2015-07-09 10:43
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

Two articles in each of two threads this week:

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

If Strong Encryption is Outlawed…

Later this year the [U.K.] government intends to introduce legislation that will ensure that any form of communication, whether it’s an email, text message, or video chat, can always be read by the police or intelligence services if they have a warrant.

Few would disagree with the idea that criminals shouldn’t be allowed to plot in secret. But in reality there are huge technical, legal, and moral problems with what the British government wants to do, setting it on a collision course with both the tech industry and privacy campaigners.

The impossible war on encryption, by Steve Ranger, ZDnet, 8-Jul-2015

[U.S.] Federal law enforcement officials warned Wednesday that data encryption is making it harder to hunt for pedophiles and terror suspects, telling senators that consumers’ right to privacy is not absolute and must be weighed against public-safety interests.

The testimony before the Senate Judiciary Committee marked the latest front in a high-stakes dispute between the Obama administration and some of the world’s most influential tech companies, moving the discussion squarely before Congress.

FBI, Justice Dept. take encryption concerns to Congress, by Eric Tucker, Associated Press via The Washington Post, 8-Jul-2015

When I was in my teens, I saw this written on a bathroom stall: “If freedom is outlawed, only outlaws will be free.” The same idea is being applied to strong encryption. These two articles come from many published in the recent weeks over the regulation and use of encryption technologies. I don’t envy the task of law enforcement in an age where technology makes covert communication easier. I would have thought, though, that at least the U.S. government learned from the Clipper Chip fiasco of the 1990s. Encryption is based on mathematical principles. Mathematical principles are not subject to legislation. You might make it illegal to publish encryption algorithms, but you cannot make it illegal for someone to think about encryption algorithms. And who will have a vested interest in having people think about encryption algorithms? If strong encryption is outlawed…

Allocations of IPv4 Internet Addresses Now Restricted; It’s a Good Thing IPv6 is Finally Here

Remember how, a decade ago, we told you that the Internet was running out of IPv4 addresses? Well, it took a while, but that day is here now: Asia, Europe, and Latin America have been parceling out scraps for a year or more, and now the ARIN wait list is here for the US, Canada, and numerous North Atlantic and Caribbean islands. Only organizations in Africa can still get IPv4 addresses as needed. The good news is that IPv6 seems to be picking up the slack.
ARIN, the American Registry for Internet Numbers, has now activated its “IPv4 Unmet Requests Policy.” Until now, organizations in the ARIN region were able to get IPv4 addresses as needed, but yesterday, ARIN was no longer in the position to fulfill qualifying requests. As a result, ISPs that come to ARIN for IPv4 address space have three choices: they can take a smaller block (ARIN currently still has a limited supply of blocks of 512 and 256 addresses), they can go on the wait list in the hopes that a block of the desired size will become available at some point in the future, or they can transfer buy addresses from an organization that has more than it needs.

It’s official: North America out of new IPv4 addresses, by Iljitsch van Beijnum, Ars Technica, 2-Jul-2015

It is now three years since World IPv6 Launch, and solid growth in global IPv6 adoption continues at a steady pace.

With over 17% of the country’s end-users actively using IPv6, the United States continues to be a dominant force in IPv6 traffic levels and adoption, with the top three U.S. broadband operators and all four of the top U.S. mobile operators actively rolling out IPv6 to their end-users. Other countries including Germany, Belgium, Japan, and Peru continue to have solid IPv6 traffic growth, and network operators in additional countries including Brazil, Saudi Arabia, Portugal, Estonia, and Greece have started large-scale IPv6 deployments to end-users.

Three years since World IPv6 Launch: strong IPv6 growth continues, by Erik Nygren, The Akamai Blog, 8-Jun-2015

I do remember when IPv6 made it through the IETF processes and became a standard. It was roughly just after the point where it was collectively decided that the 7-layer OSI network model had lost out to TCP/IP. (Okay, that was a bunch of geek — this was all getting hashed out in the mid-1990s.) Needless to say, actual implementation of the next version of the rules by which machines communicate with each other on the internet has been coming for a long time.

Is this something to worry about? Probably not — there are a bunch of really smart people making sure that the internet appears to work tomorrow just like it does today. (If you are technically minded, check out the latter half of the Akamai blog post — it has all sorts of interesting details about bridging IPv6 to IPv4 as we start to contemplate a world where IPv6 dominates.) One warning: if your work deals with “dotted quads” like, then you have a whole new addressing scheme to get used to.

Link to this post!

Terry Reese: MarcEdit OSX Preview Build Update

planet code4lib - Thu, 2015-07-09 03:58

This build is a continued refinement of the preview build.  It really doesn’t include anything that is significantly new, but addresses a couple of early gaps folks had noticed while working with the tool.  Change log is below.

Download URL:
Direct URL:


1.0.8 **************************** ** 1.0.8 ChangeLog **************************** * Bug Fix: Field Count -- When clicking on a field to retrieve information about specific indicator/subfield usage, an error would be thrown. This has been corrected. * Enhancement: Main Menu -- Added a Windows menu to the MarcEdit OSX main window to make it easier to get back to windows that might have been hidden. * Enhancement: Main Menu/Help/Help -- Linked to the Online Help * Enhancement: Main Menu/Help/Report Bug/Suggestion -- Linked to the MarcEdit online reporting tool. * Enhancement: Main Menu/Help/About Author -- Linked to online contact information. * Enhancement: Join MARC Records -- Added an Edit File button so that users can move directly from Joining files together to editing the data in the MarcEditor. * Enhancement: MarcEditor -- Exposed the mrc extension so that users can now open mrc files directly into the MarcEditor. This isn't quite as smooth as the Windows version yet, but its getting there. * Enhancement: MarcEditor/Reports/Validate ISSNs -- Exposed the Validate ISSNs function.

Cynthia Ng: CascadiaFest: CSS Afternoon Part 2 Notes

planet code4lib - Thu, 2015-07-09 01:04
The last talks of the CSS day and first day of CascadiaFest. Clarissa Peterson: Responsive Color Remember the days when they would say color is done by Technicolor? e.g. Merry Melodies, Wizard of Oz Was uncommon to have color TV. Computer monitors were similar. Started in one colour, then 256 colours. Movies and TV, colour … Continue reading CascadiaFest: CSS Afternoon Part 2 Notes


Subscribe to code4lib aggregator