You are here

Feed aggregator

William Denton: Lehman Libraries

planet code4lib - Wed, 2014-10-15 02:25

This summer I spied and acquired a copy of The Foolish Gentlewoman by Margery Sharp, who also wrote Cluny Brown and The Rescuers and sequels (none of which I’ve read). It’s the 1948 Canadian edition, published by Wm. Collins Sons & Co., Canada, at 70 Bond Street in Toronto (the Collins in HarperCollins).

What caught my eye was this, on the front endpapers:

I suppose Lehman Libraries was a private subscription library, but I’ve never heard of it and a quick search online didn’t turn anything up. If anyone knows anything about it I’d be happy to hear.

SearchHub: Stump the Chump is Coming to D.C.!

planet code4lib - Tue, 2014-10-14 21:38

In just under a month, Lucene/Solr Revolution will be coming to Washington D.C. — and once again, I’ll be in the hot seat for Stump The Chump.

If you are not familiar with “Stump the Chump” it’s a Q&A style session where “The Chump” (That’s Me!) is put on the spot with tough, challenging, unusual questions about Lucene & Solr — live, on stage, in front of hundreds of rambunctious convention goers, with judges who have all seen and thought about the questions in advance and get to mock The Chump (still me) and award prizes to people whose questions do the best job of “Stumping The Chump”.

People frequently tell me it’s the most fun they’ve ever had at a Tech Conference — You can judge for yourself by checking out the videos from last years events: Lucene/Solr Revolution 2013 in Dublin, and Lucene/Solr Revolution 2013 in San Diego.

I’ll be posting more details in the weeks ahead, but until then you can subscribe to this blog (or just the “Chump” tag) to stay informed.

And if you haven’t registered for Lucene/Solr Revolution yet, what are you waiting for?!?!

FOSS4Lib Recent Releases: Umlaut - 4.0

planet code4lib - Tue, 2014-10-14 20:09
Package: UmlautRelease Date: Monday, October 6, 2014

Last updated October 14, 2014. Created by Peter Murray on October 14, 2014.
Log in to edit this page.

This release is mostly back-end upgrades, including:

  • Support for Rails 4.x (Rails 3.2 included to make migration easier for existing installations, but recommend upgrading to Rails 4.1 asap, and starting with Rails 4.1 in new apps)
  • Based on Bootstrap 3 (Umlaut 3.x was Bootstrap 2)
  • internationalization/localization support
  • A more streamlined installation process with a custom installer

LITA: Midwinter Workshop Highlight: Meet the UX Presenters!

planet code4lib - Tue, 2014-10-14 20:08

We asked our LITA Midwinter Workshop Presenters to tell us a little more about themselves and what to expect from their workshops in January. This week, we’re hearing from Kate Lawrence, Deirdre Costello, and Robert Newell, who will be presenting the workshop:

From Lost to Found: How User Testing Can Improve the User Experience of Your Library Website
(For registration details, please see the bottom of this blog post)

LITA: We’ve seen your formal bios but can you tell us a little more about you?

Kate: If I didn’t work as a user researcher, I would be a professional backgammon player or cake decorator (I am a magician with fondant!). Or both.

Deirdre: I’m horse crazy!

Robert: In a past life I was a professional actor. If you pay really really close attention (like, don’t blink), you might spot me in a few episodes of Friday Night Lights or Prison Break.

LITA: User Testing is a big area. Who is your target audience for this workshop?

Presenters: This is a perfect workshop for people who want to learn user testing in a supportive environment. We will teach people how to test their websites in the real world – we understand that time and other resources are limited. This is for anyone who wants to know what it’s like for patrons to try accessing their library’s resources through their website.

LITA: How much experience with UX do attendees need to succeed in the workshop?

Presenters: Experience isn’t required, but an understanding of the general UX field and goals is useful. Attendees are encouraged to come with a potential usability study topic in mind. From Robert: “You just need to be able to put your social scientist hat on and look at user testing as an informal (and fun!) psychology experiment.”

LITA: If your workshop was a character from the Marvel or Harry Potter universe, which would it be, and why?

Kate: Having just read the Harry Potter series with my two kids, I can say that our workshop will inspire like Dumbledore, give you a chuckle like those naughty Weasley twins, teach you like the astute Minerva McGonagle would, and leave you smiling with satisfaction just like the brilliant Hermione Grainger.

Deirdre: Marvel: definitely Wolverine. Tough and sassy with a heart of gold, calls everyone “bub.” Harry Potter: 100% Hermione. I’m an avid reader, rule-follower and overachiever. (LITA note, I think those are of Dierdre, maybe not the workshop ? )

Robert: I’m gonna say Mystique. Mystique can literally put herself in someone else’s shoes (human or Mutant). When we conduct usability testing, we’re directly observing what it’s like to be in the user’s shoes and we’re seeing things from their perspective.

LITA: Name one concrete thing your attendees will be able to take back to their libraries after participating in your workshop.

Kate: The knowledge about how to conduct a user test on their library site, a coupon for a free test from, and support and encouragement from a team of experienced researchers.

Deirdre: The skills to plan, recruit for and execute small-sample usability tests. The ability to communicate the findings for those tests in a way that will advocate for their users.

Robert: The ability to validate your ideas about your website with direct, reliable user feedback. Whenever you think, “This might work, but would it make sense to our users?” You’ll have the skills and tools to go find out.

LITA: What kind of gadgets/software do your attendees need to bring?

Presenters: Whatever note taking method you prefer; a laptop or mobile device to follow along is recommending but isn’t required. Kate recommends “A laptop. A pen and paper. A positive, can-do attitude!”

LITA: Respond to this scenario: You’re stuck on a desert island. A box washes ashore. As you pry off the lid and peer inside, you begin to dance and sing, totally euphoric. What’s in the box?

Kate: I’m assuming my family is on the island with me, and in that case – I want that box to contain Hershey’s hugs, the white chocolate kisses with milk chocolate swirls. I’m obsessed!

Deirdre: Hostess Orange Cupcakes.

Robert: A gallon of Coppertone Oil Free Faces SPF 50+ Sunscreen. I’m sorry but I’m fair skinned with a ton of freckles and a desert island scenario just screams melanoma to me.

Thank you to Kate, Deirdre, and Robert for giving us this interview! We’re looking forward to their UX Workshop at Midwinter in Chicago. We’ll hear from our other workshop presenters in the coming weeks!

More information about Midwinter Workshops. 

Registration Information: LITA members get one third off the cost of Mid-Winter workshops. Use the discount promotional code:  LITA2015 during online registration to automatically receive your member discount.  Start the process at the ALA web sites: Conference web site: Registration start page: LITA Workshops registration descriptions: When you start the registration process and BEFORE you choose the workshop, you will encounter the Personal Information page.  On that page there is a field to enter the discount promotional code:  LITA2015 As in the example below.  If you do so, then when you get to the workshops choosing page the discount prices, of $235, are automatically displayed and entered.  The discounted total will be reflected in the Balance Due line on the payment page. Please contact the LITA Office if you have any registration questions.

District Dispatch: Free Wi-fi in the Allegheny Mountains

planet code4lib - Tue, 2014-10-14 19:13

Allegheny Mountains. Photo by Nicholas A. Tonelli via flickr.

Last week, Emily Sheketoff, executive director of the American Library Association (ALA) Washington Office, Cathleen Bourdon associate executive director of ALA Communication and Member Relations, and I (staff lackey) took a road trip to the Snowshoe resort in West Virginia to speak at the West Virginia Library Association Conference. The five-hour drive from D.C. to Snowshoe, W.V., was a pastoral treat, with fall leaves at their peak in the Allegheny Mountains.

There was a gas station in Warrensville where a gallon was only $3.09! The folksy diner there served a grilled cheese sandwich for $2.50. We saw a lot of cows (which is a big deal for folks who live in cities and rarely leave their offices). Emily’s theory that pending rainfall could be determined by whether a cow was standing or laying down on the ground proved to be inconclusive.

Once we got to Snowshoe, we experienced firsthand the difficulties a rural state like West Virginia have with access to broadband. We were assured prior to the trip that Wi-Fi was free, but upon arrival learned that that meant free at the Starbucks (which closes at 4pm). AT&T and T-Mobile were the only cellular networks supported. Because of the Robert C. Byrd Green Bank Telescope and potential interference with its operation, a large swath of land surrounding the area requires that all radio transmissions be severely limited. Check out the West Virginia Broadband map to see for yourself. Library-wise, over 65 percent of West Virginia libraries still require increased broadband based on the Digital Inclusion Survey.

For those of us suffering digital overload, this might not seem too bad. Cheap gas, low cost grilled cheese sandwiches, and beautiful mountains sound great, so who needs broadband? Everyone. In today’s connected world, how can people succeed without broadband?

The post Free Wi-fi in the Allegheny Mountains appeared first on District Dispatch.

HangingTogether: The Elusive User

planet code4lib - Tue, 2014-10-14 15:04


[city man watching fog dust | pixabay]

Each year, OCLC Research staff gather together to review current activities and to plan for the upcoming year. During this year’s meeting, which happened in September, we reviewed our activity areas. I lead the User Behavior Studies and Synthesis activity area; our group engaged in a discussion about describing and possibly renaming the activity area. We discussed “user behavior studies” and whether this terminology is overused and whether it reflects the whole picture of studying and identifying how individuals engage with technology; how they seek, access, and use information; and how and why they demonstrate these behaviors and do what they do.

I wonder if we, as librarians and information professionals, spend too much time contemplating and discussing users of our services and resources and if this energy would be well spent on identifying those individuals who choose not to use library services and resources. I wonder why we are fixated on users of library services and resources and why we do not expend energy on learning about those who go elsewhere for their technology and information needs and try to position library services and resources in their workflows and personal and professional landscapes. Marie L. Radford and I define these individuals who do not use library services and resources as potential users.

If we do buy into this need to identify potential users and their behaviors, what do we call this group? Are these individuals users also, just not users of library services and resources? The term potential user seems cumbersome and not very enticing when trying to promote interest and activity in this area. Even more difficult is identifying a term that describes both users and potential users of library services and resources. Could that term be Elusive Users? According to Choose Your Words, “Anything elusive is hard to get a hold of. It eludes you.” Does this term, elusive, accurately describe the individuals who we observe, interview, and track in various contexts of using technology and acquiring information? I invite you to share your ideas in the comments!

About Lynn Connaway

Senior Research Scientist at OCLC Research. I study how people get & use information & engage with technology.

Mail | Web | Twitter | More Posts (2)

LibraryThing (Thingology): Job: Library Developer at LibraryThing (Telecommute)

planet code4lib - Tue, 2014-10-14 14:22
UPDATE: We are offering $1,000 of books to the person who finds us a library developer. Code! Code! Code!

LibraryThing, the company behind and LibraryThing for Libraries, is looking to hire a top-notch developer/programmer.

We like to think we make “products that don’t suck,” as opposed to much of what’s developed for libraries. We’ve got new ideas and not enough developers to make them. That’s where you come in.

The Best Person
  • Work for us in Maine, or telecommute in your pajamas. We want the best person available.
  • If you’re junior, this is a “junior” position. If you’re senior, a “senior” one. Salary is based on your skills and experience.
Technical Skills
  • LibraryThing is mostly non-OO PHP. You need to be a solid PHP programmer or show us you can become one quickly.
  • You should be experienced in HTML, JavaScript, CSS and SQL.
  • We welcome experience with design and UX, Python, Solr, and mobile development.
The highly-photogenic LibraryThing staff only use stock photos ironically. What We Value
  • Execution is paramount. You must be a sure-footed and rapid coder, capable of taking on jobs and finishing them with diligence and expedition.
  • Creativity, diligence, optimism, and outspokenness are important.
  • Experience with library data and systems is favored.
  • LibraryThing is an informal, high-pressure and high-energy environment. This puts a premium on speed and reliability, communication and responsibility.
  • Working remotely gives you freedom, but also requires discipline and internal motivation.
  • Gold-plated health insurance.
  • Cheese.
How To Apply
  • We have a simple quiz, developed back in 2011. If you can do it in under five minutes, you should apply for the job! If not, well, wasn’t that fun anyway?
  • To apply, send a resume. Skip the cover letter, and go through the blog post in your email, responding to the tangibles and intangibles bullet-by-bullet.
  • Also include your solution to the quiz, and how long it took you. Anything under five minutes is fine. If it takes you longer than five minutes, we won’t know. But the interview will involve lots of live coding.
  • Feel free to send questions to, or Skype chat Tim at LibraryThingTim.
  • Please put “Library developer” somewhere in your email subject line.

LibraryThing (Thingology): Send us a programmer, win $1,000 in books.

planet code4lib - Tue, 2014-10-14 14:04

We just posted a new job post Job: Library Developer at LibraryThing (Telecommute).

To sweeten the deal, we are offering $1,000 worth of books to the person who finds them. That’s a lot of books.

Rules! You get a $1,000 gift certificate to the local, chain or online bookseller of your choice.

To qualify, you need to connect us to someone. Either you introduce them to us—and they follow up by applying themselves—or they mention your name in their email (“So-and-so told me about this”). You can recommend yourself, but if you found out about it from someone else, we hope you’ll do the right thing and make them the beneficiary.

Small print: Our decision is final, incontestable, irreversible and completely dictatorial. It only applies when an employee is hired full-time, not part-time, contract or for a trial period. If we don’t hire someone for the job, we don’t pay. The contact must happen in the next month. If we’ve already been in touch with the candidate, it doesn’t count. Void where prohibited. You pay taxes, and the insidious hidden tax of shelving. Employees and their families are eligible to win, provided they aren’t work contacts. Tim is not.

» Job: Library Developer at LibraryThing (Telecommute)

Library of Congress: The Signal: Close Reading, Distant Reading: Should Archival Appraisal Adjust?

planet code4lib - Tue, 2014-10-14 13:34

From time to time, co-chairs of the National Digital Stewardship Alliance Arts and Humanities Content Working Group will bring you guest posts addressing the future of research and development for digital cultural heritage as a follow-up to a dynamic forum held at the 2014 Digital Preservation Conference.  Anyone interested in contributing a posting for The Signal on this topic should contact either jsternfeld at or gail at

The following is a guest post from Meg Phillips, External Affairs Liaison, National Archives and Records Administration. Opinions expressed are those of the author and do not necessarily represent positions of the National Archives and Records Administration.

Meg Phillips, External Affairs Liaison at the National Archives and Records Administration and member of the NDSA Coordinating Committee.

Digital humanists and digital historians are employing research methods that most of us did not anticipate when we were learning to be archivists.  Do new types of research mean archivists should re-examine the way we learned to do appraisal?

The new types of researchers are experimenting with methods beyond the scholarly tradition of “close reading.”  When paper archives were the only game in town, close reading was all a researcher could do – it’s what we generally mean by “reading.”  Researchers studied individual records, extracting meaning and context from the information contained in each document.  Now, however, digital humanists are using born-digital or digitized collections to explore the benefits of computational analysis techniques, or “distant reading.” They are using computer programs to analyze patterns and find meaning in entire corpora of records without a human ever reading any individual record at all.

I have been interested in digital scholarship and its implications for archives for a while, but I hadn’t heard the phrase “distant reading” until seeing Franco Moretti’s book “Distant Reading” reviewed earlier this year. (See  “What is Distant Reading?” in the New York Times and “In Praise of Overstating the Case: A review of Franco Moretti, Distant Reading” in Digital Humanities Quarterly for a taste of the debate over the book.)  The phrase stuck with me as provocative shorthand for a new way of using records, and I started thinking about what distant reading might mean for archival appraisal.

Our traditions of archival appraisal are based on locating records that reward close reading.  A series appraised as permanent contains individual records that contain historically valuable information.  Both appraisal itself and the culling that happens during transfer or processing focus on removing records that do not contain permanently valuable information.

Now, however, it is possible to ask and answer entirely new kinds of questions with born-digital or digitized records. What did the network of influence in an organization look like?  How did communication flow? Was the chief executive interacting with a particular vendor unusually often? When did a new concept or term first appear and how quickly did use of the new term spread?  How did a disease spread through a community?  Not only is it possible, but early adopters are now teaching these research methods to a new generation of students.  For example, Professor Matthew Connelly is teaching a seminar at the London School of Economics called Hacking the Archives.  The course challenges students of international history to explore the new kinds of questions computational research allows.  These are questions whose answers emerge not from deep reading of individual records but from analysis of patterns in  large bodies of records.

The National Archives from user silbersam on Flickr.

The interesting thing about these questions is that the answers may rely on the presence of records that would clearly be temporary if judged on their individual merits. Consider email messages like “Really sick today – not coming in” or a message from the executive of a  regulated company saying “Want to meet for lunch?” to a government policymaker. In the aggregate, the patterns of these messages  may paint a picture of disease spread or the inner workings of access and influence in government.  Those are exactly the kinds of messages traditional archival practice would try to cull. In these cases, appraising an entire corpus of records as permanent would support distant reading much better.  The informational value of the whole corpus cannot be captured by selecting just the records with individual value.

If we adjusted practice to support more distant reading, archivists would still do appraisal, deciding what is worth permanent preservation.  We would just be doing it at a different level of granularity – appraising the research value of an entire email system, SharePoint site or social media account, for example.

Incidentally, on a practical level this level of appraisal might also lead to disposition instructions that are easier for creating offices to carry out.

Figuring out how to do appraisal to support both distant reading and close reading would be an excellent project for the archival and digital preservation fields.  What questions would we want to answer?  We could start with some questions like these:

  • How many researchers are actually engaged in distant reading?  What fields do they work in?  Are their numbers increasing?
  • Do they want to apply computational techniques to archival materials, for example Federal records in the National Archives, or in any other environment?  Perhaps they are getting their source material somewhere else, bypassing archives.
  • To what extent do their research methods rely on having a complete set of the records created rather than a subset of the most permanently valuable records?
  • Do current definitions of a record and current recordkeeping regulations support a change to appraisal of entire corpora of records?
  • How would we know which corpora of records were most useful to researchers?
    • Is the benefit of distant reading worth the cost and risk of retaining more material that could have personal privacy or other protected content?
  • Is there a meaningful difference between trying to support computational research and actually just keeping everything?  (Perhaps this whole discussion is just the modern version of the old tension between historians who want to save everything and archivists who are trying to put their resources toward the most important materials.)

Staff at the National Archives and other institutions are starting to create opportunities for archivists to discuss questions like these.  Josh Sternfeld of NEH, Jordan Steele of Johns Hopkins and Paul Wester and I from NARA will be holding a panel discussion of these issues at the Fall 2014 Mid Atlantic Regional Archives Conference meeting in Baltimore, for example.   Paul and I will be also be speaking with Matthew Connelly and others on an American Historical Association panel at the 2015 annual meeting in New York City, “Are We Losing History? Capturing Archival Records for a New ERA of Research.”

However, we need to create even more opportunities for archivists to explore these issues with digital humanists. A forum that pulled together digital researchers, archivists, librarians and technologists could be a great opportunity for us all to learn from each other. Such an event could also spread the word about the exciting new things that can be done with digital primary sources and the rich collections of digital resources that are now available in archives and libraries.

Of course, we can also blog about the issues and hope that the community leaps into the fray!

In that spirit, do you think archival appraisal needs to change, and if so, how?

PeerLibrary: Towards Open Access to Research and Knowledge for Development

planet code4lib - Tue, 2014-10-14 04:00

Beyond establishing an online database of publicly accessible academic articles, PeerLibrary has committed itself to providing an open space where people are encouraged to collaborate and communicate with one another. This week we want to highlight a fascinating article in our database, titled “Towards Open and Equitable Access to Research and Knowledge for Development”, by Professor Leslie Chan, et. al. at the University of Toronto. Professor Chan’s team focuses on the importance of research and the necessity for anyone, not just academic scholars, to be able to engage and conduct research as well as share and critique one another’s ideas. He claims that only open collaboration will effectively promote human development and allow the merging of different identities. With the words of Professor Chan in mind, PeerLibrary hopes to aid the development of free collaboration under the guiding principle of the universal right to education.

Dan Scott: DCMI 2014: holdings in open source library systems

planet code4lib - Tue, 2014-10-14 01:07

My slides from DCMI 2014: in the wild: open source libraries++.

Last week I was at the Dublin Core Metadata Initiative 2014 conference, where Richard Wallis, Charles MacCathie Nevile and I were slated to present on and the work of the W3C Bibliographic Extension Community Group (#schemabibex). As a first-timer at DCMI, I wasn't sure what kind of an audience to expect: there is a peer-reviewed papers track, and a series of sessions on a truly intimidating topic (RDF Application Profiles), but on the other hand our own topic was fairly basic. As it turned out, there was an invigoratingly mixed set of backgrounds present, and Eric Miller's opening keynote, which gave an oral history of the origins of DCMI and a look towards the future challenges for the organization, reassured me that I wasn't going to be out of my depth.

Special kudos to Eric for his analogy of the Web to a credit card, which offers both human-readable and machine-readable data. A nice, clean image!

Richard, Charles and I opted to structure our 1.5 hour session as a series of short talks followed by a long period of discussion. However, as often happens, the excitement of speaking in front of a room that drew so many attendees that we had to jam with more chairs led to that plan breaking down. I cut my own materials back to illustrating how one of my primary contributions to the #schemabibex effort--representing library holdings using's GoodRelations-based Product/Offer model--had been implemented in free software library systems, including Evergreen, Koha, and VuFind. I walked from a basic bibliographic record (represented as a Product), through to the associated borrowable items (represented as Offers with a price of $0.00, call numbers as SKUs, and barcodes as serialNumbers), that were offered by a specific Library with its own set of operating hours, address, and contact information... all published out of the box as RDFa in modern Evergreen systems.

I did stray a little to posit that the use case for is not and should not be limited to "search engine optimization", but that this very simple level of structured data could fairly easily form the basis of an API. In the rather limited discussion that we were able to hold at the end of the session (and encroaching on break time), Charles counselled that libraries shouldn't really bother with dumbing down their beautiful metadata simply to publish while I countered that the pursuit of publishing beautiful metadata in the past has generally led librarians to publish no metadata at all, and that was a great first step towards building a web of cultural heritage metadata meant for machine consumption.

I wish I could have stayed longer at DCMI, but it was Thanksgiving in Canada and there were families to visit and feast with--not to mention children to help take car of--so I had to depart after just a day and a half. I'm encouraged by the steps the organization is taking to renew itself, and I hope to be able to participate again in the future.

DPLA: DPLA and DigitalNZ present GIF IT UP, an international GIF-making competition: October 13 – December 1, 2014

planet code4lib - Mon, 2014-10-13 14:28

It’s a public domain celebration! The Digital Public Library of America and DigitalNZ are very excited to announce the launch of GIF IT UP, an international competition over the next six weeks to find the best GIFs reusing public domain and openly licensed digital video, images, text, and other material available via our search portals. The winners will have their work featured and celebrated online at the Public Domain Review and Pretty sweet, huh?


Cat Galloping (1887). The still images used in this GIF come from Eadweard Muybridge’s “Animal locomotion: an electro-photographic investigation of consecutive phases of animal movements” (1872-1885). Courtesy USC Digital Library, 2010. View original record (item is in the public domain). GIF available under a CC-BY license.

How it works. The GIF IT UP competition has six categories:

  1. Animals
  2. Planes, trains, and other transport
  3. Nature and the environment
  4. Your hometown, state, or province
  5. WWI, 1914-1918
  6. GIF using a stereoscopic image

A winner will be selected in each of these categories and, if necessary, a winner will be awarded in two fields: use of an animated still public domain image, and use of video material.

To view the competition’s official homepage, visit

Judging. GIF IT UP will be co-judged by Adam Green, Editor of the Public Domain Review and by Brian Wolly, Digital Editor of Entries will be judged on coherence with category theme, thoroughness of entry (correct link to source material and contextual information), creativity, and originality.

Gallery. All entries that meet the criteria outlined below in the Guidelines and Rules will be posted to the GIF IT UP Tumblr Gallery. The gallery entries with the most amount of Tumblr “notes” will receive the people’s choice award and will appear online at the Public Domain Review and alongside the category winners.

Submit. To participate, please first take a moment to read “How it Works” and the guidelines and rules on the GIF IT UP homepage, and then submit your entry by clicking here.

Deadline. The competition deadline is December 1, 2014 at 5:00 PM EST / December 2, 2014 at 10:00 AM GMT+13.

GIFtastic Resources. You can find more information about GIF IT UP–including select DPLA and DigitalNZ collections available for re-use and a list of handy GIF-making tips and tools–over on the GIF IT UP homepage.

Questions. For questions or other inquiries, email us at or, or tweet us @digitalnz or @dpla. Good luck and happy GIFing!

 All written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

LITA: ADE in the Library eBook Data Lifecycle

planet code4lib - Mon, 2014-10-13 13:53

Reader: “Hey, I heard there is some sort of problem with those ebooks I checked out from the library?”

Librarian: “There are technical problems, potential legal problems, and philosophical problems – but not with the book itself nor your choice to read it.”

As mentioned, there are (at least) three sides to the problem. Nate Hoffelder* discovered the technical problem with the way the current version (4) of Adobe Digital Editions (ADE) manages the ebook experience, which was confirmed by security researcher Benjamin Daniel Mussler, and later reviewed by Eric Hellman. The technical problem, that arguably private data is sent in plain text from a reader’s device to a central data-store, seems pretty obvious once it was discovered. The potential legal problem stems from laws in every state which protect reader privacy which set expectations for data security, plus other laws which may apply. The philosophical problem has several facets, which could be simplified down to the tension between privacy and convenience.

When a widely-used software platform is found to be logging data unexpectedly and transmitting it for some unknown use it causes great unease among users. When that transmission is happening in plain text over easily-intercepted channels, it causes anger among technologists who think a leading software developer should know better. When this is all happening in the context of the library world where privacy is highly valued, there is outrage as expressed by LITA Board member Andromeda Yelton.

Here are the library profession’s basic positions:

  1. Each individual’s reading choices and behavior should be private (i.e. anonymized or, better, not tracked)
  2. Data gathered for user-desired functionality across devices should be private (i.e. anonymized)
  3. Insofar as there is any tracking of reading choices and behavior, there should be an opt-out option readily available to individuals (i.e, not buried in the fine print)

In his October 9th post from The Digital Shift, Matt Enis reports that Adobe is working to correct the problem of data being transmitted in clear text but “maintains that its collection of this data is covered under its user agreement.” The data that corporations transmit should be limited to the data and data elements necessary to provide desired functionality yet also restricted enough for an individual’s activity to remain private.

To join the conversation, begin to educate yourself using our ADE Primer, below, plus the following resources:

A Primer on how Adobe Digital Editions (ADE) works with library ebooks

I’m a reader and I go to use a library ebook
(via Overdrive or other downloading service offered):

  1. what will I need to install on my device(s)?
    (laptop, tablet, phone, & iPod let’s assume)

    • laptop/computer: Adobe Digital Editions (ADE), activated with an Adobe ID
    • tablet, phone, iPod, etc.: Bluefire Reader (or compatible) app, activated with an Adobe ID
  2. how do the various devices know which page to show me next when I switch between them?
    • access and synchronization across devices are managed using the Adobe ID and the information associated with the ebook and by data tracked with ADE
  3. what technologies are behind the scenes?
    • the ADE managed digital rights management (DRM) required by the ebook publisher
    • the ebook reader software/app
    • the internet
  4. what data is needed to be able to do the sync?
    • the minimum required data is arguably the UserID, BookID, and a page-accessed timestamp
    • the current ADE version, ADE4, tracks significantly more data than the minimums above
  5. how is that data shared between devices?
    • Users can access their ADE account from up to 6 different devices. When accessing the ID/account from a new device the user must “activate” the device by logging into the Adobe ID/Account to prove that the user is the legitimate account holder.
    • ADE4 shares all ebook data it tracks in plain-text in an unsecured channel over the internet
  6. what functionality would not work if this were suddenly not provided?
    • if ADE did not provide reader tracking data, each time a reader opened an ebook on a different device the reader would have to remember the page s/he was on and then navigate to that page to continue reading from where they left off
    • A computer can be anonymously activated using ADE, however this will prevent the items from being accessible from more than one computer/device. The ebooks would then be considered to be “owned” by that computer and would not be available to be accessed from other devices.
    • if ADE were completely withdrawn from availability, ebook DRM would prevent use of ADE-managed DRM-protected ebooks

From a technology point of view, the clear-text data transmitted suggests the data may be for synchronization, but it seems, first and foremost, to support various licensing business models. Because Adobe might in the future have customers who want to use Adobe DRM to expire a book after a certain number of hours or pages read, they may feel the need to collect that data. Adobe’s data collection seems to be working as intended here. Clear-text transmission is clearly a bug, but that this data about patron reading habits is being transmitted to Adobe is a feature of the software.

The philosophical discussion which needs to happen around ebooks and DRM should include:

  • what data elements enable user-desired functionality
  • what data elements enable digital rights management
  • what data elements above are/are not within ALA’s stated professional ethics
  • whether tracking ebook user behavior is acceptable *at all*

From libraryland conversations around the issue so far, opinions have ranged from ‘tracking is not the problem, the clear-text transmission is‘ to ‘tracking is very much a problem, it’s unacceptable.’

Issues like this highlight the need to revisit stated positions and evaluate where the balance point is between accomodating user functionality and protecting against collection of personally identifiable data, or metadata.

*Post updated to correctly credit Nate Hoffelder as the original discoverer (my apologies!)

John Miedema: How Watson Works in Four Steps

planet code4lib - Mon, 2014-10-13 13:34

A good overview of how IBM’s Watson works. When humans seek to understand something and to make a decision we go through four steps.

  1. Observe visible phenomena and bodies of evidence;
  2. Draw on what we know to interpret evidence and to generate hypotheses;
  3. Evaluate which hypotheses are right or wrong; and
  4. Decide the best option and act accordingly.

So does Watson. Key to the success is the ability to process unstructured inputs using Natural Language Processing.

District Dispatch: Free webinar: Helping patrons understand Ebola

planet code4lib - Mon, 2014-10-13 06:48

Photo by Phil Moyer

Reminder: On Tuesday, October 14, 2014, library leaders from the U.S. National Library of Medicine will host the free webinar “Fighting Ebola and Infectious Diseases with Information: Resources and Search Skills can Arm Librarians.” The webinar will teach participants how to find and share reliable health information.

Recent outbreaks across the globe and in the U.S. have increased public awareness of the potential public health impacts of infectious diseases. As a result, many librarians are assisting their patrons in finding credible information sources on topics such as Ebola, Chikungunya and pandemic influenza.

Speakers include:

Siobhan Champ-Blackwell
Siobhan Champ-Blackwell is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center. She selects material to be added to the NLM disaster medicine grey literature data base and is responsible for the Center’s social media efforts. She has over 10 years of experience in providing training on NLM products and resources.

Elizabeth Norton
Elizabeth Norton is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center where she has been working to improve online access to disaster health information for the disaster medicine and public health workforce. She has presented on this topic at national and international association meetings and has provided training on disaster health information resources to first responders, educators, and librarians working with the disaster response and public health preparedness communities.

Date: Tuesday, October 14, 2014
Time: 2:00 PM – 3:00 PM Eastern
Register for the free event

If you cannot attend this live session, a recorded archive will be available to view at your convenience. To view past webinars also done in collaboration with iPAC, please visit

The post Free webinar: Helping patrons understand Ebola appeared first on District Dispatch.

Patrick Hochstenbach: Homework assignment #3 Sketchbookskool

planet code4lib - Sun, 2014-10-12 07:07
This week we were asked to go to a park and draw people using line art. It was raining, so I decided to go to the Church of Our Lady which is always a tourists attraction.   Filed under: Doodles

Cynthia Ng: Batch Appending a Single PDF to multiple PDFs

planet code4lib - Sun, 2014-10-12 03:59
So recently, I came up to the problem of having to add a page at the end of multiple PDFs.A couple of years ago, I’d done some work with GhostScript to merge a bunch of PDFs, so I thought I’d start there. Use Case I have a bunch of PDFs, and what I have is […]

DuraSpace News: The Archivematica + DuraCloud “Soup-to-Nuts” Preservation Service Launches a Beta Test

planet code4lib - Sun, 2014-10-12 00:00

Winchester, MA  The Archivematica + DuraCloud hosted service has launched a beta test with pilot partners that will be ongoing from October 2014 to January 2015.

Ensuring that robust Archivematica Archival Information Packages (AIPs) have a secure long-term home is the idea behind the new Archivematica + DuraCloud hosted service. The new integrated service is designed to provide users with a robust preservation workflow plus long-­term archiving in a single hosted solution.

John Miedema: Orlando: the lives and works of British women writers. Digital resources working together in unexpected and insightful ways.

planet code4lib - Sat, 2014-10-11 19:55

Orlando is a digital resource, indexing the lives and works of British women writers.

The full name of the project is, Orlando: Women’s Writing in the British Isles from the Beginnings to the Present. It is the work of scholars Susan Brown, Patricia Clements, and Isobel Grundy. The name of the work was inspired by Virginia Woolf’s 1928 novel, Orlando: A Biography. The project, like the novel, is an important resource in the history of women’s writing. It grew out of the limitations of a print-based publication, The Feminist Companion to Literature in English. The Companion presented a great deal of research on women writers but lacked an adequate index. The researchers decided to compile a digital index.

I have the good fortune to work with Susan Brown and the Orlando resource. I have extracted bibliographic and literary data from Orlando, and intend to integrate it with unstructured literary content using Natural Language Processing. The aim is a first demonstration of how digital resources like Orlando can provide new ways of reading and understanding literature. In particular I hope to show how digital resources can work together in unexpected and insightful ways.

More information:

The Orlando Project

Bigold, Melanie (2013) “Orlando: Women’s Writing in the British Isles from the Beginnings to the Present, edited by Susan Brown, Patricia
Clements, and Isobel Grundy,” ABO: Interactive Journal for Women in the Arts, 1640-1830: Vol. 3: Iss. 1, Article 8.
Available at:

Orlando: A Biography. Wikipedia


Open Library Data Additions: An error occurred

planet code4lib - Sat, 2014-10-11 10:12
The RSS feed is currently experiencing technical difficulties. The error is: Search engine returned invalid information or was unresponsive


Subscribe to code4lib aggregator