You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 20 hours 18 min ago

Mita Williams: The update to the setup

Sat, 2015-04-25 20:26
In my last post, I described my current computer set up. I did so to encourage a mindfulness in my own practice (I am not ashamed of writing the previous sentence - I really do mean it). Forcing myself to inventory the systems that I use, made two things readily apparent to me. First, it is abundantly clear that not only am I profoundly dependent on Google products such as Google Drive, almost all of the access to my online world is tied together by my Gmail account. I aspire to, one day, be one among the proud and the few who are willing to use alternatives such as Owncloud and Fastmail just to establish a little more independence.

But before even considering this move, I first needed to address the second glaring problem that emerged from this self-reflection of my setup: I desperately needed a backup strategy. Massive loss was just a hard drive failure or malicious hack away.

As I write this, my old Windows XP computer is sending years worth of mp3s, documents and digital photos to my new WD Book which I bought on recommendation from Wirecutter. When that’s done, I’m going to copy over my back ups of my Google Drive contents, Gmail, Calendar, Blogger, and Photos that I generated earlier this week using Google Takeout.

I know myself well enough that I cannot rely on making regular manual updates to an external hard drive. So I have also invested in a family membership to CrashPlan. It took a loooong time for the documents of our family computers to be uploaded to the CrashPlan central server but now the service works unobtrusively in the background as new material accumulates. If you go this route of cloud-service backups, be aware that its likely that you are going to exceed your monthly data transfer limit for your ISP. Hopefully your ISP is as understanding as mine who waved the additional costs as this was a ‘first offense’ (Thank you Cogeco!)

My next step? I’m going to re-join the Archiveteam.

Because history is our future.

Coral Sheldon-Hess: Fitness Trackers

Sat, 2015-04-25 20:01

I’m pretty sure I promised to write a post about the fitness tracker I use. It’s very pretty! But since it hasn’t synced to my phone in eight days(!), and I can’t get it to manually sync, I find I’m too frustrated to write the in-depth post I had intended to write. That’s too bad for Misfit, since their app has just been updated, and every in-depth post out there is going to talk about the old app. (Apps, really. Apparently the Android one and the iOS one are very different.)

Because I’m cranky, I’m going to open with some reasons why you might not want a fitness tracker. And then I’ll do a quick compare/contrast of the four I know the most about, because why not?

Fitness trackers aren’t amazing yet

Now that my Misfit Shine seems to be on the fritz, I have to decide whether or not a fitness tracker is even something I want to start to budget for, at all. Every single one of them, to my knowledge, shares these serious down sides (listed no particular order):

  • They’re either kind of ugly or completely useless as a motivator. The Shine is lovely; it looks like a piece of jewelry when I wear it on its necklace! But its unobtrusiveness means that it doesn’t serve arguably its greatest purpose: reminding and/or motivating me to get up and move. Most of the other trackers are worn on the wrist (which makes me feel like they should be so inaccurate, but I’ve been assured that isn’t the case), and most are pretty ugly. But at least having something on your wrist all day will give you that visual cue: “Oh, right, I have goals.” As far as avoiding fitness bracelets, there are clip-on trackers, but I’m a forgetful person; I had more days without data than days with it, some weeks, with my clip-on Fitbit. And the real problem with attractive or “invisible” trackers: you don’t see them, so they don’t motivate you. This could be mitigated, as I’ll mention below, by adding an inactivity alert to any of these devices, plus maybe a signal from your phone when it gets too far from the device, to help prevent forgetting them. (Sure, this would have consequences for battery life. Look: engineering is hard. Deal with it.)
  • There are legal and privacy concerns. (A warning for my friends from Anchorage: I’m going to talk about Wil. Skip this if you want.) Fitbits are being used in court cases, hopefully responsibly; there are also employers who know way more about their employees’ activities than I would be entirely comfortable with, if I worked for them. Having seen the O’Reilly talk about Intridea (video), I can tell some of these employers mean well, so I might go along with being tracked if I worked for one of them; I’d just want to see what they were doing with the data, first. Still, fitness tracking can also be used for evil: a few years ago, a friend of mine was killed by a motorist while commuting by bike. The police force’s “investigation” was almost entirely based around his tracking app’s location data—which was obviously, visibly inaccurate, showing him swerving across a four lane street in a way that the snow berms would have made impossible. (Because consumer-grade location data is like that. Inaccurate.) Long story short, the lazy, incompetent police officers decided that my friend was at fault for riding in the wrong direction, in the street (the sidewalk in that area is a bike path, so riding the wrong way down that is legal/expected). The motorist who hit him (while he was, verifiably, in a crosswalk) was not punished. Now, I realize, no amount of properly interpreted data could have saved my friend’s life, and punishment isn’t really justice; still, it seems obvious that misinterpretation of fitness tracker data could easily hurt other people.
  • It’s a little bit ableist, isn’t it? You know? 10,000 steps per day isn’t a good goal for me, at least not more than a couple of times a week. With my foot injury, “more steps” isn’t always going to be better. But you aren’t rewarded for resting, or taking care of yourself; you’re rewarded for more steps. Worse, some trackers (and Misfit is one) use “streaks” as their primary motivational mechanic, which I hate: I’m going to have bad days, and I hate that my tracker punishes me for them. Fitbit and Garmin at least let you work toward badges over time, so you don’t “lose” all of your progress on a bad day.
  • What if your goal isn’t weight loss? This varies by tracker, but some of them really harp on calories. Really, really, really. I know some people who find this triggering. For me, it’s just annoying. My goal isn’t really to lose weight; it’s to get more exercise per day (when I can).
  • If your goal is weight loss, bad news: many users gain weight. Apparently. (Obvious and probably unnecessary warning: the discussion of weight, at that link, is not nuanced.) Calorie burn estimates are wildly inaccurate, who knew? (I know! And, if you didn’t already, now you know!)
Features and flaws

So let’s see how these things all stack up, shall we?

  • Features: Relatively easy to get your data, web interface, works with some cool third party apps.
  • Maybe features/Maybe bugs: Counts flights of stairs climbed.
  • Bugs: Not waterproof, requires charging, no inactivity monitor.

This one’s really popular, and now that they offer a purple wrist band, I’ve thought hard about whether or not I want one again. I really liked my clip-on Fitbit, even though I got fed up with it (or myself) for its (my) “forgetting to wear it” and “forgetting to change its mode” issues—that latter one was particularly annoying, because if you do something like cycling or sleeping, it has trouble figuring that out, and everything will be horribly inaccurate unless you change its mode while you do those things, or go in and fix it later. But since that seems to be the case with every other tracker I’ve seriously considered (with the exception of the newest/fanciest Jawbone), it doesn’t even make it in the features/bugs list.

I liked that it had a web interface; it wasn’t perfect—some settings were a little hard to find—but you could look at trends over time, which I appreciated. I haven’t been back in there in months, but the last time I looked, it didn’t appear that they’d changed the website at all. I feel like all of their development efforts are on hardware, not UI, which is too bad.

As far as wrist-borne Fitbits, my friends who have them seem to like them well enough. Nobody has complained that theirs is too big, or too ugly to wear to work. It’s apparently wildly inaccurate if you use a treadmill, but otherwise, it seems to do a good job for them.

If you get the one without the heart rate monitor, it’s a little bit water resistant; the one with HRM (so… the purple one) is, apparently, not. I don’t know about you, but, even if I can’t wear my fitness monitor in the pool (which I would like) or the shower (which I would very much like, due to that whole “forgetfulness” issue), I’d like to be able to wear it in heavy rain without worrying about it. That alone might nix the whole Fitbit idea, for me.

Also, since I can’t reliably climb stairs (arthritis), I think a Fitbit would potentially not be great for my motivation; it has an onboard altimeter that tells you how many flights you’ve climbed, and I know I hated seeing “0,” in the past. Plus, I’d never beat my record, from ALA Midwinter in Seattle, when I stayed at the hotel at the top of the hill. ;) But for people who have set “climb more stairs” as a goal for themselves, it is very, very good in this respect, and I believe it’s the only tracker to do that.

That said, Fitbit is probably the most popular tracker out there, so it works with pretty much every other app, including FitRPG. (I am very interested in FitRPG.) If you want good app integration or lots of friends to be socially competitive with, Fitbit’s probably the winner (with Jawbone in second place).

Plus? There’s an R library for pulling in Fitbit data. I think that’s kind of cool. :)

Misfit Shine
  • Features: Waterproof! Aesthetically pleasing.
  • Maybe features/Maybe bugs: Totally flexible about how you wear it (wrist, necklace, etc.), though accuracy varies. No charging; replace the battery after four months.
  • Bugs: The data is locked inside an app, nearly impossible to get out. Syncing is really shaky. Terrible sleep monitor.

Despite all this writing, which I’m told is supposed to be therapeutic or something, I’m still pretty mad about my Misfit not syncing. I’ll try to be balanced in my approach, though, because it does have some great features, beyond just being pretty.

First off, it uses a CR2032 watch battery, which means you don’t have to take it off and recharge it all the time. Every four months or so, you pry it open, put the dead battery in your “take to the dump on toxic waste day” pile (which is a down side, yes), and put a new one in. As watch batteries go, you can’t really get more ubiquitous or any cheaper than the model the Misfit uses.

Because there’s no charging port, it’s waterproof. It even has a setting for swimming! (Like the other devices, it wants you to tell it when you’re doing any activity besides walking. I think you’re supposed to wear it on your wrist to swim and on your ankle to bike; they even sell special socks, though I didn’t buy those.) “Waterproof” is a really hard feature to find in the fitness tracker market and was a major selling point for me. I can wear it 24/7, in any weather! I really like that!

Something that they state as a pro is, honestly, more of a toss-up: you can wear it on your wrist (they provide a basic wrist band and sell fancier ones), on a necklace (they sell necklaces), clipped to your shoe (they provide a magnetic clip), or basically wherever. The perception of flexibility is great! That said, it is wickedly inaccurate when worn on the wrist—either that, or Charlottesville’s trail markers are wrong. It seems accurate enough when worn on the chest, though, so I generally go with that. There’s a place in the app to tell it where you’re wearing it, so it does try to be accurate; I assume this will improve over time.

The thing I dislike most about it, though, and which is probably going to cost them further business from me, is that they have no web interface and no data export. Your only interface is the app. You can’t really look at trends over time—not in a way that’s useful—and its integration with IF/IFTTT is really, really weak; if it doesn’t sync for a day, for whatever reason, it records zeros and never overwrites them. And the syncing is really, really spotty, so this happens a lot. (It used to only sync manually, and you were supposed to put the device on your phone to do it. It nominally syncs automatically, nowadays, but it fails. Often.)

I haven’t looked to see if there are any API wrappers in any programming languages I know, and I’ve been grumpy enough about this glaring lack, on their part, to actually jump through the hoops to get a developer API key so I could write my own. I might, though. If mine ever syncs again.

Also, since it’s sort of new and syncs poorly, Misfit just isn’t there yet with the other app integration. They’re adding services all the time, so this will presumably improve rapidly.

I listed “terrible sleep monitor” as a bug, but this is honestly not the main thing I use these devices for. The Fitbit’s sleep monitor also seemed inaccurate (I hope, or else I’m living on 4-5 hours of actual sleep per night), but the Shine is even worse, splitting up one night of sleep so that it looks like several nights. I avoid that part of the interface altogether, honestly. They sell a separate device that you can put on your bed to monitor sleep, but given how bad the interface for the Shine is, on that front, and how little access you get to your data, I don’t see the point.

Jawbone Up
  • Features: Inactivity monitor! Relatively slim band. The fanciest one doesn’t require logging workouts separately; it can tell when you start a workout.
  • Maybe features/Maybe bugs: Super super super into weight loss as a thing, like whoa.
  • Bugs: Not waterproof (though it is water resistant and can be worn in the shower). No display (the older models had this problem; it looks like there’s a display, now).

I haven’t actually owned an Up, but it was the other brand I considered really hard, last time I was looking at fitness trackers. I know a few people who have them, and they like them a lot.

It has one feature that I think is killer and that I really want: you can set it to vibrate if you haven’t moved in some given time frame. You can even pick the time frame! (Note: this is true for their bands, but not for the Up Move. That’s too bad, because that one comes in purple.)

When my arthritis was at its worst, I felt severe pain if I sat still for more than about 45 minutes at a time—the problem was, the sitting itself wasn’t what hurt; it was the standing up afterward. I didn’t have to get up and walk for long—just a stand-and-stretch did the trick, most of the time—but the consequences of staying seated too long were awful. So I tried all kinds of alarms, calendar reminders, apps, you name it; nothing was very good. I just kind of limped along (bad joke) with a hodgepodge of things until I got better meds.

Now, of course, I’m much better, and I can sit for however long I need to, within reason; but it turns out you’re still supposed to get up and move every so often, to be healthy.

And where the calendars and apps fail, an inactivity alert is such a clear win: it isn’t about getting up and moving at every arbitrary time mark, but about not sitting still for too long. If you have a day that’s nicely broken up by meetings and various activities, you don’t need a reminder on the 45/60/115 minute mark; that’s annoying. You definitely don’t need something that makes actual sound. But on those “head down, do work” days, a bracelet that vibrates gently to say “Hey, you haven’t moved in a bit”? That sounds perfect.

If the Shine or one of the other non-wrist trackers implemented this feature, they would suddenly be useful for motivation again!

It seems like Jawbone falls between Fitbit and Misfit on the app integration front, but probably closer to Misfit, honestly. There are some sleep apps and some weight loss apps and a whole bunch of food apps, but, besides Fitocracy, there’s not a lot on the gamification front. (And Fitocracy is really more oriented toward weight lifting than cardio or walking.) That’s too bad. Honestly, the whole Jawbone website seems so focused on weight loss that I’m a little grossed out by their approach.

Garmin Vívosmart
  • Features: Waterproof! Inactivity alarm! Nice display! A find-your-phone utility.
  • Maybe features/Maybe bugs: It wants to also be a smart watch. You apparently have to tap to turn the display on. Goals seem to be set for you automatically, and it doesn’t penalize you for not meeting your goal.
  • Bugs: It’s a wrist band.

I don’t own one of these, and I don’t know anyone who does. I had previously written Garmin off entirely, because their prices were so much higher. But at $150, the Vívosmart is still within the ballpark, with a few added features. (I didn’t think I wanted a smart watch, but the idea of seeing text messages on my wrist kind of appeals to me, especially as I consider getting an iPhone that won’t fit in my pocket.)

Since it has the two features I most want (waterproofness and an inactivity alert), plus one you think I’m joking about but I’m not (purple highlights), it’s a strong contender. It appears that you cannot set the duration for the inactivity alert; it’s always an hour. I could live with that, though.

Some people might not like that it (reportedly) sets your goals for you, but for me this is a major bonus. My goal is to be constantly improving, not to hit some arbitrary mark right off the bat. Plus, it will create training plans for you, based on goals (which seem to take the form of, for instance, “I want to complete [event—triathlon, 5k, whatever] on [date],” which is a feature that interests me. Overall, the focus on activity, rather than weight, seems really positive.

People don’t seem impressed by the Garmin app, in reviews, but all of the screen shots I’ve seen make it look at least as good as Misfit’s app, with the added bonus of website availability. It doesn’t seem to integrate with IFTTT or with anything similar to FitRPG, so I’d say its app integration is also probably comparable to Misfit’s. (Slightly better, for my purposes, since I’ve used MyFitnessPal in the past, and the integration between that and Garmin seems to be really tight.)


I may not buy another fitness tracker; like I said, there are good reasons not to do so. If I do, it’s almost certainly going to be the Vívosmart, or something similar to it, because I really like that feature set. It’s still not perfect, but I think it would fit into my life pretty well. If you have one, please leave me a comment or tweet at me to let me know how you like it! :)


Ed Summers: Human Nature and Conduct

Fri, 2015-04-24 20:29

Human Nature and Conduct by John Dewey
My rating: 5 of 5 stars

This book came recommended by Steven Jackson when he visited UMD last year. I’m a fan of Jackson’s work on repair, and was curious about how his ideas connected back to Dewey’s Human Nature and Conduct.

I’ve been slowly reading it, savoring each chapter on my bus rides to work since then. It’s a lovely & wise book. Some of the language puts you back into 1920s, but the ideas are fresh and still so relevant. I’m not going to try to summarize it here. You may have noticed I’ve posted some quotes here. Let’s just say it is a very hopeful book and provides a very clear and yet generous view of the human enterprise.

I don’t know if I was imagining it, but I seemed to see a lot of parallels between it and some reading I’m doing about Buddhism. I noticed over at Wikipedia that Dewey spent some time in China and Japan just prior to delivering these lectures. So maybe it’s not so far fetched a connection.

I checked it out of the library, but I need to buy a copy of my own so I can re-read it. You can find a copy at Internet Archive for your ebook reader too.

Ed Summers: Method and Materials

Fri, 2015-04-24 20:26

Now it is a wholesome thing for any one to be made aware that thoughtless, self-centered action on his part exposes him to the indignation and dislike of others. There is no one who can be safely trusted to be exempt from immediate reactions of criticism, and there are few who do not need to be braced by occasional expressions of approval. But these influences are immensely overdone in comparison with the assistance that might be given by the influence of social judgments which operate without accompaniments of praise and blame; which enable an individual to see for himself what he is doing, and which put him in command of a method of analyzing the obscure and usually unavowed forces which move him to act. We need a permeation of judgments on conduct by the method and materials of a science of human nature. Without such enlightenment even the best-intentioned attempts at the moral guidance and improvement of others often eventuate in tragedies of misunderstanding and division, as is so often seen in the relations of parents and children.

John Dewey in Human Nature and Conduct (p. 321)

Ed Summers: Something Horrible

Fri, 2015-04-24 20:18

There is something horrible, something that makes one fear for civilization, in denunciations of class-differences and class struggles which proceed from a class in power, one that is seizing every means, even to a monopoly of moral ideals, to carry on its struggle for class-power.

John Dewey in Human Nature and Conduct (p. 301)

Ed Summers: Energies

Fri, 2015-04-24 20:14

Human nature exists and operates in an environment. And it is not “in” that environment as coins are in a box, but as a plant is in the sunlight and soil. It is of them, continuous with their energies, dependent upon their support, capable of increase only as it utilizes them, and as it gradually rebuilds from their crude indifference an environment genially civilized.

John Dewey in Human Nature and Conduct (p. 296)

LITA: Build a Circuit & Learn to Program an Arduino in a Silicon Valley Hackerspace

Fri, 2015-04-24 16:05
Panel of Inventors & Librarians Working Together for a More Creative Tomorrow A LITA Preconference at 2015 ALA Annual

Register online for the ALA Annual Conference and add a LITA Preconference

Friday, June 26, 2015, 8:30am – 4:00pm

Computers have changed our lives, but what do we really know about them? Library/information centers can provide answers. Via this hackerspace hosted innovative and experiential session, attendees will learn practical skills such as soldering and learning the basics of Arduino programing and being able to create and adapt programs for their own needs. A panel of Silicon Valley insiders and librarians will share how their institutions programs on programming contribute to analytical thinking.

This experiential session is for anyone, with or without experience, who is curious about the Do-It-Yourself (DIY) / Do-It-Together (DIT) movement, and how it can help libraries. Come join LITA at Noisebridge, for a day at one of the first US hackerspaces. In the morning, attendees will learn to solder their own limited edition LITA project, learn the basics of electronics, and leave not only with the projects they made and inspiration to experiment on their own, but also with ideas for implementation of hackerspaces in their libraries.

There will be an afternoon panel lead by a Silicon Valley inventor and library colleagues from School, Public and University Libraries that will provide different perspectives of how a hackerspace and its programming can provide a catalyst for lifelong learning in students/patrons, and how libraries can remain relevant and supportive far into the future. The discussion will include helpful hints to decide what type of space and tools are is right for your institution. Finally there will be a choice of experiential small group projects along with tours of the space.

An additional materials fee of $25, payable at the door, may apply for this session

Additional resources

A Librarian’s Guide to Makerspaces
Mitch Altman TedxBrussels talk
Tod Colgrove TedxReno Talk
Castilleja School Bourn Idea Lab
The Maker Jawn Initiative at the Free Library of Philadelphia


  • Mitch Altman, Co-founder of Noisebridge, President and CEO of Cornfield Electronics
  • Tod Colegrove, Head of DeLaMare Science & Engineering Library, University of Nevada – Reno
  • Angi Chau, Director of Bourn Idea Lab, Castilleja School (Palo Alto,CA)
  • Brandon (BK) Klevence, Maker Mentor and Prototyper, The Maker Jawn Initiative (Philadelphia, PA)
  • Tara M Radniecki, Engineering Librarian at DeLaMare Science & Engineering Library at the University of Nevada, Reno
  • Daniel Verbit, MLIS Candidate, University of Alabama


The fun will take place at the well known Noisebridge hackerspace. Accessible using the BART system.


SparkFun is an online retail store that sells the bits and pieces to make your electronics projects possible. Whether it’s a robot that can cook your breakfast or a GPS cat tracking device, our products and resources are designed to make the world of electronics more accessible. Learn more at



  • LITA Member $235 (coupon code: LITA2015)
  • ALA Member $350
  • Non-Member $380


To register for any of these events, you can include them with your initial conference registration or add them later using the unique link in your email confirmation. If you don’t have your registration confirmation handy, you can request a copy by emailing You also have the option of registering for a preconference only. To receive the LITA member pricing during the registration process on the Personal Information page enter the discount promotional code: LITA2015

Register online for the ALA Annual Conference and add a LITA Preconference
Call ALA Registration at 1-800-974-3084
Onsite registration will also be accepted in San Francisco.

Questions or Comments?

For all other questions or comments related to the course, contact LITA at (312) 280-4269 or Mark Beatty,

FOSS4Lib Upcoming Events: Fedora Committers Meeting at Open Repositories

Fri, 2015-04-24 13:48
Date: Monday, June 8, 2015 - 09:00 to 17:00Supports: Fedora Repository

Last updated April 24, 2015. Created by Peter Murray on April 24, 2015.
Log in to edit this page.

Open Repositories represents an annual opportunity to bring current and prospective Fedora developers together to review, discuss, and share: current initiatives; upcoming roadmap; design issues; collaboration opportunities, etc. Although this meeting is open to community developers interested in joining the Fedora effort, this is a working/planning session and not a Fedora tutorial.

Agenda on the DuraSpace Wiki.

LITA: Tips for Managing Electronic Resources

Fri, 2015-04-24 13:00

Credit: Pixabay user Geralt, CC0 Public Domain

Last fall, I unexpectedly took on the electronic resources management (ERM) role at my university. Consequently, I had to teach myself–on the fly–how to manage 130+ electronic resources, along with a budget of several hundred thousand dollars. My initial six months focused on finances, licensing, and workflows rather than access, discoverability, or other key issues. So here are some life-saving tips for all you new e-librarians, because I know you didn’t learn this in library school!

Let’s start, as always, with the users.

Evaluate user needs.

Are you new at your job? Then begin by conducting a needs assessments, formal or informal. Check the programs and course offerings to make sure they still align with the e-resources for which you pay. Seek out faculty, colleagues, and students to get a sense of what resources they assign, use, or see used. Pull usage statistics from each database–and be sure to cross-reference this vendor data with web analytics because vendor data can be self-serving to the point of fictitious. Do your users use each resource enough to justify its cost? And they do really require the level of access you’re paying for? If not, can the resources be marketed and usage increased? And if there’s just no market, can those funds be reallocated and more relevant sources acquired?

Be budget-conscious.

Budgets are a huge consideration for any e-resources manager given that libraries are constantly absorbing budget cuts while vendors raise prices 3-5% a year, on average. Can your library afford to provide the resources it currently offers? More importantly, can the funds be used better? Can you save ten thousand dollars on one contract simply by renegotiating the number of concurrent users so as to reflect enrollment? Can you review your databases for duplication of content? Can you tap free, open access resources to plug content gaps or replace proprietary platforms? Can you talk to vendors and peruse old records to check for any unused credits lying around? And above all, how can you make the case for spending more money on electronic resources?

Negotiate terms.

Often you don’t actually need to throw more money at e-resources to get the best value. Most vendor reps are authorized to reduce off-the-shelf pricing by 20-25% without consulting their boss, and if you push hard enough–especially with smaller or longstanding service providers with a stake in the clientele–you can save potentially huge sums that can then be reallocated to purchase more databases or ebooks. And even if you don’t get a big discount, at least you can get special add-ons or other privileges. But you have be willing to negotiate and drive a hard bargain. Don’t be mean, because vendors are people too–usually very nice people; I’m Facebook friends with several. But we have to remember that our first duty is to get the best value for our taxpayers or students, not to “be nice” to the private sector and hand them all our money without demur.

Take advantage of add-ons.

Even if you aren’t a tough negotiator, you can derive maximum benefit from your subscriptions by exploring untapped services and add-ons most vendors provide. Want to market an e-resource? Check with the vendor-chances are that they can provide free web-based training and marketing materials. Annoyed that a database doesn’t integrate with your discovery layer? Talk to the vendor’s tech team; chances are that you can work something out. And major subscriptions often come with package deals and free add-ons. For example, libraries that use OCLC’s WorldShare as their ILS may be surprised to discover that ContentDM comes bundled with a WMS subscription.

Think consortia.

Speaking of packages, remember the value of group or consortial deals! We save 15% on our EBSCO databases through our free membership in an independent college consortium. Scan your environment to see if there are any great consortial arrangements out there. If not, consider initiating one with area libraries that have similar user populations and information needs. Talk to your state association and regional network or cooperative as well as to folks at your university. That said, be sure to evaluate critically the e-resources and terms of each consortial deal–beware of paying for stuff you don’t need, let alone paying twice for databases you already have.

Learn to love documentation.

Document everything. Seriously. When I started my position, there was no systematic workflow or documentation in place, older invoices were packed loose into folders, and invoices would trickle in randomly through snail mail. I created budget spreadsheets listing databases, vendors, pricing, and period of service; digitized and classified a year’s worth of records; and converted the system to e-invoicing. I also created a master password list for all administrative logins and a contact list for the reps and tech support for each e-resource. Not only does this streamline your workflows and preempt internal audits, but also enables you to document what e-resources you have, how much money you have saved, and how much money you can spend before the new fiscal year.

Read the contracts.

Read licensing agreements and contracts before signing. PLEAZ. Words are negotiable, same as prices. Can you tweak the wording to soften your legal obligations and remove financial penalties for violating the terms of use? Can you demand a VPAT documenting the e-resource’s accessibility? Can you add a clause excluding the library from liability if a user or advocacy group sues because disabled users cannot access the e-resource? Can you give the library a quick out clause in cases of multiyear contracts? Can you get reimbursed if the e-resource goes offline for an extended period? . . . In short, can you modify the standard contract? In all cases, the answer is yes. You can.

Ensure legal compliance.

Credit: Pixabay user Geralt, CC0 Public Domain

Be sure your institution is complying with the terms of the contract. You don’t want to get sued or have your access terminated without notice because people didn’t read the contract carefully enough and gave two hundred students access to an e-resource budgeted for only two users.

Closing thought.

Be that person who interrogates assumptions, saves the library money, and better serves staff and end users. If something was done that way for years, chances are it can be done better.

Do you manage electronic resources? Have you done in the past? Please share your tips below!

ACRL TechConnect: Best Practices for Hacking Third-Party Sites

Fri, 2015-04-24 11:52

While customizing vendor web services is not the most glamorous task, it’s something almost every library does. Whether we have full access to a templating system, as with LibGuides 2, or merely the ability to insert an HTML header or footer, as on many database platforms, we are tested by platform limitations and a desire to make our organization’s fractured web presence cohesive and usable.

What does customizing a vendor site look like? Let’s look at one example before going into best practices. Many libraries subscribe to EBSCO databases, which have a corresponding administrative side “EBSCOadmin”. Electronic Resources and Web Librarians commonly have credentials for these admin sites. When we sign into EBSCOadmin, there are numerous configuration options for our database subscriptions, including a “branding” tab under the “Customize Services” section.

While EBSCO’s branding options include specifying the primary and secondary colors of their databases, there’s also a “bottom branding” section which allows us to inject custom HTML. Branding colors can be important, but this post is focuses on effectively injecting markup onto vendor web pages. The steps for doing so in EBSCOadmin are numerous and not informative for any other system, but the point is that when given custom HTML access one can make many modifications, from inserting text on the page, to an entirely new stylesheet, to modifying user interface behavior with JavaScript. Below, I’ve turned footer links orange and written a message to my browser’s JavaScript console using the custom HTML options in EBSCOadmin.

These opportunities for customization come in many flavors. We might have access only to a section of HTML in the header or footer of a page. We might be customizing the appearance of our link resolver, subscription databases, or catalog. Regardless, there are a few best practices which can aid us in making modifications that are effective.

General Best Practices

What happens when vendors don’t put headings in HTML elements:

— Matthew Reidsma (@mreidsma) April 21, 2015

Ditch best practices when they become obstacles

It’s too tempting; I have to start this post about best practices by noting their inherent limitations. When we’re working with a site designed by someone else, the quality of our own code is restricted by decisions they made for unknown reasons. Commonly-spouted wisdom—reduce HTTP requests! don’t use eval! ID selectors should be avoided!—may be unusable or even counter-productive.

To note but one shining example: CSS specificity. If you’ve worked long enough with CSS then you know that it’s easy to back yourself into a corner by using overly powerful selectors like IDs or—the horror—inline style attributes. These methods of applying CSS have high specificity, which means that CSS written later in a stylesheet or loaded later in the HTML document might not override them as anticipated, a seeming contradiction in the “cascade” part of CSS. The hydrogen bomb of specificity is the !important modifier which automatically overrides anything but another !important later in the page’s styles.

So it’s best practice to avoid inline style attributes, ID selectors, and especially !important. Except when hacking on vendor sites it’s often necessary. What if we need to override an inline style? Suddenly, !important looks necessary. So let’s not get caught up following rules written for people in greener pastures; we’re already in the swamp, throwing some mud around may be called for.

There are dozens of other examples that come to mind. For instance, in serving content from a vendor site where we have no server-side control, we may be forced to violate web performance best practices such as sending assets with caching headers and utilizing compression. While minifying code is another performance best practice, for small customizations it adds little but obfuscates our work for other staff. Keeping a small script or style tweak human-readable might be more prudent. Overall, understanding why certain practices are recommended, and when it’s appropriate to sacrifice them, can aid our decision-making.

Test. Test. Test. When you’re done testing, test again

Whenever we’re creating an experience on the web it’s good to test. To test with Chrome, with Firefox, with Internet Explorer. To test on an iPhone, a Galaxy S4, a Chromebook. To test on our university’s wired network, on wireless, on 3G. Our users are vast; they contain multitudes. We try to represent their experiences as best as possible in the testing environment, knowing that we won’t emulate every possibility.

Testing is important, sure. But when hacking a third party site, the variance is more than doubled. The vendor has likely done their own testing. They’ve likely introduced their own hacks that work around issues with specific browsers, devices, or connectivity conditions. They may be using server-side device detection to send out subtly different versions of the site to different users; they may not offer the same functionality in all situations. All of these circumstances mean that testing is vitally important and unending. We will never cover enough ground to be sure our hacks are foolproof, but we better try or they’ll not work at all.

Analytics and error reporting

Speaking of testing, how will we know when something goes wrong? Surely, our users will send us a detailed error report, complete with screenshots and the full schematics of every piece of hardware and software involved. After all, they do not have lives or obligations of their own. They exist merely to make our code more error-proof.

If, however, for some odd reason someone does not report an error, we may still want to know that one occurred. It’s good to set up unobtrusive analytics that record errors or other measures of interaction. Did we revamp a form to add additional validation? Try tracking what proportion of visitors successfully submit the form, how often the validation is violated, how often users submit invalid data multiple times in a row, and how often our code encounters an error. There are some intriguing client-side error reporting services out there that can catch JavaScript errors and detail them for our perusal later. But even a little work with events in Google Analytics can log errors, successes, and everything in between. With the mere information that problems are occurring, we may be able to identify patterns, focus our testing, and ultimately improve our customizations and end-user experience.

Know when to cut your losses

Some aspects of a vendor site are difficult to customize. I don’t want to say impossible, since one can do an awful lot with only a single <script> tag to work with, but unfeasible. Sometimes it’s best to know when sinking more time and effort into a customization isn’t worth it.

For instance, our repository has a “hierarchy browse” feature which allows us to present filtered subsets of items to users. We often get requests to customize the hierarchies for specific departments or purposes—can we change the default sort, can we hide certain info here but not there, can we use grid instead of list-based results? We probably can, because the hierarchy browse allows us to inject arbitrary custom HTML at the top of each section. But the interface for doing so is a bit clumsy and would need to be repeated everywhere a customization is made, sometimes across dozens of places simply to cover a single department’s work. So while many of these change requests are technically possible, they’re unwise. Updates would be difficult and impossible to automate, virtually ensuring errors are introduced over time as I forget to update one section or make a manual mistake somewhere. Instead, I can focus on customizing the site-wide theme to fix other, potentially larger issues with more maintainable solutions.

A good alternative to tricky and unmaintainable customizations is to submit a feature request to the vendor. Some vendors have specific sites where we can submit ideas for new features and put our support behind others’ ideas. For instance, the Innovative Users Group hosts an annual vote where members can select their most desired enhancement requests. Remember that vendors want to make a better product after all; our feedback is valued. Even if there’s no formal system for submitting feature requests, a simple email to our sales representative or customer support can help.

CSS Best Practices

@mreidsma @phette23 Sounds like a LITA Guide. z-index: 100001 !important; how to customize vendor sites.

— Michael Schofield (@schoeyfield) April 9, 2015

While the above section spoke to general advice, CSS and JavaScript have a few specific peculiarities to keep in mind while working within a hostile host environment.

Don’t write brittle, overly-specific selectors

There are two unifying characteristics of hacking on third-party sites: 1) we’re unfamiliar with the underlying logic of why the site is constructed in a particular way and 2) everything is subject to change without notice. Both of these making targeting HTML elements, whether with CSS or JavaScript, challenging. We want our selectors to be as flexible as possible, to withstand as much change as possible without breaking. Say we have the following list of helpful tools in a sidebar:

<div id="tools">     <ul>         <li><span class="icon icon-hat"></span><a href="#">Email a Librarian</a></li>         <li><span class="icon icon-turtle"></span><a href="#">Citations</a></li>         <li><span class="icon icon-unicorn"></span><a href="#">Catalog</a></li>     </ul> </div>

We can modify the icons listed with a selector like #tools > ul > li > span.icon.icon-hat. But many small changes could break this style: a wrapper layer injected in between the #tools div and the unordered list, a switch from unordered to ordered list, moving from <span>s for icons to another tag such as <i>. Instead, a selector like #tools .icon.icon-hat assumes that little will stay the same; it thinks there’ll be icons inside the #tools section, but doesn’t care about anything in between. Some assumptions have to stay, that’s the nature of customizing someone else’s site, but it’s pretty safe to bet on the icon classes to remain.

In general, sibling and child selectors make for poor choices for vendor sites. We’re suddenly relying not just on tags, classes, and IDs to stay the same, but also the particular order that elements appear in. I’d also argue that pseudo-selectors like :first-child, :last-child, and :nth-child() are dangerous for the same reason.

Avoid positioning if possible

Positioning and layout can be tricky to get right on a vendor site. Unless we’re confident in our tests and have covered all the edge cases, try to avoid properties like position and float. In my experience, many poorly structured vendor sites employ ad hoc box-sizing measurements, float-based layout, and lack a grid system. These are all a recipe for weird interconnections between disparate parts—we try to give a call-out box a bit more padding and end up sending the secondary navigation flying a thousand pixels to the right offscreen.

display: none is your friend

display: none is easily my most frequently used CSS property when I customize vendor sites. Can’t turn off a feature in the admin options? Hide it from the interface entirely. A particular feature is broken on mobile? Hide it. A feature is of niche appeal and adds more clutter than it’s worth? Hide it. The footer? Yeah, it’s a useless advertisement, let’s get rid of it. display: none is great but remember it does affect a site’s layout; the hidden element will collapse and no longer take up space, so be careful when hiding structural elements that are presented as menus or columns.

Attribute selectors are excellent

Attribute selectors, which enable us to target an element by the value of any of its HTML attributes, are incredibly powerful. They aren’t very common, so here’s a quick refresher on what they look. Say we have the following HTML element:

<a href="" title="the best site, seriously" target="_blank">

This is an anchor tag with three attributes: href, title, and target. Attribute selectors allow us to target an element by whether it has an attribute or an attribute with a particular value, like so:

/* applies to <a> tags with a "target" attribute */ a[target] {     background: red; } /* applies to <a> tags with an "href" that begin with "http://" this is a great way to style links pointed at external websites or one particular external website! */ a[href^="http://"] {     cursor: help; } /* applies to <a> tags with the text "best" anywhere in their "title" attribute */ a[title*="best"] {     font-variant: small-caps; }

Why is this useful among the many ways we can select elements in CSS? Vendor sites often aren’t anticipating all the customizations we want to employ; they may not provide handy class and ID styling hooks where we need them. Or, as noted above, the structure of the document may be subject to change either over time or across different pieces of the site. Attribute selectors can help mitigate this by making style bindings more explicit. Instead of saying “change the background icon for some random span inside a list inside a div”, we can say “change the background icon for the link that points at our citation management tool”.

If that’s unclear, let me give another example from our institutional repository. While we have the ability to list custom links in the main left-hand navigation of our site, we cannot control the icons that appear with them. What’s worse, there are virtually no styling hooks available; we have an unadorned anchor tag to work with. But that turns out to be plenty for a selector of form a[href$=hierarchy] to target all <a>s with an href ending in “hierarchy”; suddenly we can define icon styles based on the URLs we’re pointing it, which is exactly what we want to base them on anyways.

Attribute selectors are brittle in their own ways—when our URLs change, these icons will break. But they’re a handy tool to have.

JavaScript Best Practices

Avoid the global scope

JavaScript has a notorious problem with global variables. By default, all variables lacking the var keyword are made global. Furthermore, variables outside the scope of any function will also be global. Global variables are considered harmful because they too easily allow unrelated pieces of code to interact; when everything’s sharing the same namespace, the chance that common names like i for index or count are used in two conflicting contexts increases greatly.

To avoid polluting the global scope with our own code, we wrap our entire script customizations in an immediately-invoked function expression (IIFE):

(function() {     // do stuff here  }())

Wrapping our code in this hideous-looking construction gives it its own scope, so we can define variables without fear of overwriting ones in the global scope. As a bonus, our code still has access to global variables like window and navigator. However, global variables defined by the vendor site itself are best avoided; it is possible they will change or are subject to strange conditions that we can’t determine. Again, the fewer assumptions our code makes about how the vendor’s site works, the more resilient it will be.

Avoid calling vendor-provided functions

Oftentimes the vendor site itself will put important functions in the global scope, funtions like submitForm or validate where their intention seems quite obvious. We may even be able to reverse engineer their code a bit, determining what the parameters we should pass to these functions are. But we must not succumb to the temptation to actually reference their code within our own!

Even if we have a decent handle on the vendor’s current code, it is far too subject to change. Instead, we should seek to add or modify site functionality in a more macro-like way; instead of calling vendor functions in our code, we can automate interactions with the user interface. For instance, say the “save” button is in an inconvenient place on a form and has the following code:

<button type="submit" class="btn btn-primary" onclick="submitForm(0)">Save</button>

We can see that the button saves the form by calling the submitForm function when it’s clicked with a value of 0. Maybe we even figure out that 0 means “no errors” whereas 1 means “error”.[X. True story, I reverse engineered a vendor form where this appeared to be the case.] So we could create another button somewhere which calls this same submitForm function. But so many changes break our code; if the meaning of the “0” changes, if the function name changes, or if something else happens when the save button is clicked that’s not evident in the markup. Instead, we can have our new button trigger the click event on the original save button exactly as a user interacting with the site would. In this way, our new save button should emulate exactly the behavior of the old one through many types of changes.

{{Insert Your Best Practices Here}}

Web-savvy librarians of the world, what are the practices you stick to when modifying your LibGuides, catalog, discovery layer, databases, etc.? It’s actually been a while since I did customization outside of my college’s IR, so the ideas in this post are more opinion than practice. If you have your own techniques—or disagree with the ones in this post!—we’d love to hear about it in the comments.

DuraSpace News: Open Repository Welcomes New Client: South African Medical Research Council

Fri, 2015-04-24 00:00

By James Evans, Open Repository  Open Repository is delighted to announce another new client, the South African Medical Research Council (SAMRC). They are Open Repository’s first client in sub-Saharan Africa. The platform now operates repositories on behalf of a growing client base in six continents.

DuraSpace News: DSquare Technology News: DSpace 4 and 5 Now Available in Hindi

Fri, 2015-04-24 00:00

DSquare Technologies is focused on providing turnkey solution in the field of Enterprise Content Management.  Recently the National Institute of Immunology, New Delhi (NII) selected DSpace for hosting its institutional repository, which will contain contents like books, theses, annual reports, research reports and more.  Users will be able to access these contents based on nature of contents e.g.

pinboard: Southeast - Code4Lib

Thu, 2015-04-23 15:28

Harvard Library Innovation Lab: Link roundup April 23, 2015

Thu, 2015-04-23 15:24

This is the good stuff.


then I went here

How Dalziel and Pow Realized This Awesome Interactive Touch Wall – Core77


How Dalziel and Pow Realized This Awesome Interactive Touch Wall – Core77


John Harvard ‘speaks’ | Harvard Gazette

Harvard is animating the famous John Harvard Statue


HTTP search. Maybe? Searching is so dang common.

DPLA: DPLAfest 2015: That’s a Wrap!

Thu, 2015-04-23 14:50

DPLAfest 2015 was one for the history books! Bringing together more than 300 people from across the country (and world!), this year’s DPLAfest was two-days worth of excellent conversations, workshops, networking, hacking, and more. Missed the action, or just looking for a one-stop summary of the event? Look no further — this post contains all media, outputs, and other materials associated with the second annual DPLAfest in Indianapolis.


On the second anniversary of the Digital Public Library of America’s launch, DPLA announced a number of new partnerships, initiatives, and milestones that highlight its rapid growth, and prepare it to have an even larger impact in the years ahead. At DPLAfest 2015 in Indianapolis, hundreds of people from DPLA’s expanding community gathered to discuss DPLA’s present and future. Announcements included:

  • Over 10 Million Items from 1,600 Contributing Institutions
  • New Hub Partnerships
  • PBS-DPLA Partnership
  • Learning Registry Collaboration
  • Sloan Foundation-funded Work on Ebooks
  • Collaboration with HathiTrust for Open Ebooks
  • New Board Chair and New Board Member Announced
  • New IMLS-funded Hydra project
  • DPLA Becomes an Official Hydra Project Partner

To find out more about these announcements and milestones, click here.

To read more about the collaboration with HathiTrust for open ebooks, click here.

Slides and notes

To find presentation slides and notes from DPLAfest 2015 sessions, visit the online agenda (click on each session to find attached slides and links to notes, where available).


Historic Indianapolis Tour (via Historypin)

Follow the Digital Public Library of America channel on to take our tour of historic sites in downtown Indianapolis, featuring some great images from the collection! Also, make sure to download the Historypin app, too, to access the tour on-the-go. Inspired to make your own Historypin tour? Share it with us!


The Digital Public Library of America wishes to thank its generous DPLAfest Sponsors:

  • The Alfred P. Sloan Foundation
  • Anonymous Donor
  • Bibliolabs
  • Central Indiana Community Foundation (CICF)
  • Digital Divide Data
  • Digital Library Federation
  • Digital Library Systems Group at Image Access
  • OCLC

DPLA also wishes to thank its gracious hosts:

  • Indianapolis Public Library
  • Indiana State Library
  • Indiana Historical Society
  • IUPUI University Library
Host DPLAfest 2016

If your organization is interested in hosting DPLAfest 2016, please let us know! We will put out a formal call for proposals in late April or early May.



Peter Murray: Thursday Threads: Fake Social Media, Netflix is Huge, Secret TPP is Bad

Thu, 2015-04-23 10:55
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

In this week’s Thursday Threads we look at the rise of fake social media influence, how a young media company (Netflix) is now bigger than an old media company (CBS), and a reminder of how secrecy in constructing trade agreements is a bad idea.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

Buying Social Media Influence

Click farms jeopardize the existential foundation of social media: the idea that the interactions on it are between real people. Just as importantly, they undermine the assumption that advertisers can use the medium to efficiently reach real people who will shell out real money. More than $16 billion was spent worldwide on social media advertising in 2014; this money is the primary revenue for social media companies. If social media is no longer made up of people, what is it?

The Bot Bubble: How Click Farms Have Inflated Social Media Currency, by Doug Bock Clark, New Republic, 20-Apr-2015

Think that all that happens on the social networks is real? You may think differently after reading this article about the business of buying follow, likes, and mentions. How to win friends and influence people in the 21st century? Buy in bulk. (Is that too cynical?)

Netflix is Big. Really Big.

In a letter to investors released on Wednesday, Netflix announced that by the end of March, it had reached a staggering 40 million subscriptions in the U.S. That means there’s a Netflix subscription for more than a third of the households in the United States — 115,610,216, according to the U.S. Census. Which is pretty insane. In the same letter, Netflix announced it had reached more than 20 million international subscribers as well, bringing the total to about 60 million.

Netflix Now Has One Subscriber For Every Three Households In America, by Brendan Klinkenberg, Buzzfeed News, 15-Apr-2015

Netflix shares are soaring after another outstanding quarter. And as of right now, that’s pushed the market value of the disruptive streaming TV company above CBS Corp, which, by most measures, operates the highest rating broadcast TV network in the US.

Netflix is now bigger than CBS, by John McDuling, Quartz, 16-Apr-2015

These two articles about the size of Netflix came out back-to back. I find both of them astounding. Sure, I believe that Netflix’s share price, and therefore its market capitalization, is pushed up in an internet bubble. But one in three households in America is a subscriber? Really? I wonder what the breakdown by age demographic is. If media stereotypes are to be believed, it skews highly towards young cable-cutting households.

Secrecy Surrounding Trans-Pacific Partnership

When WikiLeaks recently released a chapter of the Trans-Pacific Partnership Agreement, critics and proponents of the deal resumed wrestling over its complicated contents. But a cover page of the leaked document points to a different problem: It announces that the draft text is classified by the United States government. Even if current negotiations over the trade agreement end with no deal, the draft chapter will still remain classified for four years as national security information. The initial version of an agreement projected by the government to affect millions of Americans will remain a secret until long after meaningful public debate is possible.

National security secrecy may be appropriate to protect us from our enemies; it should not be used to protect our politicians from us. For an administration that paints itself as dedicated to transparency and public input, the insistence on extensive secrecy in trade is disappointing and disingenuous. And the secrecy of trade negotiations does not just hide information from the public. It creates a funnel where powerful interests congregate, absent the checks, balances and necessary hurdles of the democratic process.

Don’t Keep the Trans-Pacific Partnership Talks Secret, op-ed by Margot E. Kaminiski, New York Times, 14-Apr-2015

Have you seen what’s in the new TPP trade deal?

Most likely, you haven’t – and don’t bother trying to Google it. The government doesn’t want you to read this massive new trade agreement. It’s top secret.

Why? Here’s the real answer people have given me: “We can’t make this deal public because if the American people saw what was in it, they would be opposed to it.”

You can&apost read this, by Elizabeth Warren, 22-Apr-2015

This is bad policy. The intellectual property provisions of it — at least the leaked versions that we have seen — are particularly odious. There should not be fast track authority for a treaty that our elected representatives haven’t seen and haven’t heard from their constituencies about.

Link to this post!

HangingTogether: Going, going, gone: The imperative for archiving the web

Thu, 2015-04-23 00:55

We all know that over the past 30+ years the World Wide Web has become an indispensable tool (understatement!) for disseminating information, extending the reputations of organizations and businesses, enabling Betty the Blogger to establish an international reputation, and ruining dinner table debate by providing the answer to every conceivable question. It has caused a sea change in how humans communicate and learn. Some types of content are new, but huge quantities of material once published in print are now issued only in bytes. For example, if you’re a university archivist, you know that yesterday’s endless flood of high-use content such as graduate program brochures, course listings, departmental newsletters, and campus information dried up a decade or more ago. If you’re a public policy librarian, you know that the enormously important “grey literature” once distributed as pamphlets is now only mostly on the web. Government information? It’s almost all e-only. In addition, the scope of the scholarly record is evolving to embrace new types of content, much of which is also web-only. Without periodic harvesting of the websites that host all this information, the content is gone, gone, gone. In general, we’ve been very slow to respond to this imperative. Failure to adequately preserve the web is at the heart of the Digital Dark Ages.

The Internet Archive’s astonishing Wayback Machine has been archiving the web since the mid-1990s, but its content is far from being complete or reliable, and searching is possible only by URL. In some countries, such as the U.K. and New Zealand, the national library or archives is charged with harvesting the country’s entire web domain, and they struggle to fulfill this charge. In the U.S., some archives and libraries have been harvesting websites for a number of years, but few have been able to do so at scale. Many others have yet to dip their toes in the water. Why do so many of us lack a sense of urgency about preserving all this content? Well, for one thing, web archiving is rife with challenges.

Within the past week Ricky, Dennis, and I hosted two Webex conversations with members of our OCLC Research Library Partnership to surface some of the issues that are top-of-mind for our colleagues. Our objective was to learn whether there are shared problems that make sense for us to work on together to identify community-based solutions. All told, more than sixty people came along for the ride, which immediately suggested that we had touched a nerve. In promoting the sessions, we posited ten broad issues and asked registrants to vote for their top three. The results of this informal poll gave us a good jumping-off point. Master synthesizer Ricky categorized the issues and counted the aggregate votes for each: capture (37), description (41), and use (61). (I confess to having been glad to see use come out on top.)

OK, take a guess … what was the #1 issue? Not surprisingly … metadata guidelines! As with any type of cataloging, no one wants to have to invent the wheel themselves. Guidelines do exist, but they don’t meet the needs of all institutions. #2: Increase access to archived websites. Many sites are archived but are not then made accessible, for a variety of good reasons. #3: Ensure capture of your institution’s own output. If you’re worried about this one, you should be. #4: Measure access to archived websites. Hard to do. Do you have an analytics tool that can ever do what you really want it to?

Other challenges received some votes: getting descriptions of websites into local catalogs and WorldCat, establishing best practices for quality assurance of crawls, collaborating on selection of sites, and increasing discovery through Google and other search engines (we were a tad mystified about why this last one didn’t get more votes). Some folks offered up their own issues, such as capture of file formats other than HTML, providing access in a less siloed way, improving the end-user experience, sustaining a program in the face of minimal resources, and developing convincing use cases.

When we were done, Ricky whipped out a list of her chief off-the-cuff takeaways, to whit:

  • We need strong use cases to convince resource allocators that this work is mission-critical.
  • Let’s collaborate on selection so we don’t duplicate each others’ work.
  • Awareness of archived websites is low across our user communities: let’s fix that.
  • In developing metadata guidelines, we should bridge the differing approaches of the library and archival communities.
  • We need meaningful use metrics.
  • We need to know how users are navigating aggregations of archived sites and what they want to do with the content.
  • Non-HTML file formats are the big capture challenge.

Our Webex conversations were lively and far ranging. Because we emphasized that we needed experienced practitioners at the table, we learned that even the experts responsible for large-scale harvesting struggle in various ways. Use issues loomed large: no one tried to claim that archived websites are easy to locate, comprehend, or use. Legal issues are sometimes complex depending on the sites being crawled. Much like ill-behaved serials, websites change title, move, split, and disappear without warning. Cataloging at the site or document level isn’t feasible if, like the British Library, you crawl literally millions of sites. Tools for analytics are too simplistic for answering the important questions about use and users.

Collecting, preserving, and providing access to indispensible informational, cultural, and scholarly content has always been our shared mission. The web is where today’s content is. Let’s scale up our response before we lose more decades of human history.

What are your own web archiving challenges? Let us know by submitting a comment below, or get in touch by whatever means you prefer so we can add your voice to the conversation. We’re listening.

About Jackie Dooley

Jackie Dooley leads OCLC Research projects to inform and improve archives and special collections practice. Activities have included in-depth surveys of special collections libraries in the U.S./Canada and the U.K./Ireland; leading the Demystifying Born Digital work agenda; a detailed analysis of the 3 million MARC records in ArchiveGrid; and studying the needs of archival repositories for specialized tools and services. Her professional research interests have centered on the development of standards for cataloging and archival description. She is a past president of the Society of American Archivists and a Fellow of the Society.

Mail | Web | Twitter | Facebook | More Posts (16)

DuraSpace News: Asian Development Bank Embraces Open Access

Thu, 2015-04-23 00:00

Since it was founded in 1966, the Asian Development Bank has been the leading organization fighting poverty in the Asian and Pacific region. The organisation set its goal to enhance economic collaboration by investing in regional projects. Currently, the Asian Development Bank has 67 members.

DuraSpace News: Chris Wilper Joins @mire

Thu, 2015-04-23 00:00

For people acquainted with digital repositories, conferences or mailinglists, Chris Wilper's name should ring a bell. After being part of the initial Fedora team at Cornell University, Chris was the Fedora tech lead at DuraSpace between 2008 and 2012. We are excited to add Chris’ vast experience with digital repositories and his unique perspective on the grass-roots of the repository community to our team.

DuraSpace News: SIGN UP for Current SHARE (SHared Access Research Ecosystem) News

Thu, 2015-04-23 00:00

Winchester, MA  Interested in following SHARE news and events?