You are here

Feed aggregator

OCLC Dev Network: Systems Maintenance on Dec 13

planet code4lib - Fri, 2014-12-12 14:45

Web services that require user level authentication will be down for systems maintenance to the Identity Management system (IDM) for 30 minutes beginning at 3am on Dec 13th. This down time will affect OCLC’s worldwide data centers as follows:

District Dispatch: IMLS announces Sparks! Ignition library grants

planet code4lib - Fri, 2014-12-12 06:26

The Institute of Museum and Library Services (IMLS) is now accepting applications for Sparks! Ignition Grants for Libraries, a small grants program that encourage libraries and archives to prototype and evaluate innovations that result in new tools, products, services, or organizational practices.

Photo by Cushing Library Holy Names University via Flickr

The grants enable grantees to undertake activities that involve risk and require them to share project results–whether they succeed or fail–to provide valuable information to the library field and help improve the ways libraries serve their communities.

Libraries may qualify for $10,000 to $25,000 in small grants, and there are no matching requirements. Projects must begin on October 1, November 1, or December 1, 2015. Learn more about program guidelines and more information about the funding opportunity. The application deadline is February 2, 2015.

Have questions? IMLS staff members are available by phone and email to discuss general issues relating to the Sparks! Ignition Grants for Libraries program. Library staff members are encouraged to participate in a webinar to learn more about the program, ask questions, and listen to the questions and comments of other participants.

The webinar is scheduled for Tuesday, January 6, 2015, at 4 p.m. ET. See grant program guidelines for additional webinar details.

The post IMLS announces Sparks! Ignition library grants appeared first on District Dispatch.

DuraSpace News: COMMENTS REQUESTED: VIVO Strategic Goals for 2015-2016

planet code4lib - Fri, 2014-12-12 00:00

From Layne Johnson, VIVO Project Director

Winchester, MA  15 VIVO Goals for 2015-2016 have been Identified and selected by the VIVO Strategy Group and are presented for comment.

District Dispatch: ALA wants high-resolution images of your library

planet code4lib - Thu, 2014-12-11 20:01

As part of the American Library Association (ALA) Washington Office’s ongoing efforts to modernize and reinvigorate the District Dispatch blog, the office is seeking large high-resolution images of patrons using services offered by your library.

Photo by Jason Armstrong.

We will use high-resolution images of your library on the District Dispatch when we discuss government and information technology policies that impact libraries.

High-resolution images can include photos of:

  • Library patrons, young and old, using digital tools and resources, such as 3D printers, computers, tablets or digital collections
  • Students participating in tutoring or mentoring programs
  • Jobseekers participating in employment programs
  • Book storytimes
  • Makerspace activities
  • Innovative library programs or classes
  • Community forums
  • Outdoor images of your library

Your library images will bolster the ALA Washington Office’s advocacy efforts in Washington. Please send all images to ALA Washington Office Press Officer Jazzy Wright at jwright[at]

The post ALA wants high-resolution images of your library appeared first on District Dispatch.

Library of Congress: The Signal: NDSA New England Regional Meeting Recap

planet code4lib - Thu, 2014-12-11 14:53

The following is a guest post by Meghan Banach Bergin, Bibliographic Access and Metadata Coordinator, University of Massachusetts Amherst Libraries.

NDSA NE Regional Meeting. Photo Credit: Jennifer Gunter King

On October 30th, the second New England Regional National Digital Stewardship Alliance (NE NDSA) meeting was held at the University of Massachusetts Amherst Libraries.  The meeting was generously sponsored by the Five Colleges Digital Preservation Task Force and the UMass Amherst Libraries and coordinated by myself and Jennifer Gunter King, Director of the Harold F. Johnson Library at Hampshire College.  The first NE NDSA meeting was hosted last year by WGBH and the Harvard Library at WGBH in Boston and coordinated by Karen Cariani and Andrea Goethals.

This year’s meeting began with an overview of the NDSA and its goals and purpose by Dr. Micah Altman, Director of Research at MIT Libraries and Chair of the NDSA Coordinating Committee.  Dr. Altman also discussed the NDSA’s 2015 National Agenda, which is aimed at senior institutional decision makers and includes recommendations on specific actions that can be taken now to coordinate the large-scale acquisition and management of all different types of born-digital content – some of which may not be the type of content that is traditionally collected by libraries and archives.  The actions recommended in the National Agenda include things like advocating for resources; enhancing staffing and training; fostering multi-institutional collaboration as well as shared software platforms, tools and services; and developing standards and best practices, especially in the areas of format migrations and long-term data integrity.

This was followed with a presentation by Aaron Rubinstein and Shaun Trujilo about collaborative digital preservation efforts (PDF) among the five schools in the Five Colleges Consortium (University of Massachusetts Amherst, Smith College, Mount Holyoke College, Hampshire College and Amherst College).  Recent efforts included hosting Nancy McGovern’s Digital Preservation Management workshop and the Digital POWRR (Preserving Digital Objects with Restricted Resources) workshop, preparing a digital preservation readiness guide and checklist, and applying what was learned in a practical way with a pilot project to install and test Archivematica.

From Kathryn Gronsbell’s, AVPreserve, Presentation. Photo Credit: Jennifer Gunter King

Next was a presentation by Eleni Castro, Research Coordinator at Harvard University, on DataVerse, which is a repository for sharing, citing and preserving research data.  Her presentation highlighted some of DataVerse’s most recent data publishing efforts, (PDF) which include dataset versioning, standards-based data citations and integration with journal publishing workflows.

After that we heard from Michele Kimpton, Chief Executive Officer of DuraSpace, about the new DuraCloud/Archivematica Pilot project to integrate the two services (PDF) and provide a hosted digital preservation platform that will hopefully meet all of the needs identified in the Digital POWRR tool grid.  Our last presentation before we broke for lunch was by Kathryn Gronsbell of AVPreserve, who discussed the role of taxonomies in digital preservation strategies (PDF) and how they can help us to more efficiently find and organize the information we are preserving.

After lunch, we reconvened the meeting with a lightning talk by Casey Davis, Project Manager for the American Archive of Public Broadcasting at the WGBH Media Library and Archives.  Casey talked about digital media failures during the born-digital phase of the American Archive of Public Broadcasting Project.

We then had a series of lightning talks by the residents of the National Digital Stewardship Residency program in Boston.   Andrea Goethals, Manager of Digital Preservation and Repository Services at Harvard, and Nancy McGovern, Head of Curation and Preservation Services at MIT Libraries, gave an overview of the NDSR program (PDF) and discussed the Boston NDSR program more specifically.  Then we heard from the residents themselves about their projects:

  • Samantha DeWitt talked about her residency project at Tufts University.  Samantha is helping Tufts to gain a more complete understanding of the research data produced by its faculty, research staff and graduate students.  She is also investigating strategies for producing metadata for Tufts-created datasets for their Fedora-based repository.
  • Rebecca Fraimow, who is doing her residency at WGBH, explained her involvement with many different aspects of daily operations within the WGBH Media, Library and Archives department and her project which is to examine and help improve the overall workflow for preserving digital media as WGBH migrates from managing files with Filemaker databases and a proprietary DAM system to a Fedora-based Hydra repository.
  • Joey Heinen talked about his project at Harvard Library which is to develop migration plans for three specific, now-obsolete formats — Kodak PhotoCD, RealAudio and SMIL Playlists.
  • Jen LaBarbera then discussed her residency at Northeastern University’s Archives and Special Collections, where she is working on ingesting recently born-digital content into the Our Marathon digital archive that was created as a digital humanities project following the bombing at the 2013 Boston Marathon.  She is working on transferring all of the materials (in a wide variety of formats) from their current states/platforms (Omeka, external hard drives, Google Drive, local server) to a new iteration of Northeastern’s Fedora-based digital repository.
  • Tricia Patterson talked about her residency at the MIT Lewis Music Library.  Tricia is working to develop a digital preservation workflow for digital audio files that are part of the “Music at MIT” digital audio project.

NDSA NE Regional Meeting discussion session. Photo credit: Meghan Banach Bergin

The day concluded with a breakout discussion session where we broke into groups and talked about several topics chosen by meeting attendees.  The topics included preserving born-digital versus digitized content, digital preservation systems and tools, leveraging intellectual data using taxonomies and other tools, and video archiving.

The day’s agenda, presentations, and notes from our afternoon discussions (PDF) are posted here:

Since the first two meetings were so successful, we are hoping to make this an annual meeting with different institutions volunteering to take on hosting and coordination from year to year.  Some plans are already being discussed for next year’s meeting, so stay tuned for more information.

Peter Murray: Thursday Threads: All about online privacy, or lack thereof

planet code4lib - Thu, 2014-12-11 11:44
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

Are you paranoid yet? Are you worried that the secret you shared anonymously might come right back to you? Or wondering why advertisements seem to follow you around from web page to web page? Or just creeped out by internet-enabled services tracking your every move? Or angry that mobile carriers made it very easy for anyone to track every page you visited from your smartphone? Or maybe you will simply give up any personal information for a delicious cookie? (Are you paranoid now?)

This week’s DLTJ Thursday Threads highlights a selection of stories from the past couple months that show what’s happening with information we might consider private and how companies are trying to monetize our every move and our every click. It would seem, at least from my tiny view of the internet, that concerns about online privacy are growing. For the librarians reading this post, you’ll know that protecting patron privacy is core to our ethos. Yet sometimes our seemingly innocent actions — adding a Facebook “Like” button or gathering usage reports via Google Analytics — feed our patron’s information right into the heart of corporate interests whose ideals may not align with our own. If you are a member of the Library Information Technology Association, I encourage you to look at the newly formed Patron Privacy Interest Group and — if you will be at the ALA Midwinter meeting — come to the interest group’s first meeting on Saturday, January 31, 2015 from 8:30am to 9:30am.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

Your Favorite Anonymous App Is Not Anonymous At All

Not all of the incidents with the major anonymous messaging apps in the past year have been identical. Some were straight up hacks. Some were exploits revealed. Some were just invasive data collection measures exposed. They all support the thesis that anonymous apps tend not to stay anonymous, however. And you shouldn’t surrender a bunch of sensitive information through these apps, because you’ll probably get screwed. Let’s look at this issue app by app.

- Your Favorite Anonymous App Is Not Anonymous At All, by Adam Clark Estes, Gizmodo, 9-Dec-2014

When data gets creepy: the secrets we don’t realise we’re giving away

At the same time, something much more interesting has been happening. Information we have happily shared in public is increasingly being used in ways that make us queasy, because our intuitions about security and privacy have failed to keep up with technology. Nuggets of personal information that seem trivial, individually, can now be aggregated, indexed and processed. When this happens, simple pieces of computer code can produce insights and intrusions that creep us out, or even do us harm. But most of us haven’t noticed yet: for a lack of nerd skills, we are exposing ourselves.

- When data gets creepy: the secrets we don’t realise we’re giving away, by Ben Goldacre, The Guardian, 5-Dec-2014

We Can’t Trust Uber [or anyone else collecting our data]

Uber isn’t alone. Numerous companies, from social media sites like Facebook to dating sites like OKCupid, make it their business to track what we do, whom we know and what our typical behaviors and preferences are. OKCupid unashamedly announced that it experimented on its users, sometimes matching them with incompatible dates, just to see what happened.

The data collection gets more extensive at every turn. Facebook is updating its terms of service as of Jan. 1. They state in clearer terms that Facebook will be tracking your location (unless you disable it), vacuuming up data that other people provide about you and even contacts from your phone’s address book (if you sync it to your account) — important provisions many of Facebook’s 1.35 billion users may not even notice when they click “accept.”

We use these apps and websites because of their benefits. We discover new music, restaurants and movies; we meet new friends and reconnect with old ones; we trade goods and services. The paradox of this situation is that while we gain from digital connectivity, the accompanying invasion into our private lives makes our personal data ripe for abuse — revealing things we thought we had not even disclosed.

- We Can’t Trust Uber, op-ed by Zeynep Tufekci and Brayden King, New York Times, 7-Dec-2014

AT&T Stops Using Undeletable Phone Tracking IDs

AT&T says it has stopped its controversial practice of adding a hidden, undeletable tracking number to its mobile customers’ Internet activity….

The move comes after AT&T and Verizon received a slew of critical news coverage for inserting tracking numbers into their subscribers’ Internet activity, even after users opted out. Last month, ProPublica reported that Twitter’s mobile advertising unit was enabling its clients to use the Verizon identifier. The tracking numbers can be used by sites to build a dossier about a person’s behavior on mobile devices – including which apps they use, what sites they visit and for how long.

The controversial type of tracking is used to monitor users’ behavior on their mobile devices where traditional tracking cookies are not as effective. The way it works is that a telecommunications carrier inserts a uniquely identifying number into all the Web traffic that transmits from a users’ phone.

- AT&T Stops Using Undeletable Phone Tracking IDs, by Julia Angwin, ProPublica, 14-Nov-2014

How Much of Your Data Would You Trade for a Free Cookie?

In a highly unscientific but delicious experiment last weekend, 380 New Yorkers gave up sensitive personal information — from fingerprints to partial Social Security numbers — for a cookie. “It is crazy what people were willing to give me,” said artist Risa Puno, who conducted the experiment, which she called “Please Enable Cookies,” at a Brooklyn arts festival. … To get a cookie, people had to turn over personal data that could include their address, driver’s license number, phone number and mother’s maiden name. More than half of the people allowed Puno to take their photographs. Just under half — or 162 people — gave what they said were the last four digits of their Social Security numbers. And about one-third — 117 people — allowed her to take their fingerprints. She examined people’s driver’s licenses to verify some of the information they provided.

- How Much of Your Data Would You Trade for a Free Cookie?, by Lois Beckett, ProPublica, 1-Oct-2014Link to this post!

DuraSpace News: DSpace 5.0 Previews: Auto Upgrade and Batch Import Features

planet code4lib - Thu, 2014-12-11 00:00

Winchester, MA  Throughout the month DuraSpace is highlighting key features that will be available to the community in the upcoming release of DSpace 5.0.

Auto Upgrade Feature

Moving your repository to DSpace 5 from earlier versions of DSpace is about to get easier with the new “Auto Upgrader”.

DuraSpace News: DuraSpace Adds 40 New Contributors in 2014 Campaign

planet code4lib - Thu, 2014-12-11 00:00
Winchester, MA  Many thanks to our community for the success of the 2014 DuraSpace fundraising campaign. The total dollars raised was $1.25 million, ensuring that DuraSpace open source projects, DSpace, Fedora and VIVO, continue to serve the global communities that depend on them into the future. In addition the generous, ongoing institutional contributions of developer time, talent and commitment are critical to the success of our projects. We deeply appreciate your support.  

DuraSpace News: Webinar Recording Available: "The SHARE Notification Service"

planet code4lib - Thu, 2014-12-11 00:00

Winchester, MA  Eric Celeste, SHARE’s Technical Lead presented a webinar on, “The SHARE Notification Service” on December 10, 2014.  This was the second webinar in the DuraSpace Community Webinar Series, “ All About the SHared Access Research Ecosystem (SHARE).”  Eric included an architectural overview of the service, a demonstration, and information about how to include new resources in the service. He also described how the service may be integrated with future SHARE initiatives or local systems.

Jonathan Rochkind: debugging apache Passenger without enterprise

planet code4lib - Wed, 2014-12-10 18:35

I kind of love Passenger for my Rails deployments. It Just Works, it does exactly what it should do, no muss, no fuss.  I use Passenger with apache.

I very occasionally have a problem that I am not able to reproduce in my dev environment, and only seems to reproduce on production using Passenger apache. Note well: In every case so far, the problem actually had nothing to do with passenger or apache, there were other differences in environment that were causing it.

But still, being able to drop into a debugger in the Rails actually running under apache Passenger would have helped me find it quicker.

Support for dropping into the debugger, remotely, when running under Apache is included only in Passenger Enterprise.  I recommend considering purchasing Passenger to support the Passenger team, the price is reasonable… for one server or two. But I admit I have not yet purchased Enterprise, mainly because the number of dev/staging/production servers I would want it on to have it everywhere starts to make the cost substantial for my environment.

But it looks like there’s a third-party open source gem meant to provide the same support! See .   It’s two years old in fact, but just noticing it today myself, huh.

I haven’t tried it yet, but making this post as a note to myself and others who might want to give it a try.

The really exciting thing only in Passenger Enterprise, to me, is the way it can deploy with a hybrid multiple process+multi-threaded-request-dispatch setup. This is absolutely the best way to deploy under MRI, I have no doubts at all, it just is (and I’m surprised it’s not getting more attention).   This lower-level  feature is unlikely to come from a third-party open source, and I’m not sure I’d trust it if it did. The open source Puma, an alternative to Passenger, also offers this deploy model. I haven’t tried it in Puma myself beyond some toy testing like the benchmark mentioned above.  But I know I absolutely trust Passenger to get it right with no fuss. If you need to maximize performance (or deal with avoiding end-user latency spikes in the presence of some longer-running requests), and deploy under MRI, you should definitely consider Passenger Enterprise just for this multi-process/multi-thread combo feature.

Filed under: General

Open Knowledge Foundation: A round-up of Open Knowledge Community events around the world!

planet code4lib - Wed, 2014-12-10 18:28

One of the best opportunities that being part of a community offers is the chance to collaborate and make things happen together – and when we want this to happen in sync, what’s better than convening an (in person or online) event?

Just before the end of the year, let’s collect a few highlights from the Open Knowledge Community events you posted about on the Community Stories Tumblr (so nicely curated by Kathleen Luschek of the Public Library of Science – thank you!)!

Joseph De Guia, Open Knowledge Philippines local group ambassador, TJ Dimacali, journalist and media manager, and Happy Feraren, School of Data Fellow participated in the festival exhibition and lightning talks series spreading the word about the Open Government Data, Lobbying Transparency, Open Education, Open Spending working groups and the School of Data programme. Find out more about it here.

Open Knowledge El Salvador local ambassador Iris Palma, joined the panel focusing on Open Data and Open Access together with Caroline Burle from W3C (Brazil) and Pilar Saenz from Fundacion Karisma (Colombia). Further information about the event can be found here.

In line with the OKFestival (in Berlin) and the Latin American and Caribbean Internet Governance Forum (in San Salvador), Open Knowledge El Salvador, Creative Commons El Salvador and Association of Librarians of El Salvador celebrated the first Open Knowledge Meeting in El Salvador). The event focused on Open Knowledge, Open Data, Creative Commons Licenses, Open Education and the Declaration for Open Knowledge in El Salvador. Congratulations!

Open Knowledge Greece organized an open workshop to discuss and propose the positions and proposals of the group on the National Action Plan. Please find here all comments and suggestions that were stated in the meeting, published in both Greek and English.

Open Knowledge France hosted a data expedition in Paris at La Gaité Lyrique during the digital festival Futur en Seine to find, analyse, visualise and tell stories with existing open data on air pollution. All about it on the group’s blog!

These are wonderful examples of what happens when we get together, all you event organizers out there rock! Are you running an Open Knowledge event? We want to hear from you – please submit quick posts about your events to the Community Tumblr (details about how/where here). Let’s share the community’s great work, inspire each other, and spread the open knowledge love far and wide!

Post a link to your favorite 2014 open knowledge event in the comments below:

LITA: Jobs in Information Technology: December 10

planet code4lib - Wed, 2014-12-10 18:21

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Deputy County Librarian,  County of Santa Clara, San Jose, CA
Digital Access & Discovery Specialist, Tennessee Technological University, Cookeville, TN
Librarian, Santa Barbara City College Luria Library, Santa Barbara, CA
System Administrator, University at Albany,  Albany, NY

Visit the LITA Job Site for more available jobs and for information on submitting a  job posting.


Dan Scott: Dear database vendor: defending against scraping is going to be very difficult

planet code4lib - Wed, 2014-12-10 16:16

Our library receives formal communications from various content/database vendors about "serious intellectual property infringement" on a reasonably regular basis, that urge us to "pay particular attention to proxy security". Here is part of the response I sent to the most recent such request:

We use the UsageLimit directives that OCLC's EZProxy solution offers to block users who go over certain thresholds. However, the UsageLimit directives are really too coarse to be extremely useful. For example, you can set a limit based on the number of transfers in a given time period, but you can't set different thresholds for content types (such as CSS, JavaScript, HTML, images, or PDFs). The compromised account had gathered a set of URLs that enabled them to directly request a series of PDFs, thus staying below the general threshold for transfers. If EZProxy offered a "transfer threshold by MIME type" directive, then we could easily block users who tried to download more than, say, 100 PDFs in an hour.

We also set UsageLimit directives for total bandwidth consumed. However, again this is limited by the coarseness of the directives available to us in EZProxy, as well as the increased richness of the variety of content available from electronic resources these days. With individual PDFs varying in size from 0.25 MB to 2.5 MB, not to mention streaming audio and video services, finding the right threshold without locking out legitimate users is quite challenging.

I therefore urge you to contact OCLC directly and demand that they add the ability to include finer-grained directives for UsageLimit throttling to EZProxy. As EZProxy is by far the most common proxy solution deployed by libraries worldwide, this would enable many of your customers to benefit from the enhancement. While OCLC's customers have been requesting functionality like this for years via the EZProxy mailing list, they are slow to react (having taken months to update EZProxy to address recent SSL vulnerabilities, for example). Perhaps OCLC will listen to an enterprise partner.

For our part, at Laurentian, I have asked our IT Services department (who controls our proxy server) to write a simple script that parses the EZProxy event logs and emails us when a user is blocked due to going past a threshold. This would have helped us catch the compromised account much earlier on, and should also be another basic feature of EZProxy. Right now, every library has to implement their own solution for this basic requirement, and many do not.

All that said, even with finer-grained threshold directives and active monitoring of account blocking events, I have to note that a savvy attacker intent on harvesting your content will, once they have compromised an account, simply slow down the number of requests to the level that emulates the activity that a normal human would generate, and spread the requests out across all of the accounts they have compromised, and introduce a level of randomness into the requests so that they aren't detectable patterns (such as linear requests for only PDFs), etc. No system is going to offer a perfect defence against those efforts.

I'm sympathetic to the content vendors' concerns, but really, even if OCLC does add some of these features to their core EZProxy offering, the content-scraping approaches will simply increase in sophistication. Removing proxy access isn't a real option for our users, even though cutting off proxy access is what the content vendors do. This is a game that nobody is going to win.

John Miedema: PirateBay went down yesterday. Text analysts can take a page from pirates.

planet code4lib - Wed, 2014-12-10 14:55

This post deserves an essay. I’m going to take big leaps with too little explanation, but it’s been rattling in my head for awhile and yesterday’s bust of PirateBay compelled me to write something down.

PirateBay went down yesterday. Police in Sweden seized computers and the site went down. This is not the first time the site went down and people expect it to come back up. Torrent technology was invented for just this kind of event. A torrent only stores metadata about files available elsewhere. The entire PirateBay set of magnets can be stored on a USB disk. Cached versions of PirateBay still exist on the web and people can still download files.

One might dismiss torrent technology as a hack by pirates unwilling to pay for content, but torrents are driving real-world innovation. In earlier posts, I compared the classical “Hot Water Tank” architecture of a QA system with an alternative “Tank-less” architecture. The Tank approach is solid but cumbersome, while the Tank-less approach is deft. The idea is part of a larger shift in the world of big data processing and a demand for real-time stream processing. One of the technologies in play are torrents.

The pirate flag flies in winter in Wakefield Quebec

Go ahead and question the motive of pirates but their purpose overlaps with freedom of information advocates. Consider PirateBox. PirateBox is a do-it-yourself file sharing system, built with a cheap router and open source software. Bring it to a public space and anyone can anonymously upload and download content. It can be used to share movies. It could also be used to legally share health care information in the aftermath of a natural disaster when the internet is not available. It is no surprise that the technology has been taken up by librarians in the form of LibraryBox.

The fight for net neutrality does not seem to end. A two-tiered internet seems inevitable. Those who seek greater internet surveillance powers keep coming back. What can be done? In 2012 PirateBay experienced a downtime. They came back on, announcing a plan to move its servers to the sky, tethered to drones. It got me thinking, strap a PirateBox to a drone from BestBuy, and you have a flying internet. The cost is cheap. Build a fleet. A flying internet would deftly sidestep unwanted controls, for geeks wanting the latest Marvel movie, for teachers in Syria.

PirateBay, PirateBox, a drone-based internet. It sounds fantastic but the driver is practical. People want agile access to content. If things get too boxed in then people will invent PirateBoxes to get out. It is the same challenge faced in big data and text analytics today. Faced with an ocean of unstructured content waiting to be mined, traditional database design and top-down programming is simply too rigid. New approaches with Natural Language Processing divide content into fragments and apply bottom-up pattern recognition to extract meaning. You can see the parallel with the pirates, the use of sophisticated techniques to preserve access to distributed content.

I think of Fahrenheit 451 and the character Granger, the leader of a group of exiled drifters. Each has memorized fragments of books in readiness for a time when society will be ready to discover them.

ACRL TechConnect: This Is How I Work (Nadaleen Tempelman-Kluit)

planet code4lib - Wed, 2014-12-10 14:37

Editor’s Note: This post is part of ACRL TechConnect’s series by our regular and guest authors about The Setup of our work.


Nadaleen Tempelman-Kluit @nadaleen

Location: New York, NY

Current Gig: Head, User Experience (UX), New York University Libraries

Current Mobile Device: iPhone 6

Current Computer:

Work: Macbook pro 13’ and Apple 27 inch Thunderbolt display

Old dell PC that I use solely to print and to access our networked resources


I carry my laptop to and from work with me and have an old MacBook Pro at home.

Current Tablet: First generation iPad, supplied by work

One word that best describes how you work: has anyone said frenetic yet?

What apps/software/tools can’t you live without?

Communication / Workflow

Slack is the UX Dept. communication tool in which all our communication takes place, including instant messaging, etc. We create topic channels in which we add links and tools and thoughts, and get notified when people add items. We rarely use email for internal communication.

Boomeranggmail-I write a lot of emails early in the morning so can schedule them to be sent at different times of the day without forgetting.

Pivotal Tracker-is a user story-based project planning tool based on agile software development methods. We start with user flows then integrate them into bite size user stories in Pivotal, and then point them for development

Google Drive


Google Hangouts-We work closely with our Abu Dhabi and Shanghai campus libraries, so we do a lot of early morning and late night meetings using Google Hangouts (or GoToMeeting, below) to include everyone.

Wireframing, IA, Mockups

Sketch: A great lightweight design app

OmniGraffle: A more heavy duty tool for wire framing, IA work, mockups, etc. Compatible with a ton of stencil libraries, including he great Knoigi (LINK) and Google material design icons). Great for interactive interface demos, and for user flows and personas (link)

Adobe Creative Cloud

Post It notes, Graph paper, White Board, Dry-Erase markers, Sharpies, Flip boards

Tools for User Centered Testing / Methods 

GoToMeeting- to broadcast formal usability testing to observers in another room, so they can take notes and view the testing in real time and ask virtual follow up questions for the facilitator to ask participants.

Crazy Egg-a heat mapping hot spotting A/B testing tool which, when coupled with analytics, really helps us get a picture of where users are going on our site.

Silverback- Screen capturing usability testing software app.

PostitPlus – We do a lot of affinity grouping exercises and interface sketches using post it notes,  so this app is super cool and handy.

OptimalSort-Online card sorting software.

Personas-To think through our user flows when thinking through a process, service, or interface. We then use these personas to create more granular user stories in Pivotal Tracker (above).

What’s your workspace like?

I’m on the mezzanine of Bobst Library which is right across from Washington Square Park. I have a pretty big office with a window overlooking the walkway between Bobst and the Stern School of Business.

I have a huge old subway map on one wall with an original heavy wood frame, and everyone likes looking at old subway lines, etc. I also have a map sheet of the mountain I’m named after. Otherwise, it’s all white board and I’ve added our personas to the wall as well so I can think through user stories by quickly scanning and selecting a relevant persona.

I’m in an area where many of my colleagues mailboxes are, so people stop by a lot. I close my door when I need to concentrate, and on Fridays we try to work collaboratively in a basement conference room with a huge whiteboard.

I have a heavy wooden L shaped desk which I am trying to replace with a standing desk.

Every morning I go to Oren’s, a great coffee shop nearby, with the same colleague and friend, and we usually do “loops” around Washington Square Park to problem solve and give work advice. It’s a great way to start the day.

What’s your best time saving trick

Informal (but not happenstance) communication saves so much time in the long run and helps alleviate potential issues that can arise when people aren’t communicating. Though it takes a few minutes, I try to touch base with people regularly.

What’s your favorite to do list manager

My whiteboard, supplemented by stickies (mac), and my huge flip chart notepad with my wish list on it. Completed items get transferred to a “leaderboard.”

Besides your phone and computer, what gadget can’t you live without?


What everyday thing are you better at than everyone else?

I don’t think I do things better than other people, but I think my everyday strengths include:  encouraging and mentoring, thinking up ideas and potential solutions, getting excited about other people’s ideas, trying to come to issues creatively, and dusting myself off.

What are you currently reading?

I listen to audiobooks and podcasts on my bike commute. Among my favorites:

In print, I’m currently reading:

What do you listen to while at work?

Classical is the only type of music I can play while working and still be able to (mostly) concentrate. So I listen to the masters, like Bach, Mozart and Tchaikovsky

When we work collaboratively on creative things that don’t require earnest concentration I defer to one of the team to pick the playlist. Otherwise, I’d always pick Josh Ritter.

Are you more of an introvert or an extrovert?

Mostly an introvert who fakes being an extrovert at work but as other authors have said (Eric, Nicholas) it’s very dependent on the situation and the company.

What’s your sleep routine like?

Early to bed, early to rise. I get up between 5-6 and go to bed between around 10.

Fill in the blank: I’d love to see _________ answer these same questions.

@Morville (Peter Morville)

@leahbuley (Leah Buley)

What’s the best advice you’ve ever received?

Show up

LITA: Virtual Machines in a Nutshell

planet code4lib - Wed, 2014-12-10 13:27

Many of you have probably heard the term “virtual machine“, but might not be familiar with what a VM is or does. Virtualization is a complicated topic, as there are many different kinds and it can be difficult for the novice to tell which is which. Today we’re going to talk specifically about OS virtualization and why you should care about this pretty fabulous piece of tech.

Let’s start with a physical computer. For the sake of having a consistent example, we’ll say it’s a Dell laptop running Windows 7. Dual booting is a popular method of installing an additional operating system onto a physical computer in order to have more options and flexibility with what programs you want to run. Lots of Mac users run Boot Camp so they can have both OS X and Windows side by side. While dual booting is a great choice for many, it has limitations. Installing an OS directly onto the hardware is expensive in terms of time and system resources, and doesn’t scale very well if you want to install LOTS of operating systems as a test. What if we want Mac, Windows, and a few flavors of Linux? Bringing more than two operating systems onto the hardware is asking for trouble. Dual booting is also overkill if you are just experimenting with an OS. If you are like me and you like to install things just to see if you like them and then throw them away when you are done, dual booting just takes too long.

Enter OS virtualization. Using virtualization software like VirtualBox, a user can have any number of operating systems running as virtual machines. Our trusty Dell laptop, henceforth referred to as the “host machine”, running Windows 7, henceforth referred to as the “host OS”, downloads a copy of VirtualBox for Windows and installs it just like any other program. Virtualization software is built to manage VMs (also known as a “guest OS”) just like Microsoft Word manages documents and iTunes manages music. VMs are just files that the virtualization software runs, making it far easier to download, install, backup and destroy any number of operating systems at will. It also allows the host machine to run several operating systems at once; Windows can be running VirtualBox which is running a Mac OS X guest OS and a Linux guest OS. As you could probably guess, having several VMs running at once can be a drain on memory, so just because you can run several at once doesn’t mean you should.

Now let’s talk about why you would want to virtualize operating systems. The first and most obvious reason is that it’s more convenient than installing a new OS straight onto the hardware. Many libraries are starting to leverage OS virtualization as part of their IT strategy. When you have hundreds of computers to manage, it’s a lot easier to install virtualization software on all of them and then deploy a single managed VM file (called an “image”) to all of them instead of installing the exact same set of programs on each one individually. It’s also a great way for regular users to experiment with new environments without fear of turning their computers into expensive paperweights. Since the host OS is never overwritten, there’s never any danger of accidentally deleting your entire system, and you can always go back to the OS you are familiar with when finished.

If you are a coder, VMs are mana from heaven for a many reasons. The first is that it allows you to download whatever you want without mucking up your host machine. I’m constantly downloading new tools and programs to test out, and I don’t keep 95% of them. Testing them out in a virtual machine means that I can just delete the entire VM when I’m done, taking all that junk leftover from the installation and any test files I created along with it. I can also play around with configurations in a VM without fear of doing irreparable damage.

Perhaps one of the most useful aspects of a VM for coders is the ability to mimic target environments. Here at FSU, all of our servers are running a specific kind of Linux called Red Hat v6.5. With OS virtualization, I can download a Red Hat v6.5 image and go hog wild installing, deleting and reconfiguring whatever I want without fear of accidentally trashing the server and taking down our website. If I do inadvertently break something in the VM, I just delete it and spin up another instance. This can be a great tool for teaching newbies how to work on your production server without actually letting them anywhere near your production server.

You can prepackage software on an image as well, which is handy when you and your team want a simple way to play around with some software that might be difficult to install. The Islandora project distributes a virtual machine containing all the necessary parts configured correctly to create a working Islandora instance. This has been a huge boon to the project because it lets newbies who don’t know what they are doing (such as myself) have access to a disposable Islandora to hack on without the pain of setting one up themselves. Catmandu, a bibliographic data processing toolkit, can also be downloaded as a VM for experimentation. Expect to see this trend of software being distributed in a virtual machine continue in the future.

Learning to leverage OS virtualization effectively has changed the way I work. I do almost all of my work inside of disposable VMs now just because it’s so much more clean and convenient; it’s like a quarantined area for when you are working on things that may or may not explode. Even if you aren’t a developer, there are plenty of convenient ways to use virtualization in your everyday work environment. Despite the complicated technology running under the hood, getting started with virtualization has never been easier. Give it a shot today and let me know what you think in the comments!

Journal of Web Librarianship: A Review of “Building and Managing E-Book Collections: A How-to-Do-It Manual for Librarians”

planet code4lib - Wed, 2014-12-10 06:24
Volume 8, Issue 4, October-December 2014, pages 418-419
David Gibbs

Journal of Web Librarianship: A Review of “The CSS3 Anthology: Take Your Sites to New Heights, 4th ed.”

planet code4lib - Wed, 2014-12-10 06:24
Volume 8, Issue 4, October-December 2014, pages 419-420
Elizabeth Fronk

Journal of Web Librarianship: A Review of “Going Beyond Google Again: Strategies for Using and Teaching the Invisible Web”

planet code4lib - Wed, 2014-12-10 06:24
Volume 8, Issue 4, October-December 2014, pages 420-421
Kali Davis


Subscribe to code4lib aggregator