You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 2 hours 5 min ago

M. Ryan Hess: New Thoughts on Digital Publishing Services

Fri, 2014-11-07 16:32

Back in early 2011, I gave an overview of the library as a disruptive publishing platform. Three years is a long time in “disruptive agent” years. So where do we stand today?

First of all, the publishing industry has not fallen yet…but the great disruption goes on.

A friend of mine was recently describing his rodent control neighbor, a charmingly opaque Eastern European gentleman whose central point about controlling rats can be summed up in a single pronouncement: “Fighting rats is F@#%ing 24×7 War!”

I’m seeing value in this statement for the effort to liberate information. As I’m learning in my contact with faculty and other librarians, the rat warrens run deep into our institutions. So invasive are their labyrinths that they threaten the very financial underpinnings of our information services.

Luckily, we are not passive observers in this state of affairs. We are active participants in creating something new. We have tools at our disposal to fill in the rat holes with a digital foundation that will ensure a long, fruitful future of open access publishing that will empower our users in ways traditional publishing could never do.

New Openings

I’m seeing a number of openings libraries are beginning to exploit that build on the “library as publishing platform” model I wrote about earlier. Namely, librarians are often becoming central hubs for a variety of digital services that include:

  • digital humanities and academic computing support
  • digital project consultant services for everything from how to migrate online content to advice on metadata to search engine optimization (SEO) and usability
  • helping faculty navigate scholarly communications issues from copyright to developing readership and recognition
  • and, of course, providing the place on campus for online publishing

Taken together, all of these emerging services suggest a fairly promising future for librarians interested in transforming the profession into something more in line with current and future trajectories for information.

Ready to enlist as a disruptive agent yet?

Over the next few posts, I’ll explore each of the above and how my library is building new services or augmenting older services to meet these emerging digital publishing needs.

First up, that thing that goes by the very vague and unhelpful term of digital humanities…

Ground Zero for Digital Humanities

At my Library, we have not rolled out a formal digital humanities support program…yet.

Nonetheless, we receive regular, unsolicited inquiries about platforms like Omeka and Digital Commons from faculty interested in creating exhibits and online course projects. To meet the demand so far, we’ve rolled out services, but what people really want is full-blown Omeka with plugins like Neatline and others the hosted version does not support.

Clearly, this organic demand suggests a far more robust DH service is required. As I write, we’ve deployed a faculty survey based loosely on one created by Rose Fortier’s work at Marquette University. With this, we hope to not only build awareness of our digital collections and services (spoiler: early results have 60% of faculty being unaware of our institutional repository, for example…24×7 war indeed!), but also we want to learn what services, like digital humanities support, would interest faculty.

Based on our experience, my guess is that digital humanities support services will generate healthy interest. If this is the case, then we will probably role out self-hosted Omeka plus Neatline and GeoServer, along with trainings and baseline technical support, sometime in 2015. The one hitch that will need to be overcome, will be multi-site capability, which will enable us to install Omeka once and then launch as many separate sites as are required with a single click of a button. That particular feature does not exist yet outside, but according to, the forthcoming Omeka 3/Omeka-S will provide this, greatly enhancing the practicality of launching an Omeka service for any library.

Meanwhile, as I recently presented at the 2014 Digital Commons Great Lakes User Group, we are also continuing to provide a measure of digital humanities support on our Digital Commons institutional repository. While not as sexy as Neatline, we are posting student-generated Map of the Month from the Geography Department, for example, in PDF format.

The recent enhanced, zoomable image viewer available in Digital Commons may also help in this regard.

We’ve also seen a few faculty interested in using Digital Commons for student projects, particularly around courses focused on digital publishing issues.

But, of course, as non-librarian content creators enter the collection-building business, they come ill-prepared for overcoming the kinds of problems library professionals excel at solving. And so, this is where I’d like to turn to next: the library as a digital project consultant service.

Library of Congress: The Signal: The Value of the NDSR: Residents and Mentors Weigh In

Fri, 2014-11-07 14:49

The following is a guest post by Vicky Steeves, National Digital Stewardship Resident at the American Museum of Natural History in New York City. This is the first in a series of posts by the residents from NDSR class of 2014-2015.

I wanted to take this opportunity, as the first 2014-2015 resident to post on The Signal, to discuss how valuable the National Digital Stewardship Residency program is. Among many things, it has given me the opportunity to work at the American Museum of Natural History in New York, surveying scientific research data and recommending preservation strategies. Nowhere else could I have gotten this opportunity. In this post I will look at the value of NDSR, showing that the NDSR is an innovative and important program for furthering the field of library and information science.

Current 2014-2015 NDSR-NY Cohort (left to right): Vicky Steeves, Peggy Griesinger, Karl Blumenthal, Shira Peltzman, and Julia Kim. Photo by Alan Barnett.

The National Digital Stewardship Residency participants (hosts and residents) have demonstrated how this residency fulfills the need for emerging professionals to be placed in important institutions. Here, residents’ skills have the space to expand. This allows for the growth of the field in two ways: residents contribute to the growing body of research in digital preservation and gain skills which they can use throughout their careers as they continue to advance the field. For host institutions, the ability to bring in additional, knowledgeable staff at little or no cost is transformative.

When evaluating the NDSR program, it’s important to look at both simple numbers and testimonials. In terms of the quantitative, 100% of  the residents from the 2013-2014 team in Washington DC have found relevant positions upon completion of the residency. (See previous posts on that subject, parts one and two.) I sought out this first class of residents, and asked them how important they feel NDSR has been for them:

Vicky Steeves: Why did you apply to the NDSR program?

Margo Padilla, (Strategic Programs Manager at Metropolitan New York Library Council): “It seemed like a great way to meet and collaborate with people doing critical work in the field. I was also excited about all the projects and knew that even though I was the resident at only one location, I would learn a lot from the other residents and projects.”

Molly Schwartz, (Fulbright Scholar, Aalto University and the National Library of Finland): “As a new graduate I knew that I needed more hands-on experience and I wasn’t sure exactly what type of institution would be the right professional fit for me. NDSR seemed like a great option for many reasons: I would get more experience, come out of it with a completed project, I would learn what it is like to work at a small non-profit institution (ARL), and I would have the freedom to dive into digital information research full-time, both working on my own project and attending conferences and meetings where I could collaborate with others in the field.”

Julia Blase, (Project Manager, Field Book Project, Smithsonian Libraries): “I was very interested in working on the digital side of libraries and archives after graduate school, but knew that it could be difficult to find entry-level positions in the field, particularly those that would provide practical, complex experience in multiple aspects of the field and train me for the next step in my career. NDSR seemed to offer that chance.”

Vicky Steeves: Why do you think it’s important (or not) for the library science field to have programs like this?

Margo Padilla:  “I think programs like this are important because it helps new graduates grow into the field, discover their niche, and contribute to a larger body of research. Recent graduates lend a fresh perspective to work already being done. It is also a chance for them to learn, make mistakes, and test what works and what doesn’t.”

Molly Schwartz: “The digital information field, especially from the information steward perspective, is at a point where we need to retain and cultivate professionals who have the desire to work in a fast-paced environment and have the skill sets to get employed elsewhere. It is crucial that we provide opportunities for these types of people to develop within the field and get exposed to all the cool work they can do, work that will have real impact, if we are to tackle the challenges facing the profession.”

Julia Blase: “It is very difficult, in my experience and in the experiences of my friends, for a young professional to make the jump from an entry-level or paraprofessional position to a mid-level position, which may begin to incorporate more complex projects, strategic planning, and perhaps even the management of a project, program or other staff members. Programs like the Residency offer that in-between path, supporting and training their graduates so that they are prepared and qualified for that first mid-level position after the program, advancing both the individual careers and also, by providing motivated and prepared staff, the quality of the profession as a whole.”

Heidi Elaine Dowding, (Ph.D. Research Fellow at the Royal Dutch Academy of Arts and Sciences, Huygens ING Institute): “I think paid fellowships like this are really important, especially for students who can’t afford to accept unpaid internships to forward their career. They even the playing field in some ways, and help build really strong networks of practitioners.”

These testimonials demonstrate how impactful the NDSR curriculum is to professional development and career opportunities for postgraduates. The current resident at the Museum of Modern Art in NYC, Peggy Griesinger, remarked, “I applied to NDSR because I wanted the opportunity to contribute to how cultural heritage institutions are developing long-term digital preservation practices.”  The ability to “test drive” a career and preferred setting (public institution, private, non-profit, etc.) while accumulating and refining skills in digital preservation is an invaluable part of the program. Residents also had the opportunity to network and establish relationships with mentors who have invaluable experience in the field, which often led to gainful employment.

Additionally, having diverse institutions buy into this program affirms the value of NDSR. While these institutions are getting a resident at little or no cost to them, it takes a lot of trust to give an incubating project to an outside professional, especially one fresh from their master’s degree. In this way, NDSR takes an important step in public trust for digital archives. I reached out to a few mentors from the 2013-2014 Washington D.C. host institutions, to get their take on the value of the NDSR program.

Vicky Steeves: How useful was the program for you and your institution in hindsight? Are you using the results from the project that your resident worked on?

Shalimar White, (Manager of the Image Collections and Fieldwork Archives at Dumbarton Oaks Research Library and Collection): “One of the benefits of the NDSR program was the ability to bring in someone like Heidi [Dowding] who could evaluate a complex organization like Dumbarton Oaks from an external perspective. Heidi’s report was delivered to our new Manager of Information Technology. As recommended in the report, the IT Manager is currently developing DO’s general technical infrastructure and building out the operations of the new IT department. In the future, when the IT Manager is able to turn her attention to more strategic planning, she has indicated that the report will be a helpful guide for developing the systems and operational procedures necessary for long-term digital asset management at Dumbarton Oaks. We expect that Heidi’s work will continue to be useful and valuable in the long-term.”

Vickie Allen, (Director of the Media Library at the Public Broadcasting Service): “Having a skilled NDSR fellow at our organization for an extended period of time was critical in getting the necessary focus, interest and leadership support for our efforts to launch a complex digitization initiative. As a direct result of the quality and scope of our resident’s work, we were allocated internal funds during the final month of the residency to begin digitization. The completed project plan and associated documentation were invaluable in filling critical knowledge gaps, allowing us to move forward quickly and confidently with our digitization initiative. We plan to use these guidelines long into the future as we continue our digitization efforts, as well as translate findings into strengthening digital media management policy for our born digital content.”

Christie Moffatt, (Manager of the Digital Manuscripts Program at the National Library of Medicine): “The NDSR program was a valuable experience for the National Library of Medicine, both in terms of project accomplishments with the addition of a new thematic Web archive collection, and our participation the NDSR community. Maureen [McCormick Harlow] shared her experiences wrestling with the technical and intellectual challenges of scoping out and creating a new collection with NLM staff involved in Web collecting, which enabled us all to learn together and apply lessons learned throughout the duration of the project. The collection Maureen developed, “Disorders of the Developing and Aging Brain: Autism and Alzheimer’s on the Web,” serves as a model for thematic Web collecting at the Library, and the workflows that she helped to develop are now being implemented in our current Ebola Outbreak web collecting initiative announced earlier this month.  NLM’s web collecting efforts have and will continue to benefit from this experience.”

These host institutions have not only used their resident’s work, but will continue to use their project deliverables, recommendations and associated documentation as digital initiatives are further developed. In this way, residents are contributing to the future developments at their host institutions. This ability to impact the present and future of host institutions is what makes NDSR such an advantage. As one of the newest members of the NDSR program, I can say that the opportunities granted to me have been phenomenal. As a resident, you truly have endless possibilities in this program.


LITA: IA & UX Meet Library Technology

Fri, 2014-11-07 13:00

The class I enjoy the most this semester at Indiana University is Information Architecture. It is a class where theory and practical application are blended so that we can create something tangible, but also understand the approaches – my favorite kind!

As defines it, Information Architecture (IA) “focuses on organizing, structuring, and labeling content in an effective and sustainable way.” While the class doesn’t necessarily focus on Library Science since it is offered through the Information Science courses, this concept may sound a bit familiar to those working in a library.

In the class, we have chosen a small website we believe could benefit from restructuring. Some students chose public library websites, and others websites from the private sector. Regardless of each website’s purpose, the process of restructuring is the same. The emphasis is placed on usability and user experience (UX), which the ALA Reference and User Services Association defines as “employing user research and user-centered design methods to holistically craft the structure, context, modes of interaction, and aesthetic and emotional aspects of an experience in order to facilitate satisfaction and ease of use.”

Basically, it means structuring content so that a user can use it to a high level of satisfaction.

Peter Morville and Co. developed this honeycomb to represent the multiple facets of User Experience. Check out his explanation here.

Keeping usability and UX at the forefront, much of our semester has been focused on user demographics. We developed personas of specific users by highlighting the tasks they need to carry out and the kind of behaviors they bring to the computer. For example, one of my personas is a working mother who wants to find the best dance studio for her daughter, but doesn’t have a lot of time to spend looking up information and gets frustrated easily with technology (may or may not have been influenced by my own mother).

We also developed a project brief to keep the main benefits of restructuring in mind, and we analyzed parts of the current websites that work for users, and parts that could be improved. We did not (and could not) begin proposing our restructured website until we had a solid understanding of the users and their needs.

While learning about usability, I thought back to my graduate school application essay. I discussed focusing on digital libraries and archives in order to improve accession of materials, which is my goal throughout my career. As I’m learning, I realize that accession doesn’t mean digitizing to digitize, it means digitizing then presenting the materials in an accessible way. Even though the material may be released on the web, that doesn’t always imply that a user will find it and be able to use it.

As technology increasingly evolves, keeping the goals of the library in sync with the skills and needs of the user is crucial. This is where information architecture and user experience meet library technology.

How do you integrate usability and user experience with library technology in your institution? If you are an information architect or usability researcher, what advice do you have for others wishing to integrate these tools?

Open Knowledge Foundation: Global Open Data Index 2014: Reviewing in progress

Thu, 2014-11-06 19:54

October was a very exciting month for us in the Index team. We spoke to so many of you about the Index, face to face or in the virtual world, and we got so much back from you. It was amazing for us to see how the community is pulling together not only with submissions, but also giving advice in the mailing list, translating tweets and tutorials and spreading the word of the Index around. Thank you so much for your contributions.

This is the first time that we have done regional sprints, starting from the Americas in early October in AbreLATAM/ConDatos, through to our community hangout with Europe and MENA, and finishing off with Asia, Africa and Pacific. On Thursday last week, we hosted a Hangout with Rufus, who spoke about the the Index, how it can be used and where it is headed. We were also very lucky to have Oscar Montiel from Mexico, who spoke with us how they use the Index to demand datasets from the government and how they are now implementing the local data index in cities around Mexico so they can promote data openness at the municipal level. We were also excited to host Oludotun Babayemi from Nigeria, who explained how Index that involves Nigeria can help them to promote awareness in government and civilians to open data issues.

Now that the sprints are over, we still have a lot of work ahead of us. We are now reviewing all of the submissions. This year, we divided the editor role from 2014 into two roles known as ‘contributor’ and ‘reviewer’. This has been done so we can have a second pair of eyes to to ensure information is reliable and of excellent quality. Around the world people a team of reviewers are working on the submissions from the sprints. We are still looking for reviewers for South Africa, Bangladesh, Finland, Georgia, Latvia, Philippines and Norway. You can apply to become one here.

We are finalising the Index 2014 over the next few weeks. Stay tuned for more updates. In the meantime, we are also collecting your stories about participating in the Index for 2014. If you would like to contribute to these regional blogs, please email We would love to hear from you and make sure your country is represented.

pinboard: Code4Lib shop

Thu, 2014-11-06 19:08
tshirts, mugs, etc.

Library of Congress: The Signal: WITNESS: Digital Preservation (in Plain Language) as a Tool for Justice

Thu, 2014-11-06 18:09

Illustration of video file and wrapper from WITNESS.

Some of you information professionals may have experienced incidents where, in the middle of a breezy conversation, you get caught off guard  by a question about your work (“What do you do?”) and you struggle to come up with a straightforward, clear answer without losing the listener’s attention or narcotizing them into a stupor with your explanation.

Communicating lucid, stripped-down technical information to a general audience is a challenge…not dumbing down the information but simplifying it. Or, rather, un-complicating it and getting right to the point. At the Signal, we generally address our blog posts to institutions, librarians, archivists, students and information technologists. We preach to the choir and use peer jargon with an audience we assume knows a bit about digital preservation already. Occasionally we direct posts specifically to laypeople, yet we might still unintentionally couch some information in language that may be off-putting to them.

WITNESS, the human rights advocacy organization, has become expert in communicating complex technical information in a simple manner.  WITNESS empowers people by teaching them how to use video as a tool to document human rights abuses and how to preserve digital video so they can use it to corroborate their story when the time is right. Their audience — who may or may not be technologically savvy –  often comes to WITNESS in times of crisis, when they need immediate expertise and guidance.

Cell phone video interview on

What WITNESS has in common with the Library of Congress and other cultural institutions is a dedication to best practices in digital preservation. However, to the Library of Congress and its peer institutions, the term “digital preservation” pertains to cultural heritage; to victims of human rights violations, “digital preservation” pertains to evidence and justice.

For example, WITNESS advises people to not rename or modify the original video files. While that advice is in accord with the institutional practice of storing the original master file and  working only with derivative copies, that same advice, as applied to documenting human rights violations, protects people from the potential accusation of tampering with — or modifying — video to manipulate the truth. The original file might also retain such machine-captured metadata as the time, date and geolocation of the recording, which can be crucial for maintaining authenticity.

The Society of American Archivists recently honored WITNESS with their 2014 Preservation Publication Award for their “Activists Guide to Archiving Video.” The SAA stated, “Unlike other resources, (the guide) is aimed at content creators rather than archivists, enabling interventions that support preservation early in the digital life-cycle. The guide also uses easy-to-understand language and low-cost recommendations that empower individuals and grassroots organizations with fewer resources to take action to safeguard their own valuable collections. To date, the guide has found enthusiastic users among non-archivists, including independent media producers and archives educators, as well as archivists who are new to managing digital video content. The Award Committee noted that the guide was a ‘valuable contribution to the field of digital preservation’ and an ‘example of what a good online resource should be.’”

Screenshot from “What is Metadata” video by WITNESS.

That is an important distinction, the part about “…non-archivists, including independent media producers and archives educators, as well as archivists who are new to managing digital video content.” It means that WITNESS’s digital preservation resources are equally useful to a broad audience as they are to its intended audience of human rights advocates. Like the  Academy of Motion Picture Arts and Sciences’ 2007 publication, The Digital Dilemma (profiled in the Signal), the language that WITNESS communicates in is so plain and direct, and the advice so comprehensive, that the digital video preservation instruction in the publication is broadly applicable and useful beyond its intended audience. Indeed, WITNESS’s ”Activists Guide to Archiving Video” is used in training and college courses on digital preservation.

WITNESS’s latest resource, “Archiving for Activists,”  is a video series aimed at improving people’s understanding of digital video so they can make informed choices for shooting and preserving the best possible copy of the event. The videos in this series are:

Photo from

Some activists in the field have said that, thanks to WITNESS’s resources, they are organizing their footage better and adopting consistent naming conventions, which makes it easier to find files later on and strengthens the effectiveness of their home-grown archives. Yvonne Ng, senior archivist at WITNESS, said, “Even in a situation where they don’t have a lot of resources, there are simple things that can be done if you have a few hard drives and a simple system that everybody you are working with can follow in terms of how to organize your files and put them into information packages – putting things in folders and not renaming your files and not transcoding your files and having something like an Excel document to keep track of where your videos are.”

WITNESS will continue to offer professional digital video archival practices to those in need of human rights assistance, in the form of tools that are easy to use and readily available, in plain language. Ng said, “We talk about digital preservation in a way that is relevant and immediate to the people who are documenting abuses. It serves their end goals, which are not necessarily just to create an archive. It’s so that they can have a collection that they can easily use and it will maintain its integrity for years.”

HangingTogether: UCLA’s Center for Primary Resources and Training: A model for increasing the impact of special collections and archives

Thu, 2014-11-06 17:00

Many of us in the special collections and archives community have long admired the purpose and scope of UCLA’s Center for Primary Resources and Training (CFPRT), so I was pleased to learn that the UCLA library would be celebrating the Center’s 10th anniversary with a symposium on 24 October. As a result, I now know that we should all be celebrating its remarkable success as well. The audience that day learned via stellar presentations by ten CFPRT “graduates” that the program’s impact on them, and on their students and colleagues, has been profound.

Vicki Steele, the Center’s founding director, talked about being inspired by the ARL “hidden collections” conference at the Library of Congress in 2003 (the papers were published here). She flew right back to UCLA and put together a strategy for not only making a dent in her department’s massive backlogs (she noted they had lost both collections and donors due to a well-deserved reputation for taking years to process new acquisitions) but for integrating special collections into the intellectual life of the university. Students have told her “you never know what you’re in training for” when describing the “life-changing experiences” fostered by working at CFPRT. And based on the presentations, it’s clear that this is not hyperbole. Oh, and it was great to learn that providing a very desirable wage to the Center’s fellows was a high priority from the beginning; one graduate noted that the stipend literally made it possible for her to focus on her studies and complete her M.A. program.

I confess that I’ve occasionally wondered how much the Center accomplishes beyond getting lots of special collections processed. In the wake of this symposium, I’m wondering no more. The achievements of the graduate students who have participated, their evangelism for the importance of primary sources research, and the effects of the CFPRT experience on their lives render this program a model for others to admire and, resources permitting, to replicate. Ensuring that special collections and archives achieve real impact is a huge emphasis these days—as it should be. The Center is a model for one meaningful approach.

A few of my takeaways:

  •  Alexandra Apolloni, Ph.D. student in musicology, now uses sheet music to teach her students about the many aspects of society reflected in such sources. She teaches them to “read a primary source for context.” She noted that it was useful to think about how future researchers would use the materials in order to maintain objectivity in her approach to processing and description.
  • Yasmin Dessem, MA graduate in moving image archive studies and now an archivist at Paramount Studios, discovered the power of primary sources to change history: evidence found in a collection on the notorious Lindbergh kidnapping suggests that the person executed for the crime was innocent. Too little, too late.
  • Andrew Gomez, Ph.D. graduate in history, played a central role in designing and implementing the exceptional digital resource The Los Angeles Aqueduct Digital Platform. In the process of this work, he became a huge supporter of the digital humanities as a rigorous complement to traditional historical research: his work involved standard historical skills and outputs such as studying primary sources and creating historical narratives, as well as mastering a wide variety of digital tools. He also learned how to address audiences other than fellow scholars; in effect, he saw that scholarship can have a broad reach if designed to do so. He is currently on the academic job market and noted that he is seeing ads for tenure-track faculty positions focused on digital humanities. The game may be starting to change.
  • Rhiannon Knol, M.A. student in classics, worked on textual medieval manuscripts. I liked her elegant statement about the ability of a book’s materiality to “communicate knowledge from the dead to the living.” She also quoted Umberto Eco: “Books are not made to be believed, but to be subject to inquiry.” I can imagine reciting both statements to students.
  • Erika Perez, Ph.D. graduate in history and now on the faculty of the University of Arizona, reported that when looking for a job, her experience at CFPRT helped her get her foot in the door and tended to be a major topic during interviews.
  • Aaron Gorelik, Ph.D. graduate in English, said that CFPRT changed his life by leading to his becoming a scholar of the poet Paul Monette. He had his “wow” moment when he realized that “this was a life, not a novel.” His work on Monette has guided his dissertation, teaching, and reading ever since, and he’s in the process of getting more than 100 unpublished Monette poems into press.
  • Audra Eagle Yun, MLIS graduate and now Head of Special Collections and Archives at UC Irvine, spoke of the CFPRT as an “archival incubator.” She and her fellow students were amazed that they would be trusted “to handle the stuff of history” and learned the centrality of doing research before processing. They graduated from CFPRT with the assumption that MPLP is standard processing. Ah, the joys of a fresh education, to be unfettered by unproductive past practice! She felt like a “real archivist” when she realized that she could identify the best research resources and make processing decisions without input from her supervisor.
  • Thai Jones, curator of U.S. history at the Columbia University Rare Books and Manuscripts Library, gave a fascinating keynote in which he told the story of researching his activist grandmother, Annie Stein, who worked for integration of New York City public schools from the 1950s to the 1980s. He gathered a collection of materials entirely via FOIA requests, and the resulting Annie Stein papers are heavily used. (His own life story is fascinating too: he was born and spent his early years living underground with his family because his father was on the run as a member of the Weather Underground. Gosh. Rather different from my Republican childhood!) He opined that digitization has revolutionized discovery for historians but lamented that many of his colleagues today identify and use online resources only. Please digitize more, and faster, is his mantra. It’s ours too, but we know how difficult and expensive it is to achieve. We need to keep developing methodologies for turning it around.

Few special collections and archives can muster the resources to launch and maintain a program as impressive as UCLA’s Center for Primary Resources and Training, but many can do it on a smaller scale. Do you work at one that has gotten started and from which colleagues might learn? If not, what are the challenges that have stopped you from moving forward? Please leave a comment and tell your story.



About Jackie Dooley

Jackie Dooley leads OCLC Research projects to inform and improve archives and special collections practice. Activities have included in-depth surveys of special collections libraries in the U.S./Canada and the U.K./Ireland; leading the Demystifying Born Digital work agenda; a detailed analysis of the 3 million MARC records in ArchiveGrid; and studying the needs of archival repositories for specialized tools and services. Her professional research interests have centered on the development of standards for cataloging and archival description. She is a past president of the Society of American Archivists and a Fellow of the Society.

Mail | Web | Twitter | Facebook | More Posts (15)

Jonathan Rochkind: Useful lesser known ruby Regexp methods

Thu, 2014-11-06 15:50
1. Regexp.union

Have a bunch of regex’s, and want to see if a string matches any of them, but don’t actually care which one it matches, just if it matches any one or more? Don’t loop through them, combine them with Regexp.union.

union_re = Regexp.union(re1, re2, re3, as_many_as_you_want) str =~ union_re 2. Regexp.escape

Have an arbitrary string that you want to embed in a regex, interpreted as a literal? Might it include regex special chars that you want interpreted as literals instead? Why even think about whether it might or not, just escape it, always.

val = 'Section 19.2 + [Something else]' re = /key: #{Regexp.escape val}/

Yep, you can use #{} string interpolation in a regex literal, just like a double quoted string.

Filed under: General

Eric Hellman: If your website still uses HTTP, the X-UIDH header has turned you into a snitch

Thu, 2014-11-06 14:54
Does your website still use HTTP? It not, you're a snitch.

As I talk to people about privacy, I've found a lot of misunderstanding. HTTPS applies encryption to the communication channel between you and the website you're looking at. It's an absolute necessity when someone's making a password or sending a credit card number, but the modern web environment has also made it important for any communication that expects privacy.

HTTP is like sending messages on a postcard. Anyone handling the message can read the whole message. Even worse, they can change the message if they want. HTTPS is like sending the message in a sealed envelope. The messengers can read the address, but they can't read or change the contents.

It used to be that network providers didn't read your web browsing traffic or insert content into it, but now they do so routinely. This week we learned that Verizon and AT&T were inserting an "X-UIDH" header into your mobile phone web traffic. So for example, if a teen was browsing a library catalog for books on "pregnancy" using a mobile phone, Verizon's advertising partners could, in theory, deliver advertising for maternity products.

The only way to stop this header insertion is for websites to use HTTPS. So do it. Or you're a snitch.

Sorry, doesn't support HTTPS. So if you mysteriously get ads for snitch-related products, or if the phrase "Verizon and AT&T" is not equal to "V*erizo*n and A*T*&T" without the asterisks, blame me and blame Google.

Here's more on the X-UIDH header.

Open Knowledge Foundation: Open Knowledge Festival 2014 report: out now!

Thu, 2014-11-06 14:46

Today we are delighted to publish our report on OKFestival 2014!

This is packed with stories, statistics and outcomes from the event, highlighting the amazing facilitators, sessions, speakers and participants who made it an event to inspire. Explore the pictures, podcasts, etherpads and videos which reflect the different aspects of the event, and uncover some of its impact as related by people striving for change – those with Open Minds to Open Action.

Want more data? If you are still interested in knowing more about how the OKFestival budget was spent, we have published details about the events income and expenses here.

If you missed OKFestival this year, don’t worry – it will be back! Keep an eye on our blog for news and join the Open Knowledge discussion list to share your ideas for the next OKFestival. Looking forward to seeing you there!

OCLC Dev Network: Planned Downtime for November 9 Release

Thu, 2014-11-06 14:30

WMS Web services will be down during the install window for this weekend's release. The install time for this release is between 2:00 – 7:00 am Eastern USA, Sunday Nov 9th.


Ted Lawless: Connecting Python's RDFLib and Stardog

Thu, 2014-11-06 00:00
Connecting Python's RDFLib and Stardog

For a couple of years I have been working with the Python RDFLib library for converting data from various formats to RDF. This library serves this work well but it's sometimes difficult to track down a straightforward, working example of performing a particular operation or task in RDFLib. I have also become interested in learning more about the commercial triple store offerings, which promise better performance and more features than the open source solutions. A colleague has had good experiences with Stardog, a commercial semantic graph database (with a freely licensed community edition) from Clark & Parsia, so I thought I would investigate how to use RDFLib to load data in to Stardog and share my notes.

A "SPARQLStore" and "SPARQLUpdateStore" have been included with Python's RDFLib since version 4.0. These are designed to allow developers to use the RDFLib code as a client to any SPARQL endpoint. Since Stardog supports SPARQL 1.1, developers should be able to connect to Stardog from RDFLib in the similar way they would to other triple stores like Sesame or Fuseki.

Setup Stardog

You will need a working instance of Stardog. Stardog is available under a community license for evaluation after going through a simple registration process. If you haven't setup Stardog before, you might want to checkout Geir Grnmo's triplestores repository where he has Vagrant provisioning scripts for various triple stores. This is how I got up and running with Stardog.

Once Stardog is installed, start the Stardog server with security disabled. This will allow the RDFLib code to connect without a username and password. Obviously you will not want to run Stardog in this way in production but it is convenient for testing.

$./bin/stardog-admin server start --disable-security

Next create a database called "demo" to store our data.

$./bin/stardog-admin db create -n demo

At this point a SPARQL endpoint is available at ready for queries at http://localhost:5820/demo/query.


For this example, we'll add three skos:Concepts to a named graph in the Stardog store.

@prefix rdf: <> . @prefix rdfs: <> . @prefix skos: <> . @prefix xml: <> . @prefix xsd: <> . <> a skos:Concept ; skos:broader <> ; skos:preferredLabel "Baseball" . <> a skos:Concept ; skos:preferredLabel "Sports" . <> a skos:Concept ; skos:preferredLabel "Soccer" . Code

The complete example code here is available as a Gist.

Setting up the 'store'

We need to initialize a SPARQLUpdateStore as well as a named graph where we will store our assertions.

from rdflib import Graph, Literal, URIRef from rdflib.namespace import RDF, SKOS from rdflib.plugins.stores import sparqlstore #Define the Stardog store endpoint = 'http://localhost:5820/demo/query' store = sparqlstore.SPARQLUpdateStore(), endpoint)) #Identify a named graph where we will be adding our instances. default_graph = URIRef('') ng = Graph(store, identifier=default_graph) Loading assertions from a file

We can load our sample turtle file to an in-memory RDFLib graph.

g = Graph() g.parse('./sample-concepts.ttl', format='turtle') #Serialize our named graph to make sure we got what we expect. print g.serialize(format='turtle')

Since our data is now loaded as an in memory Graph we can add it to Stardog with a SPARQL INSERT DATA operation.

ng.update( u'INSERT DATA { %s }' % g.serialize(format='nt') ) Use the RDFLib API to inspect the data

Using the RDFLib API, we can list all the Concepts in the Stardog that were just added.

for subj in ng.subjects(predicate=RDF.type, object=SKOS.Concept): print 'Concept: ', subj

And, we can find concepts that are broader than others.

for ob in ng.objects(predicate=SKOS.broader): print 'Broader: ', ob Use RDFLib to issue SPARQL read queries.

RDFLib allows for binding a prefix to a namespace. This makes our queries easier to read and write.

store.bind('skos', SKOS)

A SELECT query to get all the skos:preferredLabel for skos:Concepts.

rq = """ SELECT ?s ?label WHERE { ?s a skos:Concept ; skos:preferredLabel ?label . } """ for s, l in ng.query(rq): print s.n3(), l.n3() Use RDFLib to add assertions.

The RDFLib API can also be used to add new assertions to Stardog.

soccer = URIRef('') ng.add((soccer, SKOS.altLabel, Literal('Football')))

We can now Read statements about soccer using the RDFLib API, which issues the proper SPARQL query to Stardog in the background.

for s, p, o in ng.triples((soccer, None, None)): print s.n3(), p.n3(), o.n3() Summary

With a little setup, we can begin working with Stardog in RDFLib in a similar way that we work with RDFLib and other backends. The sample code here is included in this Gist.

DuraSpace News: Recordings available for the Fedora 4.0 Webinar Series

Thu, 2014-11-06 00:00

Winchester, MA

On November 5, 2014 the Hot Topics DuraSpace Community Webinar series, “Early Advantage: Introducing New Fedora 4.0 Repositories,” concluded with its final webinar, “Fedora 4.0 in Action at Penn State and Stanford.”

DuraSpace News: Fedora 4 Almost Out the Door: Final Community Opportunity for Feedback!

Thu, 2014-11-06 00:00

From Andrew Woods, Technical Lead for Fedora 

Winchester, MA  Fedora 4 Beta-04 will be released before this coming Monday, November 10, 2014. The development sprint that also begins on November 10 will be focused on testing and documentation as we prepare for the Fedora 4.0 production release.

SearchHub: What Could Go Wrong? – Stump The Chump In A Rum Bar

Wed, 2014-11-05 22:56

The first time I ever did a Stump The Chump session was back in 2010. It was scheduled as a regular session — in the morning if I recall correctly — and I (along with the panel) was sitting behind a conference table on a dais. The session was fun, but the timing, and setting, and seating, made it feel very stuffy and corporate..

We quickly learned our lesson, and subsequent “Stump The Chump!” sessions have become “Conference Events”. Typically held at the end of the day, in a nice big room, with tasty beverages available for all. Usually, right after the winners are announced, it’s time to head out to the big conference party.

This year some very smart people asked me a very smart question: why make attendees who are having a very good time (and enjoying tasty beverages) at “Stump The Chump!”, leave the room and travel to some other place to have a very good time (and enjoy tasty beverages) at an official conference party? Why not have one big conference party with Stump The Chump right in the middle of it?

Did I mention these were very smart people?

So this year we’ll be kicking off the official “Lucene/Solr Revolution Conference Party” by hosting Stump The Chump at the Cuba Libre Restaurant & Rum Bar.

At 4:30 PM on Thursday, (November 13) there will be a fleet of shuttle buses ready and waiting at the Omni Hotel’s “Parkview Entrance” (on the South East side of the hotel) to take every conference attendee to Cuba Libre. Make sure to bring your conference badge, it will be your golden ticket to get on the bus, and into the venue — and please: Don’t Be Late! If you aren’t on a shuttle buses leaving the Omni by 5:00PM, you might miss the Chump Stumping!

Beers, Mojitos & Soft Drinks will be ready and waiting when folks arrive, and we’ll officially be “Stumping The Chump” from 5:45 to 7:00-ish.

The party will continue even after we announce the winners, and the buses will be available to shuttle people back to the Omni. The last bus back to the hotel will leave around 9:00 PM — but as always, folks are welcome to keep on partying. There should be plenty of taxis in the area.

To keep up with all the “Chump” news fit to print, you can subscribe to this blog (or just the “Chump” tag).

The post What Could Go Wrong? – Stump The Chump In A Rum Bar appeared first on Lucidworks.

LITA: Game Night at LITA Forum

Wed, 2014-11-05 22:13

Are you attending the 2014 LITA Forum in Albuquerque? Like board games? If so, come to the LITA Game Night!

Thursday, November 6, 2014
8:00 – 11:00 pm
Hotel Albuquerque, Room Alvarado C

Games that people are bringing:

  • King of Tokyo
  • Cheaty Mages
  • Cards Against Humanity
  • One Night Ultimate Werewolf
  • Star Fluxx
  • Love Letter
  • Seven Dragons
  • Pandemic
  • Coup
  • Avalon
  • Bang!: The Dice Game
  • Carcassonne
  • Uno
  • Gloom
  • Monty Python Fluxx
  • and probably more…

Hope you can come!

FOSS4Lib Recent Releases: Evergreen - 2.7.1, 2.6.4, 2.5.8

Wed, 2014-11-05 21:21
Package: EvergreenRelease Date: Wednesday, November 5, 2014

Last updated November 5, 2014. Created by Peter Murray on November 5, 2014.
Log in to edit this page.

"In particular, they fix a bug where even if a user had logged out of the Evergreen public catalog, their login session was not removed. This would permit somebody who had access to the user’s session cookie to impersonate that user and gain access to their account and circulation information."

Evergreen ILS: SECURITY RELEASES – Evergreen 2.7.1, 2.6.4, and 2.5.8

Wed, 2014-11-05 21:11

On behalf of the Evergreen contributors, the 2.7.x release maintainer (Ben Shum) and the 2.6.x and 2.5.x release maintainer (Dan Wells), we are pleased to announce the release of Evergreen 2.7.1, 2.6.4, and 2.5.8.

The new releases can be downloaded from:

THESE RELEASES CONTAIN SECURITY UPDATES, so you will want to upgrade as soon as possible.

In particular, they fix a bug where even if a user had logged out of the Evergreen public catalog, their login session was not removed. This would permit somebody who had access to the user’s session cookie to impersonate that user and gain access to their account and circulation information.

After installing the Evergreen software update, it is recommended that memcached be restarted prior to restarting Evergreen services and Apache.  This will clear out all user login sessions.

All three releases also contain bugfixes that not related to the security issue. For more information on the changes in these release, please consult the change logs:

District Dispatch: IRS provides update to libraries on tax form program

Wed, 2014-11-05 21:06

Photo by AgriLifeToday via Flickr

On Tuesday, the Internal Revenue Service (IRS) announced that the agency will continue to deliver 1040 EZ forms to public libraries that are participating in the Tax Forms Outlet Program (TFOP). TFOP offers tax products to the American public primarily through participating libraries and post offices. The IRS will distribute new order forms to participating libraries in the next two to three weeks.

The IRS released the following statement on November 4, 2014:

Based on the concerns expressed by many of our TFOP partners, we are now adding the Form 1040 EZ, Income Tax Return for Single and Joint Filers with No Dependents, to the list of forms that can be ordered. We will send a supplemental order form to you in two to three weeks. We strongly recommend you keep your orders to a manageable level primarily due to the growing decline in demand for the form and our print budget. Taxpayers will be able to file Form 1040 EZ and report that they had health insurance coverage, claim an exemption from coverage or make a shared responsibility payment. However, those who purchased health coverage from the Health Insurance Marketplace must use the Form 1040 or 1040A.Your help communicating this to your patrons within your normal work parameters would be greatly appreciated.

We also heard and understood your concerns of our decision to limit the number of Publication 17 we plan to distribute. Because of the growing cost to produce and distribute Pub 17, we are mailing to each of our TFOP partners, including branches, one copy for use as a reference. We believe that the majority of local demand for a copy of or information from Publication 17 can be met with a visit to our website at or by ordering it through the Government Printing Office. We value and appreciate the important work you do providing IRS tax products to the public and apologize for any inconvenience this service change may cause.

Public library leaders will have the opportunity to discuss the management and effectiveness of the Tax Forms Outlet Program with leaders from the IRS during the 2015 American Library Association Midwinter Meeting session “Tell the IRS: Tax Forms in the Library.” The session takes place on Sunday, February 1, 2015.

The post IRS provides update to libraries on tax form program appeared first on District Dispatch.

Roy Tennant: How Some of Us Learned To Do the Web Before it Existed

Wed, 2014-11-05 20:58

Perhaps you really had to be there to understand what I’m about to relate. I hope not, but it’s quite possible. Imagine a world without the Internet, as so totally strange as that is. Imagine that we had no world-wide graphical user interface to the world of information. Imagine that the most we had were green screens and text-based interfaces to “bulletin boards” and “Usenet usegroups”. Imagine that we were so utterly ignorant of the world we would very soon inhabit. Imagine that we were about to have our minds utterly blown.

But we didn’t know that. We only had what we had, and it wasn’t much. We had microcomputers of various kinds, and the clunkiest interfaces to the Internet that you can possibly imagine. Or maybe you can’t even imagine. I’m not sure I could, from this perspective. Take it from me — it totally sucked. But it was also the best that we had ever had.

And then along came HyperCard. 

HyperCard was a software program that ran on the Apple Macintosh computer. It would be easy to write it off as being too narrow a niche, as Microsoft was even more dominant in terms of its operating system than it is now. But that would be a mistake. Much of the true innovation at that point was happening on the Macintosh. This was because it had blown the doors off the user interface and Microsoft was still playing catchup. You could argue in some ways it still is. But back then there was absolutely no question who was pushing the boundaries, and it wasn’t Redmond, WA, it was Cupertino, CA. Remember that I’m taking you back before the Web. All we had were clunky text-based interfaces. HyperCard gave us this:

  • True “hypertext”. Hypertext is what we called the proto-web — that is, the idea of linking from one text document to another before Tim Berners-Lee created HTML.
  • An easy to learn programming language. This is no small thing. Having an easy-to-learn scripting language put the ability to create highly engaging interactive interfaces into the hands of just about anyone.
  • Graphical elements. Graphics, as we know, are a huge part of the Web. The Web didn’t really come into its own until graphics could show up in the UI. But we already had this in HyperCard. The difference was that anyone with a network connection could see your graphics — not just those who had your HyperCard “stack”.

As a techie, I was immediately taken with the possibilities, so as a librarian at UC Berkeley at the time I found some other willing colleagues and we built a guide to the UC Berkeley Libraries. Unfortunately I’ve been unable to locate a copy of it, since it’s still possible to run a HyperCard stack in emulation. I’d give a lot to be able to play with it again.

Doing this exposed us to principles of “chunking up” information and linking it together in different ways that we eventually took with us to the web. We also learned to limit the amount of text with online presentations, to enhance “scannability”. We were introduced to visual metaphors like buttons. We learned to use size to indicate priority. We experimented with bread crumb trails to give users a sense of where they were in the information space. And we strove to be consistent. All of these lessons helped us to be better designers of web sites, before the web even existed.

For more, here is another viewpoint on what HyperCard provided a web-hungry world.