You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 18 hours 22 min ago

Dan Scott: Putting the "Web" back into Semantic Web in Libraries 2014

Thu, 2014-12-04 21:15

I was honoured to lead a workshop and speak at this year's edition of Semantic Web in Bibliotheken (SWIB) in Bonn, Germany. It was an amazing experience; there were so many rich projects being described with obvious dividends for the users of libraries, once again the European library community fills me with hope for the future success of the semantic web.

The subject of my talk "Cataloguing for the open web with RDFa and" (slides and video recording - gulp) pivoted while I was preparing materials for the workshop. I was searching library catalogues around Bonn looking for a catalogue with persistent URIs that I could use for an example. To my surprise, catalogue after catalogue used session-based URLs; it took me quite some time before I was able to find ULB, who had hosted a VuFind front end for their catalogue. Even then, the robots.txt restricted crawling by any user agent. This reminded me rather depressingly of my findings from current "discovery layers", which entirely restrict crawling and therefore put libraries into a black hole on the web.

Thses findings in the wild are so antithetical to the basic principles of enabling discovery of web resources that, in a conference about the semantic web, I opted to spend over half of my talk making the argument that libraries need to pay attention to the old-fashioned web of documents first and foremost. The basic building blocks that I advocated were, in priority order:

  • Persistent URIs, on which everything else is built
  • Sitemaps, to facilitate discovery of your resources
  • A robots.txt file to filter portions of your website that should not be crawled (for example, search results pages)
  • RDFa, microdata, or JSON-LD only after you've sorted out the first three

Only after setting that foundation did I feel comfortable launching into my rationale for RDFa and as a tool for enabling discovery on the web: a mapping of the access points that cataloguers create to the world of HTML and aggregators. The key point for SWIB was that RDFa and can enable full RDF expressions in HTML; that is, we can, should, and must go beyond surfacing structured data to surfacing linked data through @resource attributes and schema:sameAs properties.

The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation. Tim Berners-Lee, Scientific American, 2001

I also argued that using RDFa to enrich the document web was, in fact, truer to Berners-Lee's 2001 definition of the semantic web, and that we should focus on enriching the document web so that both humans and machines can benefit before investing in building an entirely separate and disconnected semantic web.

I was worried that my talk would not be well received; that it would be considered obvious, or scolding, or just plain off-topic. But to my relief I received a great deal of positive feedback. And on the next day, both Eric Miller and Richard Wallis gave talks on a similar, but more refined, theme: that libraries need to do a much, much better job of enabling their resources to be found on the web--not by people who already use our catalogues, but by people who are not library users today.

There were also some requests for clarification, which I'll try to address generally here (for the benefit of anyone who wasn't able to talk with me, or who might watch the livestream in the future).

"When you said anything could be described in, did you mean we should throw out MARC and BIBFRAME and EAD?"

tldr: I intended and, not instead of!

The first question I was asked was whether there was anything that I had not been able to describe in, to which I answered "No"--especially since the work that the W3C SchemaBibEx group had done to ensure that some of the core bibliographic requirements were added to the vocabulary. It was not as coherent or full a response as I would have liked to have made; I blame the livestream camera

But combined with a part of the presentation where I countered a myth about being a very coarse vocabulary by pointing out that it actually contained 600 classes and over 800 properties, a number of the attendees interpreted one of the takeaways of my talk as suggesting that libraries should adopt as the descriptive vocabulary, and that MARC, BIBFRAME, EAD, RAD, RDA, and other approaches for describing library resources were no longer necessary.

This is not at all what I'm advocating! To expand on my response, you can describe anything in, but you might lose significant amounts of richness in your description. For example, short stories and poems would best be described in as a CreativeWork. You would have to look at the associated description or keyword properties to be able to figure out the form of the work.

What I was advocating was that you should map your rich bibliographic description into corresponding classes and properties in RDFa at the time you generate the HTML representation of that resource and its associated entities. So your poem might be represented as a CreativeWork, with a name, author, description, keywords, and about values and relationships. Ideally, the author will include at least one link (either via sameAs, url, or @resource) to an entity on the web; and you could do the same with about if you are using a controlled vocabulary.

If you take that approach, then you can serve up descriptions of works in HTML that most web-oriented clients will understand (such as search engines) and provide basic access points such as name / author / keywords, while retaining and maintaining the full richness of the underlying bibliographic description--and potentially providing access to that, too, as part of the embedded RDFa, via content negotiation, or <link rel="">, for clients that can interpret richer formats.

"What makes you think Google will want to surface library holdings in search results?"

There is a perception that Google and other search engines just want to sell ads, or their own products (such as Google Books). While Google certainly does want to sell ads and products, they also want to be the most useful tool for satisfying users' information needs--possibly so they can learn more about those users and put more effective ads in front of them--but nonetheless, the motivation is there.

Imagine marking up your resources with the Product / Offer portion of you are able to provide search engines with availability information in the same way that Best Buy, AbeBooks, and other online retailers do (as Evergreen, Koha, and VuFind already do). That makes it much easier for the search engines to use everything they may know about their users, such as their current location, their institutional affiliations, their typical commuting patterns, their reading and research preferences... to provide a link to a library's electronic or print copy of a given resource in a knowledge graph box as one of the possible ways of satisfying that person's information needs.

We don't see it happening with libraries running Evergreen, Koha, and VuFind yet, realistically because the open source library systems don't have enough penetration to make it worth a search engine's effort to add that to their set of possible sources. However, if we as an industry make a concerted effort to implement this as a standard part of crawlable catalogue or discovery record detail pages, then it wouldn't surprise me in the least to see such suggestions start to appear. The best proof that we have that Google, at least, is interested in supporting discovery of library resources is the continued investment in Google Scholar.

And as I argued during my talk, even if the search engines never add direct links to library resources from search results or knowledge graph sidebars, having a reasonably simple standard like the GoodRelations product / offer pattern for resource availability enables new web-based approaches for building appplications. One example could be a fulfillment system that uses sitemaps to intelligently crawl all of its participating libraries, normalizes the item request to a work URI, and checks availability by parsing the offers at the corresponding URIs.

District Dispatch: ALA seeks nominations for 2015 James Madison awards

Thu, 2014-12-04 19:08

The American Library Association’s (ALA) Washington Office is calling for nominations for two awards to honor individuals or groups who have championed, protected and promoted public access to government information and the public’s right to know.

The James Madison Award, named in honor of President James Madison, was established in 1986 to celebrate an individual or group who has brought awareness to these issues at the national level. Madison is widely regarded as the Father of the Constitution and as the foremost advocate for openness in government.

The Eileen Cooke Award honors an extraordinary leader who has built local grassroots awareness of the importance of access to information. Cooke, former director of the ALA Washington Office, was a tireless advocate for the public’s right to know and a mentor to many librarians and trustees.

Both awards are presented during Freedom of Information (FOI) Day, an annual event on or near March 16, Madison’s birthday.

Nominations should be submitted to the ALA Washington Office no later than January 16, 2015. Submissions should include a statement (maximum one page) about the nominee’s contribution to public access to government information, why it merits the award and one seconding letter. Please include a brief biography and contact information for the nominee.

Send e-mail nominations to Jessica McGilvray, Assistant Director for the ALA Office of Government Relations, at Submissions can also be mailed to:

James Madison Award / Eileen Cooke Award
American Library Association
Washington Office
1615 New Hampshire Avenue, NW
Washington, D.C. 20009-2520

The post ALA seeks nominations for 2015 James Madison awards appeared first on District Dispatch.

LITA: Don’t Miss the OpenStreetMaps Webinar

Thu, 2014-12-04 17:32

Before Hackforge’s Mita Williams Masters session on new spaces at the ALA 2015 Midwinter Meeting, you can attend her next LITA webinar, part of the “Re-drawing the Map”–a webinar series:

OpenStreetMaps: Trust the map that anyone can change

Tuesday December 9, 2014
1:00 pm – 2:00 pm Central Time
Instructor: Mita Williams
Register for this webinar

Ever had a map send you the wrong way and wished you could change it? Learn how to add your local knowledge to the “Wikipedia of Maps.”

It’s been said that “the map is not the territory”. But when when the most of the world’s websites and mobile apps rely on maps from private corporations who selectively show you places based on who you are (and who pays for the privilege), perhaps we should cede that territory for higher ground. It’s counter-intuitive to trust a map that anyone can edit, but OpenStreetMap is already the geospatial foundation of some of the world’s most popular sites including Pinterest, Evernote, and github. This session will introduce you to OpenStreetMap and show you how you can both contribute to and make use of the “Wikipedia of Maps”.

Full details

Can’t make the date but still want to join in? Registered participants will have access to the recorded webinar.


  • LITA Member: $39
  • Non-Member: $99
  • Group: $190

Registration Information:

Register Online page arranged by session date (login required)


Mail or fax form to ALA Registration
OR call 1-800-545-2433 and press 5
OR email

Questions or Comments?

For all other questions or comments related to the course, contact LITA at (312) 280-4269 or Mark Beatty,

Library of Congress: The Signal: New FADGI Report: Creating and Archiving Born Digital Video

Thu, 2014-12-04 16:20

As part of a larger effort to explore file formats, the Born Digital Video subgroup of the Federal Agencies Digitization Guidelines Initiative Audio-Visual Working Group is pleased to announce the release of a new four-part report, “Creating and Archiving Born Digital Video.”

This report has already undergone review by FADGI members and invited colleagues including the IASA Technical Committee. With this release, we seek comments and feedback from all interested parties.

The report is the result of over 18 months of collaborative effort from a range of federal agencies including the Smithsonian Institution Archives as well as the Smithsonian Institution Office of the CIO, National Archives and Records Administration, National Oceanic and Atmospheric Administration, Voice of America, and several Library of Congress units including the American Folklife Center, the Web Archiving team and the Packard Campus for Audio-Visual Conservation.

The four documents that comprise the “Creating and Archiving Born Digital Video” report provide practical technical information for both file creators and file archivists to help them make informed decisions and understand the long-term consequences of those decisions when creating or archiving born digital video. The information is intended to serve memory institutions, especially in the U.S. federal sector. But of course we also hope that this report will serve the broader cultural heritage community who may produce and/or receive and ingest materials that range from high-end professional productions to more modest (but culturally important) grass-roots footage.

Clap On. Clap Off. photo by Chuck Olsen courtesy of Flickr.

The scope of the report is outlined in the introduction (Part I) (PDF) including background information and rationale on some of the choices made during the project. The eight case histories (Part II) (PDF) document aspects of the current state of practice in six U. S. federal agencies working with born digital video. These case histories not only describe deliverables and file specifications but also tell the story of each project, and provide background information about the institution and the collection, as well as lessons learned.

As the case histories developed, a set of high level recommended practices (Part III) (PDF) emerged from the collective project experiences. Drafting a uniform guideline or a cookbook felt premature at this point so these recommended practices are intended to support informed decision-making and guide file creators and archivists as they seek out workflows, file characteristics and other practices that will yield files with the greatest preservation potential.

Finally, the annotated resource guide (Part IV) (PDF) provides links to useful documentation, including reports, guidelines, software applications and other technical tools. Many of these resources are referenced in the “Case Histories” and “Recommended Practices” documents.

The report covers both the perspective of the archive that is receiving born digital video and seeks to preserve it for the long term (a group we call “file archivists”) and the perspective of the organization that oversees production (termed “file creators”). In many cases the “file creator” organization is itself an archive. Thus one of the goals of this report is to encourage dialog between stakeholders involved in creating born digital video files and those responsible for protecting the legacy of those files. Dialog between producers and archivists is essential to sustainability and interoperability of born digital video; this report aims to broach that topic in earnest by looking at thoughtful approaches and helpful practices.

IASA Technical Committee meeting. Photo courtesy of Carl Fleischhauer.

The goal of the three “Creating Born Digital Video” case histories, which we summarized as “start with nothing; end up with ingest-ready video,” is to encourage a thoughtful approach from the very beginning of the video production project, before even shooting the video, which takes the “long tail” perspective of preservation, use and reuse into account. These case histories illustrate the advantages of starting with high-quality data capture from the very start because choices made during the file creation process will have impacts on the long-term archiving and distribution processes.

The five “Archiving Born Digital Video” case histories tell the story of bringing the born digital video files into managed data repositories for long-term retention and access. Our shorthand for this group is “identify what you have and understand what you need to do to it.”

These case histories explore the issues which emerge when the born digital video objects arrive at the archive. They cover topics including the challenges of dealing with diverse formats, understanding and documenting relationships among the video files and related objects, and metadata. A major topic for this case history set is the technical characteristics of file formats: how to identify and document the formats coming into the archive, when changes to the file attributes are needed, and what are the impact of changes to the format and encoding.

It bears mentioning that as this report was being compiled, the Library of Congress received the “Preserving Write-Once DVDs: Producing Disk Images, Extracting Content, and Addressing Flaws and Errors” (PDF) report from George Blood Audio/Video. The report was one product of a contract with GBAV in which the company converted a set of write-once DVDs for the Library. The report describes the issues encountered and provides some detail about GBAV’s methods for carrying out the work, thus providing a complement to the DVD section of the “Creating and Archiving Born Digital Video,” drafted by the Smithsonian Institution Archives.

Burned DVD by Roman Soto courtesy of Flickr.

The case histories (PDF) report includes summary tables of the file characteristics of the case history projects, one for “Creating Born Digital Video” projects and a separate one for the “Archiving Born Digital Video” projects. These two tables are interesting because they hint at the trends for the “right now solutions.” This is how some institutions are working today – using what they have to do what they can. It will be very interesting to see how this changes over time as practices advance and mature.

The recommended practices (PDF) are organized into three categories:

  • Advice for File Creators, also known as “advice for shooters,” focuses on providing video content producers, including videographers and, by extension, the project managers within cultural heritage institutions who are responsible for the creation of new born digital video, with a set of practices that emphasize the benefits of aiming for high quality and planning for archival repository ingest from the point of file creation.
  • Advice for File Archivists seeks to provide guidance about video-specific issues which come into play when ingesting the files into a managed storage repository.
  • Advice for File Creators and File Archivists are grouped together because they transcend specific lifecycle points. This guidance focuses on selecting sustainable encodings and wrappers whether at initial file creation or during normalization.

As mentioned in a previous blog post, the use, or more accurately the lack of use, of uncompressed video encodings is one marked example of how the case history projects deviate from the Recommended Practices. Quite simply, we didn’t follow our own advice. All five case history project which specified encodings used compression. And of the five case history projects that implement compression, only one (The Library of Congress’s Packard Campus) implements mathematically lossless compression. The remaining four use various forms of lossy compression, including visually lossless, and all for good reasons. The specific goals of the case history projects necessitated different decisions in order to meet business needs – in this case, the need for smaller files and/or systems-specific compressed formats outweighed the benefits of uncompressed video.

Let’s start the dialog now! We welcome comments and feedback through the FADGI page or direct email to this writer from the interested public on the “Creating and Archiving Born Digital Video” report through the end of January 2015, after which we will review them and publish a “final” version early in the new year. Of course, comments received after our closing are equally welcome although they may have to wait until a planned revision to be addressed. We look forward to hearing from you.

David Rosenthal: A Note of Thanks

Thu, 2014-12-04 16:00
I have a top-of-the-line MacBook Air, which is truly a work of art, but I discovered fairly quickly that subjecting a machine that cost almost $2000 to the vicissitudes of today's travel is worrying. So for years now the machine I've travelled with is a netbook, an Asus Seashell 1005PE. It is small, light, has almost all-day battery life and runs Ubuntu just fine. It cost me about $250, and with both full-disk encryption and an encrypted home directory, I just don't care if it gets lost, broken or seized.

But at last the signs of the hard life of a travelling laptop are showing. I looked around for a replacement and settled on the Acer C720 Chromebook. This cost me $387 including tax and same-day delivery from Amazon. Actually, same-day isn't accurate. It took less than 9 hours from order to arrival! If I'd waited until Black Friday to order it would have been more than $40 cheaper.

For that price, the specification is amazing:
  • 1.7GHz 4-core Intel Core i3
  • 4GB RAM
  • 32GB SSD
  • 11.6" 1366x768 screen
Thanks to these basic instructions from Jack Wallen and the fine work of HugeGreenBug in assembling a version of Ubuntu for the C720, 24 hours after ordering I had a light, thin, powerful laptop with a great display running a full 64-bit installation of Ubuntu 14.0.4. I'm really grateful to everyone who contributed to getting Linux running on Chromebooks in general and on the C720 in particular. Open source is wonderful.

Of course, there are some negatives. The bigger screen is great, but it makes the machine about an inch bigger in width and depth. Like the Seashell and unlike full-size laptops, it will be usable in economy seats on the plane even if the passenger in front reclines their seat. But it'll be harder than it was with the Seashell to claim that the computer and the drink can co-exist on the economy seat-back table.

Below the fold, some details for anyone who wants to follow in my footsteps.

Jack Wallen's instructions for creating a recovery disk didn't work. After I had updated the Chrome OS, I discovered there's an app to create a recovery disk that requires the updated OS, which worked perfectly.

My attempt to install ChrUbuntu from Jack's instructions failed with a long string of errors as the install script tried to patch some files. I then installed Bodhi Linux from his instructions which worked fine except for the part about enabling legacy boot via crosh. I had to follow his ChrUbuntu instructions to log in as cronos before enabling legacy boot. You can see whether what you did worked by executing crossystem - with no arguments it dumps all the settable parameters.

I'm used to Ubuntu, so some Googling I should have done before installing Bodhi Linux led me to HugeGreenBug's instructions for installing it, which worked like a charm.

The 32GB of SSD is not a lot of space. I added a 64GB SD card, but there is a
problem. When inserted in the SD slot the card is only half inside the machine, so it is vulnerable and has to be removed when its being carried.

The 24 hours didn't include transferring all my customizations from the Seashell, but I don't expect any trouble doing that when I get to it shortly before my next trip.

DPLA: VT Community Rep introduces DPLA to Veterans at Disability Awareness event

Thu, 2014-12-04 14:30

As a Community Rep for Vermont, I introduced Digital Public Library of America (DPLA) to employees of the Veterans Affairs Medical Center (VAMC) in White River Junction, Vermont, during its Disability Employment Awareness event on October 16, 2014. Drawing upon my law degree (Vermont Law School) and Masters in Library and Information Science (Simmons College), I have been volunteering at VAMC this year in order to contribute toward programs that benefit current and retired members of the U.S. Armed Forces and their families. Bringing DPLA to veterans and civilians with disabilities was my first effort as the Community Rep to bridge digital divides among under-represented populations. The event’s motto “Expect. Employ. Empower.” was about creating a society of inclusion, thus it seemed to be a perfect fit for DPLA. Ten other participants were local entities that provide adaptive technologies to people with disabilities. The community programs cover healthy eating; vision, hearing, and mobility assistance; as well as outdoor and sports activities.

The DPLA Info Table was equipped with a laptop for a hands-on presentation, which attracted around thirty or so VA veterans and civilian employees, including Deborah Amdur, VAMC Director. Most attendees had never heard of DPLA before, while others were quite informed: “We are nurses, and we learn a lot at our workshops.” Still, another group, albeit the smallest, was not only aware of DPLA as a portal but also as a platform; those were IT employees.

Community Rep Natalia May demonstrates DPLA at the Veterans Affairs Medical Center in White River Junction, Vermont, during its October 2014 Disability Employment Awareness event.

Despite the various degrees of DPLA awareness, the attendees’ responses may be grouped in the following common threads:

  • Attendees recognize DPLA as a “library” (asking if the Info Table got managed by the VAMC in-house library)
  • Attendees with visual and/or audio impairments need visual/audio features to enhance DPLA content for them (asking if I could make the font large, or if a video had subtitles, etc.)
  • Attendees with limited mobility enjoy DPLA all together (commenting that discovering DPLA is akin to “a travel-free, and thus, trouble-free, visit to a neighborhood library or a museum”)
  • Attendees take DPLA swag to share (not only with their family and friends, but also with volunteer organizations they belong to: libraries, peer-to-peer veterans help groups, etc.)
  • Attendees are particular fond of the DPLA’s Timeline feature (that directly answers their needs in authentic sources on history: “This may be another way to enhance our exhibit at a historical society I belong to,” genealogy: “I bet I can find out something for my family’s genealogical tree, as I am an ancestry-buff among my siblings,” and warfare: “to match time of deployment with the country’s events”).

Introducing DPLA to disabled veterans made me realize two things. First, there is a real desire among VA populations to learn more about DPLA and other online resources. The Info Table format allowed only a short introductory presentation, while attendees were curious to spend more time searching beyond the major DPLA tabs, such as Bookshelf (“to search public records”) and Partners. The veterans are likely to benefit from a more formal sit-down instruction with individual computer access. Second, veterans are extremely fond of volunteers, as most of them are volunteers themselves; without exception, they were full of gratitude for the fact that I had brought DPLA into their lives.

FOSS4Lib Recent Releases: Fedora Repository - 3.8.0

Thu, 2014-12-04 14:21
Package: Fedora RepositoryRelease Date: Thursday, December 4, 2014

Last updated December 4, 2014. Created by Peter Murray on December 4, 2014.
Log in to edit this page.

The Fedora 3.8 release features an improved REST API interaction with correct headers returned for better caching along with performance improvements and bug fixes.

FOSS4Lib Recent Releases: Fedora Repository - 4.0

Thu, 2014-12-04 14:19
Package: Fedora RepositoryRelease Date: Thursday, December 4, 2014

Last updated December 4, 2014. Created by Peter Murray on December 4, 2014.
Log in to edit this page.

The international Fedora repository community and DuraSpace are very pleased to announce the production release of Fedora 4. This significant release signals the effectiveness of an international and complex community source project in delivering a modern repository platform with features that meet or exceed current use cases in the management of institutional digital assets. Fedora 4 features include vast improvements in scalability, linked data capabilities, research data support, modularity, ease of use and more.

OCLC Dev Network: Today at Developer House: Data strategy, lightning talks, oh my!

Thu, 2014-12-04 02:30

Today’s developer house activities were a real adventure. The morning was filled with an overview of OCLC’s data strategy and plans for exposing entities. The later half of the morning saw a bevy of lightning talks ranging from user experience to Hadoop and inspired lots of great conversations over lunch.

SearchHub: Infographic: Gender Gap – Women in Technology

Thu, 2014-12-04 01:00
Women that choose careers in technology and other STEM fields are pivotal to technological innovation but they are increasing relegated to the sidelines inside their own organizations. Here’s a snapshot of the gender gap in technology and how it compares to the rest of the workforce – and why we should reprogram the gender balance:

The post Infographic: Gender Gap – Women in Technology appeared first on Lucidworks.

DuraSpace News: CALL for OR2015 Scholarship Programme Applicants

Wed, 2014-12-03 00:00

Indianapolis, IN  The Tenth International Conference on Open Repositories ( , OR2015, will take place on June 8-11, 2015 in Indianapolis (Indiana, USA). The organizers are pleased to invite you to apply to the 2015 Scholarship Programme.

District Dispatch: Put your library on Digital Inclusion map before Dec 12!

Tue, 2014-12-02 21:19

Last call! Add your voice now to a nationally representative study of public libraries and the roles they play in community digital inclusion. Participate in the 2014 Digital Inclusion Survey by December 12 to add your library to interactive community maps and support efforts to educate policymakers and the media about modern library services, resources and infrastructure.

Participation in the survey can also help your library identify the impacts of public computer and Internet access on your community and demonstrate library contributions to community digital inclusion efforts. The study is funded by the Institute of Museum and Library Services, and conducted by the American Library Association (ALA), the Information Policy & Access Center (iPAC) at University of Maryland, the International City/County Management Association (ICMA), and Community Attributes International (CAI).

Find your community on our new interactive map here and check out the rest of our tools and resources here. With your help we can further build on these tools and products with the 2014 survey results. To participate, go to and follow the Take Survey Now button. The survey is open until December 12. (By participating you can also register to win one of three Kindles!)

Questions? E-mail Thank you!

The post Put your library on Digital Inclusion map before Dec 12! appeared first on District Dispatch.

OCLC Dev Network: Four Projects Started at Developer House

Tue, 2014-12-02 20:30

Developer House is underway!  We have four teams working on four different projects.  Each of us have the same goal: We will have fun developing these projects and we will have working code to demonstrate on Friday morning.

FOSS4Lib Upcoming Events: CollectionSpace: Getting it up and running at your museum

Tue, 2014-12-02 19:24
Date: Monday, February 9, 2015 - 12:00 to 17:00Supports: CollectionSpace

Last updated December 2, 2014. Created by Peter Murray on December 2, 2014.
Log in to edit this page.

This workshop is designed for anyone interested in or tasked with the technical setup and configuration of CollectionSpace for use in any collections environment (museum, library, special collection, gallery, etc. For more information about CollectionSpace, visit

HangingTogether: Gifts for archivists (and librarians)?

Tue, 2014-12-02 17:44

Last year we asked on the ArchiveGrid blog for suggestions for gifts for archivists — and we were blown away by the number (and quality!) of suggestions (posted in 24 fun and practical gifts for archivists). This year, we’re moving the conversation over to HangingTogether and extending the fun to librarians. So, librarians and archivists, what would you like as a gift? We’ll assemble the best of the best and post them in a week or two. Then it’s up to you to leave the link for that special someone to find. Or use it to treat your colleagues. We look forward to your suggestions in the comments below!

[Untitled, Anacostia family c. 1950. Smithsonian Institution]

About Merrilee Proffitt

Mail | Web | Twitter | Facebook | LinkedIn | More Posts (275)

David Rosenthal: Henry Newman's Farewell Column

Tue, 2014-12-02 16:00
Henry Newman has been writing a monthly column on storage technology for Enterprise Storage Forum for 12 years, and he's decided to call it a day. His farewell column is entitled Follow the Money: Picking Technology Winners and Losers and it starts:
I want to leave you with a single thought about our industry and how to consistently pick technology winners and losers. This is one of the biggest lessons I’ve learned in my 34 years in the IT industry: follow the money.Its an interesting read. Although Henry has been a consistent advocate for tape for "almost three decades", he uses tape as an example of the money drying up. He has a table showing that the LTO media market is less than half the size it was in 2008. He estimates that the total tape technology market is currently about $1.85 billion, whereas the disk technology market it around $35 billion.
Following the money also requires looking at the flip side and following the de-investment in a technology. If customers are reducing their purchases of a technology, how can companies justify increasing their spending on R&D? Companies do not throw good money after bad forever, and at some point they just stop investing.Go read the whole thing and understand why Henry's regular column will be missed, and how perceptive the late Jim Gray was when in 2006 he stated that Tape is Dead, Disk is Tape, Flash is Disk.

Open Knowledge Foundation: Introducing Open Education Data

Tue, 2014-12-02 15:29

Open education data is a relatively new area of interest with only dispersed pockets of exploration having taken place worldwide. The phrase ‘open education data’ remains loosely defined but might be used to refer to:

  • all openly available data that could be used for educational purpose
  • open data that is released by education institutions

Understood in the former sense, open education data can be considered a subset of open education resources (OERs) where data sets are made available for use in teaching and learning. These data sets might not be designed for use in education, but can be repurposed and used freely.

In the latter sense, the interest is primarily around the release of data from academic institutions about their performance and that of their students. This could include:

  • Reference data such as the location of academic institutions
  • Internal data such as staff names, resources available, personnel data, identity data, budgets
  • Course data, curriculum data, learning objectives,
  • User-generated data such as learning analytics, assessments, performance data, job placements
  • Benchmarked open data in education that is released across institutions and can lead to change in public policy through transparency and raising awareness.

Last week I gave a talk at the at the LTI NetworkED Seminar series run by the London School of Economics Learning Technology and Innovation Department introducing open education data. The talk ended up being a very broad overview of how we can use open data sets to meet educational needs and the challenges and opportunities this presents, so for example issues around monitoring and privacy. Prior to giving the talk I was interviewed for the LSE blog.

A video of the talk is available on the CLTSupport YouTube Channel and embedded below.

DPLA: Open Technical Advisory Committee Call: Wednesday, December 3, 2:00 PM Eastern

Tue, 2014-12-02 14:00

The DPLA Technical Advisory Committee will lead an open committee call on Wednesday, December 3 at 2:00 PM Eastern. To register, complete the short registration form available via the link below.

  1. AWS migration
  2. Ingestion development
  3. Frontend usability assessment work
  4. Recent open source contributions (non-DPLA-specific projects) by tech team members
  5. Upcoming events with DPLA tech team participation
  6. DPLA Hubs application
  7. Questions, comments, and open discussion

Islandora: Research Data in Islandora

Tue, 2014-12-02 13:01

The idea of storing research data in Islandora has come up fair bit lately at camps and on the listserv, so here is a little overview of the current state of tools and projects that touch on the topic:

  • Combining the Compound Solution Pack with the Binary Solution Pack can get your data into Islandora and make it browsable. The Binary SP, which is still in development, can accommodate any kind of data with a barebones ingestion that adds only the objects necessary for Fedora. The Compound SP can be used to 'attach' these files to a parent object more suitable to display and browsing, such as a PDF or image.
  • Islandora Scholar contains tools for disseminating information about citations. When used in conjunction with the Entities Solution Pack (recently offered to the Islandora Foundation and likely to be in the 7.x-1.5 release next year), it can manage authority records for scholars and projects.
  • The Data Solution Pack, being developed by Alex Garnett at Simon Fraser University, uses Ethercalc to display and manipulate data from XLSX, XLS, ODS, and CSV sources in a spreadsheet viewer.
  • Simon Fraser also has a Research Data Repository environment with SFUdora, which supports DDI and desktop synchronization. It is demonstrated here by Alex Garnett.
  • Research data can also be handled by using existing solution packs in novel ways. One of the first Islandora projects at UPEI involved storing electron microscope images with the Large Image Solution Pack, which was perfectly suited to storing and presenting such massive files. UPEI has also employed the Image Annotation Solution Pack to steward and annotate goat anatomy photos for veterinary students.
  • A quantum chemist at UPEI is updating the Chemistry Solution Pack to work with Islandora 7.x.
  • UPEI is also developing a Biosciences Solution Pack to serve their biodiversity and bioscience wetlab.
  • The UPEI team is developing integration of the DDC Data Management Planning Tool into the Islandora stack, with work nearly complete.
  • CNR IPSP and CNR IRCrES in Italy are using Islandora to store, preserve, and make accessible scientific data produced by the Institute of Plant Virology and the Institute of Plant Protection of the Italian National Research Council with the V2P2 project. This repository handles data relating to plant, microorganism, and virus interactions.
  • The University of Toronto Scarborough has begun a project for Learning in Neural Circuits. More projects are in development, such as Eastern Himalaya Research Network and Mediating Israel. A broader Research Commons service is also in the works.
  • The Smithsonian Institute uses a heavily customized Islandora instance called SIdora for field research data.

Are you working with research data in islandora? Are you planning to? Contact us and share your story.

Ed Summers: Inter-face

Tue, 2014-12-02 01:49

Image from page 315 of “The elements of astronomy; a textbook” (1919)

Every document, every moment in every document, conceals (or reveals) an indeterminate set of interfaces that open into alternate spaces and temporal relations.

Traditional criticism will engage this kind of radiant textuality more as a problem of context than a problem of text, and we have no reason to fault that way of seeing the matter. But as the word itself suggests, “context” is a cognate of text, and not in any abstract Barthesian sense. We construct the poem’s context, for example, by searching out the meanings marked in the physical witnesses that bring the poem to us. We read those witnesses with scrupulous attention, that is to say, we make our detailed way through the looking glass of the book and thence to the endless reaches of the Library of Babel, where every text is catalogued and multiple cross-referenced. In making the journey we are driven far out into the deep space, as we say these days, occupied by our orbiting texts. There objects pivot about many different points and poles, the objects themselves shapeshift continually and the pivots move, drift, shiver, and even dissolve away. Those transformations occur because “the text” is always a negotiated text, half perceived and half created by those who engage with it.

Radiant Textuality by Jerome McGann