You are here

Feed aggregator

D-Lib: The Sixth Annual VIVO Conference

planet code4lib - Tue, 2015-09-15 14:43
Article by Carol Minton Morris, Duraspace

D-Lib: Enduring Access to Rich Media Content: Understanding Use and Usability Requirements

planet code4lib - Tue, 2015-09-15 14:43
Article by Madeleine Casad, Oya Y. Rieger and Desiree Alexander, Cornell University Library

D-Lib: Success Criteria for the Development and Sustainable Operation of Virtual Research Environments

planet code4lib - Tue, 2015-09-15 14:43
Article by Stefan Buddenbohm, Goettingen State and University Library; Heike Neuroth, University of Applied Science Potsdam; Harry Enke and Jochen Klar, Leibniz Institute for Astrophysics Potsdam; Matthias Hofmann, Robotics Research Institute, TU Dortmund University

D-Lib: Year Twenty-One

planet code4lib - Tue, 2015-09-15 14:43
Editorial by Laurence Lannom, CNRI

DuraSpace News: REGISTER for the Midwest Fedora User Group Meeting

planet code4lib - Tue, 2015-09-15 00:00

From Stefano Cossu, Director of Application Services, Collections, The Art Institute of Chicago

Chicago, Ill  I am pleased to announce that the first Midwest Fedora User Group Meeting will be held in Chicago on October 22 and 23, 2015. Event details are posted in the Duraspace Wiki page:

Registration is still open and presentation proposals are welcome! If you are interested in participating, please register through this form:

DuraSpace News: Toward the Next DSpace User Interface: The DSpace UI Prototype Challenge

planet code4lib - Tue, 2015-09-15 00:00

Winchester, MA  Help us discover the technology/platform for our new user interface (UI) for DSpace!  You are invited to create a prototype UI on a platform of your choice (in any programming language), with basic DSpace-like capabilities as described below. The goal of a UI prototype is to exercise a new UI technology/platform to see whether it would meet the needs of our DSpace community.

District Dispatch: Important win for fair use and for babies who dance

planet code4lib - Mon, 2015-09-14 21:47

From Flickr

In Lenz v. Universal, an appeals court in San Francisco today ruled that a rights holder must consider whether a use is fair before sending a takedown notice. The “Dancing Baby Case,” you may recall, is about a takedown notice a mother received after uploading a video to YouTube showing her baby dancing to Prince’s “Let’s Go Crazy.” The court found that rights holders cannot send takedown notices without first considering whether the use of the copyrighted content is fair. This ruling not only clarifies the steps that rights holders should consider before issuing a takedown notice, it also bolsters the notion that fair use is a right, not just an affirmative defense to infringement.

“Fair use is not just excused by the law, it is wholly authorized by the law . . . The statute explains that the fair use of a copyrighted work is permissible because it is a non-infringing use.”

“Although the traditional approach is to view ‘fair use’ as an affirmative defense . . . it is better viewed as a right granted by the Copyright Act of 1976. Originally, as a judicial doctrine without any statutory basis, fair use was an infringement that was excused–this is presumably why it was treated as a defense. As a statutory doctrine, however, fair use is not an infringement. Thus, since the passage of the 1976 Act, fair use should no longer be considered an infringement to be excused; instead, it is logical to view fair use as a right. Regardless of how fair use is viewed, it is clear that the burden of proving fair use is always on the putative infringer.” Bateman v. Mnemonics, Inc., 79 F.3d 1532, 1542 n.22 (11th Cir. 1996).

The court’s ruling is one that reflects what people understand to be a fair use. The general public thinks that integrating portions of copyrighted content in non-commercial user-created videos is reasonable. Today, there are so many dancing baby videos on YouTube that people are starting to curate them!

I like when the law makes sense to regular people – after all, in today’s digital environment, copyright affects the lives of everyday people, not just the content industry. Many hope that Congress also understands this as it considers copyright review. Congratulations to the Electronic Frontier Foundation for their leadership on this litigation over the past several years.

The post Important win for fair use and for babies who dance appeared first on District Dispatch.

SearchHub: Searching and Querying Knowledge Graphs with Solr/SIREn: a Reference Architecture

planet code4lib - Mon, 2015-09-14 19:30
As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Tummarello Delbru and Giovanni Renaud’s session on querying knowledge graphs with Solr/SIREn: Knowledge Graphs have recently gained press coverage as information giants like Google, Facebook, Yahoo and Microsoft, announced having deployed Knowledge Graphs at the core of their search and data management capabilities. Very richly structured datasets like “Freebase” or “DBPedia” can be said to be examples of these. In this talk we discuss a reference architecture for high performance structured querying and search on knowledge graphs. While graph databases, e.g., Triplestores or Graph Stores, have a role in this scenario, it is via Solr along with its schemaless structured search plugin SIREn that it is possible to deliver fast and accurate entity search with rich structured querying. During the presentation we will discuss an end to end example case study, a tourism social data use case. We will cover extraction, graph databases, SPARQL, JSON-LD and the role of Solr/SIREn both as search and as high speed structured query component. The audience will leave this session with an understanding of the Knowledge Graph idea and how graph databases, SPARQL, JSON-LD and Solr/SIREn can be combined together to implement high performance real world applications on rich and diverse structured datasets. Renaud Delbru, Ph. D., CTO and Founder at SindiceTech, is leading the research and development of the SIREn engine and of all aspects related to large scale data retrieval and analytics. He is author of over a dozen academic works in the area of semi-structured information retrieval and big data RDF processing. Prior to SindiceTech, Renaud completed his Ph.D. on Information Retrieval for Semantic Web data at the Digital enterprise Research Institute, Galway where he worked on the semantic search engine project. Among his achievements, he led the team that won the Entity Search track of the Yahoo’s Semantic Search 2011. Searching and Querying Knowledge Graphs with Solr/SIREn – A Reference Architecture: Presented by Renaud Delbru & Giovanni Tummarello, SIREn Solutions from Lucidworks Join us at Lucene/Solr Revolution 2015, the biggest open source conference dedicated to Apache Lucene/Solr on October 13-16, 2015 in Austin, Texas. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post Searching and Querying Knowledge Graphs with Solr/SIREn: a Reference Architecture appeared first on Lucidworks.

Thom Hickey: Extracting information from VIAF

planet code4lib - Mon, 2015-09-14 18:04

Occasionally I run into someone trying to extract information out of VIAF and having a difficult time. Here's a simple example of how I'd begin extracting titles for a given VIAF ID.  Far from industrial strength, but might get you started.

The problem: Have a file of VIAF IDs (one/line).  Want a file of the titles, each proceeded by the VIAF ID of the record they were found in.

There are lots of ways to do this, but my inclination is to do it in Python (I ran this in version 2.7.1) and to use the raw VIAF XML record:

from __future__ import print_function
import sys, urllib
from xml.etree import cElementTree as ET

# reads in list of VIAF IDs one/line
# writes out VIAFID\tTitle one/line

# worry about the name space
ns = {'v':''}

def titlesFromVIAF(viafXML, path):
    vel = ET.fromstring(viafXML)
    for el in vel.findall(path, ns):
        yield el.text

for line in sys.stdin:
    viafid = line.strip()
    viafURL = ''%viafid
    viafXML = urllib.urlopen(viafURL).read()
    for ttl in titlesFromVIAF(viafXML, ttlPath):
      print('%s\t%s'%(viafid, ttl.encode('utf-8')))

That's about as short as I could get it and have it readable in this narrow window.  We've been using the new print function (and division!) for some time now, with an eye towards Python 3.


Update 2015.09.16: Cleaned up how namespace is specified

SearchHub: Infographic: The Dangers of Bias in High-Stakes Data Science

planet code4lib - Mon, 2015-09-14 17:04
A data set is only as powerful as the ability of data scientists to interpret it, and insights gleaned can have huge ramifications in business, public policy, health care, and elsewhere. As the stakes of data-driven decisions become increasingly high, let’s look at some of the most common data science fallacies.

The post Infographic: The Dangers of Bias in High-Stakes Data Science appeared first on Lucidworks.

SearchHub: Stump The Chump: Meet The Panel Keeping Austin Weird

planet code4lib - Mon, 2015-09-14 14:14

As previously mentioned: On October 15th, Lucene/Solr Revolution 2015 will once again be hosting “Stump The Chump” in which I (The Chump) will be answering tough Solr questions — submitted by users like you — live, on stage, sight unseen.

Today, I’m happy to announce the Panel of experts that will be challenging me with those questions, and deciding which questions were able to Stump The Chump!

In addition to taunting me with the questions, and ridiculing all my “Um”s and “Uhh”s as I stall for time while I rack my brain to come up with a non-gibberish answer, the Panel members will be responsible for awarding prizes to the folks who have submitted the question that do the best job of “Stumping” me.

Check out the session information page for details on how to submit questions. Even if you can’t make it to Austin to attend the conference, you can still participate — and do your part to humiliate me — by submitting your questions.

To keep up with all the “Chump” related info, you can subscribe to this blog (or just the “Chump” tag).

The post Stump The Chump: Meet The Panel Keeping Austin Weird appeared first on Lucidworks.

FOSS4Lib Upcoming Events: Open Journal Systems 3.0: A User Experience Webinar

planet code4lib - Mon, 2015-09-14 13:57
Date: Tuesday, October 20, 2015 - 10:00 to 11:00Supports: Open Journal Systems

Last updated September 14, 2015. Created by Peter Murray on September 14, 2015.
Log in to edit this page.

From the announcement:

Open Journal Systems 3.0: A User Experience Webinar
Think about user experience and libraries? Be sure to register for this UX webinar.

In August, Open Journal System 3.0 was released in beta. The new version has major upgrades, including improvements to usability.
In this webinar, Kevin Stranack of the Public Knowledge Project will provide a case study of integrating UX into a major web design project: OJS.

Islandora: Islandora Community Sprint 001 - Complete!

planet code4lib - Mon, 2015-09-14 13:27

For the past couple of weeks, the Islandora Community has been working together on a collective experiment in community-driven code: the community sprint. The brainchild of Jared Whiklo and Nick Ruest, a call was put out in mid-August for volunteers to tackle a maintenance sprint, fixing bugs, doing code tasks, and updating documentation. Ten people signed up:

  • Nick Ruest
  • Jared Whiklo
  • Melissa Anez
  • Diego Pino
  • Mark Cooper
  • Brian Harrington
  • Kim Pham
  • Peter Murray
  • Lingling Jiang
  • Jacob Sanford

And on Monday, August 31st, they got to work. 118 outstanding issues from our JIRA tracker were tagged as possible tasks for the sprint, ranging from complex bug fixes that spanned multiple modules, to 'newbie' tickets updating readme files and user documentation. The sprint to off to a brisk start, knocking off 15 tickets in the first 24 hours. The pace slowed a little as we ran out of low-hanging-fruit and got into the more challenging stuff, but I'll let this JIRA tracker gif speak for itself:

By the end, 38 tickets were down for the count. 

Some of the highlights include:

JIRA 1087 - Get a WSOD when using xml form with template and changing tabbed template values. A particularly tricky little error that was difficult to trigger, but pointed to deeper issues. Reported nearly a year ago, and actually effecting the code since 2012, this bug finally fell to the efforts of Diego Pino near the end of the sprint. If you're curious about the details, just check that long and involved comments thread on the ticket - this one was a doozy!

JIRA 1383 and JIRA 1394 weren't particularly complex or difficult, but they share the record for pull requests for a single sprint ticket. Both involved making updates to the readme file of every module in the release, to point to our new static wiki documentation and to the new instructions on contributing to Islandora that will ship with every module. They were also my biggest ticket, and I include them in this report not to brag, but to demonstrate that someone with no programming skills using the GitHub GUI can still be an active part of a sprint. Kim Pham and I were both 'newbie' sprinters and tackled low-level tickets accordingly, but we racked up a decent chunk of the 38 completed tickets, so I hope more 'newbies' will consider joining us on the next sprint.

JIRA 1274 was a Blocker ticket finished up during the sprint by Jared Whiklo. This one brings Ashok Modi's great Form Field Panel into Islandora coding standards so the it can be included in the 7.x-1.6 release.

Our sprinters also did a lot of testing, enabling some fixed from outside the sprint to finally be implemented, such as:

  • JIRA 1181 - newspaper view is broken when newspaper issues have XACML policies
  • JIRA 1184 - islandora_batch failing to import MARCXML
  • JIRA 1292 - Citeproc shouldn't try to render dates it doesn't have
  • JIRA 1419 - Derivative Generation on Web Archive is Broken After Adding JPG Functionality (a Release Blocker)

Many thanks to the first team of Islandora community sprinters. Your hard work and coordination have proven this approach can have great results, so we will definitely be sprinting again. Keep an eye on the listserv and here on for the next announcement.

LITA: Creating High-Quality Online Video Tutorials

planet code4lib - Mon, 2015-09-14 11:00

Lately it seems all I do all day is create informational or educational video tutorials on various topics for my library.  The Herbert Wertheim College of Medicine Medical Library at Florida International University in Miami, FL has perfected a system.  First, a group of three librarians write and edit a script on a topic.  In the past we have done multiple videos on American Medical Association (AMA) and American Psychological Association (APA) citation styles, Evidence-Based Medicine research to support a course, and other titles on basic library services and resources. After a script has been finalized, we record the audio.  We have one librarian who has become “the voice of the library,” one simple method to brand the library.  After that, I go ahead and coordinate the visuals – a mixture of PowerPoint slides, visual effects and screen video shots.  We use Camtasia to edit our videos and produce and upload them to our fiumedlib YouTube Channel.  Below are some thoughts and notes to consider if starting your own collection of online video tutorials for your organization.

Zoom In
As my past photography teacher declared, rather than zoom in with a telephoto lens walk towards your subject.  You as the photographer should reposition yourself to get the best shot.  The same holds true for screen shots.  If recording from your browser, it is good practice to use your zoom feature when recording your footage to get the sharpest footage.  If using Chrome, click on the customize and control (three-bar icon) on the top right of the browser window and you will see the option to zoom in or out.  Keep in mind that the look of the video also is dependent on the viewers monitor screen resolution and other factors – so sometimes you have to let it go.  The screen shots below show one recorded in 100% and another in 175%.  This small change affected the clarity of the footage.

recorded at 100%

recorded at 175%

Write a Script and Record Audio First – (then add Visuals)
Most people multi-task by recording the voice over, their video and audio at the same time.  I have found that this creates multiple mistakes and the need to record multiple takes.  Preparation steps help projects run smoothly.

Brand Your Library
The team brands our library by having the same beginning title slide with our logo and the ending slide with contact email with the same background music clip.  In addition, we try to use a common look and feel throughout the video lesson to further cement that these are from the same library. As mentioned before, we use the same narrator for our videos.

PowerPoint Slides
I cringe at the thought of seeing a PowerPoint slide with a header and a list of bullet points.  PowerPoint is not necessarily bad, I just like to promote using the software in a creative manner by approaching each slide as a canvas.  I steer clear from templates and following the usual “business” organization of a slide.

Check out the current videos our department has created and let me know if you have any questions at

Herbert Wertheim College of Medicine Medical Library You Tube Channel:

Karen Coyle: Models of our World

planet code4lib - Mon, 2015-09-14 09:57

This is to announce the publication of my book, FRBR, Before and After, by ALA Editions, available in November, 2015. As is often the case, the title doesn't tell the story, so I want to give a bit of an introduction before everyone goes: "Oh, another book on FRBR, yeeech." To be honest, the book does have quite a bit about FRBR, but it's also a think piece about bibliographic models, and a book entitled "Bibliographic Models" would look even more boring than one called "FRBR, Before and After."

The before part is a look at the evolution of the concept of Work, and, yes, Panizzi and Cutter are included, as are Lubetzky, Wilson, and others. Then I look at modeling and how goals and models are connected, and the effect that technology has (and has not) had on library data. The second part of the book focuses on the change that FRBR has wrought both in our thinking and in how we model the bibliographic description. I'll post more about that in the near future, but let me just say that you might be surprised at what you read there.

The text will also be available as open access in early 2016. This is thanks to the generosity of ALA Editions, who agreed to this model. I do hope that enough libraries and individuals do decide to purchase the hard copy that ALA Publishing puts out so that this model of print plus OA is economically viable. I can attest to the fact that the editorial work and application of design to the book has produced a final version that I could not have even approximated on my own

DuraSpace News: JOIN "VIVO Stories": Introducing People, Projects, Ideas and Innovation

planet code4lib - Mon, 2015-09-14 00:00

The Telling VIVO Stories Task Force is underway! The task force goal is to grow our open source community and actively engage with its members by sharing each others stories. The first three stories are now available to inspire and answer questions about how others have implemented VIVO at their institutions:


Subscribe to code4lib aggregator