You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 3 hours 5 min ago

DuraSpace News: REGISTER for the Midwest Fedora User Group Meeting

Tue, 2015-09-15 00:00

From Stefano Cossu, Director of Application Services, Collections, The Art Institute of Chicago

Chicago, Ill  I am pleased to announce that the first Midwest Fedora User Group Meeting will be held in Chicago on October 22 and 23, 2015. Event details are posted in the Duraspace Wiki page:

Registration is still open and presentation proposals are welcome! If you are interested in participating, please register through this form:

DuraSpace News: Toward the Next DSpace User Interface: The DSpace UI Prototype Challenge

Tue, 2015-09-15 00:00

Winchester, MA  Help us discover the technology/platform for our new user interface (UI) for DSpace!  You are invited to create a prototype UI on a platform of your choice (in any programming language), with basic DSpace-like capabilities as described below. The goal of a UI prototype is to exercise a new UI technology/platform to see whether it would meet the needs of our DSpace community.

District Dispatch: Important win for fair use and for babies who dance

Mon, 2015-09-14 21:47

From Flickr

In Lenz v. Universal, an appeals court in San Francisco today ruled that a rights holder must consider whether a use is fair before sending a takedown notice. The “Dancing Baby Case,” you may recall, is about a takedown notice a mother received after uploading a video to YouTube showing her baby dancing to Prince’s “Let’s Go Crazy.” The court found that rights holders cannot send takedown notices without first considering whether the use of the copyrighted content is fair. This ruling not only clarifies the steps that rights holders should consider before issuing a takedown notice, it also bolsters the notion that fair use is a right, not just an affirmative defense to infringement.

“Fair use is not just excused by the law, it is wholly authorized by the law . . . The statute explains that the fair use of a copyrighted work is permissible because it is a non-infringing use.”

“Although the traditional approach is to view ‘fair use’ as an affirmative defense . . . it is better viewed as a right granted by the Copyright Act of 1976. Originally, as a judicial doctrine without any statutory basis, fair use was an infringement that was excused–this is presumably why it was treated as a defense. As a statutory doctrine, however, fair use is not an infringement. Thus, since the passage of the 1976 Act, fair use should no longer be considered an infringement to be excused; instead, it is logical to view fair use as a right. Regardless of how fair use is viewed, it is clear that the burden of proving fair use is always on the putative infringer.” Bateman v. Mnemonics, Inc., 79 F.3d 1532, 1542 n.22 (11th Cir. 1996).

The court’s ruling is one that reflects what people understand to be a fair use. The general public thinks that integrating portions of copyrighted content in non-commercial user-created videos is reasonable. Today, there are so many dancing baby videos on YouTube that people are starting to curate them!

I like when the law makes sense to regular people – after all, in today’s digital environment, copyright affects the lives of everyday people, not just the content industry. Many hope that Congress also understands this as it considers copyright review. Congratulations to the Electronic Frontier Foundation for their leadership on this litigation over the past several years.

The post Important win for fair use and for babies who dance appeared first on District Dispatch.

SearchHub: Searching and Querying Knowledge Graphs with Solr/SIREn: a Reference Architecture

Mon, 2015-09-14 19:30
As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Tummarello Delbru and Giovanni Renaud’s session on querying knowledge graphs with Solr/SIREn: Knowledge Graphs have recently gained press coverage as information giants like Google, Facebook, Yahoo and Microsoft, announced having deployed Knowledge Graphs at the core of their search and data management capabilities. Very richly structured datasets like “Freebase” or “DBPedia” can be said to be examples of these. In this talk we discuss a reference architecture for high performance structured querying and search on knowledge graphs. While graph databases, e.g., Triplestores or Graph Stores, have a role in this scenario, it is via Solr along with its schemaless structured search plugin SIREn that it is possible to deliver fast and accurate entity search with rich structured querying. During the presentation we will discuss an end to end example case study, a tourism social data use case. We will cover extraction, graph databases, SPARQL, JSON-LD and the role of Solr/SIREn both as search and as high speed structured query component. The audience will leave this session with an understanding of the Knowledge Graph idea and how graph databases, SPARQL, JSON-LD and Solr/SIREn can be combined together to implement high performance real world applications on rich and diverse structured datasets. Renaud Delbru, Ph. D., CTO and Founder at SindiceTech, is leading the research and development of the SIREn engine and of all aspects related to large scale data retrieval and analytics. He is author of over a dozen academic works in the area of semi-structured information retrieval and big data RDF processing. Prior to SindiceTech, Renaud completed his Ph.D. on Information Retrieval for Semantic Web data at the Digital enterprise Research Institute, Galway where he worked on the semantic search engine project. Among his achievements, he led the team that won the Entity Search track of the Yahoo’s Semantic Search 2011. Searching and Querying Knowledge Graphs with Solr/SIREn – A Reference Architecture: Presented by Renaud Delbru & Giovanni Tummarello, SIREn Solutions from Lucidworks Join us at Lucene/Solr Revolution 2015, the biggest open source conference dedicated to Apache Lucene/Solr on October 13-16, 2015 in Austin, Texas. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post Searching and Querying Knowledge Graphs with Solr/SIREn: a Reference Architecture appeared first on Lucidworks.

Thom Hickey: Extracting information from VIAF

Mon, 2015-09-14 18:04

Occasionally I run into someone trying to extract information out of VIAF and having a difficult time. Here's a simple example of how I'd begin extracting titles for a given VIAF ID.  Far from industrial strength, but might get you started.

The problem: Have a file of VIAF IDs (one/line).  Want a file of the titles, each proceeded by the VIAF ID of the record they were found in.

There are lots of ways to do this, but my inclination is to do it in Python (I ran this in version 2.7.1) and to use the raw VIAF XML record:

from __future__ import print_function
import sys, urllib
from xml.etree import cElementTree as ET

# reads in list of VIAF IDs one/line
# writes out VIAFID\tTitle one/line

# worry about the name space
ns = {'v':''}

def titlesFromVIAF(viafXML, path):
    vel = ET.fromstring(viafXML)
    for el in vel.findall(path, ns):
        yield el.text

for line in sys.stdin:
    viafid = line.strip()
    viafURL = ''%viafid
    viafXML = urllib.urlopen(viafURL).read()
    for ttl in titlesFromVIAF(viafXML, ttlPath):
      print('%s\t%s'%(viafid, ttl.encode('utf-8')))

That's about as short as I could get it and have it readable in this narrow window.  We've been using the new print function (and division!) for some time now, with an eye towards Python 3.


Update 2015.09.16: Cleaned up how namespace is specified

SearchHub: Infographic: The Dangers of Bias in High-Stakes Data Science

Mon, 2015-09-14 17:04
A data set is only as powerful as the ability of data scientists to interpret it, and insights gleaned can have huge ramifications in business, public policy, health care, and elsewhere. As the stakes of data-driven decisions become increasingly high, let’s look at some of the most common data science fallacies.

The post Infographic: The Dangers of Bias in High-Stakes Data Science appeared first on Lucidworks.

SearchHub: Stump The Chump: Meet The Panel Keeping Austin Weird

Mon, 2015-09-14 14:14

As previously mentioned: On October 15th, Lucene/Solr Revolution 2015 will once again be hosting “Stump The Chump” in which I (The Chump) will be answering tough Solr questions — submitted by users like you — live, on stage, sight unseen.

Today, I’m happy to announce the Panel of experts that will be challenging me with those questions, and deciding which questions were able to Stump The Chump!

In addition to taunting me with the questions, and ridiculing all my “Um”s and “Uhh”s as I stall for time while I rack my brain to come up with a non-gibberish answer, the Panel members will be responsible for awarding prizes to the folks who have submitted the question that do the best job of “Stumping” me.

Check out the session information page for details on how to submit questions. Even if you can’t make it to Austin to attend the conference, you can still participate — and do your part to humiliate me — by submitting your questions.

To keep up with all the “Chump” related info, you can subscribe to this blog (or just the “Chump” tag).

The post Stump The Chump: Meet The Panel Keeping Austin Weird appeared first on Lucidworks.

FOSS4Lib Upcoming Events: Open Journal Systems 3.0: A User Experience Webinar

Mon, 2015-09-14 13:57
Date: Tuesday, October 20, 2015 - 10:00 to 11:00Supports: Open Journal Systems

Last updated September 14, 2015. Created by Peter Murray on September 14, 2015.
Log in to edit this page.

From the announcement:

Open Journal Systems 3.0: A User Experience Webinar
Think about user experience and libraries? Be sure to register for this UX webinar.

In August, Open Journal System 3.0 was released in beta. The new version has major upgrades, including improvements to usability.
In this webinar, Kevin Stranack of the Public Knowledge Project will provide a case study of integrating UX into a major web design project: OJS.

Islandora: Islandora Community Sprint 001 - Complete!

Mon, 2015-09-14 13:27

For the past couple of weeks, the Islandora Community has been working together on a collective experiment in community-driven code: the community sprint. The brainchild of Jared Whiklo and Nick Ruest, a call was put out in mid-August for volunteers to tackle a maintenance sprint, fixing bugs, doing code tasks, and updating documentation. Ten people signed up:

  • Nick Ruest
  • Jared Whiklo
  • Melissa Anez
  • Diego Pino
  • Mark Cooper
  • Brian Harrington
  • Kim Pham
  • Peter Murray
  • Lingling Jiang
  • Jacob Sanford

And on Monday, August 31st, they got to work. 118 outstanding issues from our JIRA tracker were tagged as possible tasks for the sprint, ranging from complex bug fixes that spanned multiple modules, to 'newbie' tickets updating readme files and user documentation. The sprint to off to a brisk start, knocking off 15 tickets in the first 24 hours. The pace slowed a little as we ran out of low-hanging-fruit and got into the more challenging stuff, but I'll let this JIRA tracker gif speak for itself:

By the end, 38 tickets were down for the count. 

Some of the highlights include:

JIRA 1087 - Get a WSOD when using xml form with template and changing tabbed template values. A particularly tricky little error that was difficult to trigger, but pointed to deeper issues. Reported nearly a year ago, and actually effecting the code since 2012, this bug finally fell to the efforts of Diego Pino near the end of the sprint. If you're curious about the details, just check that long and involved comments thread on the ticket - this one was a doozy!

JIRA 1383 and JIRA 1394 weren't particularly complex or difficult, but they share the record for pull requests for a single sprint ticket. Both involved making updates to the readme file of every module in the release, to point to our new static wiki documentation and to the new instructions on contributing to Islandora that will ship with every module. They were also my biggest ticket, and I include them in this report not to brag, but to demonstrate that someone with no programming skills using the GitHub GUI can still be an active part of a sprint. Kim Pham and I were both 'newbie' sprinters and tackled low-level tickets accordingly, but we racked up a decent chunk of the 38 completed tickets, so I hope more 'newbies' will consider joining us on the next sprint.

JIRA 1274 was a Blocker ticket finished up during the sprint by Jared Whiklo. This one brings Ashok Modi's great Form Field Panel into Islandora coding standards so the it can be included in the 7.x-1.6 release.

Our sprinters also did a lot of testing, enabling some fixed from outside the sprint to finally be implemented, such as:

  • JIRA 1181 - newspaper view is broken when newspaper issues have XACML policies
  • JIRA 1184 - islandora_batch failing to import MARCXML
  • JIRA 1292 - Citeproc shouldn't try to render dates it doesn't have
  • JIRA 1419 - Derivative Generation on Web Archive is Broken After Adding JPG Functionality (a Release Blocker)

Many thanks to the first team of Islandora community sprinters. Your hard work and coordination have proven this approach can have great results, so we will definitely be sprinting again. Keep an eye on the listserv and here on for the next announcement.

LITA: Creating High-Quality Online Video Tutorials

Mon, 2015-09-14 11:00

Lately it seems all I do all day is create informational or educational video tutorials on various topics for my library.  The Herbert Wertheim College of Medicine Medical Library at Florida International University in Miami, FL has perfected a system.  First, a group of three librarians write and edit a script on a topic.  In the past we have done multiple videos on American Medical Association (AMA) and American Psychological Association (APA) citation styles, Evidence-Based Medicine research to support a course, and other titles on basic library services and resources. After a script has been finalized, we record the audio.  We have one librarian who has become “the voice of the library,” one simple method to brand the library.  After that, I go ahead and coordinate the visuals – a mixture of PowerPoint slides, visual effects and screen video shots.  We use Camtasia to edit our videos and produce and upload them to our fiumedlib YouTube Channel.  Below are some thoughts and notes to consider if starting your own collection of online video tutorials for your organization.

Zoom In
As my past photography teacher declared, rather than zoom in with a telephoto lens walk towards your subject.  You as the photographer should reposition yourself to get the best shot.  The same holds true for screen shots.  If recording from your browser, it is good practice to use your zoom feature when recording your footage to get the sharpest footage.  If using Chrome, click on the customize and control (three-bar icon) on the top right of the browser window and you will see the option to zoom in or out.  Keep in mind that the look of the video also is dependent on the viewers monitor screen resolution and other factors – so sometimes you have to let it go.  The screen shots below show one recorded in 100% and another in 175%.  This small change affected the clarity of the footage.

recorded at 100%

recorded at 175%

Write a Script and Record Audio First – (then add Visuals)
Most people multi-task by recording the voice over, their video and audio at the same time.  I have found that this creates multiple mistakes and the need to record multiple takes.  Preparation steps help projects run smoothly.

Brand Your Library
The team brands our library by having the same beginning title slide with our logo and the ending slide with contact email with the same background music clip.  In addition, we try to use a common look and feel throughout the video lesson to further cement that these are from the same library. As mentioned before, we use the same narrator for our videos.

PowerPoint Slides
I cringe at the thought of seeing a PowerPoint slide with a header and a list of bullet points.  PowerPoint is not necessarily bad, I just like to promote using the software in a creative manner by approaching each slide as a canvas.  I steer clear from templates and following the usual “business” organization of a slide.

Check out the current videos our department has created and let me know if you have any questions at

Herbert Wertheim College of Medicine Medical Library You Tube Channel:

Karen Coyle: Models of our World

Mon, 2015-09-14 09:57

This is to announce the publication of my book, FRBR, Before and After, by ALA Editions, available in November, 2015. As is often the case, the title doesn't tell the story, so I want to give a bit of an introduction before everyone goes: "Oh, another book on FRBR, yeeech." To be honest, the book does have quite a bit about FRBR, but it's also a think piece about bibliographic models, and a book entitled "Bibliographic Models" would look even more boring than one called "FRBR, Before and After."

The before part is a look at the evolution of the concept of Work, and, yes, Panizzi and Cutter are included, as are Lubetzky, Wilson, and others. Then I look at modeling and how goals and models are connected, and the effect that technology has (and has not) had on library data. The second part of the book focuses on the change that FRBR has wrought both in our thinking and in how we model the bibliographic description. I'll post more about that in the near future, but let me just say that you might be surprised at what you read there.

The text will also be available as open access in early 2016. This is thanks to the generosity of ALA Editions, who agreed to this model. I do hope that enough libraries and individuals do decide to purchase the hard copy that ALA Publishing puts out so that this model of print plus OA is economically viable. I can attest to the fact that the editorial work and application of design to the book has produced a final version that I could not have even approximated on my own

DuraSpace News: JOIN "VIVO Stories": Introducing People, Projects, Ideas and Innovation

Mon, 2015-09-14 00:00

The Telling VIVO Stories Task Force is underway! The task force goal is to grow our open source community and actively engage with its members by sharing each others stories. The first three stories are now available to inspire and answer questions about how others have implemented VIVO at their institutions:

William Denton: Thinks and Burns

Sat, 2015-09-12 14:39

Yesterday I stumbled on the Thinks and Burns with Fink and Byrne podcast. I have no idea where or when (indeed if) it was announced, but I was happy to find it. It’s John Fink (@adr) and Gillian Byrne (@redgirl13) talking about library matters. I’m acquainted with both of them and we all work near each other, so it’s of more interest to me than if it were two complete strangers rambling on, but if you know either of them, or are a Canadian librarian, or like podcasts where two librarians talk like you’re hanging out down the pub and one keeps hitting the table for emphasis, it’s worth a listen. They’ve done three episodes, and I hope they keep it going, even if irregularly.

Ed Summers: Zombie Information Science

Sat, 2015-09-12 04:00

One of the readings for INST800 this week was (Bates, 2007). It’s a short piece that outlines how she and Mary Maack organized their work on the third edition of the Encyclopedia of Library and Information Sciences. When creating an encyclopedia it becomes immediately necessary to define the scope, so you have some hope of finishing the work. As she points out, information science is contested territory now because of all the money and power that is aggregated in Silicon Valley. Everyone wants a piece of it now, whereas it has struggled to be a discipline before people started billion dollar companies in their garages:

We have been treated as the astrologers and phrenologists of modern science— assumed to be desperately trying to cobble together the look of scholarship in what are surely trivial and nearly content-free disciplines.

It’s fun to pause for a moment to consider how much of what comes out of Silicon Valley feels like astrology or phrenology in its marketing and futurism.

So, in large part Bates and Maack needed to define what information science is before they could get to work. But Bates seems to feel like this theory transcended its use to scope the work, and really did define the field. Or perhaps it’s more likely that her previous work in the field (she is a giant) informed the scope chosen for the encyclopedia. What better way to establish the foundations of a field (ideology) than write an encyclopedia about it?

A few things struck me about this article. The small addition of “s” to the end of the title of the encyclopedia seemed like an important detail. It recognizes the pluralistic nature of information, its inter-disciplinarity – how information science has grown out of many disciplines across the humanities, social sciences and physical sciences. But she goes on to point out that in fact all these disciplines do have something in common:

… we engage in living and working our daily lives, and these vast numbers of human activities give off or throw off a remarkably extensive body of documentation of one sort or another: Business records, family histories, scholarly books, scientific and technical journals, websites, listservs and blogs for groups with common interests of a work or avocational nature, religious texts, educational curricula, x-rays, case records in law, medicine, and criminal justice, architectural drawings and purchase orders from construction sites, and on and on and on. The universe of living throws off documentary products, which then form the universe of documentation.

This description of different universes is compelling for me because it seems to recognize the processes and systems that information is a part of. She also includes a quirky mindmap-like diagram of these two universes in interaction which helps illustrate what she sees going on:

The Universe of Living and the Universe of Documentation

Now I would be sold if things stopped here, but of course they don’t. Bates goes on to criticize Briet’s idea of documents (anything that can be used as evidence) in order to say:

I argue that a document, above all, contains “recorded information,” that is, “communicatory or memorial information preserved in a durable medium”. (Bates, 2006).

For bates the antelope in the zoo is a specimen, not a document. Now I should probably dig down into (Bates, 2006) to understand fully what’s going on here, but this definition on the surface seems specific, but begs a few questions. Who or what is in communication? Is understanding required for it to be communication? Does this communication need to be intentional? What is does durable mean, over what time scales? Maybe I’m just getting defensive because I’ve always been a bit partial to Briet’s definition.

Bates goes on to use this distinction as an argument for not including the study of living things as information in the encyclopedia; which seems like a perfectly fine definition of scope for an encyclopedia, but not as a definition of what is and is not information science:

In the definition of scope of the Encyclopedia of Library and Information Sciences, the first two branches of the collections sciences, all working with collections of non-living but highly informative objects, are being included in the coverage of the encyclopedia, while the collectors of live specimens— the branch most remote from document collecting— are not included at this time.

Can we really say information is dead, or rather, that it has never been alive? Is information separable from living things (notably us) and still meaningful as an object of study? Where do you draw the line between the living and the unliving? I suspect she herself would agree that this is not a sustainable view of the information sciences.


Bates, M. J. (2006). Fundamental forms of information. Journal of the American Society for Information Science and Technology, 57(8), 1033–1045. Retrieved from

Bates, M. J. (2007). Defining the information disciplines in encyclopedia development. Information Research, 12. Retrieved from

Ed Summers: Red Thread

Sat, 2015-09-12 04:00

Bates, M. (1999). The invisible substrate of information science. Journal of the Society for Information Science, 50(12):1043–1050.

Of all the introductions to information science we’ve read so far I’m quite partial to this one. It does have a few moments of “just drink the kool aid already”, but the general thrust is to help familiarize the growing number of people working with information about the field of information science. So it’s purpose is largely educational not theoretical. Bates wrote this in 1999, and I think there is still a real need to broaden the conversation about what the purpose of information science is today, although perhaps we know it more in the context of human-computer-interaction. I also suspect this article helped define the field for those who were already working in the middle of this highly inter-disciplinary space and trying to find their way.

The reason why it appealed to me so much is because it speaks to the particularly strange way information science permeates, but is not contained by other disciplines. Information science is distinguished by the way its practitioners:

… are always looking for the red thread of information in the social texture of people’s lives. When we study people, we do so with the purpose of understanding information creation, seeking, and use. We do not just study people in general. (Bates, 1999, p. 1048)

Her definitions are centered on people and these weird artifacts they create bearing information, which are only noticed with a particular kind of attention, which you learn when you study information science. I was reminded a bit of (Star, 1999), written in the same year. I can hear echoes of STS in Bates’ exhortation to follow “the red thread” of information–which reminds me of Bruno Latour more than it does Woodward and Bernstein.


Bates, M. (1999). The invisible substrate of information science. Journal of the Society for Information Science, 50(12), 1043–1050.

Star, S. L. (1999). The ethnography of infrastructure. American Behavioral Scientist, 43(3), 377–391.

LibUX: Does the best library web design eliminate choice?

Fri, 2015-09-11 21:38

There is research to suggest that libraries’ commitment to privacy may be its ace up the sleeve, as constant tracking and creepy Facebook ads repulse growing numbers of users. We can use libraries’ killer track-record to our advantage to insulate our patrons’ trust which raises our esteem. The only conniption I have when we are talking shop is how often privacy is at odds with personalization – this is a real shame.

Does the best library web design eliminate choice?
This writeup is also on medium.

The root of the “Library User Experience Problem” is not design. No – design is just a tool. What gets in the way of good UX is that there is just too much. These websites suffer from too many menu items, too many images, too many hands in the pot, too much text on the page, too many services, too many options.
The solution is less:

  • fewer menu items increase navigability
  • fewer images increase site-speed, which increases conversion
  • fewer hands in the pot increase consistency and credibility
  • less text on the page increases readability, navigability
  • fewer options decrease the interaction cost.

Interaction cost describes the amount of effort users exhaust to interact with a service. Tasks that require particularly careful study to navigate, validate, complete – whether answering crazy-long forms or clicking through a ton of content – are emotionally and mentally taxing. High cost websites churn. To increase usability, reduce this cost.

Decision fatigue

John Tierney, in the New York Times in 2011, called this “decision fatigue.”

Decision fatigue helps explain why ordinarily sensible people get angry at colleagues and families, splurge on clothes, buy junk food at the supermarket and can’t resist the dealer’s offer to rustproof their new car.

These well-founded negative repercussions of cognitive load inspired Aaron Shapiro to write, a couple years later, that “choice is overrated.” As Orwellian as that sounds, I think I agree. Functional design is design that gets out of your way. It facilitates – when you need it. It is the unwritten servant in Jane Austen who relinquishes such trivial concerns like cooking, fetching, cleaning, delivering, dressing, so that our heroines can hob-knob and gossip.

Already we engage with nascent services anticipating our choices, and these will mature. In the next couple of years, when I schedule a flight in my calendar it will go ahead and reserve the Uber and inform the driver of my destination (the airport).

It is not that these choices were eliminated for me, but context and past behavior spared me from dealing with the nitty gritty.

Anticipatory design is a way to think about using context and user behavior as well as personal data – if and when ethically available – to craft a “user experience for one” to reduce the interaction cost, the decision fatigue, or – in the Brad Frost way of doing things – cut out the bullshit.

Context, behavior, and personal data

Here are pillars for developing an anticipating service. The last one, personal data, is what makes librarians – who, you should know if you’re not one, care more about your privacy than you probably do – pretty uneasy.

The context can be inferred from neutral information such as the time of day, business hours, weather, events, holidays. If the user opts in, then information such as device orientation, location or location in relation to another, and motion from the accelerometer can build a vague story about that patron and make educated guesses about which content to present.

Behavior comes in two flavors: general analytics-driven user behavior of your site or service on the web or in house, and individual user behavior such as browsing history. I distinguish the latter from personal data because I consider this information available to the browser without need for actually retrieving information from a database. General analytics reveals a lot about how a site functions, what’s most sought after, at what times, by which devices. Specific user behavior, which can be gleaned through cookies, can then narrow the focus of analytics-driven design.

It can only be a user experience for one when personal data can expose real preferences — Michael loves science fiction, westerns, prefers Overdrive audiobooks to other vendors and formats — to automatically skip the hunt-and-peck and curate a landing page unique to the patron. Jane is a second-year student at the College of Psychology, she reserves study rooms every Thursday, it’s finals week, she’s in the grind and it’s in the evening: when she visits the library homepage, we should ensure that she can reserve her favorite study room, give her the heads up that the first-floor cafe is opened late if she needs a pick-me-up, and give her the databases she most frequents.

We just have to reach out and take the ring

Libraries have access to all the data we could want, but as Anne Quito wrote over on Quartz, anticipatory design requires a system of trust.

This means relinquishing personal information – passwords, credit card numbers, activity tracking data, browsing histories, calendars – so the system can make and execute informed decision on your behalf.

This would never fly.

Anticipatory design presents new ethical checkpoints for designers and programmers behind the automation, as well as for consumers. Can we trust a system to safeguard our personal data from hackers and marketers – or does privacy become a moot concern?

I do not believe that privacy and personalization are mutually exclusive, but I am skeptical of libraries’ present ability to safeguard this data. As I told Amanda in our podcast about anticipatory design, I do not trust the third-party vendors with which libraries do business to not ethically exploit personal information, nor do I trust libraries without seasoned DevOps to deeply personalize the experience without leaving it vulnerable.

Few libraries benefit from the latter. So …, bummer. The shame to which I alluded above is that while users not only benefit from the convenience, libraries by so drastically improving usability thus drastically improve the likelihood of mission success. These things matter. Investing in a net-positive user experience matters, because libraries thrive and rely on the good-vibes from its patronbase – especially during voting season.

The low-fat flavor of “anticipatory design” without the personal-data part has also been referred to as context-driven design , which I think a compelling strategy. It doesn’t require libraries to store and safeguard more information than is necessary for basic function. Context inferred from device or browser information is usually opt-in by default, and this would do most of the heavy lifting without crossing that deep, deep line in the sand, or crossing into the invasive valley.

The post Does the best library web design eliminate choice? appeared first on LibUX.

William Denton: Now with and COinS structured metadata

Fri, 2015-09-11 21:19

Thanks to a combination of Access excitement, a talk by Dan Scott, a talk by Mita Williams, and wanting to learn more, I added and COinS metadata to this site. It validates, though I’m not sure if the semantic structure is correct. Here’s what I’ve got so far.

My Jekyll setup

I build this site with Jekyll. It uses static HTML templates in which you can place content as needed or do a little bit of simple scripting (inside double braces, which here I’ve spaced out: { { something } }). My main template is _layouts/miskatonic.html, which (leaving out the side, the footer, CSS and much else) looks like this:

<!DOCTYPE html> <html itemscope itemtype=""> <head> <meta charset="utf-8"> <meta name="referrer" content="origin-when-cross-origin" /> <title>{ {page.title} } | { { } }</title> <meta itemprop="creator" content="William Denton"> <meta itemprop="name" content="Miskatonic University Press"> <link rel="icon" type="image/x-icon" href="/images/favicon.ico"> <link rel="alternate" type="application/rss+xml" title="Miskatonic University Press RSS" href="" /> </head> <body> <article> { { content } } </article> <aside></aside> </body> </html>

It declares that the web site is a CreativeWork, what its name is, and who owns it.

I have two types of pages: posts and pages. Posts are blog posts like this, and pages are things like Fictional Footnotes and Indexes.

My page template, _layouts/page.html, sets out that the page is, in the sense, an Article:

--- layout: miskatonic --- <div itemscope itemtype=""> { % include coins.html % } <meta itemprop="creator" content="William Denton"> <meta itemprop="license" content=""> <meta itemprop="name" content="{ { page.title } }"> <meta itemprop="headline" content="{ { page.title } }"> { % if page.description % } <meta itemprop="description" content="{ { page.description } }"> { % endif % } <img itemprop="image" src="/images/dentograph-400px-400px.png" alt="" style="display: none;"> <h1>{ { page.title } }</h1> <p> <time itemprop="datePublished" datetime="{ { | date_to_xmlschema } }"> { { | date_to_long_string } } </time> <span class="tags"> { % for tag in page.tags % } <a href="/posts/tags.html#{ { tag } }"><span itemprop="keywords">{ { tag } }</span></a> { % endfor % } </span> </p> <div class="post" itemprop="articleBody"> { { content } } </div> </div>

Those meta tags declare some properties of the Article. Every Article is required to have a headline and an image, which doesn’t really suit my needs and shows the commercial nature of the system. For the headline, I just use the title of the page. For the image, I use a generic image that will repeat on every page, and what’s more I style it with CSS so it’s not visible. I may come back to this later and make it work better.

The layout: miskatonic at the top means that this content gets inserted into that layout where the { { content } } line is.

The _layouts/post.html template looks like this:

--- layout: miskatonic --- <div itemscope itemtype=""> { % include coins.html % } <meta itemprop="creator" content="William Denton"> <meta itemprop="license" content=""> <h1 itemprop="name">{ { page.title } }</h1> <p> <time itemprop="datePublished" datetime="{ { | date_to_xmlschema } }"> { { | date_to_long_string } } </time> <span class="tags"> { % for tag in page.tags % } <a href="/posts/tags.html#{ { tag } }"><span itemprop="keywords">{ { tag } }</span></a> { % endfor % } </span> </p> <div class="post" itemprop="articleBody"> { { content } } </div> </div>

Every blog post is a BlogPosting. The same kind of properties are given about it as for a page, and the same image trick. I usually include images with blog posts and maybe there’s a simple way to make Jekyll find the first one and bung it in there. I don’t think I want to get into listing all the images I use in the YAML header … that’s too much work.

When I write a blog post, like this one, I start it with

--- layout: post title: Now with and COinS structured metadata tags: jekyll metadata date: 2015-09-11 16:24:17 -0400 ---

That defines that this is a post, so the Markdown is processed and inserted into the post layout, which is processed and put into the miskatonic layout, which is processed, and that’s turned into static HTML and dumped to disk. (Or something along those lines.)

Proper semantics?

This all validates, but I’m not sure if the semantics are correct. Google’s Structured Data Testing Tool says this about a recent blog post:

CreativeWork (my site) and the BlogPosting (the post) are at the same level. I’m not sure if the BlogPosting should be a child of the Creative Work. It is in the schema, but I don’t know if that should apply to this structure here.

Useful links



While I was at all this, I decided to add COinS metadata to everything so Zotero could make sense of it. Adapting Matthew Lincoln’s COinS for your Jekyll blog, I created _includes/coins.html, which looks like this, though if you want to use it, reformat it to remove all the newlines and spaced braces, and change the name:

<span class="Z3988" title="ctx_ver=Z39.88-2004&amp;rft_val_fmt=info%3Aofi%2Ffmt%3Akev%3Amtx%3Adc &amp;rft.title={ { page.title | cgi_escape } } &amp;rft.aulast=Denton&amp;rft.aufirst=William &amp;rft.source={ { | cgi_escape } } &amp;{ { | date_to_xmlschema } } &amp;rft.type=blogPost&amp;rft.format=text &amp;rft.identifier={ { site.url | cgi_escape } }{ { page.url | cgi_escape } } &amp;rft.language=English"></span>

I just noticed that this says the thing is a blog post, and I’m using this COinS snippet on both my pages and posts, so Zotero thinks the pages are posts, but I’ll let that ride for now. Zotero users, if you ever cite one of my pages, watch out.

COinS is over ten years old now! There must be a more modern way to do this. Or is there?

Bibliographic information in

Now that I’ve done this, search engines like Google can make better sense of the content of the site, which is nice enough, though I hardly ever use Google (I’m a DuckDuckGo man—it’s not as good, but it’s better). I would like to mark up my talks and publications so all of that citation information is machine-readable, but hasn’t been formally approved yet, from what I can see. And look at all of the markup going on for something like a Chapter!

Blecch. I don’t want to type all that kind of cruft every time I want to describe a chapter or article. There are things like jekyll-scholar that would let me turn a set of BibTeX citations into HTML, but it doesn’t do microformats. Maybe that would something to hack on. Or maybe I’ll just leave it all for now and come back to it next time I feel like doing some fiddling with this site. That’s enough template hacking for one week!

Corrections welcome

If anyone who happens to read this sees any errors in what I’ve done, please let me know. I don’t really care if my headlines could be better, but if there’s something semantically wrong with what I’ve described here, I’d like to get it right.

Nicole Engard: Bookmarks for September 11, 2015

Fri, 2015-09-11 20:30

Today I found the following resources and bookmarked them on Delicious.

  • Roundcube Free and Open Source Webmail Software
  • Bolt Bolt is an open source Content Management Tool, which strives to be as simple and straightforward as possible. It is quick to set up, easy to configure, uses elegant templates, and above all: It’s a joy to use.

Digest powered by RSS Digest

The post Bookmarks for September 11, 2015 appeared first on What I Learned Today....

Related posts:

  1. Who are our peers?
  2. September Workshops
  3. Google Docs Templates

Alf Eaton, Alf: What Aaron understood

Fri, 2015-09-11 18:52

I didn’t know Aaron, personally, but I’d been reading his blog as he wrote it for 10 years. When it turned out that he wasn’t going to be writing any more, I spent some time trying to work out why. I didn’t find out why the writing had stopped, exactly, but I did get some insight into why it might have started.

Philip Greenspun, founder of ArsDigita, had written extensively about the school system, and Aaron felt similarly, documenting his frustrations with school, leaving formal education and teaching himself.

In 2000, Aaron entered the competition for the ArsDigita Prize and won, with his entry The Info Network — a public-editable database of information about topics. (Jimmy Wales & Larry Sanger were building Nupedia at around the same time, which became Wikipedia. Later, Aaron lost a bid to be on the Wikimedia Foundation’s Board of Directors, in an election).

Aaron’s friends and family added information on their specialist subjects to the wiki, but Aaron knew that a centralised resource could lead to censorship (he created zpedia, for alternative views that would not survive on Wikipedia). Also, some people might add high-quality information, but others might not know what they’re talking about. If everyone had their own wiki, and you could choose which trusted sources to subscribe to, you’d be able to collect just the information that you trusted, augment it yourself, and then broadcast it back out to others.

In order to pull information in from other people’s databases, you needed a standard way of subscribing to a source, and a standard way of representing information.

RSS feeds (with Aaron’s help) became a standard for subscribing to information, and RDF (with Aaron’s help) became a standard for describing objects.

I find — and have noticed others saying the same — that to thoroughly understand a topic requires access to the whole range of items that can be part of that topic — to see their commonalities, variances and range. To teach yourself about a topic, you need to be a collector, which means you need access to the objects.

Aaron created Open Library: a single page for every book. It could contain metadata for each item (allowable up to a point - Aaron was good at pushing the limits of what information was actually copyrightable), but some books remained in copyright. This was frustrating, so Aaron tried to reform copyright law.

He saw that many of the people incarcerated in America were there for breaking drug-related laws, so he tried to do something about that, as well.

He found that it was difficult to make political change when politicians were highly funded by interested parties, so he tried to do something about that. He also saw that this would require politicians being open about their dealings (but became sceptical about the possibility of making everything open by choice; he did, however, create a secure drop-box for people to send information anonymously to reporters).

To return to information, though: having a single page for every resource allows you to make statements about those resources, referring to each resource by its URL.

Aaron had read Tim Berners-Lee’s Weaving The Web, and said that Tim was the only other person who understood that, by themselves, the nodes and edges of a “semantic web” had no meaning. Each resource and property was only defined in terms of other nodes and properties, like a dictionary defines words in terms of other words. In other words, it’s ontologies all the way down.

To be able to understand this information, a reader would need to know which information was correct and reliable (using a trust network?).

He wanted people to be able to understand scientific research, and to base their decisions on reliable information, so he founded Science That Matters to report on scientific findings. (After this launched, I suggested that Aaron should be invited to SciFoo, which he attended; he ran a session on open access to scientific literature).

He had the same motivations as many LessWrong participants: a) trying to do as little harm as possible, and b) ensuring that information is available, correct, and in the right hands, for the sake of a “good AI”.

As Alan Turing said (even though Aaron spotted that the “Turing test” is a red herring), machines can think, and machines will think based on the information they’re given. If an AI is given misleading information it could make wrong decisions, and if an AI is not given access to the information it needs it could also make wrong decisions, and either of those could be calamitous.

Aaron chose not to work at Google because he wanted to make sure that reliable information was as available as possible in the places where it was needed, rather than being collected by a single entity, and to ensure that the AI which prevails will be as informed as possible, for everyone’s benefit.

District Dispatch: (Bene)tech as a leveler

Fri, 2015-09-11 16:58

Disability issues are a third rail in our public discourse. To “de-electrify” this rail is no simple task. It requires us to be critical of our own predilections and instincts. Here’s the problem: We’re human. Humans are reflexively disconcerted by what we perceive as an aberration from the norm. For this reason, we celebrate difference in the abstract, but are often paralyzed by it in practice; we exalt people and things we perceive as different, but devote too little time to truly understanding them. How can we have robust conversations about addressing the unique challenges facing people with disabilities if we’re afraid to broach the subject of disability in the first place? We can’t. To make real headway on these challenges, we have to check those parts of our nature and our milieu that compel us to clam up in the face of “otherness.” We have to bridge the gap between our best of intentions and our actions in the world.

3D printer in action

Thankfully, there are social advocacy organizations that realize this. An example par excellence: the Silicon Valley-based non-profit, Benetech. The men and women of Benetech realize that one of the greatest opportunities for progress on disability issues lies at the confluence of education, technology, science and public policy. They encourage individuals from across these fields – both with and without disabilities – to work together to develop solutions to accessibility challenges. Benetech’s latest effort on this front: A convening of library, museum and school professionals from across the country to discuss strategies for using 3D printing technology to improve the quality of education for students with disabilities. I was honored to be given the chance to attend and share my perspective on 3D printing as a policy professional.

The convening’s discussions and workshops highlighted a number of ways in which 3D printers can level the playing field for students with disabilities. 3D printers can render molecules and mathematical models coated with braille to bring STEM learning to life for individuals with print disabilities; they can provide a boost of confidence to a child whose motor skills are compromised by cerebral palsy by helping him or her create an intricately shaped object; and they can energize a student with a learning disability by illustrating a practical application of a subject with which he or she may struggle. Participants discussed how libraries, schools and museums can collaborate to help disabled students everywhere enjoy these and more “leveling” applications of 3D printing technology.

As fruitful as Benetech’s San Jose convening was, its participants all agreed that it should represent the beginning of a broader conversation on the need to use technology to address the myriad of learning challenges facing disabled students; one that must include not just professionals from the library, museum and school communities, but also government decision makers, academics and the public. The more people we involve in the conversation, the closer we will come to de-electrifying the third rail of disability issues in the education, tech and policy arenas.

Benetech is already taking steps to broaden the conversation. Last month, Benetech staff and participants in its June convening on 3D printing held a webinar in which they highlighted several projects they’ve spearheaded to raise awareness of the capacity of 3D printing to put students with disabilities on a level playing field with their peers. These include the development of draft technical standards aimed at making 3D printing technology accessible, and the creation of 3D printing curricula that encourage teachers to use 3D printers and 3D printed objects to create new learning opportunities for students with disabilities.

The ALA Washington Office would like to thank Lisa Wadors Verne, Robin Seaman, Anh Bui, Julie Noblitt and the rest of the Benetech team for the opportunity to participate in its June convening on 3D printing. ALA Washington looks forward to continued engagement with Benetech – we have already begun discussions with Lisa, Robin, Anh, Julie and others about how libraries can promote their work to improve access to digital content and empower all people to enjoy the transformative power of technology.

You can read about a program on 3D printing and educational equity that Benetech has proposed for the 2016 SXSWedu conference here.

The post (Bene)tech as a leveler appeared first on District Dispatch.