You are here

Feed aggregator

Tara Robertson: Inclusive is not a feature list

planet code4lib - Mon, 2015-09-21 17:48

INCLUSIVE from Microsoft Design on Vimeo.

I love the video that Microsoft recently put out about Inclusive Design. It uses several design stories to illustrate how inclusive design needs to start with an individual and be user centered. I learned so much from many of the amazing people featured in this video. Also, it’s delightful to watch a video that presents such strong ideas and has such high production values.

This quote from interaction designer Mike Vanis at the 5 minute mark really stuck with me:

If you start with technology, then it just becomes a feature list.  But if you start with the person then this really amazing thing happens. They dictate the technology and you come to surprises. You arrive at a point where the technology and the person feel so close, so intimate, that you don’t actually see the technology at all anymore.

One of the stories in Inclusive is about Skype Translator (starts at 13:42). There are two threads to this story. First this video shows a school in Seattle and a school in Beijing that are using Skype Translator to bridge their linguistic differences and video chat with each other. Skype Translator is impressive, it uses speech to text, machine translation, and then text to speech to translate what someone is saying in one language into another. As part of this exchange the text of what is being said is included on the screen. The second thread is that this technology is useful in including Deaf and Hard of Hearing students in mainstreamed hearing classrooms.

Will Lewis, Principal Technical PM, Microsoft Research, says that for Deaf or Hard of Hearing students in a hearing classroom they “often require an interpreter, whether that’s a sign language interpreter, or closed captioning. The problem is that it doesn’t scale.” The underlying assumption is that there is a problem with people who are Deaf or Hard of Hearing and that there is a problem in making them fit in a hearing classroom.

This story doesn’t fit with the fundamental concept that it’s important to start with the individual and should have been left out of this video. This segment focuses on how amazing Skype Translator is as a technology (which it is) and then tacks on two Deaf or Hard of Hearing students as an afterthought. Also, presenting cochlear implants as an amazing value neutral technology is an example of audism, or “is the notion that one is superior based on one’s ability to hear or to behave in the manner of one who hears, or that life without hearing is futile and miserable, or an attitude based on pathological thinking which results in a negative stigma toward anyone who does not hear”.

Talking this through with a friend who is Hard of Hearing and a PhD candidate revealed some underlying privacy concerns. Assuming that the machine translation is occurring on Microsoft servers means that conversations are being saved temporarily, to translate their words, and also likely permanently saved in order to improve the technology. So if Deaf and Hard of Hearing people are reliant on this technology they will be under more surveillance than hearing people. This is really problematic. If the design process had started with the individual who valued privacy and dictated the technology the amazing thing that Mike Vanis talked about might have happened. Instead the story of Skype Translator is just a software feature list.


Cynthia Ng: A Community Supported on Obsolete Technology

planet code4lib - Mon, 2015-09-21 16:39
This was first posted on The Pastry Box Project on September 21, 2015 with the same name. I encourage you to go read it there, since lots of great articles are written on The Pastry Box. Re-posted here for my own archiving purposes. People frequently express their abhorrence, or at least dislike, of obsolete technology … Continue reading A Community Supported on Obsolete Technology

Islandora: Islandora Committers are now Official

planet code4lib - Mon, 2015-09-21 16:02

In another step forward as a mature and sustainable open source project, the Islandora community has adopted an official process for defining and nominating Committers. As with things like our Licensed Software Acceptant Procedure and our Contributor Licence Agreements, we opted to follow a model tried and tested by one of our fellow travellers in the world of open source repositories and at the guidelines used by Fedora Committers (who were guided in turn by how it's done at Apache). We particularly liked their method for selecting new Committers, with its emphasis on community engagement at several levels - and how it would leave the selection of new Committers to those in the best position to judge: existing Committers.

With Fedora for an example, Nick Ruest wrote up a proposed set of guidelines for Committers in the Islandora Community. To sum up some key points, Committers have the right to:

  • Write access to the codebase
  • Nomination privileges of new committers
  • Release management privileges
  • Binding votes on procedural, code modification, and release issues
  • Access to the private committers mailing list

Balanced by the responsibility to:

  • Monitor and respond to project mailing lists
  • Attend project and technical meetings
  • Monitor and vet bug-tracker issues
  • Review and commit code contributions
  • Ensure code contributions are properly licensed
  • Guide and mentor new committers

There are 17 initial Committers, consisting of those community members who already had push access to the Islandora GitHub:

  • Daniel Aitken, discoverygarden inc.
  • Morgan Dawe, discoverygarden inc.
  • Jordan Dukart, discoverygarden inc.
  • Nelson Hart, discoverygarden inc.
  • Mark Jordan, Simon Fraser University
  • Danny Lamb, discoverygarden inc.
  • Rosemary LeFaive, University of Price Edward Island
  • Mitch MacKenzie, discoverygarden inc.
  • Donald Moses, University of Price Edward Island
  • William Panting, discoverygarden inc.
  • Matthew Perry, discoverygarden inc.
  • Diego Pino, REUNA
  • Paul Pound, University of Price Edward Island
  • Nick Ruest, York University
  • Alan Stanley, Agile Humanities
  • Adam Vessey, discoverygarden inc.
  • Jared Whiklo, University of Manitoba

discoverygarden's historical status as the primary contributor of Islandora code is reflected in the composition of the list, but the project's growth as a software owned and created by a wider community is also apparent - and we expect that as the list grows, so too will the diversity of institutions represented.

Want to become a Committer? Here's what they will be looking for:

Ability to be a mentor How do we evaluate? By the interactions they have through mail. By how clear they are and how willing they are to point at appropriate background materials (or even create them). Community How do we evaluate? By the interactions they have through mail. Do they help to answer questions raised on the mailing list; do they show a helpful attitude and respect for other's ideas. Committment How do we evaluate? By time, by sticking through tough issues, by helping on not-so-fun tasks as well. Personal skill/ability How do we evaluate? A solid general understanding of the project. Quality of discussion in mail. Patches (where applicable) easy to apply with only a cursory review.

HangingTogether: Data Management and Curation in 21st Century Archives – Part 1

planet code4lib - Mon, 2015-09-21 15:21

I attended the 79th Annual Meeting of the Society of American Archivists (SAA) last month in Cleveland, Ohio and was invited to participate on the Research Libraries Roundtable panel on Data Management and Curation in 21st Century Archives. Dan Noonan, e-Records/Digital Resources Archivist, moderated the discussion. Wendy Hagenmaier, Digital Collections Archivist, Georgia Tech Library and Sammie Morris, Director, Archives and Special Collections & University Archivist, Purdue University Libraries joined me on the panel. Between the three of us there was a nice variety of perspectives given our different experiences and interests.

It was a great panel so I decided to discuss it in a two parts. In this part, Managing and Curating Data with Reuse in Mind, I summarize key points from my presentation. In Part 2, I will highlight key points from Wendy and Sammie’s presentations that made an impression on me.

Managing and Curating Data with Reuse in Mind

I was excited to be invited to participate in a panel discussion on Data Management and Curation in 21st Century Archives at SAA, given my perspective is not that of an archivist. I’ve been studying data reuse in academic communities and more recently I’ve been examining libraries’ role in e-research and data on their campuses.

Given my experiences and interests my goal in participating on the panel was to convince archivists to bring their expertise to the table with an eye toward satisfying, perhaps even delighting, data reusers. I believe revolving conversations about data management and curation in 21st century archives around the needs of data reusers serves to inform the preservation of data’s meaning as well as other archival practices, particularly the partnerships archivists form, the questions they ask, and the activities they pursue.

Preservation of data’s meaning

When we think about preserving the meaning of research data, the goal is that someone not involved in the study can come along and make sense of the data. It’s no surprise that contextual information about how data are collected is critical.

For instance, a zoologist uses field notes to sort out whether a wolf might have been a dog or coyote hybrid. A social scientist references the instructions and layout of a survey to understand differences between survey responses. An archaeologist thinks artifacts are meaningless in absence of information about where they came from and how they were acquired and excavated.

While the need for data collection information is obvious, what is often surprising to some is the level of contextual detail reusers want about it and the additional kinds of context they seek, including information about the data producer, data repository, data analysis, digitization and curation, preservation, and prior reuse.

Questions asked: It’s not just about context

Ask data reusers what contextual information they need as well as why they need it and where they go to get it. What we have heard has enlightened us about disciplinary attitudes and practices. We have learned what constitutes data quality and how it contributes to their decision making and satisfaction. Our understanding of data quality has become more nuanced and given us something tangible to work toward given its importance in data management and curation.

A zoologist deciding whether to combine data from different studies needs to know if the definitions for a concept hold across the two different data sources. We call this need to evaluate whether and how data from different studies can be integrated ease of operation. A social scientist determining whether data are relevant given research objectives looks at how variables are defined and measured. An archaeologist relies on information about data producers to judge whether their data are credible.

When asking researchers to talk about how they reused other’s data, we’ve learned that it’s not just about capturing context so researchers can understand data. Our findings show researchers judge other’s data in various ways to decide if the data are worthy of reuse. We need to know more about these judgements. If we are going to curate and preserve data to be reusable we need to have a better sense of what reusable means.

Partnerships formed: It’s bigger than the archive

Looking at data management and curation from a reuser’s perspective also might influence the partnerships archivists form. It’s bigger than the archive. Archivists cannot go it alone. Our work shows how actions in one part of data’s lifecycle influence other parts.

How data producers collect, record, and document their data impacts repository staff and data reusers. For instance, we found archaeologists collecting data in the field had systems to identify tooth wear, but there were no guidelines for documenting tooth wear. Consequently they recorded it in different ways impacting repository staff’s data processing time and reusers’ understanding. We’ve also found instances where repository staff’s actions motivated data producers to share and impacted the satisfaction of data reusers and where data reusers influenced repository policy and data producers’ future actions.

While we’ve revealed interdependencies in an attempt to improve data sharing, management, and reuse experiences, we’ve only looked at three roles. Of those roles, we’ve only considered one that sits between data producers and data reusers – repository staff (i.e. the data curator). We know there are more – archivists, librarians, technologists, compliance officers, administrative staff, etc.

When asked what facilitates research data services, two-thirds of librarians mentioned communication, coordination, and collaboration with people from other units on their campus as a means to define, develop, and deliver services, pool expertise, and outline roles and responsibilities. Our research suggests that the key will be managing these stakeholders’ interdependencies through data’s lifecycle by identifying pain points and supportive actions that will move things forward.

Activities pursued: It’s always about designated communities 

Lastly incorporating data reusers’ perspectives and practices into conversations about data management and curation might influence the activities archivists pursue. It’s always about the designated community of users. We witnessed this in our interviews with staff at three data repositories – Inter-university Consortium for Political and Social Research, University of Michigan Museum of Zoology, and Open Context. Findings showed staff dealt with six types of change in data repositories, one of which was responding to their user communities.

At each repository, staff were found to adjust their processes and procedures to accommodate the developing needs of their users. The museum developed new specimen preparation, preservation, and loan procedures when DNA testing became available. ICPSR staff were deciding when and how they could meet demand for new data formats such as video. Rather than design the Open Context website to be “Flickry” and collaborative, staff decided on a more straightforward publication platform because archaeologists wanted something more professional that they could cite on their CVs.

In our roles, whether archivists, librarians, technologists, researchers, etc., we need to think about how we can talk, listen, observe, learn from, teach, and delight data reusers. We should strive to ensure our actions encompass the audience we are trying to reach.

Are any of you actively engaged with your scholarly communities to understand data management, curation, and reuse from their perspective? If so, please comment or respond to this post and tell us about your experiences – How have you done it? What have you learned? What have they learned? What challenges remain?

About Ixchel Faniel

Ixchel M. Faniel is a Research Scientist at OCLC. She is currently working on projects examining data reuse within academic communities to identify how contextual information about the data that supports reuse can best be created and preserved. She also examines librarians' early experiences designing and delivering research data services with the objective of informing practical, effective approaches for the larger academic community.

Mail | Web | More Posts (2)

Information Technology and Libraries: Editorial Board Thoughts: Information Technology and Libraries: Anxiety and Exhilaration

planet code4lib - Mon, 2015-09-21 05:20
Editorial Board Thoughts: Information Technology and Libraries: Anxiety and Exhilaration

Information Technology and Libraries: Editorial Thoughts: Rise of the Innovation Commons

planet code4lib - Mon, 2015-09-21 04:00

That the practice of libraries and librarianship is changing is an understatement. Throughout their history, libraries have adapted and evolved to better meet the needs of the communities served. Framed against the historical development of the library commons and technological support, this piece introduces the concept of an innovation commons as a natural evolution for libraries, from information through learning commons, to the organic development and incorporation of library makerspaces.

Information Technology and Libraries: Self-Archiving with Ease in an Institutional Repository: Microinteractions and the User Experience

planet code4lib - Mon, 2015-09-21 04:00

Details matter, especially when they can influence whether or not users engage with a new digital initiative that relies heavily on their support. During the recent development of MacEwan University’s institutional repository, the librarians leading the project wanted to ensure the site would offer users an easy and effective way to deposit their works, in turn helping to ensure the repository’s long-term viability. The following paper discusses their approach to user-testing, applying Dan Saffer’s framework of microinteractions to how faculty members experienced the repository’s self-archiving functionality. It outlines the steps taken to test and refine the self-archiving process, shedding light on how others may apply the concept of microinteractions to better understand a website’s utility and the overall user experience that it delivers.

Information Technology and Libraries: What Technology Skills Do Developers Need? A Text Analysis of Job Listings in Library and Information Science (LIS) from

planet code4lib - Mon, 2015-09-21 04:00

Technology plays an indisputably vital role in library and information science (LIS) work; this rapidly moving landscape can create challenges for practitioners and educators seeking to keep pace with such change.  In pursuit of building our understanding of currently sought technology competencies in developer-oriented positions within LIS, this paper reports the results of a text analysis of a large collection of job listings culled from the Code4lib jobs website.  Beginning over a decade ago as a popular mailing list covering the intersection of technology and library work, the Code4lib organization's current offerings include a website that collects and organizes LIS-related technology job listings.  The results of the text analysis of this dataset suggest the currently vital technology skills and concepts that existing and aspiring practitioners may target in their continuing education as developers.


Information Technology and Libraries: Evaluation of Semi-Automatic Metadata Generation Tools: A Survey of the Current State of the Art

planet code4lib - Mon, 2015-09-21 04:00
Assessment of the current landscape of semi-automatic metadata generation tools is particularly important considering the rapid development of digital repositories and the recent explosion of big data. Utilization of (semi)automatic metadata generation is critical in addressing these environmental changes and may be unavoidable in the future considering the costly and complex operation of manual metadata creation. To address such needs, this study examines the range of semi-automatic metadata generation tools (n=39) while providing an analysis of their techniques, features, and functions. The study focuses on open-source tools that can be readily utilized in libraries and other memory institutions.  The challenges and current barriers to implementation of these tools were identified. The greatest area of difficulty lies in the fact that  the piecemeal development of most semi-automatic generation tools only addresses part of the issue of semi-automatic metadata generation, providing solutions to one or a few metadata elements but not the full range elements.  This indicates that significant local efforts will be required to integrate the various tools into a coherent set of a working whole.  Suggestions toward such efforts are presented for future developments that may assist information professionals with incorporation of semi-automatic tools within their daily workflows.

Information Technology and Libraries: Hidden online surveillance: What librarians should know to protect their own privacy and that of their patrons

planet code4lib - Mon, 2015-09-21 04:00
Librarians have a professional responsibility to protect the right to access information free from surveillance. This right is at risk from a new and increasing threat: the collection and use of non-personally identifying information such as IP addresses through online behavioral tracking. This paper provides an overview of behavioral tracking, identifying the risks and benefits, describes the mechanisms used to track this information, and offers strategies that can be used to identify and limit behavioral tracking. We argue that this knowledge is critical for librarians in two interconnected ways. First, librarians should be evaluating recommended websites with respect to behavioral tracking practices to help protect patron privacy; second, they should be providing digital literacy education about behavioral tracking to empower patrons to protect their own privacy online.

DuraSpace News: The UK’s Largest Health Board, NHS Greater Glasgow and Clyde, Joins Open Repository

planet code4lib - Mon, 2015-09-21 00:00

From James Evans, Product Manager, Open Repository

DuraSpace News: #VIVO15 Conference Materials Now in Figshare!

planet code4lib - Mon, 2015-09-21 00:00

We are delighted to partner with figshare to make #VIVO15 presentations openly available through their new figshare for institutions service. This makes the terrific work of the VIVO community more accessible than ever in a beautiful, easy-to-use interface. Moreover, the materials are now persistent and citable with a DOI. Our sincere thanks to the figshare and Digital Science teams for their work to make this possible!

Manage Metadata (Diane Hillmann and Jon Phipps): Semantic Versioning and Vocabularies

planet code4lib - Sun, 2015-09-20 23:41

A decade ago, when the Open Metadata Registry (OMR) was just being developed as the NSDL Registry, the vocabulary world was a very different place than it is today. At that point we were tightly focussed on SKOS (not fully cooked at that point, but Jon was on the WG that was developing it, so we felt pretty secure diving in).

But we were thinking about versioning in the Open World of RDF even then. The NSDL Registry kept careful track of all changes to a vocabulary (who, what, when) and the only way to get data in was through the user interface. We ran an early experiment in making versions based on dynamic, timestamp-based snapshots (we called them ‘time slices’, Git calls them ‘commit snapshots’) available for value vocabularies, but this failed to gain any traction. This seemed to be partly because, well, it was a decade ago for one, and while it attempted to solve an Open World problem with versioned URIs, it created a new set of problems for Closed World experimenters. Ultimately, we left the versions issue to sit and stew for a bit (6 years!).

All that started to change in 2008 as we started working with RDA, and needed to move past value vocabularies into properties and classes, and beyond that into issues around uploading data into the OMR. Lately, Git and GitHub have started taking off and provide a way for us to make some important jumps in functionality that have culminated in the OMR/GitHub-based RDA Registry. Sounds easy and intuitive now, but it sure wasn’t at the time, and what most people don’t know is that the OMR is still where RDA/RDF data originates — it wasn’t supplanted by Git/Github, but is chugging along in the background. The OMR’s RDF CMS is still visible and usable by all, but folks managing larger vocabularies now have more options.

One important aspect of the use of Git and GitHub was the ability to rethink versioning.

Just about a year ago our paper on this topic (Versioning Vocabularies in a Linked Data World, by Diane Hillmann, Gordon Dunsire and Jon Phipps) was presented to the IFLA Satellite meeting in Paris. We used as our model the way software on our various devices and systems is updated–more and more these changes happen without much (if any) interaction with us.

In the world of vocabularies defining the properties and values in linked data, most updating is still very manual (if done at all), and the important information about what has changed and when is often hidden behind web pages or downloadable files that provide no machine-understandable connections identifying changes. And just solving the change management issue does little to solve the inevitable ‘vocabulary rot’ that can make published ‘linked data’ less and less meaningful, accurate, and useful over time.

Building stable change management practices is a very critical missing piece of the linked data publishing puzzle. The problem will grow exponentially as language versions and inter-vocabulary mappings start to show up as well — and it won’t be too long before that happens.

Please take a look at the paper and join in the conversation!

Ed Summers: Three Papers, Three Cultures

planet code4lib - Sun, 2015-09-20 13:07

Here are my reading notes for week 4 of the Engaged Intellectual. Superficially these papers seemed oriented around the three cultures of social science, the humanities and the physical sciences. But there were some interesting cross-currents between them.

Crotty, M. (1998). The foundations of social research : Meaning and perspective in the research process. London ; Thousand Oaks Calif.: Sage Publications.

Crotty starts out by outlining a set of questions that must be answered when embarking on a research project:

  1. What methods do we propose to use.
  2. What methodology governs our choice and use of methods.
  3. What theoretical perspective lies behind the methodology in question.
  4. What epistemology informs this theoretical perspective.

He then goes on to define this terminology, while pointing out that that we often talk about them all together:

the techniques or procedures Participant Observation
the strategy or design of the chosen methods that fit the desired outcomes (e.g. Ethnography)
theoretical perspective
the philosophical underpinnings for the methodology (e.g. Symbolic Interactionism)
the theory of knowledge that is embedded in the theory ; how we know what we know (e.g. Constructivism)

The pragmatist in me wants to pause here to reflect that the real value does not lie in the truth of this picture of research, but that it’s a useful way of distinguishing between all the concepts that are flying around when learning about research. A firm understanding of these different levels helps ground the decisions made about what methodologies to use, and how to interpret the results.

The theoretical perspective is often assumed as part of the methodology and needs to be made explicit. Epistemology (how we know) is distinguished from Ontology (the study of being). They often get muddled up too. Corry points out that we often start with particulars: a specific problem that needs to be solved, a research question or set of research questions.

We plan our research in terms of that issue that needs to be addressed, a problem that needs to be solved, a question that needs to be answered. We plan our research in terms of that issue or problem or question. What, we go on to ask, are the objectives of our research? What strategy seems likely to provide what we are looking for? What does that strategy direct us to do to achieve our aims and objectives? In this way our research question, incorporating the purposes of our research, leads us to methodology and methods. (Crotty, 1998, p. 13)

The methodology needs to be defended, so that people will understand the results. Methodologies are created based on need, and it can help to understand the menu of methodologies that are available. Methodologies can also be combined, and of course new ones can be created.

The discussion has helped me partially unravel my own muddled thoughts about what I want to research from how I want to research. It at least helped me feel OK about being muddled! I have come into the program wanting to study Web archives, specifically the ways we decide what to archive also known as appraisal. At the same time I am interested in looking at these decisions as an expression of individual and collective notions of values in particular contexts. How are these values arrived at? What’s the best way to study them? As I’ve mentioned in the past reading Steven Johnson’s work about repair was an inspiration for me to enter the PhD program: so I’m hoping to use his ethnographic approach, possibly in combination with Geiger’s trace ethnography (Geiger & Ribes, 2011). But what this means as far as methods go I’m still not sure a) what methods this approach suggests, and b) if it they are a good fit for the problem I’m studying (appraisal in Web archives). Hopefully mixed methods will grant me some license to use several methods in a coherent way.

This was a lot to pack into a book introduction. I might have to return to read more Crotty (1998) when there is time since he was able to explain some pretty complicated things in a very clear, compelling and useful way.

Uzzi, B., Mukherjee, S., Stringer, M., & Jones, B. (2013). Atypical combinations and scientific impact. Science, 342(6157), 468–472.

It was really quite interesting having read (Crotty, 1998) beforehand, because it made me think about this paper a little bit differently in terms of method, methodology, theory, and epistemology. They are trying to measure the novelty of scientific ideas, in an attempt to show that high impact articles combine conventional and unusual ideas. At first it seemed like it would be extremely hard to try to measure novelty. The methods were firmly from descriptive and inferential statistics (the methodology), from statistical theory. Wikipedia tells me that statistics are from the Formal Epistemology school which uses model and inference to answer questions.

One of the epistemological foundations of the paper also seems to be that Science (writ large) is measurable by the body of scientific literature and the citations found within it. I guess the authors would agree that it is possible that non-documentary factors could determine the creativity and impact of scientific ideas as well. But this does nothing to necessarily invalidate their own findings. Since the Philosophical Transactions in 1887 the very idea of science has been explicitly tied to publishing. But this is a relatively new occurrence, and the history of science extends much further back in time.

Still the authors have done a nice job of using the structure of the literature to infer novelty. It’s pretty cool that they were able to use all the literature (no need to sample), and it was already structured and easy to process. Hats off to them for writing this article without mentioning Big Data once. I guess they must have had a contact at Thomson Reuters to get access. One caveat here is the biases built in to Web of Science when it comes to what is indexed and how. WoS is a black box of sorts. Also, only data for 1980-2000 was examined, and the scientific endeavor has been going on for a while.

I liked how they cited Darwin and Einstein so that they could increase their own z-score. The Ggogle-colored cubes in Fig 2 are also cute in their suggestiveness. What if this were applied to the Web graph, where a journal was replaced with a website, and an article was replaced with a web page?

At any rate, their idea of z-score is seemed like an interesting technique

Mayer, N. (2004, October). Reclaiming our history: The mysterious birth of art and design. American Institute of Graphic Arts.

Professor Kraus asked us focus on the rhetoric that Mayer uses when reading this piece. Immediately I was struck by a) the number of large colorful and really quite beautiful images, and b) the sparse layout, which read almost like poetry in places. I guess it helps to know that this was a presentation at a conference, presumably slides with reading notes. It feels like it was written to be heard and seen rather than read, although it reads well. Her format underlines her essential point that reading words is ultimately looking at symbols and imagery.

Mayer seems focused on helping us see how dependent we are on culture for interpreting symbols. I liked how she humorously used the the perceived penises and vulvas in the cave paintings to deconstruct the anthropologists who were studying the paintings. I also liked how she ultimately grounded the analysis in art, and the multiple ways of seeing as subjects, and as groups. I was reminded of Neil Gaiman’s Long Now talk How Stories Last, in which he talked about the effort by Thomas Sebeok and the Human Interference Task Force of the Department of Energy to figure out ways of warning people in the future about the nuclear waste stored underneath Yucca Mountain.

How Stories Last is a very entertaining talk about stories and information if you have 45 minutes spare. here is the specific segment about Sebeok’s work if you want to quickly listen. As Gaiman so artfully points out, Sebeok’s advice in the end was to use to use culture to create an information relay across the generations. Here are Sebeok’s words in the conclusion of his report:

It follows that no fail-safe method of communication can be envisaged 10,000 years ahead. To be effective, the intended messages have to be recoded, and recoded again and again, at relatively brief intervals. For this reason, a “relay-system” of communication is strongly recommended, with a built-in enforcement mechanism, for dramatic emphasis here dubbed an “atomic priesthood”, i.e., a commission, relatively independent of future political currents, self-selective in membership, using whatever devices for enforcement are at its disposal, including those of a folkloristic character. (Sebeok, 1984, p. 28)

This in turn reminds me of Janée, Frew, & Moore (2009) which talks about how digital preservation systems can model this type of relay…but that’s a rabbit hole for another post. I only brought this up to highlight Mayer’s essential point that our understanding of information is mediated by shared culture and context. We are lost without it. But understanding is always imperfect. All models are wrong but some are useful. Maybe learning the art of feeling good lost has its uses. Yes, I’m just seeing if you are still awake. Hi!


Crotty, M. (1998). The foundations of social research : Meaning and perspective in the research process. London ; Thousand Oaks Calif.: Sage Publications.

Geiger, R. S., & Ribes, D. (2011). Trace ethnography: Following coordination through documentary practices. In 44th hawaii international conference on system sciences (pp. 1–10). IEEE. Retrieved from

Janée, G., Frew, J., & Moore, T. (2009). Relay-supporting archives: Requirements and progress. International Journal of Digital Curation, 4(1), 57–70.

Sebeok, T. A. (1984). Communication measures to bridge ten millennia. (6705990). United States Department of Energy. Retrieved from

District Dispatch: IMLS Webinar: strengthen your executive skills

planet code4lib - Fri, 2015-09-18 20:40

How about investing a couple hours to learn how libraries and museums can strengthen executive skills?

Mind in the Making (MITM), a program of the Families and Work Institute (FWI), and partner, the Institute of Museum and Library Services (IMLS), will present a free webinar for museum and library professionals on executive function life skills. The webinar will feature findings from the just-released groundbreaking report, Brain-Building Powerhouses: How Museums and Libraries Can Strengthen Executive Function Life Skills.
The webinar presenters include report contributors Mimi Howard and Andrea Camp, Mind in the Making author and FWI President Ellen Galinsky, and IMLS Supervisory Grants Management Specialist Helen Wechsler. They will discuss new findings from research on brain development, the importance of executive function skills, and how museums and libraries across the country are incorporating this research into their programs and exhibits.

Some of the outstanding initiatives in museums and libraries featured in the report will be presented in the webinar by the following:

• Laurie Kleinbaum Fink, Science Museum of Minnesota
• Stephanie Terry, Children’s Museum of Evansville
• Kerry Salazar, Portland Children’s Museum
• Kimberlee Kiehl, Smithsonian Early Enrichment Center
• Holly Henley, Arizona State Library
• Anne Kilkenny, Providence Public Library
• Kathy Shahbodaghi, Columbus Metropolitan Library

Executive function skills are built on the brain processes we use to pay attention and exercise self control, to hold information in our minds so that we can use it, and to think flexibly. These skills become foundational for other skills, including delaying gratification, understanding the perspectives of others, reflection, innovation, critical thinking, problem solving, and taking on challenges.

Webinar: Brain-Building Powerhouses: How Museums and Libraries Can Strengthen Executive Function Life Skills
Date:        Tuesday, September 22, 2015
Time:        2:00 PM EDT

Link:         Join the webinar with this link to the Blackboard Collaborate system.
Ph code:  1-866-299-7945, Enter guest code 5680404#

Note: IMLS-hosted webinars use the Blackboard Collaborate system. If you are a first-time user of Blackboard Collaborate, click here to check your system compatibility in advance of the webinar. You will be able to confirm that your operating system and Java are up-to-date, and enter a Configuration Room that will allow you to configure your connection speed and audio settings before the IMLS webinar begins. (If you choose to enter a Configuration Room, please note that the IMLS webinar will use Blackboard version 12.6.)

      # # #

The post IMLS Webinar: strengthen your executive skills appeared first on District Dispatch.

SearchHub: How CareerBuilder Executes Semantic and Multilingual Strategies with Apache Lucene/Solr

planet code4lib - Fri, 2015-09-18 18:35
As we countdown to the annual Lucene/Solr Revolution conference in Austin this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Trey Grainger’s session on multilingual search at CareerBuilder. When searching on text, choosing the right CharFilters, Tokenizer, stemmers and other TokenFilters for each supported language is critical. Additional tools of the trade include language detection through UpdateRequestProcessors, parts of speech analysis, entity extraction, stopword and synonym lists, relevancy differentiation for exact vs. stemmed vs. conceptual matches, and identification of statistically interesting phrases per language. For multilingual search, you also need to choose between several strategies such as 1) searching across multiple fields, 2) using a separate collection per language combination, or 3) combining multiple languages in a single field (custom code is required for this and will be open sourced) each with their own strengths and weaknesses depending upon your use case. This talk will provide a tutorial (with code examples) on how to pull off each of these strategies. We will also compare and contrast the different kinds of stemmers, discuss the precision/recall impact of stemming vs. lemmatization, and describe some techniques for extracting meaningful relationships between terms to power a semantic search experience per-language. Come learn how to build an excellent semantic and multilingual search system using the best tools and techniques Lucene/Solr has to offer! Trey Grainger is the Director of Engineering for Search & Analytics at and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining, and recommendation systems. Trey is also the Founder of, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences. Semantic & Multilingual Strategies in Lucene/Solr: Presented by Trey Grainger, CareerBuilder from Lucidworks Join us at Lucene/Solr Revolution 2015, the biggest open source conference dedicated to Apache Lucene/Solr on October 13-16, 2015 in Austin, Texas. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post How CareerBuilder Executes Semantic and Multilingual Strategies with Apache Lucene/Solr appeared first on Lucidworks.

DPLA: New Contract Opportunity: Request for Proposals for

planet code4lib - Fri, 2015-09-18 14:57

The Digital Public Library of America ( and Europeana ( invites interested and qualified individuals or firms to submit a proposal for development related to the infrastructure for the International Rights Statements Working Group.

  • RFP issued: 18 September 2015
  • Deadline for proposals: 00:00 GMT, 6 October 2015
  • Work is to be performed no sooner than 8 October 2015.
  • Functional prototypes for components A and C must be completed by 24 December 2015.
  • Work for components A, B, and C below must be completed by 15 January 2016.

This document specifies the project scope and requirements for technical infrastructure supporting a framework and vocabulary of machine-readable rights statements under development by the International Rights Statements Working Group, a joint Digital Public Library of America (DPLA)–Europeana Foundation working group.

The working group shall provide and maintain RDF descriptions of the rights statements, with canonical serializations in Turtle, modeled as a vocabulary in the Simple Knowledge Organization System (SKOS).  These descriptions will include multiple official translations of each statement, and support versioning of the statements and/or vocabulary scheme. Alongside the descriptions statements, the working group will produce a summary of the data model and properties used.

The contractor will provide an implementation that acts as a platform for hosting these statements. The platform consists of an application for publishing the rights statements according to linked data best practices and a content management system to be used by the working group to publish materials related to the project. These two components should provide the feel of an integrated user experience, and must be served publicly from a single domain ( As part of this contract, the contractor will also provide one year of maintenance, support, and security updates for these components, their dependencies, and the operating systems for servers on which they are deployed.

Components Component A. Rights statements application

A web application that provides both machine-readable representations of the rights statements (in RDF serializations including JSON-LD and Turtle) and human-readable representations. The working group will provide a canonical version of the rights statements in Turtle-serialized RDF as needed for launch, as well as a testing version used to implement and test specific features, including, but not limited to, versions (see 3a) and translations (see 4b and 4c).

  1. Human readable representations
    1. The application shall provide a human-readable web page representing each rights statement, with provision for versions, multiple language support, and additional request parameters as described in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
    2. All human-readable representations shall be generated from the canonical Turtle-serialized RDF.
    3. Human-readable representations must be available as HTML5 with RDFa 1.1 or RDFa 1.1 Lite.
    4. Human-readable representations must provide links to the RDF representations listed below.
  2. RDF representations
    1. The application shall provide multiple RDF serializations of the individual rights statements through content negotiation on the statement URI. Minimally, it must support Turtle and JSON-LD. Additional serializations are desirable but not required.
    2. The application shall provide multiple RDF serializations of the entire vocabulary through content negotiation on the vocabulary version URI.  The vocabulary shall support the same RDF serializations as the individual statements.
    3. All RDF serializations must be equivalent to the canonical Turtle-serialized RDF.
  3. Versions
    1. The application shall support multiple versions of each statement. The structure of the versions is described in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
    2. Otherwise valid statement URIs that omit the version number should respond with code 404.
  4. Languages and translation
    1. Human-readable representations should dynamically handle requests for translations of the statements through HTTP Accept-Language headers and through the use of parameters as specified in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
    2. The working group will provide text in one or more languages for each statement as RDF language-tagged literals in compliance with IETF BCP47. All language-tagged literals will be encoded as UTF-8.
    3. The working group will provide translations for content not derived from the statement RDF, e.g., navigational elements. The application will support this through an internationalization framework, e.g., GNU gettext.
  5. Additional request parameters
    1. For specific statements, human-readable representations must accept query string parameters and generate a view of the statement enhanced by additional metadata described in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
  6. Resource URIs and HTTP request patterns
    1. The HTTP behavior of the application shall follow the URI structure and interaction patterns described in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
    2. Resources must follow best practices for serving both human- and machine-readable representations for linked data vocabularies.
  7. Visual identity
    1. The working group will provide static HTML templates developed by another vendor charged with implementing the site’s visual identity.
    2. These templates must be transformed to work in the context of the application to ensure that human-readable representations follow the visual identity of the site as provided by the working group.
Component B. Content management system

An implementation of an off-the-shelf, free/libre/open source content management system (CMS), and associated plugins to publish pages about the project and initiative, related publications, etc.

  1. The CMS will be separate from the rights statements application.
  2. The CMS may be a static site generator.
  3. The CMS should support multilingual versions of content, either natively or through the use of plugin modules.
  4. A theme or templates for the CMS must be provided, which follow the visual identity defined for the site.
  5. The CMS must provide export of static content (text and multimedia).
  6. All content will be edited and maintained by members of the working group.
Component C. Server configuration, deployment, and maintenance implementation

An implementation of an existing free/libre/open source configuration management and deployment automation system, and any needed templates, scripts, etc., used to install dependencies, to configure and deploy components A and B above, and to manage the servers.

  1. The implementation must be published to a version control repository under the working group’s organization on GitHub.
  2. The implementation should support a shared set of configuration with templating to allow the components above to be deployed to a staging virtual machine and a production virtual machine using a common set of procedures.
  3. An implementation of an agentless configuration and deployment management system (e.g., Ansible) is strongly preferred.
  4. The implementation must include a configuration for an HTTP proxy server (e.g., Nginx, Apache HTTPD, etc.) that will allow components A and B to be presented together through a single domain name.
    1. The proxy server configuration must allow components A and B to be served from a common domain name (
    2. The proxy server configuration should provide caching for requests that respects the HTTP interaction patterns described in Requirements for the Technical Infrastructure for Standardized International Rights Statements.
  5. The vendor will also develop, execute, and provide reports for a load testing strategy for the implemented configuration.
Other restrictions

All components must run within a shared Linux virtual machine, preferably running Debian stable. The virtual machine will be hosted on a server physically located in a Luxembourg-based data center. The working group is providing both a staging environment and a production environment.

All materials developed during this project shall be released under open source/open content licensing. Source code will be licensed under the European Union Public License, version 1.1. Documentation will be licensed under a CC0 Public Domain Dedication.

Guidelines for proposals

All proposals must adhere to the following submission guidelines and requirements.

  • Proposals are due no later than 00:00 GMT, 6 October 2015.
  • Proposals should be sent via email to as a single PDF file attached to the message. Questions about the proposal can also be sent to this address.
  • Please format the subject line with the phrase “ Proposal – [Name of respondent].”
  • You should receive confirmation of receipt of your proposal no later than 00:00 GMT, 8 October 2015. If you have not received confirmation of your proposal by this time, please send an email to, otherwise follow the same guidelines as above.

All proposals should include the following:

  • Pricing, in US Dollars and/or Euros, as costs for each work component identified above, and as an hourly rate for any maintenance costs. The exchange rate will be set in the contract. The currency for payment will be chosen by the agent of the working group that is the party to this contract.
  • Proposed staffing plan, including qualifications of project team members (resumes/CVs and links or descriptions of previous projects such as open source contributions).
  • References, listing all clients/organizations with whom the proposer has done business like that required by this solicitation with the last three years.
  • Qualifications and experience, including
    • General qualifications and development expertise
      • Information about development and project management skills and philosophy
      • Examples of successful projects, delivered on time and on budget
      • Preferred tools and methodologies used for issue tracking, project management, and communication
      • Preferences for change control tools and methodologies
    • Project specific strategies
      • History of developing software in the library, archives, or museum domain
      • Information about experience with hosting and maintenance of RDF/SKOS vocabularies and linked data resources
  • Legal authority/capacity, or proof that the vendor is authorized to perform the contract under national law. Proof of the above is to be provided by (a copy of) a certificate of registration in the relevant trade or professional registers in the country of establishment/incorporation.
Contract guidelines
  • Proposals must be submitted by the due date.
  • Proposers are asked to guarantee their proposal prices for a period of at least 60 days from the date of the submission of the proposal.
  • Proposers must be fully responsible for the acts and omissions of their employees and agents.
  • The working group reserves the right to extend the deadline for proposals.
  • The working group reserves the right to include a mandatory meeting via teleconference with proposers individually before acceptance. Top scored proposals may be required to participate in an interview to support and clarify their proposal.
  • The working group reserves the right to negotiate with each contractor.
  • There is no allowance for project expenses, travel, or ancillary expenses that the contractor may incur.
  • Ownership of any intellectual property will be shared between the Digital Public Library of America and the Europeana Foundation.

LITA: Putting Pen to Paper

planet code4lib - Fri, 2015-09-18 14:00

Back in January, The Atlantic ran an article on a new device being used at the Cooper Hewitt design museum in New York City. This device allows museum visitors to become curators of their own collections, saving information about exhibits to their own special account they can access via computer after they leave. This device is called a pen; Robinson Meyer, the article’s author, likens it to a “gray plastic crayon the size of a turkey baster”. I think it’s more like a magic wand.

Courtesy of the Cooper Hewitt Museum website

Not only can you use the pen to save information you think is cool, you can also interact with the museum at large: in the Immersion Room, for example, you can draw a design with your pen and watch it spring to life on the walls around you. In the Process Lab, you use the pen to solve real-life design problems. As Meyer puts it, “The pen does something that countless companies, organizations, archives, and libraries are trying to do: It bridges the digital and the physical.”

The mention of libraries struck me: how could something like the Cooper Hewitt pen be used in your average public library?

The first thing that came to my mind was RFID. In my library, we use RFID to tag and label our materials. There are currently RFID “wands” that, when waved over stacks, can help staff locate books they thought were missing.

But let’s turn that around: give the patron the wand – rather, the pen – and program in a subject they’re looking for…say, do-it-yourself dog grooming. As the patron wanders, the pen is talking with the stacks via RFID asking where those materials would be. Soon the pen vibrates and a small LED light shines on the materials. Eureka!

Or, just as the Cooper Hewitt allows visitors to build their own virtual collection online, we can have patrons build their own virtual libraries. Using the same RFID scanning technology as before, patrons can link items to their library card number that they’ve already borrowed or maybe want to view in the future. It could be a system similar to Goodreads (or maybe even link it to Goodreads itself) or it could be a personal website that only the user – not the library – has access to.

What are some ways you might be able to use this tech in your library system?

Code4Lib: studied

planet code4lib - Thu, 2015-09-17 16:47

Creating Tomorrow’s Technologists: Contrasting Information Technology Curriculum in North American Library and Information Science Graduate Programs against Code4lib Job Listings by Monica Maceli recently appeared in the Journal of Education for Library and Information Science 56.3 (DOI:10.12783/issn.2328-2967/56/3/3). As the title states, it studies listings on

This research study explores technology-related course offerings in ALA-accredited library and information science (LIS) graduate programs in North America. These data are juxtaposed against a text analysis of several thousand LIS-specific technology job listings from the Code4lib jobs website. Starting in 2003, as a popular library technology mailing list, Code4lib has since expanded to an annual conference in the United States and a job-posting website. The study found that database and web design/development topics continued to dominate course offerings with diverse sub-topics covered. Strong growth was noted in the area of user experience but a lack of related jobs for librarians was identified. Analysis of the job listings revealed common technology-centric librarian and non-librarian job titles, as well as frequently correlated requirements for technology skillsets relating to the popular foci of web design/development and metadata. Finally, this study presents a series of suggestions for LIS educators in order that they continue to keep curriculum aligned with current technology employment requirements.


Subscribe to code4lib aggregator