You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 14 hours 13 min ago

FOSS4Lib Recent Releases: ArchivesSpace - 1.5.0

Wed, 2016-07-20 17:45

Last updated July 20, 2016. Created by Peter Murray on July 20, 2016.
Log in to edit this page.

Package: ArchivesSpaceRelease Date: Wednesday, July 20, 2016

OCLC Dev Network: Bib It: Choosing a UI Framework

Wed, 2016-07-20 15:00

Learn about the UI frameworks used in Bib It, a simple application for allowing non catalogers to add data to WorldCat.

Tim Ribaric: Another hot take... This time about Library Holdings

Wed, 2016-07-20 14:23

 

I came across another grad school Library use dilemma

read more

LITA: Wearable Technology Resources

Wed, 2016-07-20 12:00

The world of wearable technology (WT) is fascinating, but a little overwhelming. Last month I attended the Digital Humanities Summer Institute where I completed a week-long course entitled “Palpability and Wearable Computing.” We engaged in movement exercises, experimented with sensors, learned about haptics, and critiqued consumer wearables including the Fitbit, Spire, Leaf, and Athos. I expected to walk away with some light-up sneakers, but instead I left with lots of questions, inspiration, and resources.

What follows is a list of books, videos, and project tutorials that I’ve found most helpful in my exploration of wearable technology.

Textile Messages | Edited by Leah Buechley, Kylie Peppler, Michael Eisenberg, and Yasmin Kafai

  • Textile Messages is a great primer; it includes a little bit of history, lots of project ideas, and ample discussion of working with WT in the classroom. This is the most practical resource I’ve encountered for librarians of all types.

    Textile Messages: Dispatches from the World of E-Textiles and Education

Garments of Paradise | Susan Elizabeth Ryan

  • The history of WT goes back longer than you’d think. Chapter 1 from Garments of Paradise will take you all the way from the pocket watch to the electric dress to Barbarella.

Atsuko Tanaka models the electric dress, 1956.

MAKE Presents

  • If you want to make your own wearables, then you’ll need a basic understanding of electronics. MAKE magazine has a fantastic video series that will introduce you to Ohm’s Law, oscilloscopes, and a whole slew of teeny tiny components.

Wired Magazine

  • If you’re interested in consumer wearables, Wired will keep you up to date on all the latest gadgetry. Recent reviews include a temporary tattoo that measures UV exposure and Will.i.am’s smart watch.

    My UV Patch from L’Oreal is currently in development

Project Tutorials

  • One easy and inexpensive way to get started with WT is to create your own sensors. In class we created a stroke sensor made of felt and conductive thread. If you’re working with a limited budget, Textile Messages has an entire chapter devoted to DIY sensors.  
  • Adafruit is a treasure trove of project tutorials. Most of them are pretty advanced, but it’s interesting to see how far you can go with DIY projects even if you’re not ready to take them on yourself.
  • Sparkfun is a better option if you’re interested in projects for beginners.

My first attempt at making a stroke sensor

What WT resources have you encountered?

pinboard: The Code4Lib Journal – Metadata Analytics, Visualization, and Optimization: Experiments in statistical analysis of the Digital Public Library of America (DPLA)

Tue, 2016-07-19 22:12
RT @chrpr: Hey so I published a thing about visualization & predictive analytics of #DPLA metadata in the #code4lib journal.

Mita Williams: The Observer or Seeing What You Mean

Tue, 2016-07-19 20:43

If you are new to my writing, my talks and work tends to resemble an entanglement of ideas. Sometimes it all comes together in the end and sometimes I know that I’ve just overwhelmed my audience.

I’m trying to be better at reducing the sheer amount of information I give across in a single seating. So for this post, I’m going to tell you what I’m going to say briefly before I tell you what I’m going to say in a more meandering fashion.

In brief, libraries would do better to acknowledge the role of the observer in our work.

Now, true to my meandering style, we need to walk it back a bit before we can move forward. In fact, I’m going to ask you to look back at my last post (“The Library Without a Map“) that was about how traditional libraries have library catalogues that do a poor job of modeling subject relationships and how non-traditional libraries such as The Prelinger Library have tried to improve discovery through their own means of organization.

One of the essays I linked to about The Prelinger was from a zine series called Situated Knowledges, Issue 3: The Prelinger Library.  The zine series is the only one that I know of that’s been named after a journal article:

Situated Knowledges: The Science Question in Feminism and the Privilege of Partial Perspective
Donna Haraway
Feminist Studies
Vol. 14, No. 3 (Autumn, 1988), pp. 575-599
Published by: Feminist Studies, Inc.
DOI: 10.2307/3178066
Stable URL: http://www.jstor.org/stable/3178066
Page Count: 25

I have to admit that I struggled with this paper but in the end I was glad to have worked through the struggle. To sum up the paper in one sentence: we need to resist the idea that there is exists ‘god-like’ vision of objectivity and remember that our vision and our knowledge is limited by location and situation. Or as Haraway puts it:

I want a feminist writing of the body that metaphorically emphasizes vision again, because we need to reclaim that sense to find our way through all the visualizing tricks and powers of modern sciences and technologies that have transformed the objectivity debates. We need to learn in our to name where we are and are not, in dimensions of mental and physical space we hardly know how to name. So, not so perversely, objectivity turns out to be about particular and specific embodiment and definitely not about the false vision promising transcendence of all limits and responsibility. The moral is simple: only partial perspective promises objective vision. All Western cultural narratives about objectivity are allegories of the ideologies governing the relations of what we call mind and body, distance and responsibility. Feminist objectivity is about limited location and situated knowledge, not about transcendence and splitting of subject and object. It allows us to become answerable for what we learn how to see.

 

I’ve been thinking a lot recently about the power of the observer recently.

On my other blog, The Magnetic North, I wrote about how a world-weariness brought on by watching tragedies unfold on social media has led me to spend more time with art. I go on to suggest that being better versed in observing art without the burden of taste might help us better navigate a world that shows us only what we chose to see and perhaps even bring about a more just world.

But on this blog, I want to direct your attention to a more librarian-focused reason to be concerned with the matter of the observer.

You see, after I published my last post about how our library catalogue and how it poorly handles subject headings, I received a recommended read from Trevor Owens:

@copystar @shannonmattern y’all read Murray & Tillett’s “Cataloging Theory in Search of Graph Theory” https://t.co/30gr2TuKQR

— Trevor Owens (@tjowens) July 11, 2016


 

I found the paper super interesting. But among all the theory, I have to admit my favourite takeaways from the paper was that its model incorporates business rules as a means to capture an institution’s particular point of view, restraints or reasons for interest. It is as if we are recognizing the constraints and situation of the observer who is describing a work:

Following the scientific community’s lead in striving to describe the physical universe through observations, we adapted the concept of an observation into the bibliographic universe and assert that cataloging is a process of making observations on resources. Human or computational observers following institutional business rules (i.e., the terms, facts, definitions, and action assertions that represent constraints on an enterprise and on the things of interest to the enterprise)5 create resource descriptions — accounts or representations of a person, object, or event being drawn on by a person, group, institution, and so on, in pursuit of its interests.

Given this definition, a person (or a computation) operating from a business rules–generated institutional or personal point of view, and executing specified procedures (or algorithms) to do so, is an integral component of a resource description process (see figure 1). This process involves identifying a resource’s textual, graphical, acoustic, or other features and then classifying, making quality and fitness for purpose judgments, etc., on the resource. Knowing which institutional or individual points of view are being employed is essential when parties possessing multiple views on those resources describe cultural heritage resources. How multiple resource descriptions derived from multiple points of view are to be related to one another becomes a key theoretical issue with significant practical consequences.

Murray, R. J., & Tillett, B. B. (2011). Cataloging theory in search of graph theory and other ivory towers: Object: Cultural heritage resource description networks. Information Technology and Libraries, 30(4), 170-184.

I’ll end this post with a video of the first episode of Ways of Seeing, a remarkable series four-part series about art from the BBC in 1972. It is some of the smartest TV I have ever seen and begins with the matter of the perspective and the observer:

The first episode is based on the ideas of Walter Benjamin’s The Work of Art in the Age of Mechanical Reproduction, which I must admit with some shame that I still have not read.

Art takes into account the observer.

I’m not sure that librarianship does.

But perhaps this observation is not sound. Perhaps it is limited by my particular situation and point of view.

District Dispatch: There’s a reason why we complain

Tue, 2016-07-19 20:39

Image from Pixabay

The Social Science Research Network (SSRN) could be called the “academic version” of user-generated content on the web. Scholars and academics generate content in the form of scholarly papers and post them on the SSRN for all to see, read, and comment on.  Often, academics who post their forthcoming papers or “pre-prints” intend to eventually publish them in scholarly journals that research libraries and academic societies acquire. But in the meantime, academics want to quickly share their works in a pre-published form on the SSRN.  It’s a valuable and heavily used resource with over 682,100 scholarly working papers and forthcoming papers freely available.

After the scholarly publisher Elsevier acquired the SSRN in May, people thought, what the h***?! Many were inclined to think that Elsevier would develop a way to monetize SSRN because Elsevier does that sort of thing, they have a history. They sell journal subscriptions to academics at lunatic prices — their current profit margin more than 40% — by re-selling content produced by scholars who work at publicly funded higher education institutions. Then libraries have to find the money to purchase the journal…you know the story. (if not, see SPARC) Elsevier assured those concerned that SSRN would remain unchanged – specifically that “both existing and future SSRN content will be largely unaffected”

The Authors Alliance, whose members want to facilitate the “widespread access to works of authorship” and “disseminate knowledge,” were particularly concerned because SSRN is one of the primary venues for sharing works of social science rapidly and freely. So they asked Elsevier to accept principles that acknowledged its willingness to accept open access preferences of scholars.

Well, they did not. Surprise!

Last week, several authors noted that their papers had been removed from SSRN by Elsevier without notice. Apparently Elsevier wants to remove all the papers whose copyright status is unclear. Ahh…come again? Elsevier is asking authors who have written an unpublished paper and have not transferred their copyright to submit documentation proving that they are the rights holder! What kind of world do we live in?

Now there is a movement by scholars and academics to drop SSRN.  Luckily, a new pre-print archive is under development. It is called SocArXiv. Stay tuned to the District Dispatch for more information.

The post There’s a reason why we complain appeared first on District Dispatch.

LibUX: The UX of VR

Tue, 2016-07-19 17:12

Max Glenister has curated a list of resources about the user experience of virtual reality. These range from actual code to conceptual principles and broadly applicable truisms about immersion and design, like

The last 40 years have seen the rise of the digital landscape; a two dimensional plane that abstracts familiar real-world concepts like writing, using a calendar, storing documents in folders into user interface elements (UI). This approach allows for a high level of information density and multitasking. The down-side is that new interaction models need to be learned and there is a higher cognitive load to decision making. Matt Sundstrom
Immersive Design: Learning to Let Go of the Screen

The UX of VR by Max Glenister

The post The UX of VR appeared first on LibUX.

LITA: LITA Forum 2016 – Call for Library School Student Volunteers

Tue, 2016-07-19 15:28

2016 LITA Forum
Ft Worth, Texas
November 17-20, 2016

STUDENT REGISTRATION RATE AVAILABLE – 50% OFF REGISTRATION RATE — $180

The Library and Information Technology Association (LITA), a division of the American Library Association, is offering a discounted student registration rate for the 2016 LITA Forum. This offer is limited to graduate students enrolled in ALA-accredited programs. In exchange for the lower registration cost, these graduate students will be asked to assist the LITA organizers and Forum presenters with onsite operations. This is a great way to network and meet librarians active in the field.

The selected students will be expected to attend the full LITA Forum, Friday noon through Sunday noon. Attendance during the preconferences on Thursday afternoon and Friday morning is not required. While you will be assigned a variety of duties, you will be able to attend the Forum programs, which include 3 keynote sessions, over 50 concurrent sessions, and poster presentations, as well as many opportunities for social engagement.

The Forum will be held November 17-20, 2016 at the Omni Hotel in Fort Worth, Texas. The student rate is $180 – half the regular registration rate for LITA members. A real bargain, this rate includes a Friday night reception, continental breakfasts, and Saturday lunch.

For more information about the Forum, visit http://litaforum.org. We anticipate an attendance of 300 decision makers and implementers of new information technologies in libraries.

To apply to be a student volunteer, complete and submit this form by September 30, 2016.

http://goo.gl/forms/e6UeOsfqTW0hhsfu2

You will be asked to provide the following:
1. Contact information, including email address and cell phone number
2. Name of the school you are attending
3. Statement of 150 words (or less) explaining why you want to attend the LITA National Forum

Those selected to be volunteers registered at the student rate will be notified no later than Friday, October 14, 2016.

Additional questions should be sent to Christine Peterson, peterson@amigos.org, or Mary Duffy, mduffy@southalabama.edu

Code4Lib Journal: Editorial Introduction – Summer Reading List

Tue, 2016-07-19 15:08
New additions for your summer reading list!

Code4Lib Journal: Emflix – Gone Baby Gone

Tue, 2016-07-19 15:08
Enthusiasm is no replacement for experience. This article describes a tool developed at the Emerson College Library by an eager but overzealous cataloger. Attempting to enhance media-discovery in a familiar and intuitive way, he created a browseable and searchable Netflix-style interface. Though it may have been an interesting idea, many of the crucial steps that are involved in this kind of high-concept work were neglected. This article will explore and explain why the tool ultimately has not been maintained or updated, and what should have been done differently to ensure its legacy and continued use.

Code4Lib Journal: Introduction to Text Mining with R for Information Professionals

Tue, 2016-07-19 15:08
The 'tm: Text Mining Package' in the open source statistical software R has made text analysis techniques easily accessible to both novice and expert practitioners, providing useful ways of analyzing and understanding large, unstructured datasets. Such an approach can yield many benefits to information professionals, particularly those involved in text-heavy research projects. This article will discuss the functionality and possibilities of text mining, as well as the basic setup necessary for novice R users to employ the RStudio integrated development environment (IDE). Common use cases, such as analyzing a corpus of text documents or spreadsheet text data, will be covered, as well as the text mining tools for calculating term frequency, term correlations, clustering, creating wordclouds, and plotting.

Code4Lib Journal: Data for Decision Making: Tracking Your Library’s Needs With TrackRef

Tue, 2016-07-19 15:08
Library services must adapt to changing patron needs. These adaptations should be data-driven. This paper reports on the use of TrackRef, an open source and free web program for managing reference statistics.

Code4Lib Journal: Are games a viable solution to crowdsourcing improvements to faulty OCR? – The Purposeful Gaming and BHL experience

Tue, 2016-07-19 15:08
The Missouri Botanical Garden and partners from Dartmouth, Harvard, the New York Botanical Garden, and Cornell recently wrapped up a project funded by IMLS called Purposeful Gaming and BHL: engaging the public in improving and enhancing access to digital texts (http://biodivlib.wikispaces.com/Purposeful+Gaming). The goals of the project were to significantly improve access to digital texts through the applicability of purposeful gaming for the completion of data enhancement tasks needed for content found within the Biodiversity Heritage Library (BHL). This article will share our approach in terms of game design choices and the use of algorithms for verifying the quality of inputs from players as well as challenges related to transcriptions and marketing. We will conclude by giving an answer to the question of whether games are a successful tool for analyzing and improving digital outputs from OCR and whether we recommend their uptake by libraries and other cultural heritage institutions.

Code4Lib Journal: From Digital Commons to OCLC: A Tailored Approach for Harvesting and Transforming ETD Metadata into High-Quality Records

Tue, 2016-07-19 15:08
The library literature contains many examples of automated and semi-automated approaches to harvest electronic theses and dissertations (ETD) metadata from institutional repositories (IR) to the Online Computer Library Center (OCLC). However, most of these approaches could not be implemented with the institutional repository software Digital Commons because of various reasons including proprietary schema incompatibilities and high level programming expertise requirements our institution did not want to pursue. Only one semi-automated approach was found in the library literature which met our requirements for implementation, and even though it catered to the particular needs of the DSpace IR, it could be implemented to other IR software if further customizations were applied. The following paper presents an extension of this semi-automated approach originally created by Deng and Reese, but customized and adapted to address the particular needs of the Digital Commons community and updated to integrate the latest Resource Description & Access (RDA) content standards for ETDs. Advantages and disadvantages of this workflow are discussed and presented as well.

Code4Lib Journal: Metadata Analytics, Visualization, and Optimization: Experiments in statistical analysis of the Digital Public Library of America (DPLA)

Tue, 2016-07-19 15:08
This paper presents the concepts of metadata assessment and “quantification” and describes preliminary research results applying these concepts to metadata from the Digital Public Library of America (DPLA). The introductory sections provide a technical outline of data pre-processing, and propose visualization techniques that can help us understand metadata characteristics in a given context. Example visualizations are shown and discussed, leading up to the use of "metadata fingerprints" -- D3 Star Plots -- to summarize metadata characteristics across multiple fields for arbitrary groupings of resources. Fingerprints are shown comparing metadata characterisics for different DPLA "Hubs" and also for used versus not used resources based on Google Analytics "pageview" counts. The closing sections introduce the concept of metadata optimization and explore the use of machine learning techniques to optimize metadata in the context of large-scale metadata aggregators like DPLA. Various statistical models are used to predict whether a particular DPLA item is used based only on its metadata. The article concludes with a discussion of the broad potential for machine learning and data science in libraries, academic institutions, and cultural heritage.

David Rosenthal: More on Terms of Service

Tue, 2016-07-19 15:00
When Jefferson Bailey & I finished writing My Web Browser's Terms of Service I thought I was done with the topic, but two recent articles bought it back into focus. Below the fold are links, extracts and comments.

In Ticking all the boxes, The Economist writes about an interesting legal theory underpinning a rash of cases in New Jersey:
The suits seek to exploit the Truth-in-Consumer Contract, Warranty and Notice Act, enacted in New Jersey 35 years ago. This was intended to prevent companies that do business in the state from using contracts, notices or signs to limit consumer rights protected by law.These suits:
generally include allegations that online terms violate consumers’ rights to seek damages as protected by New Jersey law and fail to explain which provisions cover New Jersey. ... plaintiffs need not show injury or loss in order to sue but merely prove violation of the TCCWNA. The risks to companies are significant:
the TCCWNA entitles each successful plaintiff to at least $100 in damages, plus fees to lawyers and so on. If a website has millions of visitors, the costs to a company could be staggering. But they are balanced by longer-term risks to consumers:
A growing number of firms, emboldened by favourable Supreme Court rulings, have adopted clauses that limit class-action suits. Consumers are instead restricted to resolving disputes individually, in arbitration. The TCCWNA cases may inspire more firms to add such caveats. That might limit frivolous suits. But consumers with grave complaints would be unable to sue, either. In the end lawsuits over restrictive contracts may make them more restrictive still. An example of this trend is Pokemon Go:
to play Pokemon Go, you have to accede to a binding arbitration clause, surrendering your right to sue and promising only to seek redress for any harms that the company visits upon you in a system of secretive, one-sided shadow courts paid for by corporations where class actions are not permitted and the house always wins. ... Pokemon joins a small but growing movement of online services that strip their customers of their legal rights as a condition of sale, including Google Fiber and AirbnbIt could be worse, in this case you can send an email:
within 30 days of creating your account, and include in the body "a clear declaration that you are opting out of the arbitration clause in the Pokémon Go terms of service." In The Biggest Lie on the Internet: Ignoring the Privacy Policies and Terms of Service Policies of Social Networking Services, Jonathan Obar and Anne Oeldorf-Hirsch report on:
an empirical investigation of privacy policy (PP) and terms of service (TOS) policy reading behavior. An experimental survey (N=543) assessed the extent to which individuals ignore PP and TOS when joining a fictitious social networking site, NameDrop. Results reveal 74% skipped PP, selecting ‘quick join.’ For readers, average PP reading time was 73 seconds, and average TOS reading time was 51 seconds. Based on average adult reading speed (250-280 words per minute), PP should have taken 30 minutes to read, TOS 16 minutes. Among the clauses that almost all experimental subjects missed were ones requiring:
data sharing with the NSA and employers, and .. providing a first-born child as payment The thing is, consumers are probably being rational in ignoring the mandatory arbitration and the terms of service. Even with a class action, the terms of service are so stacked against the consumer that a win is highly unlikely, and if it happens the most the consumer can expect is a mountain of paperwork asking for proofs that they almost certainly don't possess, in order to stake a claim to the crumbs left over after the class action lawyers get paid.

A social network that started suing its members for the kinds of things everyone does would be out of business very quickly, so the details of the terms are pretty irrelevant compared to the social norms of the network. The privacy terms are perhaps more important, but if you care about privacy the last thing you should be doing is using a social network.

Eric Lease Morgan: How not to work during a sabbatical

Tue, 2016-07-19 14:43

This presentation — given at Code4Lib Midwest (Chicago, July 14, 2016) — outlines the various software systems I wrote during my recent tenure as an adjunct faculty member at the University of Notre Dame. (This presentation is also available as a one-page PDF handout designed to be duplex printed and folded in half as if it were a booklet.)

  • How rare is rare? – In an effort to determine the “rarity” of items in the Catholic Portal, I programmatically searched WorldCat for specific items, counted the number of times it was held by libraries in the United States, and recorded the list of the holding libraries. Through the process I learned that most of the items in the Catholic Portal are “rare”, but I also learned that “rarity” can be defined as the triangulation of scarcity, demand, and value. Thus the “rare” things may not be rare at all.
  • Image processing – By exploiting the features and functions of an open source library called OpenCV, I started exploring ways to evaluate images in the same way I have been evaluating texts. By counting & tabulating the pixels in an image it is possible to create ratios of colors, do facial recognition, or analyze geometric composition. Through these processes is may be possible to supplement art history and criticism. For example, one might be able to ask things like, “Show me all of the paintings from Picasso’s Rose Period.”
  • Library Of Congress Name Authorities – Given about 125,000 MARC authority records, I wrote an application that searched the Library Of Congress (LOC) Name Authority File, and updated the local authority records with LOC identifiers, thus making the local authority database more consistent. For items that needed disambiguation, I created a large set of simple button-based forms allowing librarians to choose the most correct name.
  • MARC record enrichment – Given about 500,000 MARC records describing ebooks, I wrote a program that found the richest OCLC record in WorldCat and then merged the found record with the local record. Ultimately the local records included more access points and thus proved to be more useful in a library catalog setting.
  • OAI-PMH processing – I finally got my brain around the process of harvesting & indexing OAI-PMH content into VUFind. Whoever wrote the original OAI-PMH applications for VUFind did a very good job, but there is a definite workflow to the process. Now that I understand the workflow it is relatively easy ingest metadata from things like ContentDM, but issues with the way Dublin Core is implement still make the process challenging.
  • EEBO/TCP – Given the most beautiful TEI mark-up I’ve ever seen, I have systematically harvested the Early English Books Online (EEBO) content from the Text Encoding Initiative (TCP) and done some broad & deep but also generic text analysis subsets of the collection. Readers are able to search the collection for items of interest, save the full text to their own space for analysis, and have a number of rudimentary reports done against the result. This process allows the reader to see the corpus from a “distance”. Very similar work has been done against subsets of content from JSTOR as well as the HathiTrust.
  • VIAF Lookup – Given about 100,000 MARC authority records, I wrote a program to search VIAF for the most appropriate identifier and associate it with the given record. Through the process I learned two things: 1) how to exploit the VIAF API, and 2) how to exploit the Levenshtein algorithm. Using the later I was able to make automated and “intelligent” choices when it came to name disambiguation. In the end, I was able to accurately associate more than 80% of the authority names with VIAF identifiers.

My tenure as an adjunct faculty member was very much akin to a one year education except for a fifty-five year old. I did many of the things college students do: go to class, attend sporting events, go on road trips, make friends, go to parties, go home for the holidays, write papers, give oral presentations, eat too much, drink too much, etc. Besides the software systems outlined above, I gave four or five professional presentations, attend & helped coordinate five or six professional meetings, taught an online, semester-long, graduate-level class of on the topic of XML, took many different classes (painting, sketching, dance, & language) many times, lived many months in Chicago, Philadelphia, and Rome, visited more than two dozen European cities, painted about fifty paintings, bound & filled about two dozen hand-made books, and took about three thousand photographs. The only thing I didn’t do is take tests.

Pages