You are here

Feed aggregator

Code4Lib Journal: Using Google Tag Manager and Google Analytics to track DSpace metadata fields as custom dimensions

planet code4lib - Wed, 2015-01-21 17:35
DSpace can be problematic for those interested in tracking download and pageview statistics granularly. Some libraries have implemented code to track events on websites and some have experimented with using Google Tag Manager to automate event tagging in DSpace. While these approaches make it possible to track download statistics, granular details such as authors, content types, titles, advisors, and other fields for which metadata exist are generally not tracked in DSpace or Google Analytics without coding. Moreover, it can be time consuming to track and assess pageview data and relate that data back to particular metadata fields. This article will detail the learning process of incorporating custom dimensions for tracking these detailed fields including trial and error attempts to use the data import function manually in Google Analytics, to automate the data import using Google APIs, and finally to automate the collection of dimension data in Google Tag Manager by mimicking SEO practices for capturing meta tags. This specific case study refers to using Google Tag Manager and Google Analytics with DSpace; however, this method may also be applied to other types of websites or systems.

Code4Lib Journal: Using SemanticScuttle for managing lists of recommended resources on a library website

planet code4lib - Wed, 2015-01-21 17:35
Concordia University Libraries has adopted SemanticScuttle, an open source and locally-hosted PHP/MySQL application for social bookmarking, as an alternative to Delicious for managing lists of recommended resources on the library’s website. Two implementations for displaying feed content from SemanticScuttle were developed: (1) using the Google Feed API and (2) using direct SQL access to SemanticScuttle’s database.

Code4Lib Journal: Training the Next Generation of Open Source Developers: A Case Study of OSU Libraries & Press’ Technology Training Program

planet code4lib - Wed, 2015-01-21 17:35
The Emerging Technologies & Services department at Oregon State University Libraries & Press has implemented a training program for our technology student employees on how and why they should engage in Open Source community development. This article will outline what they've done to implement this program, discuss the benefits they've seen as a result of these changes, and will talk about what they viewed as necessary to build and promote a culture of engagement in open communities.

Code4Lib Journal: Communication Between Devices in the Viola Document Delivery System

planet code4lib - Wed, 2015-01-21 17:35
Viola is a newly developed document delivery system that handles incoming and outgoing requests for printed books, articles, sharing electronic resources, and other document delivery services on the local level in a library organisation. An important part of Viola is the stack fetching Android application that enables librarians to collect books in the open and closed stacks in an efficient manner using a smartphone and a Bluetooth connected portable printer. The aim of this article is to show how information is transferred between systems and devices in Viola. The article presents code examples from Viola that use current .NET technologies. The examples span from the creation of high-level REST-based JSON APIs to byte array communication with a Bluetooth connected printer and the reading of RFID tags. Please note that code examples in this article are for illustration purposes only. Null checking and other exception handling has been removed for clarity. Code that is separated in Viola for testability and other reasons has been brought together to make it more readable.

Code4Lib Journal: Query Translation in Europeana

planet code4lib - Wed, 2015-01-21 17:35
Europeana – a database containing European digital cultural heritage objects – recently introduced query translation in order to aid users in searching the collections regardless of language. The user enters query terms, and the portal searches for those terms in multiple languages. This article discusses the technical details of query translation with the aim of assisting similar projects to implement similar features.

DPLA: Libraries and Copyright: Big Wins in 2014 and Big Challenges Ahead for 2015

planet code4lib - Wed, 2015-01-21 16:41

In terms of copyright, 2014 was a big year for libraries. The highlights were the release of decisions in two major copyright cases on appeal, largely in favor of library uses and affirming the applicability of fair use to certain aspects of digitization. Other developments, such as the release of a new code of best practices in fair use of collections containing orphan works, have created opportunities for libraries and other memory institutions to make further progress on addressing copyright obstacles to digital access to their collections.

Before the first open committee call of the year for the DPLA Legal Committee (which is later today at 2:00pm Eastern), now is a good time for a short recap what we’ve seen over the last year and what we can expect in 2015.

To register for today’s open Legal Committee call at 2:00 PM Eastern, click here.

The first and maybe the most important development of 2014 comes in Authors Guild v. HathiTrust, a major copyright case before the Second Circuit Court of Appeals that was decided in June in favor of the HathiTrust Digital Library (a DPLA content hub with over 13 million digitized volumes). The suit was filed by the Authors Guild in 2011, largely in response to HathiTrust’s efforts to make it’s collections of orphan works more accessible. In its complaint, the Authors Guild objected to HathiTrust’s digitization project for that and several other reasons.

As it turned out, orphan works were not much of an issue in that case. The courts concluded that those claims were not ripe for adjudication because HathiTrust stopped its orphan works program and has no plans to continue it. Instead, most of the lawsuit focused on other uses of the HathiTrust collection, such as creating an indexed search of the contents of digitized books (and related research uses), full-text access for the blind and print-disabled, and preservation in digital formats.

HathiTrust initially prevailed in the suit before the district court for Southern District of New York, with that court finding that all contested uses qualified as “fair use” under the Copyright Act. In June 2014 HathiTrust won an even bigger victory when the Second Circuit Court of Appeals largely affirmed the ruling of the lower court.

While the HathiTrust case addressed only a subset of the copyright issues raised by library mass digitization, it still represents a major positive development for digital libraries like those that contribute to DPLA to enhance access to their collections. The case makes clear that library digitization for purposes of enhanced search and for full-text use by the blind are acceptable under fair use. While this short summary can’t do justice to the importance of the case, the Association of Research Libraries has done a great job explaining it. One of the best resources is a seven-page document prepared by Jonathan Band (ARL counsel) titled “What Does the HathiTrust Decision Mean for Libraries?”

The second major decision of 2014 came in Cambridge University Press v. Becker, which was decided by the Eleventh Circuit Court of Appeals. That case began in 2008 when Cambridge University Press, Oxford University Press and Sage Publications sued Georgia State University over faculty use of book excerpts scanned for use in electronic course reserves. After a lengthy trial, the district court in that case issues a painstaking, 300+ page opinion detailing why, in the vast majority of instances, Georgia State e-reserves practices fell within the bounds of Copyright’s fair use doctrine. While not everything in the district court decision represented a positive development for libraries, it was still an important victory, especially on more generally-applicable issues of the weight and importance of the nonprofit, educational use of the work in the fair use analysis, and the weight that the court placed on whether a digital licensed copy was made available by the publishers (if no license was offered, the court generally found that that favored Georgia State’s fair use assertion).

The publishers in that case appealed to the Eleventh Circuit Court of Appeals, and in November that court issued its decision. Formally, the Eleventh Circuit reversed the district court. But in the Eleventh Circuit’s reasoning for the reversal, it was clear that the vast majority of the principles contained within the district court’s analysis–for example, the importance of the nonprofit, educational purpose of the use–were preserved. Like HathiTrust, this decision has a lot to unpack that would be impossible to review here. The best summary and analysis I have seen comes from University of Minnesota Copyright Librarian Nancy Sims. You should know that this case is still active; the Eleventh Circuit recently rejected a petition to rehear the case, but it remains possible for either side to petition the U.S. Supreme Court to review the case.

Beyond litigation, one major development worth noting is the release of a new Statement of Best Practices in Fair Use of Collections Containing Orphan Works for Libraries, Archives, and Other Memory Institutions (disclaimer: I was on the team that helped draft this document). That document, endorsed by DPLA and several of its hubs, along with many other leading national libraries, archives, and other memory institutions, takes aim at helping libraries and archives address the longstanding issue of what to do with orphan works–i.e., works for which copyright owners are difficult or impossible to locate–especially when they are embedded in larger collations that libraries seek to digitize.

The Best Practices was released in December 2014 and discussed in an ALA-sponsored webinar. In February during Fair Use Week (February 23-27), the team that helped draft these best practices from American University and UC Berkeley will be hosting an event in Washington, D.C. (also webcast live) explaining the document, discussing situations in which it might be most useful, and fielding questions via a panel of experts about its application on the ground. More details to come on that.

While 2014 was a big year for library copyright issues in the courts, it also contained the beginnings of several important discussions about legislative efforts to revise copyright law. Over the course of the year the House Subcommittee on Courts, Intellectual Property & the Internet held a series of hearings on the Copyright Act with an eye toward possible revision. Among other topics, the subcommittee addressed the scope of fair use, and preservation and reuse of copyrighted works. In addition, the U.S. Copyright Office and the U.S. Patent & Trademark Office both held roundtable meetings addressing possible areas of legislative revision. More broadly, a major conference hosted by UC Berkeley’s Center for Law and Technology, titled “The Next Great Copyright Act,” addressed an even wider range of potential Copyright Act revisions.

If I had to pick the biggest challenge for the upcoming year for libraries in this area, it would be continuing library engagement with legislative and administrative efforts to propose changes to text of the Copyright Act itself. Similar hearings and other studies are likely to continue throughout 2015. Add on top of that efforts ramping up to create a library and archive copyright exceptions treaty through the World Intellectual Property Organization, and it will be a busy and difficult year in which librarians must make concerted efforts to have their voices heard on how legislation should be crafted to ensure better online access to library collections.  My hope is that the DPLA, along with many of the other organizations such as ALA and ARL, can continue to help keep us informed about issues like this on which librarians should speak up and present a positive agenda for reform.

FOSS4Lib Upcoming Events: DC Area Fedora User Group Meeting

planet code4lib - Wed, 2015-01-21 16:40
Date: Tuesday, March 31, 2015 - 08:00 to Wednesday, April 1, 2015 - 17:00Supports: Fedora RepositoryIslandoraHydra

Last updated January 21, 2015. Created by Peter Murray on January 21, 2015.
Log in to edit this page.

The next DC Area Fedora User Group Meeting will be held on Tuesday, March 31 and Wednesday, April 1 at the USDA National Agriculture Library. Please register in advance (registration is free) by completing this brief form:

David Rosenthal: New Yorker on Web Archiving

planet code4lib - Wed, 2015-01-21 16:00
Do not hesitate, do not pass Go, right now please read Jill Lepore's really excellent New Yorker article Cobweb: can the Web be archived?

FOSS4Lib Recent Releases: DSpace - 5.0

planet code4lib - Wed, 2015-01-21 15:54
Package: DSpaceRelease Date: Tuesday, January 20, 2015

Last updated January 21, 2015. Created by Peter Murray on January 21, 2015.
Log in to edit this page.

From the release announcement:

With a new, modern look and feel for every device, the ability to auto-upgrade from older versions of DSpace, to batch import content and more, the release of DSpace 5 offers its far-flung global community of developers and stakeholders an even easier-to-use and more efficient institutional repository solution.

LITA: Amazon Echo

planet code4lib - Wed, 2015-01-21 13:00

Have you read about Amazon Echo? It is a new consumer product from Amazon that users can ask it questions and receive answers, tell it to play music, request it to add items to your shopping/to do list, etc.

I first saw a video about it in October and quickly signed up to receive an invitation to purchase the product. I received my invitation this month and Echo should arrive at my house in May.

I’m pretty excited about it for a few reasons. First, Amazon is letting people develop for it. I’m already brainstorming ways the product can be used in both my home and office.

Second, I can’t wait to be able to talk to a device without having to push a button. The reviews for the voice recognition aren’t perfect, but they seem really good for a first launch.

Finally, I’m also really interested in it as an information retrieval tool. I don’t claim to be able to predict the future, but I think devices like Echo will be a new way that people access information. It seems like a logical next step.

This only emphasizes the importance for people to understand their information need, to understand biases associated with information retrieval tools (to find answers to questions Echo reviews Wikipedia, a few other databases, and will conduct a Bing search), and the amazing role that algorithms are going to play in the future. Algorithms already play such a big role in how people retrieve information. With tools that tell people answers to their questions users won’t even see other options. They will only be told one answer.

Image Courtesy of Flickr user jm_escalante CC BY-NC-SA 2.0

I’d love to chat about Echo. Do you have ideas for how to use it?

District Dispatch: What does the new Congress mean for libraries?

planet code4lib - Wed, 2015-01-21 06:59

A panel of experts from the ranks of politics, academia and the press will explore the implica­tions of the November mid-term Congressional elections for America, libraries and library advocacy at the 2015 American Library Association (ALA) Midwinter Meeting in Chicago. ALA invited U.S. Senator and Democratic Majority Whip Richard Durbin to keynote the conference session.

The session, titled “Whither Washington?: The 2014 Election and What it Means for Libraries,” takes place from 8:30–10:00a.m. on Saturday, January 31, 2015, in the McCormick Convention Center, room W183A. With critical bills to reauthorize federal library funding, efforts to reform key privacy and surveillance statutes, and changes to copyright law all likely to be on legislators’ plates, libraries will engage heavily with the newly-elected 114th Congress.

Speakers include J. Mark Hansen, professor for the Department of Political Science at the University of Chicago and Thomas Susman, director of government affairs for the American Bar Association.

View other ALA Washington Office Midwinter Meeting conference sessions

The post What does the new Congress mean for libraries? appeared first on District Dispatch.

John Miedema: The author slip selects and boosts words for questioning unread content

planet code4lib - Wed, 2015-01-21 02:58

Lila uses author slips to “question” a collection of unread articles and books, suggesting “answers” or responses that extend the author’s material. The term, question, is appropriate because Lila uses natural language processing to enhance search. The application of natural language is shown here in three ways.

1. Distill the focus of the author slip

Perhaps the most important step is to decide which words are the most meaningful for questioning unread content. The design of the slip provides the necessary structures for making this decision, as shown in the figure:

An algorithm could use these design features to group keywords and calculate relative weights for use in searching, as shown in the table:

Figure # Field Calculation Word (weight) 4 Content Weight of 1 for each uncommon word. Increase by 1 for each occurrence, so static and dynamic add up to 3. static (3), dynamic (3), quality (1), scientific (1), knowledge (1), cave (1), political (1), institutions (1), centuries (1), king (1), constitution (1), destroying (1), government (1) 3 Subject Line Weight of 2, twice that of Content. Stop words removed. pencil (2), mightier (2), pen (2) 2 Tags Weight of 4, twice that of Subject Line staticVsDynamic (4) 1 Categories Weight of 8, twice that of Tags. “Quality” appears in both Content and Categories; the weight for this word could be their sum, 9. quality (8+1=9)

The words can be used as keywords in a natural language query. The weights would be included as boost factors, ranking search results higher if they contain those words.

2. Apply other natural language analysis, such as word frequency

In the above table, not all words were selected. In the Subject Line, stop words (e.g., “the”) are removed. This is a common practice in the query construction, since stop words are too common to add value. Similarly, in Content, only uncommon words are kept. In this case, word frequency could be calculated using a scientific measure. Words falling below a threshold could be skipped. Word frequency and other linguistic features, such as repetition and word concreteness, will be discussed in detail later on. These steps utilize knowledge of language to improve search relevance.

3. Take advantage of natural language index configurations

Unread content will be crawled and organized in a natural language index, such as Apache Solr’s Lucene index. An index of this sort can be configured to apply other natural language processing, e.g., synonym matching between queries and documents.

John Miedema: Eliza, Turing, and Whatson vs Lila. Enlist the cooperation of the human rather than design around a fight.

planet code4lib - Sun, 2015-01-18 16:09

Remember Eliza, the psychotherapist program? Eliza is a computer program written by Joseph Weizenbaum in the mid-sixties and circulated widely in the early days of personal computing. Eliza is modeled on non-directional Rogerian therapy, programmed with a few prompts and a simulation of human understanding by feeding back content from the user. It is an early example of natural language processing. Eliza appears smart as long as the user played along, but it is not hard to confuse the program. And it has bad grammar. People delight in teasing Eliza.

I am looking forward to seeing the new movie about Alan Turing, The Imitation Game, with Benedict Cumberbatch. Turing is regarded as the father of the computer, and he introduced the Turing Test, a natural language test of machine intelligence. In short, a human asks questions to determine if the hidden respondent is a machine or not. The human is trying to mess with the machine, focused on tripping it up.

‘Whatson’ was my first run at designing a cognitive system. It was designed to be a Question-Answer system for literature. I sensed that a big challenge would be the same one as Eliza or any program faced with the Turing Test. People like Question and Answer systems because it makes life easy. Ask a question, get an answer. It does the work for them. Do a little more than everyone expects and everyone expects a little more. The expectations increase. The questions get trickier. Even if the questioner was trying to help, the clues for finding the answer would often be missing. I would have to design a dialog mechanism for collecting more information. But often the questioner would be deliberately trying to test the intelligence and limits of the system. It’s what we humans do, push systems with the Turing Test. I needed a way to enlist the cooperation of the human user, so that I would not design around a fight.

In 2012 I patented a search technology, “Silent Tagging” (US 8,250,066). The technology solves a problem with social tagging. In the heyday of Web 2.0 people actively tagged content on the open web, as an aid to findability. It works on the open web, but in smaller closed contexts like company intranets, workers are much less likely to tag content. In a small population there are fewer adopters of emerging technology and workers are focused on immediate tasks. Was there a way to benefit from tagging without interrupting an employee’s workflow? I introduced the idea of Silent Tagging. The method associates two things in an employee’s normal workflow: keyword searches and clicks on search results. Keywords are like tags, intelligently selected by a searcher for findability. Clicks on search results follow a small cognitive act, deciding that one search result is better than another. The keyword-click association can be silently captured to adjust rankings of content and benefit other users. The key point here is that human cooperation can be implicitly enlisted in the design.

In January of this year I switched gears from Whatson to Lila. Lila is also a cognitive system but its design implicitly enlists human cooperation in natural language processing tasks. Lila is a cognitive writing system, designed to extend human reading, thinking and writing capabilities. The human user is involved in a writing project. In Lila, the author creates content in short units of text called slips. As I have described lately, in Lila, author slips are questions asked of unread content, just like questions in a Question-Answer system. The difference in Lila is that the author’s intent and work is implicitly intelligent, generating slips or questions with high signal and little noise. In one way, Lila is like Eliza, in that both depend on the intelligence of the user. The difference is that in Lila the purpose of the user and the system are implicitly (silently) aligned. No design work is required to convince or negotiate with a user.

Patrick Hochstenbach: Homework assignment #2 Sketchbookskool

planet code4lib - Sun, 2015-01-18 14:28
Filed under: Doodles Tagged: brugge, doodles, urbansketching

William Denton: Setting up Sonic Pi on Ubuntu, with Emacs

planet code4lib - Sun, 2015-01-18 00:39

It’s no trouble to get Sonic Pi going on a Raspberry Pi (Raspbian comes with it), and as I wrote about last week I had great fun with that. But my Raspberry Pi is slow, so it would often choke, and the interface is meant for kids so they can learn to make music and program, not for middle-aged librarians who love Emacs, so I wanted to get it running on my Ubuntu laptop. Here’s how I did it.

I wanted to get away from this and into Emacs.

There’s nothing really new here, but it might save someone some time, because it involved getting JACK working, which is one of those things where you begin carefully documenting everything you do and an hour later you have thirty browser tabs open, three of them to the same mailing list archive showing a message from 2005, and you’ve edited some core system files but you’re sure you’ve forgotten one and don’t have a backup, and then everything works and you don’t want to touch it in case it breaks.

Linux and Unix users should go to the GitHub Sonic Pi repo and follow the generic Linux installation instructions, which is what I did. I run Ubuntu; I had some of the requirements already installed, but not all. Then:

cd /usr/local/src/ git clone cd sonic-pi/app/server/bin ./compile-extensions cd ../../gui/qt ./rp-build-app ./rp-app-bin Interesting lichens.

The application compiled without any trouble, but it didn’t run because jackd wasn’t running. I had to get JACK going. The JACK FAQ helped.

sudo apt-get install qjackctl

qjackctl is a little GUI front end to control JACK. I installed it and ran it and got an error:

JACK is running in realtime mode, but you are not allowed to use realtime scheduling. Please check your /etc/security/limits.conf for the following line and correct/add it if necessary: @audio - rtprio 99 After applying these changes, please re-login in order for them to take effect. You don't appear to have a sane system configuration. It is very likely that you encounter xruns. Please apply all the above mentioned changes and start jack again!

Editing that file isn’t the right way to do it, though. This is:

sudo apt-get install jackd2 sudo dpkg-reconfigure -p high jackd2

This made /etc/security/limits.d/audio.conf look so:

# Provided by the jackd package. # # Changes to this file will be preserved. # # If you want to enable/disable realtime permissions, run # # dpkg-reconfigure -p high jackd @audio - rtprio 95 @audio - memlock unlimited #@audio - nice -19

Then qjackctl gave me this error:

JACK is running in realtime mode, but you are not allowed to use realtime scheduling. Your system has an audio group, but you are not a member of it. Please add yourself to the audio group by executing (as root): usermod -a -G audio (null) After applying these changes, please re-login in order for them to take effect.

Replace “(null)” with your username. I ran:

usermod -a -G audio wtd

Logged out and back in and ran qjackctl again and got:

ACK compiled with System V SHM support. cannot lock down memory for jackd (Cannot allocate memory) loading driver .. apparent rate = 44100 creating alsa driver ... hw:0|hw:0|1024|2|44100|0|0|nomon|swmeter|-|32bit ALSA: Cannot open PCM device alsa_pcm for playback. Falling back to capture-only mode cannot load driver module alsa

Here I searched online, looked at all kinds of questions and answers, made a cup of tea, tried again, gave up, tried again, then installed something that may not be necessary, but it was part of what I did so I’ll include it:

sudo apt-get install pulseaudio-module-jack My tuner, with knobs and buttons that are easy to frob.

Then, thanks to some helpful answer somewhere, I got onto the real problem, which is about where the audio was going. I grew up in a world where home audio signals (not including the wireless) were transmitted on audio cables with RCA jacks. (Pondering all the cables I’ve used in my life, I think the RCA jack is the closest to perfection. It’s easy to identify, it has a pleasing symmetry and design, and there is no way to plug it in wrong.) Your cassette deck and turntable would each have one coming out and you’d plug them into your tuner and then everything just worked, because when you needed you’d turn a knob that meant “get the audio from here.” I have only the haziest idea of how audio on Linux really works, but at heart there seems to be something similar going on, because what made it work was telling JACK which audio thingie I wanted.

I had to change the interface setting

You can pull up that window by clicking on Settings in qjackctl. The Interface line said “Default,” but I changed it to “hw:PCH (HDA Intel PCH (hw: 1)”, whatever that means) and it worked. What’s in the screenshot is different, and it works too. I don’t know why. Don’t ask me. Just fiddle those options and maybe it will work for you too.

I hit Start and got JACK going, then back in the Sonic Pi source tree I ran ./rp-app-bin and it worked! Sound came out of my speakers! I plugged in my headphones and they worked. Huzzah!


That was all well and good, but nothing is fully working until it can be run from Emacs. A thousand thanks go to sonic-pi.el!

I used the package manager (M-x list-packages) to install sonic-pi; I didn’t need to install dash and osc because I already had them for some reason. Then I added this to init.el:

;; Sonic Pi ( (require 'sonic-pi) (add-hook 'sonic-pi-mode-hook (lambda () ;; This setq can go here instead if you wish (setq sonic-pi-path "/usr/local/src/sonic-pi/") (define-key ruby-mode-map "\C-c\C-c" 'sonic-pi-send-buffer)))

That last line is a customization of my own: I wanted C-c C-c to do the right thing the way it does in Org mode: here, I want it to play the current buffer. A good key combination like that is good to reuse.

Then I could open up test.rb and try whatever I wanted. After a lot of fooling around I wrote this:

define :throb do |note, seconds| use_synth :square with_fx :reverb, phase: 2 do s = play note, attack: seconds/2, release: seconds/2, note_slide: 0.25 (seconds*4).times do control s, note: note sleep 0.25 control s, note: note-2 sleep 0.25 end end end throb(40, 32)

To get it connected, I ran M-x sonic-pi-mode then M-x sonic-pi-connect (it was already running, otherwise M-x sonic-pi-jack-in would do; sometimes M-x sonic-pi-restart is needed), then I hit C-c C-c … and a low uneasy throbbing sound comes out.

Emacs with sonic-pi-mode running

Amazing. Everything feels better when you can do it in Emacs. Especially coding music.

John Miedema: Lila Slip Factory I: “Question” rather than “query” for natural language processing

planet code4lib - Sat, 2015-01-17 20:35

The Lila cognitive writing system extends your reading abilities by converting unread content into slips, units of text for later visualization and analysis. How does Lila convert content into slips? The Lila Slip Factory has two processes. The first process, represented here, involves converting slips written manually by the author into questions to be asked of the unread content. I use the word “question” rather than “query” because I am using natural language processing in addition to more structured query methods. I want to create the association between author slips and natural language questions, such as one might find in a Question-Answer system.

  1. The first process begins with a stack of slips generated manually by the author. Each slip is processed.
  2. Natural language processing is applied to convert the slip into tokens and analyze parts of speech.
  3. The keyword analysis is an algorithm that converts the outputs of step two into keywords. The selection of keywords will depend on their placement in the author slip, i.e., in the subject line, content, categories and tags. It will depend on other text analytics such as word frequency and word concreteness. Weighting factors may be applied to the keywords. This algorithm will be explained more later.
  4. Once the keywords have been selected, a structured question can be formed.
  5. Each question is added to a collection that will be used in the second process, to be represented in a following post.

Harvard Library Innovation Lab: Link roundup January 17, 2015

planet code4lib - Sat, 2015-01-17 17:37

Legos. JavaScript. Photos. And request logs. Smart teams. What a range.

Why Some Teams Are Smarter Than Others

How to compose a smart team, 1. Equal talk time 2. Good at reading facial expressions 3. Not all dudes.

Issues to Readers

The living library. A video of a log of realtime book requests at the British Library.

Wonderful head shots of hand models

The faces attached to the hands that get photographed. I’d drop this in the Awesome Box.

How Lego Became The Apple Of Toys | Fast Company | Business + Innovation

Lego innovates with a walled garden Future Lab that relies on extensive user research


A ghost in the machine. A human ghost, typing to us. Or, maybe just a cool JavaScipt library.

District Dispatch: Free webinar: Understanding credit reports and scores

planet code4lib - Fri, 2015-01-16 21:36

On January 29th, the Consumer Financial Protection Bureau and the Institute for Museum and Library Services will offer a free webinar on financial literacy. This session has limited space so please register quickly.

Tune in to the Consumer Financial Protection Bureau’s monthly webinar series intended to instruct library staff on how to discuss financial education topics with their patrons. As part of the series, the Bureau invites experts from other government agencies and nonprofit organizations to speak about key topics of interest.

What’s the difference between your credit report and your credit score? How are scores used, and what makes them go up or down? Find out during this webinar when financial literacy experts discuss ways that past credit habits can affect your ability to get loans and lower interest rates in the future. If you would like to be notified of future webinars, or ask about in-person trainings for large groups of librarians, email; Subject: Library financial education training. All webinars will be recorded and archived for later viewing.

Webinar Details
January 29, 2015
2:30–3:30 p.m. EDT
To join the webinar, please click on the following link at the time of the webinar: Join the webinar

  • Conference number: PW1172198
  • Audience passcode: LIBRARY

If you are participating only by phone, please dial the following number:

  • Phone: 1-888-947-8930
  • Participant passcode: LIBRARY

The post Free webinar: Understanding credit reports and scores appeared first on District Dispatch.

M. Ryan Hess: Digital Author Services

planet code4lib - Fri, 2015-01-16 20:39

The producers of information at our academic institutions are brilliant at what they do, but they need help from experts in sharing their work online. Libraries are uniquely suited for the task.

There are three important areas where we can help our authors:

  1. Copyright and Author Rights Issues
  2. Developing Readership and Recognition
  3. Helping authors overcome technical hurdles to publishing online

Several libraries are now promoting copyright and author rights information services. These services provide resources (often LibGuides) to scholars who may be sold on the benefits of publishing online, but are unclear what their publishers allow. In fact, in my experience, this is one of the most common problems. Like I said, academics are busy people and focused on their area of specialization, which rarely includes reading the legalese of their publisher agreements, let alone keeping a paper trail handy. This is particularly true for authors that began their careers before the digital revolution.

At any rate, providing online information followed up with face-to-face Q&A is an invaluable service for scholars.

Lucretia McCulley of the University of Richmond and Jonathan Bull of the University of Valpraiso have put together a very concise presentation on the matter, detailing how they’ve solved these issues at their institutions.

Another service, which I’m actually developing at my institution presently, is providing copyright clearance as a service for scholars. In our case, I hope to begin archiving all faculty works in our institutional repository. The problem has been that faculty are busy and relying on individual authors to find the time to do the due diligence of checking their agreements just ain’t gonna happen. In fact, this uncertainty about their rights as authors often stops them cold.

In the service model I’m developing, we would request faculty activity reports or query some other resource on faculty output and then run the checks ourselves (using student labor) on services like SherpaRomeo. When items check out, we publish. When they don’t we post the metadata and link to the appropriate online resource (likely in an online journal).

Developing Readership & Recognition

Another area where library’s can provide critical support is assisting authors in growing their reputations and readership. Skills commonly found in libraries from search engine optimization (SEO) to cataloging play a role in this service offering.

At my institution, we use Digital Commons for our repository, which we selected partly because it has powerful SEO built into it. I’ve seen this at work: where a faculty posts something to the repository and within weeks (and even days), that content is rising to the top of Google search results, beating out even Facebook and LinkedIn for searches on an author’s name.

And of course, while we don’t normally mark up the content with metadata for the authors, we do provide training on using the repository and understanding the implications for adding good keywords and disciplines (subject headings) which also help with SEO.

The final bit, is the reporting. With Digital Commons, reports come out every month via email to the authors, letting them know what their top downloads were and how many they had. This is great and I find the reports help spur word-of-mouth marketing of the repository and enthusiasm for it by authors. This is built into Digital Commons, but no matter what platform you use, I think this is just a basic requirement that helps win author’s hearts, drives growth and is a vital assessment tool.

Walking The Last Mile

MacKenzie Smith of MIT has described the Last Mile Problem (Bringing Research Data into the Library, 2009), which is essentially where technical difficulties, uncertainty about how to get started and basic time constraints keep authors from ever publishing online.

As I touched on above, I’m currently developing a program to help faculty walk the last mile, starting with gathering their CVs and then doing the copyright checks for them. The next step would be uploading the content, adding useful metadata and publishing it for them. A key step before all of this, of course, is setting up policies for how the collection will be structured. This is particularly true for non-textual objects like images, spreadsheets, data files, etc.

So, when we talk about walking the last mile with authors, there’s some significant preparatory work involved. Creating a place for authors to understand your digital publishing services is a good place to start. Some good examples of this include:

Once your policies are in place, you can provide a platform for accepting content. In our case (with Digital Commons), we get stellar customer service from Bepress which includes training users how to use their tools. At institutions where such services is not available, two things will be critical:

  1. Provide a drop-dead easy way to deposit content, which includes simple but logical web forms that guide authors in giving you the metadata and properly-formatted files you require.
  2. Provide personal assistance. If you’re not providing services for adding content, you must have staffing for handling questions. Sorry, an FAQ page is not enough.
Bottom Line

Digital publishing is just such a huge area of potential growth. In fact, as more and more academic content is born digital, preserving it for the future in sustainable and systematic ways is more important than ever.

The Library can be the go-to place on your campus for making this happen. Our buildings are brimming with experts on archives, metadata, subject specialists and web technologies, making us uniquely qualified to help authors of research overcome the challenges they face in getting their stuff out there.

OCLC Dev Network: WMS Web Services Install January 18

planet code4lib - Fri, 2015-01-16 17:30

All Web services that require user level authentication will be unavailable during the installation window, which is between 2:00 – 8:00 AM Eastern USA, Sunday Jan 18th.


Subscribe to code4lib aggregator