You are here

Feed aggregator

SearchHub: Lucidworks Fusion Now Available in AWS Marketplace

planet code4lib - Tue, 2015-06-02 22:22
While it’s been easy to download and install Fusion on your own machine, we’ve worked with Amazon Web Services to offer it pre-installed on Amazon Machine Images, so it’s even easier to trial or get started using Fusion. If you don’t have easy access to a machine with the recommended OS, disk space, or memory, you can launch an instance of Fusion in the AWS cloud and have it running on a suitably configured system with just one click. We are offering two editions of Fusion in the AWS Marketplace. Both editions run the same software, but are offered under different terms and usage scenarios:
  • The Lucidworks Fusion demo is available without any software fees. It will only run on selected sizes of AWS EC2 hardware. You can use this edition to experiment with features and functionality of Fusion. Support is not offered on this version.
  • The Lucidworks Fusion standard server is intended for production use and is offered at hourly or annual rates. It is available on a wider variety of more powerful hardware, allowing you to use Fusion at larger scales. Importantly, this edition also includes standard product support to provide assurance to your production deployment.
If you haven’t had the chance to try out Fusion, here’s one more option to get you up and running, and see how you can use it to create powerful search-based applications.

The post Lucidworks Fusion Now Available in AWS Marketplace appeared first on Lucidworks.

FOSS4Lib Recent Releases: Koha - 3.16.11

planet code4lib - Tue, 2015-06-02 21:53

Last updated June 2, 2015. Created by Peter Murray on June 2, 2015.
Log in to edit this page.

Package: KohaRelease Date: Sunday, May 31, 2015

Eric Lease Morgan: HathiTrust Workset Browser on GitHub

planet code4lib - Tue, 2015-06-02 21:17

I have put my (fledgling) HathiTrust Workset Browser on GitHub. Try:

The Browser is a tool for doing “distant reading” against HathiTrust “worksets”. Given a workset rsync file, it will cache the workset’s content locally, index it, create some reports against the content, and provide the means to search/browse the collection. It should run out of the box on Linux and Macintosh computers. It requires the bash shell and Python, which come for free on these operating systems. Some sample content is available at:

Developing code with and through GitHub is interesting. I’m learning.

District Dispatch: You did it! USA FREEDOM on its way to the President’s desk

planet code4lib - Tue, 2015-06-02 20:38

Moments ago, after three amendments that would have gutted it failed in a series of roll call votes, the USA FREEDOM Act passed the Senate by a vote of 67 to 32.  Having previously passed the House by one of the largest bipartisan margins in recent history (338- 88), the first serious reform of the nation’s surveillance law is now en route to the White House where the President is poised to sign it this evening.

Victories like this don’t just happen.  They take collaboration with equally committed organizations and activists, and they take thousands of librarians fighting for years to restore the civil liberties of Americans eroded by the PATRIOT Act and the excesses of the NSA and other agencies.  This first but important victory is yours.

Tomorrow, the fight for further badly needed reforms will go on. Tonight, ALA thanks its allies in Washington and, most of all, you – for your commitment, for your passion, and for answering the call when it came to contact Congress and demand that it right serious wrongs.

You passed the USA FREEDOM Act today because you answered the call. BRAVO!

(ALA’s official statement is online here.)

The post You did it! USA FREEDOM on its way to the President’s desk appeared first on District Dispatch.

Shelley Gullikson: My CAIS poster about library space: background

planet code4lib - Tue, 2015-06-02 20:01

I’m presenting a poster at the CAIS Conference on Wednesday, June 3 on some of the work I’ve been doing on student use of library space. I tried to limit the wordiness of the poster, so am including the background information here. I still have to check with the organizers about whether I can post a copy of the poster here after the conference, but if not I’ll give some high-level findings.

Our newly renovated library includes a great new space for undergraduate research (the Discovery Centre), which is administered outside of the library. Work is being done to evaluate the use of that space and I wanted to make sure that the rest of the space in the building was also being evaluated. Reading the literature about library space, I saw that many evaluations of space happen at a single point of time or, if they recur, a few times over a single day, or the same time of day over a single week. I knew that the use of our building changed throughout the day and throughout the term. I wanted to look at how the various spaces in our building were being used, and see how that use changed over time.

I started with pretty basic seat sweeps, using a floor plan and tracking where people were sitting and if any group work was being done. Happily, this was quickly taken over by the library’s Stacks staff as part of their regular routine. They did sweeps of all five floors of the library morning, afternoon, and evening, Monday to Friday from the beginning of November until the end of April. I analyzed that data, looking particularly at any trends that emerged around time of day or over the course of the term.

(The sweeps involved stupidly labour-intensive data collection, data entry, and data analysis. There is definitely a better way to do this. Libraries at NCSU and GVSU have some great models, but I knew that taking the time to investigate and adapt these to my own library would push the data collection even later in the school year, so I chose more difficult data gathering sooner rather than more efficient data gathering later. I find it easy to postpone projects when I know I can’t do them “the right way” but in this case I decided to Just Fucking Do It.)

At the same time, I was part of another project to evaluate how students were using the space in the Discovery Centre and the rest of the library. Part of that project included a questionnaire, and the poster includes selected results from that – mostly around group work. I wanted to include results from photo elicitation in the poster but wasn’t able to get enough participation this spring. I hope to get that part done this fall.

District Dispatch: My First Book Expo America

planet code4lib - Tue, 2015-06-02 19:34

From Wikimedia Commons.

I went for work, of course.Our assignment was to meet with middle to small sized trade publishers to talk about ebooks and business models, make contacts, share expectations, and identify obstacles. We came away with a lot of good information and ideas, which are discussed in Carol Anthony’s e-content blog post.

But in this personal note, it was weird talking about ebooks when there were so many print books around! Sure, there are a lot of books in libraries, in bookstores, or at ALA conference exhibits, but not like in this extravaganza. Imagine well-crafted displays everywhere, piles of books, ribbons and banners, huge house-size posters of David Balducci, publisher swag, and people — lots of people.

Who cares about ebooks! Long live print!

It was hard to concentrate on ebooks at times, because the exhibit hall was loud, and did I mention— there were all of these books around! Pretty, colorful books; darling children’s books; gardening books; a bunch of recipe books that I wanted; and of course, cute photos of puppies and kittens on some of the book covers. Don’t get me wrong. I love books, but …I could not wait to get out of there.

On the last day, I arrived early to meet with a representative from Sourcebooks. We planned to meet at the Sourcebooks exhibit. Already, there were lines of people outside of the exhibit hall. I then learned that registrants have a limited opportunity to grab free books each morning when the exhibits open. There were several guards positioned, anticipating a frenzy. Everyone was pretty excited, and ready to run.

When the exhibits did open, the mass rush ensued. One contingent of women dashed past by me, their lieutenant yelling “Go to HarperCollins!”  I tried to get out of the way but not before I was bumped from behind firmly on my right side. Due to the ridiculously heavy bag I had over my shoulder, the blow was magnified to such an extent that I twirled a full 360 degrees, but remained standing ….and still in everybody’s way. I could hear breathing down my neck. “Go! Go! Move out of the way!” (Some people even cursed).

Luckily, Sourcebooks was only half way up the exhibit aisle so I was relatively unscathed, but some were not so lucky. Later in the day, I did see one woman down… and she was carrying a cane.

Oh the humanity!

Apparently, Book Con is even more intense, making Book Expo America look like a ride in the park. Print is alive, books are beautiful, and free books —they can be dangerous.

The post My First Book Expo America appeared first on District Dispatch.

David Rosenthal: Brittle systems

planet code4lib - Tue, 2015-06-02 15:00
In my recent rant on the Internet of Things, I linked to Mike O'Dell's excellent post to Dave Farber's IP list, Internet of Obnoxious Things, and suggested you read it. I'm repeating that advice as, below the fold, I start from a different part of Mike's post.

Mike writes:
The problem with pursuing such a goal is that it has led us down a path of "brittle failure" where things work right up until they fail, and then they fail catastrophically. The outcome is forced to be binary.

In most of Computer Science, there have been only relatively modest efforts directed at building systems which fail gracefully, or partially. Certainly some sub-specialties have spent a lot of effort on this notion, but it is not the norm in the education of a journeyman system builder.

If it is the case that we are unlikely to build any large system which is fail-proof, and that certainly seems to be the situation, we need to focus on building systems which can tolerate, isolate, and survive local failures.My response also made the IP list:
Mike is absolutely right to point out the brittle nature of most current systems. But education isn't going to fix this. My co-authors and I won Best Paper at SOSP2003 for showing a system in a restricted application space that, under attack, failed slowly and made "alarming noises". The analogy is with suspension bridges - they use stranded cables for just this reason.

However, the cost differential between stranded and solid cables in a bridge is small. Brittle fault-tolerant systems such as Byzantine Fault Tolerance are a lot more expensive than a non-fault-tolerant system that (most of the time) does the same job. Systems such as the one we showed are a lot more expensive than BFT. This is because three essential aspects of a, I believe any, solution are rate limits, excess replication and randomization.

The problem is that vendors of systems are allowed to disclaim liability for their products. Given that even the most egregious failure is unlikely to cause more than reputational harm, why would a vendor even implement BFT, let alone something much more expensive?

Just finding techniques that allow systems to fail gracefully is not going to be enough (not that it is happening). We need techniques that do so with insignificant added cost. That is a truly hard problem. But we also need to change the law so that vendors cannot escape financial liability for the failures of their products. That is an even harder problem.I should explain the comment about the importance of "rate limits, excess replication and randomization":
  • Rate Limits: The design goal of almost all systems is to do what the user wants as fast as possible. This means that when the bad guy wrests control of the system from the user, the system will do what the bad guy wants as fast as possible. Doing what the bad guy wants as fast as possible pretty much defines brittleness in a system; failures will be complete and abrupt. In last year's talk at UC Berkeley's Swarm Lab I pointed out that rate limits were essential to LOCKSS, and linked to Paul Vixie's article Rate-Limiting State making the case for rate limits on DNS, NTP and other Internet services. Imposing rate limits on system components makes the overall system more expensive.
  • Excess Replication: The standard fault-tolerance technique, Byzantine Fault Tolerance (BFT), is brittle. As faults in the system increase, it works perfectly until they pass a threshold. After that the system is completely broken. The reason is that BFT defines the minimum number of replicas that can survive a given number of faults. In order to achieve this minimum, every replica is involved in every operation of the system. There is no cushion of excess, unnecessary replicas to help the system retain some functionality above the threshold at which it stops behaving perfectly. The LOCKSS system was not concerned with minimizing the number of replicas. It assumed that it had excess replicas, Lots Of Copies, so it could Keep Stuff Safe by failing gradually as faults increased. Adding replicas to the system makes it more expensive.
  • Randomization: In general, the more predictable the behavior of the system the easier it is to attack. Randomizing the system's behavior makes it unpredictable. A significant part of the LOCKSS system's defenses is that since the selection of replicas to take part in each operation is random, the bad guy cannot predict which they are. Adding randomization to the system makes it more expensive (and harder to debug and test).
Debugging and testing were key to Karl Auerbach's contribution to the IP list discussion (reproduced in full by permission):
One of the motivations for packet switching and the ARPAnet was the ability to continue communications even during/after a nuclear holocaust. (Yes, I know that some people claim that that was not the purpose - but I was there, at SDC, from 1972 building ARPAnet like networks with that specific purpose.)

In recent years, or decades, we seem to be moving towards network architectures that are more brittle.

For example, there is a lot of discussion about "Software Defined Networks" and Openflow - which to my mind is ATM re-invented. Every time I look at it I think to myself "this design invites brittle failures."

My personal concern is slightly different. I come from a family of repairmen - radio and then TV - so when I look at something I wonder "how can it break?" and "how can it be repaired?".

We've engineered the internet so that it is not easy to diagnose problems. Unlike Ma Bell we have not learned to make remote loopbacks a mandatory part of many parts of the system. Thus we often have a flat, one sided view of what is happening. And if we need the view from the other end we often have to ask assistance of non-technical people who lack proper tools or knowledge how to use them.

As a first step we ought to be engineering more test points and remote loopback facilities into internet protocols and devices.

And a second step ought to be the creation of a database of network pathology. With that we can begin to create tools that help us reason backwards from symptoms towards causes. I'm not talking artificial intelligence or even highly expert systems. Rather this would be something that would help us look at symptoms, understand possible causes, and know what tests we need to run to begin to evaluate which of the possible causes are candidates and which are not.Examples of brittle systems abound:
  • SSL is brittle in many ways. Browsers trust a pre-configured list of certificate authorities, whose role is to provide the illusion of security. If any one of them is malign or incompetent, the system is completely broken, as we see with the recent failure of the official Chinese certificate authority.
  • IP routing is brittle. Economic pressures have eliminated the "route around failure" property of the IP networks that Karl was building to survive nuclear war. Advertizing false routes is a routine trick used by the bad guys to divert traffic for interception.
  • Perimeter security as implemented in firewalls is brittle. Once the bad guy is inside there are few limits on what, and how fast, he can do Bad Things.
  • The blockchain, and its applications such as Bitcoin are brittle.
The blockchain is brittle because it can be taken over by a conspiracy. As I wrote in another of my contributions to the IP list, responding to and quoting from this piece of techno-optimism:
The revolution in progress can generally be described as “disintermediation”. It is the transference of trust, data, and ownership infrastructure from banks and businesses into distributed peer to peer network protocols.

A distributed “world wide ledger” is one of several technologies transforming our highly centralized structures. This technology, cryptically named the “block chain” is embodied in several distributed networks such as Bitcoin, Eris Industries DB, and Ethereum.

Through an encrypted world wide ledger built on a block chain, trust in the systems maintained by third party human institutions can be replaced by trust in math. In block chain systems, account identity and transactions are cryptographically verified by network “consensus” rather than by trust in a single third party. These techno-optimists never seem to ask "what could possibly go wrong"? To quote from this blog post:
Since then, there has been a flood of proposals to base other P2P storage systems, election voting, even a replacement for the Internet on blockchain technology. Every one of these proposals for using the blockchain as a Solution for Everything I've looked at appears to make three highly questionable assumptions:
There have been times in the past when a single mining pool controlled more than 50% of the mining power, and thus the blockchain. That pool is known to have abused their control of the blockchain.

As I write this, 3 pools control 57% of the mining power. Thus a conspiracy between three parties would control the blockchain. More than two decades ago at Sun I was convinced that making systems ductile (the opposite of brittle) was the hardest and most important problem in system engineering. After working on it in the LOCKSS Program for nearly 17 years I'm still convinced that this is true.

Library of Congress: The Signal: Dodge that Memory Hole: Saving Digital News

planet code4lib - Tue, 2015-06-02 14:00

Newspapers are some of the most-used collections at libraries. They have been carefully selected and preserved and represent what is often referred to as “the first draft of history.” Digitized historical newspapers provide broad and rich access to a community’s past, enabling new kinds of inquiry and research. However, these kinds of resources are at risk of being lost to future users.  Networked digital technologies have changed how we communicate with each other and have rapidly changed how information is disseminated. These changes have had a drastic effect in the news industry, disrupting delivery mechanisms, upending business models and dispersing  resources across the world wide web.

Current library acquisition and preservation methods for news are closely linked to the physical newspaper. Ensuring that the new modes of journalism, which are moving toward a “digital- and mobile-first” model, are captured and preserved at libraries and other memory institutions is the main goal of the Dodging the Memory Hole series of events. The first was organized in November 2014 by the Reynolds Journalism Institute at the University of Missouri.  The most recent took place in May of 2015 and was organized by the Educopia Institute at the Charlotte Mecklenburg Public Library in Charlotte, NC.

Hong Kong, 31st day of the Umbrella Revolution, taken October 28, 2014 by Pasu Au Yeung.

I had the opportunity to close out the May meeting and highlight areas where continued work would have an impact in helping libraries collect, preserve and provide access to born-digital news. A (slightly longer but hopefully clearer) version of my talk (pdf) is below.

I want to start with a photograph from last year’s protest in Hong Kong known as the Umbrella Revolution. The picture speaks to the complexity of the problem we face in capturing and preserving the news of today. The protest was unique in that it was one of the first protests in China organized, sustained and broadcast via social media. Capturing a diverse set of materials about this news event would mean capturing the stories from established media companies and the writings and images from individual blogs and other social media. This is especially important in the case of the Umbrella Revolution because official media outlets (and social media accounts) in China are often censored. This protest was also an example of how activism in general has adapted due to networked digital technologies. Future researchers studying social and political movements happening right now would never get the whole story without access to the social media.

The role of the journalist is to get the story out and just like other publishers in the digital age, they’ve had to adapt to stay relevant. Digital storytelling is becoming more dynamic,  exemplified by publications like Highline, a new long-form product from Huffington Post which is richly illustrated with audio and visual elements and is translated into a variety of languages. We can expect that in the pursuit of getting the story out and advancing story telling, news content will come from more sources, be more dynamic and continue using all kinds of formats and distribution mechanisms.

Memory hole.

Libraries have also been transformed by digital technologies. There are a large number of digitized collections; we are creating vast and rich resources and, I think, providing great access and good stewardship to a large amount of this digitized content. Chronicling America and the Digital Public Library of America are great examples of this. However, there are gaps–or holes–in our collections, especially the born-digital content about contemporary events. Libraries haven’t broadly adopted collecting practices so that they are relevant to the current publishing environment which today is dominated by the web.

Several people at this meeting mentioned the study done by Andy Jackson (ppt) at the British Library. I have his permission to share these slides which he presented at the recent General Assembly of the International Internet Preservation Consortium. It is a simple but powerful study of ten years (2004-2014) worth of content from the UK Web Archive. It aims to find out what they have in their archive that is not on the live web anymore. He looked at a sample of URLs per year and analyzed the content to determine if the content at the URL in the archive was still at the same URL on the live web. He broke down and color coded the URLs according to a percentage scale expressing if the content was moved, changed, missing or gone. He found that after one year half of the content was either gone or had been changed so much as to be unrecognizable. After ten years almost no content still resides at its original URL. This analysis was done across all domains but you can make a logical assumption that news content wouldn’t fare any better if subjected to this same type of analysis.

Fifty percent of URLS in the UK Web Archive have lost or missing content after one year. After ten years nearly all content is moved, changed, missing or gone. Credit: Taken from a presentation given by Andy Jackson at the IIPC GA  Apr 27, 2015. The full presentation available at

We have clear data that if content is not captured from the web soon after its creation, it is at risk. Which brings me to where I think our main challenge is with collecting born-digital news: library acquisition policies and practices. Libraries collect the majority of their content by buying something–a newspaper subscription, a standing order for a serial publication, a package of titles from a publisher, an access license from an aggregator, etc. The news content that’s available for purchase and printed in a newspaper is a small subset of the content that’s created and available online. Videos, interactive graphs, comments and other user-generated data are almost exclusively available online. The absence of an acquisition stream for this content puts it at risk of being lost to future library and archives users.

Establishing relationships (and eventually agreements) with the organizations that create, distribute and own news content is one of the more promising strategies for libraries to collect digital news content.  Brian Hocker from KXAS-TV, an NBC affiliate in the Dallas area, shared the story of how KXAS partnered with the University of North Texas Libraries to digitize, share and ultimately preserve their station’s video archives as part of the Portal for Texas History. Jim Kroll from the Denver Public library also shared his story of acquiring the archives of the Rocky Mountain News after the newspaper ceased publication. Both stories emphasized the importance of establishing lasting relationships with decision-makers from news outlets in their respective communities. They also each created donor agreements that provided community access to the news archives which can serve as models for future agreements.

The relationships that enabled these agreements were the result of what I think of as entrepreneurial collection development in the model of acquiring special collections. The archives were pursed actively and over time, they represent a new type of content, required a new type of relationship with a donor and were a good fit–both geographically and topically–with existing collections at UNT and DPL.

Web archiving is another promising strategy to capture and preserve born-digital news. The Library of Congress recently announced its effort to save news websites, specifically those not affiliated with traditional news companies. Ben Walsh, creator of, announced that his service is now Memento-compliant, which will allow the archived front pages of websites from major-market newspapers that PastPages collects to be available in a Momento search. These projects will capture content at a national level, but the hyper-local news sites and citizen journalism and other niche blogs– news that used to be published as community newsletters or pamphlets–are most likely not being captured. Internet Archive’s Archive-It service is a mechanism for smaller libraries to engage in web archiving and capture some of this unique content. Capturing the social media around news events continues to be challenging but tools have been developed to capture tweets and collections of tweets around news events are being captured and shared.

The Dodging the Memory Hole events have thus far been excellent opportunities to bring librarians, archivists, the news industry and technologists together to help save news content for future generations. Look for more from this group on awareness raising, studies on what news content has already been lost, collaborations with the developers of news content management systems, and more guidance on developing donation agreements. To read more about the event, check out Trevor Owens’ report on the IMLS blog.

Open Knowledge Foundation: Why Open Contracting Matters to the OGP Agenda in Africa

planet code4lib - Tue, 2015-06-02 13:53

This is a guest post by Seember Nyager. Seember is an Open Knowledge/Code4Africa Open Government Fellow advocating for the adoption of open contracting data standards in Nigeria.

To be honest, the state of public services across Africa shames us. Often, you find that public services do not meet the generally accepted standards of efficiency, regular maintenance and service delivery. In most cases, it is unknown and improbable whether public services followed any specifications in the phase of contract execution and service delivery is often poor and non-standardized.

The state of public services on the continent is hard to relate with the abundance of our natural resources and the amount of external financing that is channeled to Africa in each year. The standard of Public service delivery has consequences; sometimes tragic and the prevalence of tragedy is witnessed in our health care systems. Arguably the most tragic consequence of low standards in public service delivery is the erosion of trust between the Government and the people as this is the greatest saboteur of good intentions that are in the public interest.

There is no quick fix to the infrastructure and service delivery deficit that plagues the continent. Some public services such as efficient transportation networks may only be fully operational after a decade. But there are ways to rebuild trust between Governments and the citizens and chart a formidable course for sustained efficiency in public service delivery.

In another vein, citizens of OGP participating countries may not know about the OGP and in the light of the current commitments being made by countries, may view OGP as an abstract concept that they do not need to involve themselves with. But there is compelling reason to believe that citizens of OGP participating countries may be able to relate and internalize the values behind the OGP if Open Contracting practices are made a part of the OGP agenda in each of these countries.

Open contracting advocates for all stages that lead to public service delivery to be exposed to scrutiny subject to narrowly defined exceptions. Open contracting also advocates that such routine information ought not be requested for but made readily available through multiple channels so that as much as it is possible, the people know where responsibility for the success or failure of public project lies and can participate in the contracting process which ultimately leads to public service delivery.

The scrutiny of the public contracting process requires that information is presented in ways that enables one set of information to be linked to other related information on a public project or service to be delivered. This would require data standards to be followed. Open contracting would require that information is shared through multiple channels and taken to people in formats that they would understand. Open contracting requires that information on public contracts has milestones that show expectations at each stage of contract implementation and specifications that must have been met at each milestone. Open contracting requires that there is publicly available information of the service to be expected at the end of contract execution. Open contracting requires information around the contracting process to be regularly updated and for contracting information to facilitate continuous dialogue between representatives of Government, the people, the contractors and other stakeholders within a community.

For OGP Africa participating countries like Kenya and Ghana who have FOI and RTI bills currently going through parliament, it is recommended that their bills reflect the proactive disclosure provisions on public finance information as contained in the Model Law on Access to Information. This would provide the legal backing for a robust open contracting practice to thrive. For OGP Africa participating countries like South Africa that are currently undergoing a reform to public sector procurement, it is recommended that there are clear requirements backed by law to ensure public participation in each phase of the contracting process.

For OGP participating countries like Sierra Leone who already have a robust access to information and Public Procurement Law, it is recommended that Contracting data such as pricing benchmarks for public contracts is made readily available, the data follows specified standard, is updated regularly and distributed through multiple channels, in ways that the people can understand.

Committing to open contracting practices would require Government and civil society organizations working closely together and the OGP provides that platform. Further, the Open Contracting Partnership and the web foundation have developed Open contracting data standards that would be of great help to each country willing to adopt open contracting practices. As a non-participant to the OGP, I am hopeful that my own country, Nigeria, would prioritize trust in public service delivery by adopting the spirit and practice of Open Contracting.

Seember can be reached on twitter @Seember1

LITA: Create, Build, Code and Hack with a choice of 4 LITA preconferences

planet code4lib - Tue, 2015-06-02 13:00

Register now for one of four exciting LITA pre conferences at 2015 ALA Annual in San Francisco.

On Friday, June 26, at the 2015 ALA Annual Conference in San Francisco, the Library and Information Technology Association (LITA) brings you a choice of 4 dynamic, useful and fun preconferences. These all-day preconferences, 8:30 a.m. – 4:00 p.m., will teach you how to create, build, code and hack the newest trends in technology for libraries. Register through the 2015 ALA Annual Conference website. The price to register is: $235 for LITA members (use special code LITA2015); $350 for ALA members; and $380 for non-members.

Creating Better Tutorials Through User-Centered Instructional Design. Hands-on workshop with experts from the University of Arizona. Event Code: LIT1

Build a Circuit & Learn to Program an Arduino in a Silicon Valley Hackerspace: Panel of Inventors & Librarians Working Together for a More Creative Tomorrow. This workshop will convene at Noisebridge, a maker space in San Francisco. Clearly, it will be hands on. Event Code: LIT3

Learn to Teach Coding and Mentor Technology Newbies – in Your Library or Anywhere! Work with experts from the Black Girls CODE to become master technology teachers. Event Code: LIT2

Let’s Hack a Collaborative Library Website! This hands-on experience will consist of a morning in-depth introduction to the tools, followed by an afternoon building a single collaborative library website. Event Code: LIT4

Through hands on activities participants will learn to code, build, create and learn to teach others new initiatives such as video tutorials, collaborative website tools, programming languages and arduino boards. These events are intended for any librarian wanting to stretch themselves and meet their patrons in these new hands on technologies worlds.

Notable preconference presenters include: Yvonne Mery, Leslie Sult and Rebecca Blakiston from the University of Arizona Libraries; Mitch Altman of Noisebridge, Brandon (BK) Klevence of The Maker Jawn Initiative (Philadelphia, PA), Angi Chau off the Castilleja School (Palo Alto,CA), Tod Colegrove and Tara M Radniecki of the University of Nevada – Reno; Kimberly Bryant and Lake Raymond from Black Girls CODE; and Kate Bronstad, Heather J Klish of Tufts University; and Junior Tidal of the New York City College of Technology.

See the LITA conference web site for information about LITA events including details on the preconferences, the LITA Presidents program with Lou Rosenfeld, the Top Technology Trends panel, and social events.

For questions, contact Mark Beatty, LITA Programs and Marketing Specialist at or (312) 280-4268.

Open Library Data Additions: Amazon Crawl: part 13

planet code4lib - Tue, 2015-06-02 06:35

Part 13 of Amazon crawl..

This item belongs to: data/ol_data.

This item has files of the following types: Data, Data, Metadata, Text

District Dispatch: ALA draws line in sand on USA FREEDOM amendments

planet code4lib - Tue, 2015-06-02 02:37


The United States Senate adjourned today with the stage set for votes Tuesday afternoon on at least three “hostile” amendments to the USA FREEDOM Act filed by Senate Majority Leader Mitch McConnell (R-KY).  As explained in a letter by Washington Office Executive Director Emily Sheketoff that will be delivered to all Senators ahead of Tuesday’s votes, passage of any one such amendment would water down the USA FREEDOM Act so seriously as to cause ALA to reverse course and oppose the bill.

Now is the time for one last push by librarians everywhere to again call and email their Senators to deliver a simple message: 1) VOTE “NO” on any and every amendment that would weaken the USA FREEDOM Act; and 2) PASS the bill now without change so that the President can sign it without delay.

Please, visit ALA’s Legislative Action Center to send that urgent message now.

For detailed information on the pending amendments and why they’re utterly unacceptable, please see this analysis by our coalition compatriots at the Center for Democracy and Technology.  The ALA Washington Office’s “line in the sand” letter is available here: USAF Letter 060115.

The post ALA draws line in sand on USA FREEDOM amendments appeared first on District Dispatch.

LibUX: 020: Localizing the User Experience with Robert Laws

planet code4lib - Mon, 2015-06-01 23:59

Robert Laws is the Digital Services Librarian for Georgetown University’s School of Foreign Service in Qatar. In this episode of LibUX, Robert discusses customizing Drupal and LibGuides to present a more localized version of those sites for his campus. He gives tips on how he got started and how to stay relevant in the world of web services. As our first international guest, Amanda asked him about the challenges of regional restrictions on content.

You can listen to LibUX on Stitcher, find us on iTunes, or subscribe to the straight feed. Consider signing-up for our weekly newsletter, the Web for Libraries.

The post 020: Localizing the User Experience with Robert Laws appeared first on LibUX.

District Dispatch: Update on 1201 proceedings

planet code4lib - Mon, 2015-06-01 22:06

In the last two weeks, the Copyright Office held ten hearings in Los Angeles and Washington, D.C. and heard the arguments for and against circumvention of digital locks—Section 1201 of the Digital Millennium Copyright Act—on the proposed classes of works, including cell phones, video games, e-readers, and oh yes, farm equipment. Many have said that these hearings are unbearable and long, but in a weird way, I like to attend them (and ALA Council). Unfortunately, I was out of town and missed the hearings. So read along with me, reports on the hearings from Brandon Butler of the Washington College of Law at American University, and Rebecca Tushnet, from Georgetown Law.

The post Update on 1201 proceedings appeared first on District Dispatch.

HangingTogether: What’s changed in linked data implementations?

planet code4lib - Mon, 2015-06-01 20:47

Last year we received 96 responses to the OCLC Research “International Linked Data Survey for Implementers” reporting 172 linked data projects or services in 15 countries, of which 76 were described. Of the 76 projects described, 27 (36%) were not yet implemented and 13 (17%) had been in production in less than a year.

So we were curious – what might have changed in the last year? OCLC Research decided to repeat its survey to learn details of specific projects or services that format metadata as linked data and/or make subsequent uses of it.  We’re curious to see whether the projects that had not yet been implemented have now been, whether any of last year’s respondents would have any different answers, and whether we could encourage linked data implementers who didn’t respond to last year’s survey to respond to this year’s.

The questions are the same so we can more easily compare results. (Some multiple-choice questions have more options taken from the “other” responses in last year’s responses, and some open-ended questions are now multiple-choice, again based on last year’s responses.) The target audiences are staff who have implemented or are implementing linked data projects or services-either by publishing data as linked data or ingesting linked data resources into their own data or applications, or both.

The survey is available at

We are asking that responses be completed by 17 July 2015. As with last year’s survey, we will share the examples collected for the benefit of others wanting to undertake similar efforts, wondering what is possible to do and how to go about it. We summarized last year’s results in a series of blog posts here: 1) Who’s doing it; 2) Examples in production; 3) Why and what institutions are consuming; 4) Why and what institutions are publishing; 5) Technical details; 6) Advice from the implementers.

What do you think has changed in the last year?



About Karen Smith-Yoshimura

Karen Smith-Yoshimura, program officer, works on topics related to renovating descriptive and organizing practices with a focus on large research libraries and area studies requirements.

Mail | Web | Twitter | More Posts (59)

District Dispatch: Update on 1201 proceedings

planet code4lib - Mon, 2015-06-01 20:16

In the last two weeks, the Copyright Office held ten hearings in Los Angeles and Washington, D.C. and heard the arguments for and against circumvention of digital locks—Section 1201 of the Digital Millennium Copyright Act—on the proposed classes of works, including cell phones, video games, e-readers, and oh yes, farm equipment. Many have said that these hearings are unbearable and long, but in a weird way, I like to attend them (and ALA Council). Unfortunately, I was out of town and missed the hearings. So read along with me, reports on the hearings from Brandon Butler of the Washington College of Law at American University, and Rebecca Tushnet, from Georgetown Law.


The post Update on 1201 proceedings appeared first on District Dispatch.

Patrick Hochstenbach: Triennale Brugge 2015

planet code4lib - Mon, 2015-06-01 18:23
Filed under: Doodles Tagged: brugge, triennale, urban, urbansketching

District Dispatch: Experts to demystify 3D printing policies at 2015 ALA Conference

planet code4lib - Mon, 2015-06-01 18:22

As more and more libraries nationwide begin to offer 3D printing services, library leaders are now confronting a litany of copyright, trademark and patent complications that arise from the new technology. To help the library community address 3D printing concerns, the American Library Association (ALA) Committee on Legislation’s (COL) Copyright Subcommittee will explore 3D printing policy issues at the 2015 ALA Annual Conference in San Francisco.

Join Tomas A. Lipinski, dean of the University of Wisconsin-Milwaukee’s School of Information Studies and COL Copyright Subcommittee member, St. Louis’ University City Public Library Director Patrick Wall, and other policy experts at the session “Copyright and 3D Printing: Be Informed, Be Fearless, Be Smart!” for a “plain English” discussion of 3D printing, its copyright implications, and the patent and trademark issues that this breakthrough technology raises for libraries everywhere. The session will take place from 10:30 to 11:30 a.m. on Saturday, June 27, 2015, at the Moscone Convention Center in room 2001 of the West Building.

Lipinski has worked in a variety of legal settings including the private, public and non-profit sectors. He currently teaches, researches and speaks frequently on various topics within the areas of information law and policy, especially copyright, free speech and privacy issues in schools and libraries. Patrick Wall has been the director of St. Louis’ University City Public Library since March of 2011 and was its assistant director for the previous eight years. He also serves as President of the Municipal Library Consortium of St. Louis County, a group of nine libraries providing collective public access to more than 700,000 volumes.

  • Tomas A. Lipinski, dean of the University of Wisconsin-Milwaukee’s School of Information Studies, member of the American Library Association Committee on Legislation
  • Patrick Wall, director, University City Public Library (St. Louis)

View all ALA Washington Office conference sessions

The post Experts to demystify 3D printing policies at 2015 ALA Conference appeared first on District Dispatch.

ACRL TechConnect: Where do Library Staff Learn About Programming? Some Preliminary Survey Results

planet code4lib - Mon, 2015-06-01 14:05

[Editor’s Note:  This post is part of a series of posts related to ACRL TechConnect’s 2015 survey on Programming Languages, Frameworks, and Web Content Management Systems in Libraries.  The survey was distributed between January and March 2015 and received 265 responses.  A longer journal article with additional analysis is also forthcoming.  For a quick summary of the article below, check out this infographic.]

Our survey on programming languages in libraries has resulted in a mountain of fascinating data.  One of the goals of our survey was to better understand how staff in libraries learn about programming and develop their coding skills.  Based upon anecdotal evidence, we hypothesized that library staff members are often self-taught, learning through a combination of on-the-job learning and online tutorials.  Our findings indicate that respondents use a wide variety of sources to learn about programming, including MOOCs, online tutorials, Google searches, and colleagues.

Are programming skills gained by formal coursework, or in Library Science Master’s Programs?

We were interested in identifying sources of programming learning, whether that involved course work (either formal coursework as part of a degree or continuing education program, or through Massive Online Open Courseware (MOOCs)).  Nearly two-thirds of respondents indicated they had an MLS or were working on one:

When asked about coursework taken in programming, application, or software development, results were mixed, with the most popular choice being 1-2 classes:

However, of those respondents who have taken a course in programming (about 80% of all respondents) AND indicated that they either had an MLS or were attending an MLS program, only about a third had taken any of those courses as part of a Master’s in Library Science program:

Resources for learning about programming

The final question of the survey asked respondents, in an open-ended way, to describe resources they use to learn about programming.  It was a pretty complex question:

Please list or describe any learning resources, discussion boards or forums, or other methods you use to learn about or develop your skills in programming, application development, or scripting. Please includes links to online resources if available. Examples of resources include, but are not limited to:, MOOC courses, local community/college/university course on programming, Books, Code4Lib listserv, Stack Overflow, etc.).

Respondents gave, in many cases, incredibly detailed responses – and most respondents indicated a list of resources used.  After coding the responses into 10 categories, some trends emerged.  The most popular resources for learning about programming, by far, were courses (whether those courses were taken formally in a classroom environment, or online in a MOOC environment):

To better illustrate what each category entails, here are the top five resources in each category:

By far, the most commonly cited learning resource was Stack Overflow, followed by the Code4Lib Listserv, Books/ebooks (unspecified) and  Results may skew a little toward these resources because they were mentioned as examples in the question, priming respondents to include them in their responses.  Since links to the survey were distributed, among other places, on the Code4Lib listserv, its prominence may also be influenced by response bias. One area that was a little surprising was the number of respondents that included social networks (including in-person networks like co-workers) as resources – indeed, respondents who mentioned colleagues as learning resources were particularly enthusiastic, as one respondent put it:

…co-workers are always very important learning resources, perhaps the most important!

Preliminary Analysis

While the data isn’t conclusive enough to draw any strong conclusions yet, a few thoughts come to mind:

  • About 3/4 of respondents indicated that programming was either part of their job description, or that they use programming or scripting as part of their work, even if it’s not expressly part of their job.  And yet, only about a third of respondents with an MLS (or in the process of getting one) took a programming class as part of their MLS program.  Programming is increasingly an essential skill for library work, and this survey seems to support the view that there should be more programming courses in library school curriculum.
  • Obviously programming work is not monolithic – there’s lots of variation among those who do programming work that isn’t reflected in our survey, and this survey may have unintentionally excluded those who are hobby coders.  Most questions focused on programming used when performing work-related tasks, so additional research would be needed to identify learning strategies of enthusiast programmers who don’t have the opportunity to program as part of their job.
  • Respondents indicated that learning on the job is an important aspect of their work; they may not have time or institutional support for formal training or courses, and figure things out as they go along using forums like Stack Overflow and Code4Lib’s listserv.  As one respondent put it:

Codecademy got me started. Stack Overflow saves me hours of time and effort, on a regular basis, as it helps me with answers to specific, time-of-need questions, helping me do problem-based learning.

TL;DR?  Here’s an infographic:

In the next post, I’ll discuss some of the findings related to ways administration and supervisors support (or don’t support) programming work in libraries.

LITA: Negotiate!

planet code4lib - Mon, 2015-06-01 13:00

I’m going to say it: Librarians are rarely effective negotiators. Way too often we pay full prices for mediocre resources without demur. Why?

Credit: Flickr user changeorder

First of all, most librarians are introverts and/or peaceable sorts who dislike confrontation. Second, we are unlikely to get bonuses or promotions when we save our organizations money, so there goes most of the extrinsic motivation for driving a hard bargain with vendors. Third and most importantly, we go into the library business because libraries aren’t a business. Most of us deliver government-funded public services, so we have zero profit motive, and our non-business mentality is almost a professional value in itself. But this failure to negotiate weakens our value to the communities we serve.

Libraries pay providers over a billion dollars a year for digital services and resources, only to get overpriced subscriptions and comparatively shoddy products. When did you last meet a librarian who loved their ILS? Meanwhile, we lose whatever dignity remains to us when our national associations curry favor with “Library Champions” like Elsevier, soliciting these profiteers to give back a minuscule fraction of their profits squeezed from libraries. We forget that vendors exist because of us.

Recently I sat in a dealer’s office for ninety minutes, refusing to budge till I got a better deal on my new car. The initial offer was 7% APR. The final offer was 0.9% APR with new all-season floor mats thrown in. The experience awoke me to the realization that I, as the customer, always held the leverage in any business relationship. I was thrilled.

I applied that realization to my work managing electronic resources, renegotiating contracts, haggling reduced rates, and saving about 10% of my annual budget my first year while delivering equivalent levels of services. This money then could be shuffled to fund other e-resources and services, or saved so as to forestall forced budgets cuts and make the library look good to external administrators keen to cut costs.

The key to negotiation is not to fold at the first “no.” Initial price quotes and contracts are a starting point for negotiation, by no means the final offer. Trim unneeded services to obtain a price reduction. Renegotiate, don’t renew, contracts. Ask to renew existing subscriptions at the previous year’s price, dodging the 5% annual increase that most providers slap on products. And take nothing at face value! I once saved $4000 on a single bill because I phoned to ask for a definitive list of our product subscriptions only to discover that the provider had neglected to document one very active subscription. Sooo… we didn’t have to pay for it.

Don’t hesitate to call out bad service either. A company president once personally phoned me because I had rather vociferously objected to his firm’s abysmal customer service. Bear in mind, though, that most vendor reps are delightful people who care about libraries too. So when you’re negotiating, be firm and persistent but please don’t be a jerk.

Long-term solutions to vendor overpricing and second-rate products include consortiums, open access publishing, and open source software. But the simplest and quickest short-term solution for us individuals is to negotiate to get your money’s worth. Vendors want to keep your business, so to get a better deal, sometimes all you have to do is ask.

Michael Rodriguez is the E-Learning Librarian at Hodges University in Florida. He manages the library’s digital services and resources, including 130-plus databases, the library website, and the ILS. He also teaches and tutors students in the School of Liberal Studies and the School of Technology, runs social media for LITA, and does freelance professional training and consulting. He tweets @topshelver and blogs at Shelver’s Cove.


Subscribe to code4lib aggregator