You are here

Feed aggregator

District Dispatch: Free webinar: Understanding Social Security

planet code4lib - Thu, 2014-09-04 16:26

Photo by the Knight Foundation

Do you know how to help your patrons locate information on Supplemental Security Income or Social Security? The American Library Association (ALA) is encouraging librarians to participate in “My SSA,” a free webinar that will teach participants how to use My Social Security (MySSA), the online Social Security resource.

Presented by leaders and members of the development team of MySSA, this session will provide attendees with an overview of MySSA. In addition to receiving benefits information in print, the Social Security Administration is encouraging librarians to create an online MySSA account to view and track benefits.

Attendees will learn about viewing earnings records and receiving instant estimates of their future Social Security benefits. Those already receiving benefits can check benefit and payment information and manage their benefits.

Speakers include:

  • Maria Artista-Cuchna, Acting Associate Commissioner, External Affairs
  • Kia Anderson, Supervisory Social Insurance Specialist
  • Arnoldo Moore, Social Insurance Specialist
  • Alfredo Padilia Jr., Social Insurance Specialist
  • Diandra Taylor, Management Analyst

Date: Wednesday, September 17, 2014
Time: 2:00 PM – 3:00 PM EDT
Register for the free event

If you cannot attend this live session, a recorded archive will be available. To view past webinars also hosted collaboratively with iPAC, please visit

The post Free webinar: Understanding Social Security appeared first on District Dispatch.

Library of Congress: The Signal: DPOE Working Group Moves Forward on Curriculum

planet code4lib - Thu, 2014-09-04 13:03

The working group at their recent meeting. Photo by Julio Diaz.

For many organizations that are just starting to tackle digital preservation, it can be a daunting challenge – and particularly difficult to figure out the first steps to take.  Education and training may be the best starting point, creating and expanding the expertise available to handle this kind of challenge.  The Digital Preservation Outreach and Education  program here at the Library aims to do just that, by providing the materials as well as the hands-on instruction to help build the expertise needed for current and future professionals working on digital preservation.

Recently, the Library was host to a meeting of the DPOE Working Group, consisting of a core group of experts and educators in the field of digital preservation.  The Working Group participants were Robin Dale (Institute of Museum and Library Services), Sam Meister (University of Montana-Missoula), Mary Molinaro (University of Kentucky), and Jacob “Jake” Nadal (Princeton University).  The meeting was chaired by George Coulbourne of the Library of Congress, and Library staffers Barrie Howard and Kris Nelson also participated.

The main goal of the meeting was to update the existing DPOE Curriculum, which is used as the basis for the Program’s training workshops and then subsequently, by the trainees themselves.  A survey is being conducted to gather even more information, and will help inform this curriculum as well (see a related blog post).   The Working Group reviewed and edited all of the six substantive modules which are based on terms from the OAIS Reference Model framework:

  • Identify   (What digital content do you have?)
  • Select   (What portion of your digital content will be preserved?)
  • Store   (What issues are there for long-term storage?)
  • Protect  (What steps are needed to protect your digital content?)
  • Manage   (What provisions are needed for long-term management?)
  • Provide   (What considerations are there for long-term access?)

The group also discussed adding a seventh module on implementation.  Each of these existing modules contains a description, goals, concepts and resources designed to be used by current and/or aspiring digital preservation practitioners.

Mary Molinaro, Director, Research Data Center at the University of Kentucky Libraries, noted that “as we worked through the various modules it became apparent how flexible this curriculum is for a wide range of institutions.  It can be adapted for small, one-person cultural heritage institutions and still be relevant for large archives and libraries. ”

Mary also spoke to the advantages of having a focused, group effort to work through these changes: “Digital preservation has some core principles, but it’s also a discipline subject to rapid technological change.  Focusing on the curriculum together as an instructor group allowed us to emphasize those things that have not changed while at the same time enhancing the materials to reflect the current technologies and thinking.”

These curriculum modules are currently in the process of further refinement and revision, including an updated list of resources. The updated version of the curriculum will be available later this month. The Working Group also recommended some strategies for extending the curriculum to address executive audiences, and how to manage the process of updating the curriculum going forward.

Peter Murray: Thursday Threads: History of the Future, Kuali change-of-focus, 2018 Mindset List

planet code4lib - Thu, 2014-09-04 10:22
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

This weeks threads are a mixture of the future, the present and the past. Starting things off is A History of the Future in 100 Objects, a revealing look at what technology and society has in store for us. Parts of this resource are available freely on the website with the rest available as a $5 e-book. Next, in the present, is the decision by the Kuali Foundation to shift to a for-profit model and what it means for open source in the academic domain. And finally, a look at the past with the mindset list for the class of 2018 from Beloit College.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

A History of the Future in 100 Objects

What are the 100 objects that future historians will pick to define our 21st century? A javelin thrown by an ‘enhanced’ Paralympian, far further than any normal human? Virtual reality interrogation equipment used by police forces? The world’s most expensive glass of water, mined from the moons of Mars? Or desire modification drugs that fuel a brand new religion?
A History of the Future in 100 Objects describes a hundred slices of the future of everything, spanning politics, technology, art, religion, and entertainment. Some of the objects are described by future historians; others through found materials, short stories, or dialogues. All come from a very real future.

- About A History of the Future, by Adrian Hon

I was turned on to this book-slash-website-slash-resource by a tweet from Herbert Von de Sompel:

I'm assuming @apple doesn't believe in the future – "A history of the Future in 100 objects" not in iBooks / @cni_org

— Herbert (@hvdsomp) August 21, 2014

The name is intriguing, right? I mean, A History of the Future in 100 Objects? What does it mean to have a “History of the Future”?

The answer is an intriguing book that places the reader in the year 2082 looking back at the previous 68 years. (Yes, if you are doing the math, the book starts with objects from 2014.) Whether it is high-tech gizmos or the impact of world events, the author makes a projection of what might happen by telling the brief story of an artifact. For those in the library arena, you want to read about the reading rooms of 2030, but I really suggest starting at the beginning and working your way through the vignettes from the book that the author has published on the website. There is a link in the header of each pages that points to e-book purchasing options.

Kuali Reboots Itself into a Commercial Entity

Despite the positioning that this change is about innovating into the next decade, there is much more to this change than might be apparent on the surface. The creation of a for-profit entity to “lead the development and ongoing support” and to enable “an additional path for investment to accelerate existing and create new Kuali products fundamentally moves Kuali away from the community source model. Member institutions will no longer have voting rights for Kuali projects but will instead be able to “sit on customer councils and will give feedback about design and priority”. Given such a transformative change to the underlying model, there are some big questions to address.

- Kuali For-Profit: Change is an indicator of bigger issues, by Phil Hill, e-Literate

As Phil noted in yesterday’s post, Kuali is moving to a for-profit model, and it looks like it is motivated more by sustainability pressures than by some grand affirmative vision for the organization. There has been a long-term debate in higher education about the value of “community source,” which is a particular governance and funding model for open source projects. This debate is arguably one of the reasons why Indiana University left the Sakai Foundation (as I will get into later in this post). At the moment, Kuali is easily the most high-profile and well-funded project that still identifies itself as Community Source. The fact that this project, led by the single most vocal proponent for the Community Source model, is moving to a different model strongly suggests that Community Source has failed.
It’s worth taking some time to talk about why it has failed, because the story has implications for a wide range of open-licensed educational projects. For example, it is very relevant to my recent post on business models for Open Educational Resources (OER).

- Community Source Is Dead, by Michael Feldstein, e-Literate blog

I touched on the cosmic shift in the direction of Kuali on DLTJ last week, but these two pieces from Phil Hill and Michael Feldstein on the e-Literate blog. I have certainly been a proponent of the open source method of building software and the need for sustainable open source software to develop a community around that software. But I can’t help but think there is more to this story than meets the eye: that there is something about a lack of faith by senior university administrators in having their own staff own the needs and issues of their institutions. Or maybe it has something to do with the high levels of fiscal commitment to elaborate “community source” governance structures. In thinking about what happened with Kuali, I can’t help but compare it to the reality of Project Hydra, where libraries participate with in-kind donations of staff time, travel expenses and good will to a self-governing organization that has only as much structure as it needs.

The 2018 Mindset List

Students heading into their first year of college this year were generally born in 1996.

Among those who have never been alive in their lifetime are Tupac Shakur, JonBenet Ramsey, Carl Sagan, and Tiny Tim.

On Parents’ Weekend, they may want to watch out in case Madonna shows up to see daughter Lourdes Maria Ciccone Leon or Sylvester Stallone comes to see daughter Sophia.

For students entering college this fall in the Class of 2018…

- 2018 List, by Tom McBride and Ron Nief, Beloit College Mindset List

So begins the annual “mindset list” — a tool originally developed to help the Beloit College instructors use cultural references that were relevant to the students entering their classrooms. I didn’t see as much buzz about it this year in my social circles, so I wanted to call it out (if for no other reason than to make you feel just a little older…).

Link to this post!

Peter Murray: Blocking /xmlrpc.php Scans in the Apache .htaccess File

planet code4lib - Thu, 2014-09-04 02:41

Someone out there on the internet is repeatedly hitting this blog’s /xmlrpc.php service, probably looking to enumerate the user accounts on the blog as a precursor to a password scan (as described in Huge increase in WordPress xmlrpc.php POST requests at Sysadmins of the North). My access logs look like this: - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:21 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:22 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:24 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:24 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" - - [04/Sep/2014:02:18:26 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

By itself, this is just annoying — but the real problem is that the PHP stack is getting invoked each time to deal with the request, and at several requests per second from different hosts this was putting quite a load on the server. I decided to fix the problem with a slight variation from what is suggested in the Sysadmins of the North blog post. This addition to the .htaccess file at the root level of my WordPress instance rejects the connection attempt at the Apache level rather than the PHP level:

RewriteCond %{REQUEST_URI} =/xmlrpc.php [NC] RewriteCond %{HTTP_USER_AGENT} .*Mozilla\/4.0\ \(compatible:\ MSIE\ 7.0;\ Windows\ NT\ 6.0.* RewriteRule .* - [F,L]

Which means:

  1. If the requested path is /xmlrpc.php, and
  2. you are sending this particular agent string, then
  3. send back a 403 error message and don’t bother processing any more Apache rewrite rules.

If you need to use this yourself, you might find that the HTTP_USER_AGENT string has changed. You can copy the user string from your Apache access logs, but remember to preface each space or each parenthesis with a backslash.

Link to this post!

Peter Murray: 2nd Workshop on Sustainable Software for Science: Practice and Experiences — Accepted Papers and Travel Support

planet code4lib - Thu, 2014-09-04 02:08

The conference organizers for WSSSPE2 have posted the list of accepted papers and the application for travel support. I was on the program committee for this year’s conference, and I can point to some papers that I think are particularly useful to libraries and the cultural heritage community in general:

Link to this post!

William Denton: Moodie's Tale

planet code4lib - Thu, 2014-09-04 01:19

Somebody said we need a Moo for libraries. We still do. But I just read Moodie’s Tale by Eric Wright and I think it’s the Moo of Canadian academia. I don’t know Susanna Moodie or The Canterbury Tales so I think I’m missing a fair bit, but I still enjoyed it very much.

There are a few mentions of libraries, like this:

“Here’s an example,” the president continued. “I propose that henceforth you fellows be called ‘deans.’ Most places have deans nowadays. Sound the others out to see if there’s a problem. Now what else? What else does a college have? A proper college.”

“A library?”

“We’ve got one of sorts, haven’t we? In the corner room of the Drug Mart.”

“Just a few shelves, Gravely. Not many of the faculty know about it. It ought to have some standard reference works. Encyclopedias, that kind of thing.”

“We can afford a couple of thousand from the cleaning budget. Draw up a list. But now you’ve mentioned it, what is the real mark of a library?”

“Other than books?”

“Yes. What else?”

“A copying machine?”

“What else?”

It was important to guess right. Cunningham was getting impatient. “I am not sure of your emphasis, Gravely,” he hedged.

“Emphasis? How do you know it is a library?”

“The sign on the door?”

“Exactly. The label, William, the label. Get a sign made. And what do people find inside the door?”

“The librarian?”

“Now you’re on to it. Apart from the sign, the cheapest thing in the library is the librarian, especially since they aren’t unionized. We could put anyone in and call him the librarian. Now who have we got?”


Beckett was a religious maniac, a clerk in the maintenance department who spent his hours walking the streets with a billboard, warning of the end. His fellow workers complained constantly of his proselytizing in the storeroom.

“Perfect. He’s a bit more eccentric than most librarians, I suppose, but he’ll do. Is he conscientious?”

“It’s the other thing his colleagues dislike about him.”

“Done, then.”

Islandora: Varnish, Islandora, and

planet code4lib - Thu, 2014-09-04 00:24
Varnish and Islandora

Below you will find some information on how UPEI's Robertson Library configured Varnish for use with Islandora. Currently we have Varnish running on our Newspaper site and it is working well with the OpenSeadragon viewer, but we have not tested with the IA Bookviewer yet.

Why use Varnish?

At Robertson Library we have been digitizing the Guardian newspaper for a while now. We expected there would be a good amount of traffic to this site when it went live so prior to launch we wanted to do some benchmarks. We also noticed with the stock Islandora Newspaper solution pack that loading the Guardian newspaper page was very slow and we expected we would have to try to optimize things to handle load.

The benchmarks we used were pretty simple and were really just a way to help us determine whether or not an optimization was worth keeping. We used The Grinder, a Java based load testing framework.

We loaded Grinder with a simple scenario - hit the homepage, the main Guardian newspaper page, a Newspaper page (in the Openseadragon viewer) and the main Guardian page again (the one that lists all the Issues of the Guardian, we have almost 20,000 issues of the Guardian so far). Grinder was configured to hit these pages 250 times with 50 threads.

Our first run at it was with the stock islandora newspaper solution pack.

The numbers were not great with the stock Islandora Newspaper solution pack, we could handle about 1 request per second and we were starting to receive some errors. Total throughput was 1106.59KB/sec. CPU usage on the server was very high, all cores were pretty steady at or near 100%.

The biggest problem seemed to be hitting the resource index over and over again and manipulating the resulting array. So to try and speed things up a little we modified the code to query Solr instead of the Resource Index.

Test results with Solr query.

By querying Solr we were able to speed things up quite a bit. We were now getting close to 5 requests per second, no errors and a throughput of 4874.92 KB/sec. Our CPU usage was still very high, all cores at or near 100%.

We couldn’t see other ways to make the main Guardian page load faster without significantly changing how the Newspaper solution packed worked. Dynamically listing almost 20,000 issues on one page was going to take time no matter how we did it, unless we broke the page up into several requests. Breaking the page up into several requests would not be ideal either, as we would have to make roundtrips to the server to get the list of years available as well as all issues for a selected year. Instead of breaking this page up into several requests we discussed caching it.

So our next step was to install and configure Varnish so that this page would be cached. With Varnish installed and configured we ran the same Grinder tests.

Test with Varnish enabled

By using Varnish our numbers improved again. We were now handling 10 requests per second, no errors and a throughput of 9808.21 KB/sec. Our CPU usage was way down with our all cores between 3% and 20% usage (most were closer to the 3%). By using Varnish we got a speed boost but I think the biggest advantage will be in the number of users we can handle as our most expensive requests now come from the cache with little server overhead.

Of course using Grinder to test with Varnish makes Varnish look even better, as we are hitting the same URLs over and over but the results especially the low CPU usage lead us to believe Varnish is worth using on the site.

Since we have launched we have had as many as 75 concurrent users and response times are great even under load.

Configuring Drupal and Islandora for Varnish Configure Drupal Performance

On the Drupal Performance admin page (admin/config/development/performance) we configured Drupal to cache and compress pages. We also aggregate and compress css and javascript.

Configure Islandora

On the Islandora config page (admin/islandora/configure) we disabled setting the cache headers.

If we enable the Generate/parse datastream HTTP cache headers Varnish doesn’t serve the page thumbnail images from it’s cache, on the plus side we may get better browser caching of thumbnails.

We seemed to get better performance with Generate/parse datastream HTTP headers unchecked so we have left it off for now.

Installing and configuring Varnish

We installed Varnish on Ubuntu with sudo apt-get install varnish. We are currently using Varnish 3.0.2.

Varnish Configuration

We modified the default.vcl in /etc/varnish.

Our vcl file looks like this:

# This is a basic VCL configuration file for varnish. See the vcl(7) # man page for details on VCL syntax and semantics. # # Default backend definition. Set this to point to your content # server. # backend default { .host = ""; .port = "8090"; .connect_timeout = 30s; .first_byte_timeout = 30s; .between_bytes_timeout = 30s; } sub vcl_recv { // Remove has_js and Google Analytics __* cookies. set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", ""); // Remove a ";" prefix, if present. set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", ""); // Remove empty cookies. if (req.http.Cookie ~ "^\s*$") { unset req.http.Cookie; } //in testing pipe seemed to give us better results then pass if(req.url ~ "^/adore-djatoka"){ unset req.http.Cookie; return (pipe); } if (req.url ~ "\.(png|gif|jpg|js|css)$") { unset req.http.Cookie; return (lookup); } if(req.url ~ "^/search"){ unset req.http.Cookie; return (pass); } if (req.request == "GET" || req.request == "HEAD") { return (lookup); } } sub vcl_pipe { # # This forces every pipe request to be the first one. set bereq.http.connection = "close"; }

In /etc/default/varnish (Ubuntu/Debian) or /etc/sysconfig/varnish (Centos/Fedora) you will have to change your DAEMON_OPTS. Ours look like this:

DAEMON_OPTS="-a :80 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s malloc,5g"

You can see from the two config files that we have Varnish listening on port 80 and looking for the backend on port 8090.

Our Apache server is configured to listen on port 8090, other than that Apache is using a standard Islandora type setup.

The timeouts in our VCL are pretty high and could probably be set a lot lower. With an earlier version of Varnish we were having some inconsistencies with loading times when using the OpenSeadragon viewer, the higher timeouts were left over from testing with the older version of Varnish and we will adjust them.

We have Varnish configured to use RAM (malloc) for it’s cache but this could be set to a file.

One thing we decided to do is pipe requests to Djatoka. Since Djatoka is already caching images we decided not to cache them twice.

We have also made some optimizations to Djatoka’s configs. Basically we increased the number of tiles and images Djatoka would keep in it’s cache.

Note: We are not using the Varnish Drupal module.

There are many great resources for Varnish on the web. Pantheon has a great page regarding Varnish and Drupal.

Ai Weiwei and the Apocalypse

unalog - Tue, 2014-09-02 14:34

CS 171 - Visualization

unalog - Tue, 2014-09-02 07:17


Subscribe to code4lib aggregator