You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 38 min 32 sec ago

Lukas Koster: Looking for data tricks in Libraryland

Fri, 2014-09-05 12:12

IFLA 2014 Annual World Library and Information Congress Lyon – Libraries, Citizens, Societies: Confluence for Knowledge

After attending the IFLA 2014 Library Linked Data Satellite Meeting in Paris I travelled to Lyon for the first three days (August 17-19) of the IFLA 2014 Annual World Library and Information Congress. This year’s theme “Libraries, Citizens, Societies: Confluence for Knowledge” was named after the confluence or convergence of the rivers Rhône and Saône where the city of Lyon was built.

This was the first time I attended an IFLA annual meeting and it was very much unlike all conferences I have ever attended. Most of them are small and focused. The IFLA annual meeting is very big (but not as big as ALA) and covers a lot of domains and interests. The main conference lasts a week, including all kinds of committee meetings, and has more than 4000 participants and a lot of parallel tracks and very specialized Special Interest Group sessions. Separate Satellite Meetings are organized before the actual conference in different locations. This year there were more than 20 of them. These Satellite Meetings actually resemble the smaller and more focused conferences that I am used to.

A conference like this requires a lot of preparation and organization. Many people are involved, but I especially want to mention the hundreds of volunteers who were present not only in the conference centre but also at the airport, the railway stations, on the road to the location of the cultural evening, etc. They were all very friendly and helpful.

Another feature of such a large global conference is that presentations are held in a number of official languages, not only English. A team of translators is available for simultaneous translations. I attended a couple of talks in French, without translation headset, but I managed to understand most of what was presented, mainly because the presenters provided their slides in English.

It is clear that you have to prepare for the IFLA annual meeting and select in advance a number of sessions and tracks that you want to attend. With a large multi-track conference like this it is not always possible to attend all interesting sessions. In the light of a new data infrastructure project I recently started at the Library of the University of Amsterdam I decided to focus on tracks and sessions related to aspects of data in libraries in the broadest sense: “Cloud services for libraries – safety, security and flexibility” on Sunday afternoon, the all day track Universal Bibliographic Control in the Digital Age: Golden Opportunity or Paradise Lost?” on Monday and “Research in the big data era: legal, social and technical approaches to large text and data sets” on Tuesday morning.

Cloud Services for Libraries

It is clear that the term “cloud” is a very ambiguous term and consequently a rather unclear concept. Which is good, because clouds are elusive objects anyway.

In the Cloud Services for Libraries session there were five talks in total. Kee Siang Lee of the National Library Board of Singapore (NLB) described the cloud based NLB IT infrastructure consisting of three parts; a private, public and hybrid cloud. The private (restricted access) cloud is used for virtualization, an extensive service layer for discovery, content, personalization, and “Analytics as a service”, which is used for pushing and recommending related content from different sources and of various formats to end users. This “contextual discovery” is based on text analytics technologies across multiple sources, using a Hadoop cluster on virtual servers. The public cloud is used for the Web Archive Singapore project which is aimed at archiving a large number of Singapore websites. The hybrid cloud is used for what is called the Enquiry Management System (EMS), where “sensitive data is processed in-house while the non-sensitive data resides in the cloud”. It seems that in Singapore “cloud” is just another word for a group of real or virtual servers.

In the talk given by Beate Rusch of the German Library Network Service Centre for Berlin and Brandenburg KOBV the term “cloud” meant: the shared management of data on servers located somewhere in Germany. KOBV is one of the German regional Library Networks involved in the CIB project targeted at developing a unified national library data infrastructure. This infrastructure may consist of a number of individual clouds. Beate Rusch described three possible outcomes: one cloud serving as a master for the others, a data roundabout linking the other clouds, and a cross cloud dataspace where there is an overlapping shared environment between the individual clouds. An interesting aspect of the CIB project is that cooperation with two large commercial library system vendors, OCLC and Ex Libris, is part of the official agreement. This is of interest for other countries that have vested interests in these two companies, like The Netherlands.

Universal Bibliographic Control in the Digital Age

The Universal Bibliographic Control (UBC) session was an all day track with twelve very diverse presentations. Ted Fons of OCLC gave a good talk explaining the importance of the transition from the description of records to the modeling of entities. My personal impression lately is that OCLC all in all has been doing a good job with linked data PR, explaining the importance and the inevitability of the semantic web for libraries to a librarian audience without using technical jargon like URI, ontology, dereferencing and the like. Richard Wallis of OCLC, who was at the IFLA 2014 Linked Data Satellite Meeting and in Lyon, is spreading the word all over the globe.

Of the rest of the talks the most interesting ones were given in the afternoon. Anila Angjeli of the National Library of France (BnF) and Andrew MacEwan of the British Library explained the importance, similarities and differences of ISNI and VIAF, both authority files with identifiers used for people (both real and virtual). Gildas Illien (also one of the organizers of the Linked Data Satellite Meeting in Paris) and Françoise Bourdon, both BnF, described the future of Universal Bibliographic Control in the web of data, which is a development closely related to the topic of the talks by Ted Fons, Anila Angjeli and Andrew MacEwan.

The ONKI project, presented by the National Library of Finland, is a very good example of how bibliographic control can be moved into the digital age. The project entails the transfer of the general national library thesaurus YSA to the new YSO ontology, from libraries to the whole public sector and from closed to open data. The new ontology is based on concepts (identified by URIs) instead of monolingual text strings, with multilingual labels and machine readable relationships. Moreover the management and development of the ontology is now a distributed process. On top of the ontology the new public online Finto service has been made available.

The final talk of the day “The local in the global: universal bibliographic control from the bottom up” by Gordon Dunsire applied the “Think globally, act locally” aphorism to the Universal Bibliographic Control in the semantic web era. The universal top down control should make place for local bottom up control. There are so many old and new formats for describing information that we are facing a new biblical confusion of tongues: RDA, FRBR, MARC, BIBO, BIBFRAME, DC, ISBD, etc. What is needed are a number of translators between local and global data structures. On a logical level: Schema Translator, Term Translator, Statement Maker, Statement Breaker, Record Maker, Record Breaker. These black boxes are a challenge to developers. Indeed, mapping and matching of data of various types, formats and origins are vital in the new web of information age.

Research in the big data era

The Research in the big data era session had five presentations on essentially two different topics: data and text mining (four talks) and research data management (one talk). Peter Leonard of Yale University Library started the day with a very interesting presentation of how advanced text mining techniques can be used for digital humanities research. Using the digitized archive of Vogue magazine he demonstrated how the long term analysis of statistical distribution of related terms, like “pants”, “skirts”, “frocks”, or “women”, “girls”, can help visualise social trends and identify research questions. To do this there are a number of free tools available, like Google Books N-Gram Search and Bookworm. To make this type of analysis possible, researchers need full access to all data and text. However, rights issues come into play here, as Christoph Bruch of the Helmholtz Association, Germany, explained. What is needed is “intelligent openness” as defined by the Royal Society: data must be accessible, assessable, intelligible and usable. Unfortunately European copyright law stands in the way of the idea of fair use. Many European researchers are forced to perform their data analysis projects outside Europe, in the USA. The plea for openness was also supported by LIBER’s Susan Reilly. Data and text mining should be regarded as just another form of reading, that doesn’t need additional licenses

IdeasBox

IdeasBox packed

A very impressive and sympathetic library project that deserves everybody’s support was not an official programme item, but a bunch of crates, seats, tables and cushions spread across the central conference venue square. The whole set of furniture and equipment, that comes on two industrial pallets, constitutes a self supporting mobile library/information centre to be deployed in emergency areas, refugee camps etc. It is called IdeasBox, provided by Libraries without Borders. It contains mobile internet, servers, power supplies, ereaders, laptops, board games, books, etc., based on the circumstances, culture and needs of the target users and regions. The first IdeasBoxes are now used in Burundi in camps for refugees from Congo. Others will soon go to Lebanon for Syrian refugees. If librarians can make a difference, it’s here. You can support Libraries without Borders and IdeadBox in all kinds of ways: http://www.ideas-box.org/en/support-us.html.

IdeasBox unpacked

Conclusion

The questions about data management in libraries that I brought with me to the conference were only partly addressed, and actual practical answers and solutions were very rare. The management and mapping of heterogeneous and redundant types of data from all types of sources across all domains that libraries cover, in a flexible, efficient and system independent way apparently is not a mainstream topic yet. For things like that you have to attend Satellite Meetings. Legal issues, privacy, copyright, text and data mining, cloud based data sharing and management on the other hand are topics that were discussed. It turns out that attending an IFLA meeting is a good way to find out what is discussed, and more importantly what is NOT discussed, among librarians, library managers and vendors.

The quality and content of the talks vary a lot. As always the value of informal contacts and meetings cannot be overrated. All in all, looking back I can say that my first IFLA has been a positive experience, not in the least because of the positive spirit and enthusiasm of all organizers, volunteers and delegates.

(Special thanks to Beate Rusch for sharing IFLA experiences)

Open Knowledge Foundation: OKFestival 2014: we made it! A write-up & Thank You note

Fri, 2014-09-05 09:12

Open Knowledge Festival 2014! We built it, made it and ran it – it was a blast, thank you!

  • 1056 participants from 60 countries
  • 215 facilitators and moderators
  • 17 Programme Team members
  • 70 volunteers

made it all happen. Who says that numbers are dry? Just by writing them down, our hearts are melting.

Group work! – Pic by Gregor Fischer

Six weeks have passed since the end of OKFestival 2014, many of you participated in our feedback survey, we all caught up with the lack of sleep and are now hard at work with the public post-event report which will be shared on the festival website in the next few weeks (keep your eyes peeled!).

At the festival, we tried a lot of experiments, and experimenting is both risky and thrilling – and you were up for the challenge! So we thought it was time to take a moment to have a look at what we built together and celebrate the challenges we bravely took on and the outcomes that came out of them (and, yes, there are also learnings from things which could have gone better – is there any event with bullet-proof WiFi? can a country not known to be tropical and not used to air conditioning experience a heat wave on the 2 days out of 365 when you’ll run an event?)

Rocking selfies! – Pic by Burt Lum

Summing it up:

  • an event for the whole open movement: we were keen to be the convenor of a global gathering, welcoming participants from all around the world and a multitude of folks from open communities, organisations, small and big NGOs, governments, grassroots initiatives as well as people new to the topic and willing to dive in. We wanted to create an environment connecting diverse audiences, thus enabling a diverse groups of thinkers, makers and activists to come together and collaborate to effect change.

Ory Okolloh & Rufus Pollock fireside chat – Pic by Gregor Fischer

  • hands-on and outcome-driven approach: we wanted the event to be an opportunity to get together, make, share and learn with – and from – each other and get ready to make plans for what comes next. We didn’t want the event to be simply wonderful, we also wanted it to be useful – for you, your work and the future of the open movement. We’ve just started sharing a selection of your stories on our blog and more is yet to come this month, with the launch of our public post-OKFestival report, filled out with outcome stories you told us in the weeks after the event – who you met, what did you start to plan, what’s the new project coming out of the festival you’re already working on as we speak!

Meeting, talking, connecting! – Pic by Gregor Fischer

  • narrative streams: We made a bold choice – no streams-by-topic, but streams following a narrative. The event was fuelled by the theory that change happens when you bring together knowledge – which informs change – tools – which enable change – and society – which effects change. The Knowledge, Tools and Society streams aimed to explore the work we do and want to develop further beyond the usual silos which streams-by-topic could have created. Open hardware and open science, open government and open sustainability, open culture and open source, arts and privacy and surveillance.

Your vote, your voice! – Pic by Gregor Fischer

  • crowd sourced programme and participatory formats and tools (and powerpoints discouraged): We encouraged you to leave the comfort zone with no written presentations to read in sync with slides, but instead to create action-packed sessions in which all participants were contributing with their knowledge to work to be done together. We shared tips and tricks about creation and facilitation of such formats and hosted hangouts to help you propose your ideas for our open call – and hundreds of community members sent their proposals! Also, in the most participatory of the spirits, OKFestival also had its own unconference, the unFestival run by the great DATA Uruguay Team, who complemented our busy core programme with a great space where anyone could pitch and run her/his own emerging session on the spot, to give room and time to great new born ideas and plans. And a shout out also goes to a couple of special tools: our etherpads – according to the OKFestival Pad of Pads 85 pads have been co-written and worked with – and our first code of collaboration which we hope will accompany us also in future ventures!

Green volunteering power – always on! – Pic by Gregor Fischer

  • diversity of backgrounds, experiences, cultures, domains: months before we started producing the festival, we started to get in touch with people from all around the world who were running projects we admired, and with whom we’d never worked together before. This guided us in building a diverse Programme Team first, and receiving proposals and financial aid applications from many new folks and countries later on. This surely contributed to the most exciting outcome of all – having a really international crowd of the event, people from 60 countries, speaking dozens of different languages. Different backgrounds enriched everybody’s learning and networking and nurtured new collaborations and relationships.

Wow, that was a journey. And it’s just the beginning! As we said, OKFestival aimed to be the fuel, the kick-off, the inspiration for terrific actions and initiatives to come and now it’s time to hear some of most promising stories and project started there!

You can start having taste following the ever-growing OKFestival Stories article series on our blog and be ready for more, when in the next weeks we’ll publish more outcomes, interviews, quotes and reports from you, the protagonists of it all.

Thank you again, and see you very soon!

Your OKFestival Team

Riley Childs: The Universal Library Search and Then Some, Part One

Fri, 2014-09-05 05:35

Let’s talk about universal searches… I recently had the pleasure of partaking in a focus group (actually more of a user group) at the Charlotte-Mecklenburg Library, here we talked about the upcoming plans to upgrade their ILS and what the users (patrons) wanted, one of the things we talked about in particular was the library’s […]

The post The Universal Library Search and Then Some, Part One appeared first on Riley Childs.

Tim Ribaric: Grad School Round Two: This time it's personal (Sabbatical Part 5)

Fri, 2014-09-05 01:32

 

Tomorrow I start grad school again. This time for serious & as an old man.

read more

Karen Coyle: WP:NOTABILITY (and Women)

Fri, 2014-09-05 01:04
I've been spending quite a bit of time lately following the Wikipedia pages of "Articles for Deletion" or WP:AfD in Wikipedia parlance. This is a fascinating way to learn about the Wikipedia world. The articles for deletion fall mostly into a few categories:
  1. Brief mentions of something that someone once thought interesting (a favorite game character, a dearly loved soap opera star, a heartfelt local organization) but that has not been considered important by anyone else. In Wikipedian, it lacks WP:NOTABILITY.
  2. Highly polished P.R. intended to make someone or something look more important than it is, knowing that Wikipedia shows up high on search engine results, and that any site linked to from Wikipedia also gets its ranking boosted.
Some of #2 is actually created by companies that are paid to get their clients into Wikipedia along with promoting them in other places online. Another good example is that of authors of self-published books, some of whom appear to be more skilled in P.R. than they are in the literary arts.

In working through a few of the fifty or more articles proposed for deletion each day, you get to do some interesting sleuthing. You can see who has edited the article, and what else they have edited; any account that has only edited one article could be seen as a suspected bogus account created just for that purpose. Or you could assume that only one person in the English-speaking world has any interest in this topic at all.

Most of the work, though, is in seeing if you can establish notability. Notability is not a precise measure, and there are many pages of policy and discussion on the topic. The short form is that for something or someone to be notable, it has to be written about in respected, neutral, third-party publications. Thus a New York Times book review is good evidence of notability for a book, while a listing in the Amazon book department is not. The grey area is wide, however. Publisher's Weekly may or may not indicate notability, since they publish only short paragraphs, and cover about 7,000 books a year. That's not very discriminating.

Notability can be tricky. I recently came across an article for deletion pointing to Elsie Finnimore Buckley, a person I had never heard of before. I discovered that her dates were 1882-1959, and she was primarily a translator of works from French into English. She did, though, write what appears to have been a popular book of Greek tales for young people.

As a translator, her works were listed under "E. F. Buckley." I can well imagine that if she had used her full name it would not have been welcome on the title page of the books she translated. Some of the works she translated appear to have a certain stature, such as works by Franz Funck-Brentano. She has an LC name authority file under "Buckley, E. F." although her full name is added in parentheses: "(Elsie Finnimore)".

To understand what it was like for women writers, one can turn to Linda Peterson's book "Becoming a Woman of Letters and the fact of the Victorian market." In that, she quotes a male reviewer of Buckley's Greek tales, which she did publish under her full name. His comments are enough to chill the aspirations of any woman writer. He said that writing on such serious topics is "not women's work" and that "a woman has neither the knowledge nor the literary tact necessary for it." (Peterson, p. 58) Obviously, her work as a translator is proof otherwise, but he probably did not know of that work.

Given this attitude toward women as writers (of anything other than embroidery patterns and luncheon menus) it isn't all that surprising that it's not easy to establish WP:NOTABILITY for women writers of that era. As Dale Spender says in "Mothers of the Novel; 100 good women writers before Jane Austen":
"If the laws of literary criticism were to be made explicit they would require as their first entry that the sex of the author is the single most important factor in any test of greatness and in any preservation for posterity." (p. 137)That may be a bit harsh, but it illustrates the problem that one faces when trying to rectify the prejudices against women, especially from centuries past, while still wishing to provide valid proof that this woman's accomplishments are worthy of an encyclopedia entry.

We know well that many women writers had to use male names in order to be able to publish at all. Others, like E.F. Buckley, hid behind initials. Had her real identity been revealed to the reading public, she might have lost her work as a translator. Of late, J.K. Rowling has used both techniques, so this is not a problem that we left behind with the Victorian era. As I said in the discussion on Wikipedia:
"It's hard to achieve notability when you have to keep your head down."

Cherry Hill Company: Cherry Hill to present at DrupalCamp LA this weekend

Thu, 2014-09-04 22:03

Cherry Hill is looking forward to DrupalCamp LA this weekend! Come join us for some of our sessions to expand your Drupal knowledge. Whether you are a seasoned Drupal ninja, or a green newbie, LA Drupal community members, including the crew at Cherry Hill, will be on hand to show you some ins and outs of the Drupal world. 

Check out our sessions below:

Saturday: Morning InstallFest: Get PHP & Drupal running in under 15 minutes with Tommy Keswick

8:30am Pacific Ballroom AB 
InstallFest volunteers will help guide and verify the installation of PHP and/or Drupal on your personal laptop.

Drupal Camp Into for Newbies with John Romine and Ashok Modi

8:40am Pacific Ballroom C
Pre-camp cup of coffee and a quick introduction to how to get...

Read more »

HangingTogether: 939,594,891 library users worldwide — Prove me wrong!

Thu, 2014-09-04 20:19

That crunching you hear is the sound of the numbers available from OCLC’s Global Library Statistics page.

Over the past several years, the OCLC Library has been compiling data for the total number of libraries, librarians, volumes, expenditures, and users for every country and territory in the world, broken down into the major library types: academic, public, school, special and national.  The goal was to provide statistics on all libraries—not just OCLC libraries—that could be accessed and used by anyone.

A while back Dr. Frank Seeliger, Director of the Library at the Technical University of Applied Sciences in Wildau, Germany, contacted me about the statistics.  He asked if I could send him the actual data behind the site so that he could total up all the libraries, librarians, books, etc.  (At the time the information was only accessible country-by-country.)  I was happy to oblige, and here’s what he came up with.

Global library statistics summary

His request created the impetus for us to make the data available under an Open Data Commons Attribution License. Two spreadsheets provide information for countries and for U.S. states and Canadian provinces.  A third gives information on the over 80 sources that contribute data.

See the data for yourself!

The staff of the OCLC Library extracted data from respected third-party sources, both electronic and print, that in their judgment are the most current and accurate sources to which they have access. For many countries, data were either unavailable (indicated in the charts as NA) or sporadic. For a lot of the world, the data were not as current as the we would have liked.

We want to makes these statistics as accurate as we can.  Once you’ve taken a look at the Global Library Statistics, take a look the Sources and send me your suggestions or leave a comment below.  While $51 billion in library expenditures is nothing to sneeze at, it is, as Dr. Seeliger put it, a Hausnummer.  A ballpark figure.  And it’s not even adjusted for inflation!

Thanks for your help.

About Tam Dalrymple

Tam Dalrymple is Senior Information Specialist (reference librarian) at the OCLC Library in Dublin Ohio. Prior to joining OCLC as a product manager some years back, Tam managed reference services at Ohio State and at the Columbus Metropolitan Library.

Mail | More Posts (1)

Jodi Schneider: Rating the evidence, citation by citation?

Thu, 2014-09-04 17:21

Publishers from HighWire Press are experimenting with a plugin called SocialCite. This is intended to rate the evidence, citation by citation. Like this:

SocialCite at PNAS, HighWire Press from http://www.pnas.org/content/108/14/5488.full#ref-list-1:

So far a few publishers (including PNAS) have implemented it as a pilot. Apparently the Journal of Bone and Joint Surgery is apparently leading this effort, I’d be really interested in speaking with them further:

Find out more about SocialCite from their website or the slidedeck from their debut at the HighwirePress meeting.

SocialCite makes its debut at the HighWire Press meeting from Kent Anderson

I’m *very* curious to hear what peopel think of this — it really surprised me.

LITA: LITA Updates

Thu, 2014-09-04 16:45

This is one of our periodic messages sent to all LITA members. This update includes information about:

  • LITA Forum Opportunities
  • New LITA Guides available
LITA Forum in Albuquerque

Two workshops, three keynotes, 30 plus concurrent sessions, poster sessions, and, multiple networking opportunities promise to deliver opportunities to you.

The two preconference workshops begin on Wednesday, November 5, 1:00-5:00pm and run through Thursday, November 6, 8am to noon.

1) Learn Python by Playing with Library Data with Francis Kayiwa. Learn the basics on how to set up your Python environment, install useful packages, and, write programs.

2) LinkedData for Libraries: How libraries can make use of Linked Open Data to share information about library resources and to improve discovery, access, and understanding for library users with Dean Krafft and Jon Corson-Rikert from Cornell University Library.

The three keynote speakers are:

AnnMarie Thomas, Engineering Professor at the University of St. Thomas. AnnMarie is the director of the UST Design Laboratory. Dr. Thomas co-founded, and co-directs the University of St. Thomas Center for Pre-Collegiate Engineering Education. She served as the Founding Executive Director of the Maker Education Initiative. AnnMarie has also worked on robotics design, creation, and propulsion.

Lorcan Dempsey, Vice President, OCLC Research and Chief Strategist, oversees the research division and participates in planning at OCLC. Lorcan has policy, research and service development experience, mostly in the area of networked information and digital libraries.

Kortney Ryan Ziegler, Founder of Trans*h4ck, is an award winning artist, writer, and the first person to hold the PhD of African American Studies from Northwestern University. Trans*H4CK is the only tech event of its kind that spotlights trans* created technology, trans* entrepreneurs and trans* led startups.

Networking opportunities

All Forum sessions are in a single hotel which facilitates networking opportunities. These include a first night reception, two nights of networking dinners (gather on site and then move off site to various restaurants), all conference meals on site (breakfasts, lunch) and lengthy breaks. Not to mention conversations in the hotel hallways and elevators. The first night reception launches the Sponsor Showcase where participants will have ample opportunities to meet with representatives from EBSCO, Springshare, and, @MIRE both that evening and the next day. Our thanks go to all the Forum sponsors including Innovative and OCLC. Rachel Vacek, LITA President, and, Thomas Dowling, LITA President-elect, have plans to lead two networking dinners focused on LITA specific Kitchen Conversations. LITA and the LITA Forum fully support the Statement of Appropriate Conduct at ALA Conferences

Hope to see you in Albuquerque!

New LITA Guides

Two LITA Guides were published this summer. The Top Technologies Every Librarian Needs to Know, Kenneth Varnum, editor and contributor, and, Using Massive Digital Libraries by Andrew Weiss with Ryan James.

The Top Technologies guide is focused on the impact a technology could have on staff, services, and patrons. An expert on each emerging technology talks about the technology within the near-term future of three to five years. In the introduction, Ken Varnum says, “Each chapter includes a thorough description of a particular technology: what it is, where it came from, and why it matters. We will look at early adopters or prototypes for the technology to see how it could be used more broadly. And then, having described a trajectory, we will paint a picture of how the library of the not-so-distant future could be changed by adopting and embracing that particular technology.”

Using Massive Digital Libraries examines “what Ryan James and (Andrew Weiss) in previous studies have together defined as massive digital libraries (MDLs). … A massive digital library is a collection of organized information large enough to rival the size of the world’s largest bricks-and-mortar libraries in terms of book collections. The examples examined in this book range from hundreds of thousands of books to tens of millions. This basic definition … is a starting point for discussion. As the book progresses this definition is refined further to make it more usable and relevant. This book will introduce more characteristics of MDLs and examine how they affect the current traditional library.”

I encourage you to connect with LITA by:

  1. Exploring our web site.
  2. Subscribing to LITA-L email discussion list. E-mail to sympa@ala.org with the subject line “subscribe lita-l”.
  3. Visiting the LITA blog and LITA Division page on ALA Connect.
  4. Connecting with us on Facebook and Twitter.
  5. Reaching out to the LITA leadership at any time.

Please note: the Information Technology and Libraries (ITAL) journal is available to you and to the entire profession. ITAL features high-quality articles that undergo rigorous peer-review as well as case studies, commentary, and information about topics and trends of interest to the LITA community and beyond. Be sure to sign up for notifications when new issues are posted (March, June, September, and December).

If you have any questions or wish to discuss any of these items, please do let me know.

All the best,

Mary

Mary Taylor, Executive Director
Library and Information Technology Association (LITA)
50 E. Huron, Chicago, IL 60611
800-545-2433 x4267
312-280-4267 (direct line)
312-280-3257 (fax)
mtaylor (at) ala.org
www.lita.org

Join us in Albuquerque, November 5-8, 2014 for the LITA Forum. The theme is “Transformation: From Node to Network”

District Dispatch: Free webinar: Understanding Social Security

Thu, 2014-09-04 16:26

Photo by the Knight Foundation

Do you know how to help your patrons locate information on Supplemental Security Income or Social Security? The American Library Association (ALA) is encouraging librarians to participate in “My SSA,” a free webinar that will teach participants how to use My Social Security (MySSA), the online Social Security resource.

Presented by leaders and members of the development team of MySSA, this session will provide attendees with an overview of MySSA. In addition to receiving benefits information in print, the Social Security Administration is encouraging librarians to create an online MySSA account to view and track benefits.

Attendees will learn about viewing earnings records and receiving instant estimates of their future Social Security benefits. Those already receiving benefits can check benefit and payment information and manage their benefits.

Speakers include:

  • Maria Artista-Cuchna, Acting Associate Commissioner, External Affairs
  • Kia Anderson, Supervisory Social Insurance Specialist
  • Arnoldo Moore, Social Insurance Specialist
  • Alfredo Padilia Jr., Social Insurance Specialist
  • Diandra Taylor, Management Analyst

Date: Wednesday, September 17, 2014
Time: 2:00 PM – 3:00 PM EDT
Register for the free event

If you cannot attend this live session, a recorded archive will be available. To view past webinars also hosted collaboratively with iPAC, please visit Lib2Gov.org.

The post Free webinar: Understanding Social Security appeared first on District Dispatch.

Library of Congress: The Signal: DPOE Working Group Moves Forward on Curriculum

Thu, 2014-09-04 13:03

The working group at their recent meeting. Photo by Julio Diaz.

For many organizations that are just starting to tackle digital preservation, it can be a daunting challenge – and particularly difficult to figure out the first steps to take.  Education and training may be the best starting point, creating and expanding the expertise available to handle this kind of challenge.  The Digital Preservation Outreach and Education  program here at the Library aims to do just that, by providing the materials as well as the hands-on instruction to help build the expertise needed for current and future professionals working on digital preservation.

Recently, the Library was host to a meeting of the DPOE Working Group, consisting of a core group of experts and educators in the field of digital preservation.  The Working Group participants were Robin Dale (Institute of Museum and Library Services), Sam Meister (University of Montana-Missoula), Mary Molinaro (University of Kentucky), and Jacob “Jake” Nadal (Princeton University).  The meeting was chaired by George Coulbourne of the Library of Congress, and Library staffers Barrie Howard and Kris Nelson also participated.

The main goal of the meeting was to update the existing DPOE Curriculum, which is used as the basis for the Program’s training workshops and then subsequently, by the trainees themselves.  A survey is being conducted to gather even more information, and will help inform this curriculum as well (see a related blog post).   The Working Group reviewed and edited all of the six substantive modules which are based on terms from the OAIS Reference Model framework:

  • Identify   (What digital content do you have?)
  • Select   (What portion of your digital content will be preserved?)
  • Store   (What issues are there for long-term storage?)
  • Protect  (What steps are needed to protect your digital content?)
  • Manage   (What provisions are needed for long-term management?)
  • Provide   (What considerations are there for long-term access?)

The group also discussed adding a seventh module on implementation.  Each of these existing modules contains a description, goals, concepts and resources designed to be used by current and/or aspiring digital preservation practitioners.

Mary Molinaro, Director, Research Data Center at the University of Kentucky Libraries, noted that “as we worked through the various modules it became apparent how flexible this curriculum is for a wide range of institutions.  It can be adapted for small, one-person cultural heritage institutions and still be relevant for large archives and libraries. ”

Mary also spoke to the advantages of having a focused, group effort to work through these changes: “Digital preservation has some core principles, but it’s also a discipline subject to rapid technological change.  Focusing on the curriculum together as an instructor group allowed us to emphasize those things that have not changed while at the same time enhancing the materials to reflect the current technologies and thinking.”

These curriculum modules are currently in the process of further refinement and revision, including an updated list of resources. The updated version of the curriculum will be available later this month. The Working Group also recommended some strategies for extending the curriculum to address executive audiences, and how to manage the process of updating the curriculum going forward.

Peter Murray: Thursday Threads: History of the Future, Kuali change-of-focus, 2018 Mindset List

Thu, 2014-09-04 10:22
Receive DLTJ Thursday Threads:

by E-mail

by RSS

Delivered by FeedBurner

This weeks threads are a mixture of the future, the present and the past. Starting things off is A History of the Future in 100 Objects, a revealing look at what technology and society has in store for us. Parts of this resource are available freely on the website with the rest available as a $5 e-book. Next, in the present, is the decision by the Kuali Foundation to shift to a for-profit model and what it means for open source in the academic domain. And finally, a look at the past with the mindset list for the class of 2018 from Beloit College.

Feel free to send this to others you think might be interested in the topics. If you find these threads interesting and useful, you might want to add the Thursday Threads RSS Feed to your feed reader or subscribe to e-mail delivery using the form to the right. If you would like a more raw and immediate version of these types of stories, watch my Pinboard bookmarks (or subscribe to its feed in your feed reader). Items posted to are also sent out as tweets; you can follow me on Twitter. Comments and tips, as always, are welcome.

A History of the Future in 100 Objects

What are the 100 objects that future historians will pick to define our 21st century? A javelin thrown by an ‘enhanced’ Paralympian, far further than any normal human? Virtual reality interrogation equipment used by police forces? The world’s most expensive glass of water, mined from the moons of Mars? Or desire modification drugs that fuel a brand new religion?
A History of the Future in 100 Objects describes a hundred slices of the future of everything, spanning politics, technology, art, religion, and entertainment. Some of the objects are described by future historians; others through found materials, short stories, or dialogues. All come from a very real future.

- About A History of the Future, by Adrian Hon

I was turned on to this book-slash-website-slash-resource by a tweet from Herbert Von de Sompel:

I'm assuming @apple doesn't believe in the future – "A history of the Future in 100 objects" not in iBooks / @cni_org http://t.co/dK5OI4JuIr

— Herbert (@hvdsomp) August 21, 2014


The name is intriguing, right? I mean, A History of the Future in 100 Objects? What does it mean to have a “History of the Future”?

The answer is an intriguing book that places the reader in the year 2082 looking back at the previous 68 years. (Yes, if you are doing the math, the book starts with objects from 2014.) Whether it is high-tech gizmos or the impact of world events, the author makes a projection of what might happen by telling the brief story of an artifact. For those in the library arena, you want to read about the reading rooms of 2030, but I really suggest starting at the beginning and working your way through the vignettes from the book that the author has published on the website. There is a link in the header of each pages that points to e-book purchasing options.

Kuali Reboots Itself into a Commercial Entity

Despite the positioning that this change is about innovating into the next decade, there is much more to this change than might be apparent on the surface. The creation of a for-profit entity to “lead the development and ongoing support” and to enable “an additional path for investment to accelerate existing and create new Kuali products fundamentally moves Kuali away from the community source model. Member institutions will no longer have voting rights for Kuali projects but will instead be able to “sit on customer councils and will give feedback about design and priority”. Given such a transformative change to the underlying model, there are some big questions to address.

- Kuali For-Profit: Change is an indicator of bigger issues, by Phil Hill, e-Literate

As Phil noted in yesterday’s post, Kuali is moving to a for-profit model, and it looks like it is motivated more by sustainability pressures than by some grand affirmative vision for the organization. There has been a long-term debate in higher education about the value of “community source,” which is a particular governance and funding model for open source projects. This debate is arguably one of the reasons why Indiana University left the Sakai Foundation (as I will get into later in this post). At the moment, Kuali is easily the most high-profile and well-funded project that still identifies itself as Community Source. The fact that this project, led by the single most vocal proponent for the Community Source model, is moving to a different model strongly suggests that Community Source has failed.
It’s worth taking some time to talk about why it has failed, because the story has implications for a wide range of open-licensed educational projects. For example, it is very relevant to my recent post on business models for Open Educational Resources (OER).

- Community Source Is Dead, by Michael Feldstein, e-Literate blog

I touched on the cosmic shift in the direction of Kuali on DLTJ last week, but these two pieces from Phil Hill and Michael Feldstein on the e-Literate blog. I have certainly been a proponent of the open source method of building software and the need for sustainable open source software to develop a community around that software. But I can’t help but think there is more to this story than meets the eye: that there is something about a lack of faith by senior university administrators in having their own staff own the needs and issues of their institutions. Or maybe it has something to do with the high levels of fiscal commitment to elaborate “community source” governance structures. In thinking about what happened with Kuali, I can’t help but compare it to the reality of Project Hydra, where libraries participate with in-kind donations of staff time, travel expenses and good will to a self-governing organization that has only as much structure as it needs.

The 2018 Mindset List

Students heading into their first year of college this year were generally born in 1996.

Among those who have never been alive in their lifetime are Tupac Shakur, JonBenet Ramsey, Carl Sagan, and Tiny Tim.

On Parents’ Weekend, they may want to watch out in case Madonna shows up to see daughter Lourdes Maria Ciccone Leon or Sylvester Stallone comes to see daughter Sophia.

For students entering college this fall in the Class of 2018…

- 2018 List, by Tom McBride and Ron Nief, Beloit College Mindset List

So begins the annual “mindset list” — a tool originally developed to help the Beloit College instructors use cultural references that were relevant to the students entering their classrooms. I didn’t see as much buzz about it this year in my social circles, so I wanted to call it out (if for no other reason than to make you feel just a little older…).

Link to this post!

Peter Murray: Blocking /xmlrpc.php Scans in the Apache .htaccess File

Thu, 2014-09-04 02:41

Someone out there on the internet is repeatedly hitting this blog’s /xmlrpc.php service, probably looking to enumerate the user accounts on the blog as a precursor to a password scan (as described in Huge increase in WordPress xmlrpc.php POST requests at Sysadmins of the North). My access logs look like this:

176.227.196.86 - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 195.154.136.19 - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 176.227.196.86 - - [04/Sep/2014:02:18:19 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 176.227.196.86 - - [04/Sep/2014:02:18:21 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 176.227.196.86 - - [04/Sep/2014:02:18:22 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 176.227.196.86 - - [04/Sep/2014:02:18:24 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 195.154.136.19 - - [04/Sep/2014:02:18:24 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)" 176.227.196.86 - - [04/Sep/2014:02:18:26 +0000] "POST /xmlrpc.php HTTP/1.0" 200 291 "-" "Mozilla/4.0 (compatible: MSIE 7.0; Windows NT 6.0)"

By itself, this is just annoying — but the real problem is that the PHP stack is getting invoked each time to deal with the request, and at several requests per second from different hosts this was putting quite a load on the server. I decided to fix the problem with a slight variation from what is suggested in the Sysadmins of the North blog post. This addition to the .htaccess file at the root level of my WordPress instance rejects the connection attempt at the Apache level rather than the PHP level:

RewriteCond %{REQUEST_URI} =/xmlrpc.php [NC] RewriteCond %{HTTP_USER_AGENT} .*Mozilla\/4.0\ \(compatible:\ MSIE\ 7.0;\ Windows\ NT\ 6.0.* RewriteRule .* - [F,L]

Which means:

  1. If the requested path is /xmlrpc.php, and
  2. you are sending this particular agent string, then
  3. send back a 403 error message and don’t bother processing any more Apache rewrite rules.

If you need to use this yourself, you might find that the HTTP_USER_AGENT string has changed. You can copy the user string from your Apache access logs, but remember to preface each space or each parenthesis with a backslash.

Link to this post!

Peter Murray: 2nd Workshop on Sustainable Software for Science: Practice and Experiences — Accepted Papers and Travel Support

Thu, 2014-09-04 02:08

The conference organizers for WSSSPE2 have posted the list of accepted papers and the application for travel support. I was on the program committee for this year’s conference, and I can point to some papers that I think are particularly useful to libraries and the cultural heritage community in general:

Link to this post!

William Denton: Moodie's Tale

Thu, 2014-09-04 01:19

Somebody said we need a Moo for libraries. We still do. But I just read Moodie’s Tale by Eric Wright and I think it’s the Moo of Canadian academia. I don’t know Susanna Moodie or The Canterbury Tales so I think I’m missing a fair bit, but I still enjoyed it very much.

There are a few mentions of libraries, like this:

“Here’s an example,” the president continued. “I propose that henceforth you fellows be called ‘deans.’ Most places have deans nowadays. Sound the others out to see if there’s a problem. Now what else? What else does a college have? A proper college.”

“A library?”

“We’ve got one of sorts, haven’t we? In the corner room of the Drug Mart.”

“Just a few shelves, Gravely. Not many of the faculty know about it. It ought to have some standard reference works. Encyclopedias, that kind of thing.”

“We can afford a couple of thousand from the cleaning budget. Draw up a list. But now you’ve mentioned it, what is the real mark of a library?”

“Other than books?”

“Yes. What else?”

“A copying machine?”

“What else?”

It was important to guess right. Cunningham was getting impatient. “I am not sure of your emphasis, Gravely,” he hedged.

“Emphasis? How do you know it is a library?”

“The sign on the door?”

“Exactly. The label, William, the label. Get a sign made. And what do people find inside the door?”

“The librarian?”

“Now you’re on to it. Apart from the sign, the cheapest thing in the library is the librarian, especially since they aren’t unionized. We could put anyone in and call him the librarian. Now who have we got?”

“Beckett?”

Beckett was a religious maniac, a clerk in the maintenance department who spent his hours walking the streets with a billboard, warning of the end. His fellow workers complained constantly of his proselytizing in the storeroom.

“Perfect. He’s a bit more eccentric than most librarians, I suppose, but he’ll do. Is he conscientious?”

“It’s the other thing his colleagues dislike about him.”

“Done, then.”

Islandora: Varnish, Islandora, and Islandnewspapers.ca

Thu, 2014-09-04 00:24
Varnish and Islandora

Below you will find some information on how UPEI's Robertson Library configured Varnish for use with Islandora. Currently we have Varnish running on our Newspaper site and it is working well with the OpenSeadragon viewer, but we have not tested with the IA Bookviewer yet.

Why use Varnish?

At Robertson Library we have been digitizing the Guardian newspaper for a while now. We expected there would be a good amount of traffic to this site when it went live so prior to launch we wanted to do some benchmarks. We also noticed with the stock Islandora Newspaper solution pack that loading the Guardian newspaper page was very slow and we expected we would have to try to optimize things to handle load.

The benchmarks we used were pretty simple and were really just a way to help us determine whether or not an optimization was worth keeping. We used The Grinder, a Java based load testing framework.

We loaded Grinder with a simple scenario - hit the homepage, the main Guardian newspaper page, a Newspaper page (in the Openseadragon viewer) and the main Guardian page again (the one that lists all the Issues of the Guardian, we have almost 20,000 issues of the Guardian so far). Grinder was configured to hit these pages 250 times with 50 threads.

Our first run at it was with the stock islandora newspaper solution pack.

The numbers were not great with the stock Islandora Newspaper solution pack, we could handle about 1 request per second and we were starting to receive some errors. Total throughput was 1106.59KB/sec. CPU usage on the server was very high, all cores were pretty steady at or near 100%.

The biggest problem seemed to be hitting the resource index over and over again and manipulating the resulting array. So to try and speed things up a little we modified the code to query Solr instead of the Resource Index.

Test results with Solr query.

By querying Solr we were able to speed things up quite a bit. We were now getting close to 5 requests per second, no errors and a throughput of 4874.92 KB/sec. Our CPU usage was still very high, all cores at or near 100%.

We couldn’t see other ways to make the main Guardian page load faster without significantly changing how the Newspaper solution packed worked. Dynamically listing almost 20,000 issues on one page was going to take time no matter how we did it, unless we broke the page up into several requests. Breaking the page up into several requests would not be ideal either, as we would have to make roundtrips to the server to get the list of years available as well as all issues for a selected year. Instead of breaking this page up into several requests we discussed caching it.

So our next step was to install and configure Varnish so that this page would be cached. With Varnish installed and configured we ran the same Grinder tests.

Test with Varnish enabled

By using Varnish our numbers improved again. We were now handling 10 requests per second, no errors and a throughput of 9808.21 KB/sec. Our CPU usage was way down with our all cores between 3% and 20% usage (most were closer to the 3%). By using Varnish we got a speed boost but I think the biggest advantage will be in the number of users we can handle as our most expensive requests now come from the cache with little server overhead.

Of course using Grinder to test with Varnish makes Varnish look even better, as we are hitting the same URLs over and over but the results especially the low CPU usage lead us to believe Varnish is worth using on the Islandnewspapers.ca site.

Since we have launched we have had as many as 75 concurrent users and response times are great even under load.

Configuring Drupal and Islandora for Varnish Configure Drupal Performance

On the Drupal Performance admin page (admin/config/development/performance) we configured Drupal to cache and compress pages. We also aggregate and compress css and javascript.

Configure Islandora

On the Islandora config page (admin/islandora/configure) we disabled setting the cache headers.

If we enable the Generate/parse datastream HTTP cache headers Varnish doesn’t serve the page thumbnail images from it’s cache, on the plus side we may get better browser caching of thumbnails.

We seemed to get better performance with Generate/parse datastream HTTP headers unchecked so we have left it off for now.

Installing and configuring Varnish

We installed Varnish on Ubuntu with sudo apt-get install varnish. We are currently using Varnish 3.0.2.

Varnish Configuration

We modified the default.vcl in /etc/varnish.

Our vcl file looks like this:

# This is a basic VCL configuration file for varnish. See the vcl(7) # man page for details on VCL syntax and semantics. # # Default backend definition. Set this to point to your content # server. # backend default { .host = "127.0.0.1"; .port = "8090"; .connect_timeout = 30s; .first_byte_timeout = 30s; .between_bytes_timeout = 30s; } sub vcl_recv { // Remove has_js and Google Analytics __* cookies. set req.http.Cookie = regsuball(req.http.Cookie, "(^|;\s*)(__[a-z]+|has_js)=[^;]*", ""); // Remove a ";" prefix, if present. set req.http.Cookie = regsub(req.http.Cookie, "^;\s*", ""); // Remove empty cookies. if (req.http.Cookie ~ "^\s*$") { unset req.http.Cookie; } //in testing pipe seemed to give us better results then pass if(req.url ~ "^/adore-djatoka"){ unset req.http.Cookie; return (pipe); } if (req.url ~ "\.(png|gif|jpg|js|css)$") { unset req.http.Cookie; return (lookup); } if(req.url ~ "^/search"){ unset req.http.Cookie; return (pass); } if (req.request == "GET" || req.request == "HEAD") { return (lookup); } } sub vcl_pipe { # http://www.varnish-cache.org/ticket/451 # This forces every pipe request to be the first one. set bereq.http.connection = "close"; }

In /etc/default/varnish (Ubuntu/Debian) or /etc/sysconfig/varnish (Centos/Fedora) you will have to change your DAEMON_OPTS. Ours look like this:

DAEMON_OPTS="-a :80 \ -T localhost:6082 \ -f /etc/varnish/default.vcl \ -S /etc/varnish/secret \ -s malloc,5g"

You can see from the two config files that we have Varnish listening on port 80 and looking for the backend on port 8090.

Our Apache server is configured to listen on port 8090, other than that Apache is using a standard Islandora type setup.

The timeouts in our VCL are pretty high and could probably be set a lot lower. With an earlier version of Varnish we were having some inconsistencies with loading times when using the OpenSeadragon viewer, the higher timeouts were left over from testing with the older version of Varnish and we will adjust them.

We have Varnish configured to use RAM (malloc) for it’s cache but this could be set to a file.

One thing we decided to do is pipe requests to Djatoka. Since Djatoka is already caching images we decided not to cache them twice.

We have also made some optimizations to Djatoka’s configs. Basically we increased the number of tiles and images Djatoka would keep in it’s cache.

Note: We are not using the Varnish Drupal module.

There are many great resources for Varnish on the web. Pantheon has a great page regarding Varnish and Drupal.

Pages