You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib -
Updated: 22 min 1 sec ago

Patrick Hochstenbach: Homework assignment #1 Sketchbookskool #BootKamp

Sat, 2015-06-13 12:23
Filed under: Sketchbook Tagged: apple, copic, moleskine, sketch, sketchbook, sketchbookskool, watercolor

District Dispatch: Library youth and tech leaders on the Hill: We build a bridge between knowledge and passion

Fri, 2015-06-12 20:44

Participants in ALA’s June 2015 Capitol Hill event on youth and technology. Left to right: Nicola McDonald, Jesse Sanders, Mega Subramaniam, Sari Feldman.

“Libraries are often the sole or primary technological access point for their communities,” said U.S. Representative Marcia Fudge (D-Ohio) yesterday, kicking off a Capitol Hill event exploring the role of libraries in preparing children and teens for higher education and the workforce. In my childhood, “I went to the library every day. It opened up worlds I didn’t even know existed.”

Co-hosted by the American Library Association’s Washington Office and Rep. Fudge’s office, the event, entitled, “Kids, Learning, and Technology: Libraries as 21st Century Creative Spaces,” convened library leaders in the youth and technology space to discuss strategies for advancing digital literacy, critical thinking and creative expression through technology-driven programming and services.

Congresswoman Fudge’s powerful words at the outset of the event set the tone for the panel discussion to follow. Moderated by ALA President-elect and Cuyahoga County (Ohio) Public Library (CCPL) Executive Director Sari Feldman, the discussants included Nicola McDonald of the New York Public Library (NYPL), Jesse Sanders of CCPL’s Warrensville Branch, and Professor Mega Subramaniam of the University of Maryland’s College of Information Studies.

Together, the panelists’ remarks painted a picture of the library as an equal-opportunity on-ramp to the technology economy. Nicola McDonald outlined how NYPL helps young people build science and tech skills through gaming and hands-on community activities; Jesse Sanders explained how his library offers young people digital tools that foster creative, collaborative learning; and Mega Subramaniam described school libraries as unmatched in their ability to help young people build digital skills through personalized learning opportunities.

ALA President-elect Sari Feldman, Congresswoman Marcia Fudge (D-OH) embrace.

The panelists all made clear that the library’s role in preparing young people for the future extends beyond providing access to digital technologies and critical STEM information. Access is only part of the picture. At the University of Maryland’s College of Information Studies, a program that simultaneously fosters literacy and scientific understanding among inner city Washington D.C. middle school students – known as Sci-Dentity – goes beyond simply presenting students with information in a textbook. The program encourages students to read science fiction books, watch science fiction movies, play interactive science games, and then write their own stories based upon their experiences. Similarly, a photography program at Brooklyn Public Library does more than provide patrons with informational resources and access to a camera – it also encourages patrons to take pictures in the community, and then present their work at a local art gallery.

The key is that in their efforts to help young people build critical skills for education and the workforce, school and public libraries don’t just provide learning opportunities, they provide connected learning opportunities. Other learning institutions may have informational resources on STEM topics; they may have a 3D printer or a CNC router. But, unlike libraries, they don’t provide environments in which young people can build skills through the use of these resources and technologies by pursuing their own personal interests.

Event participants in front of the U.S. Capitol.

What yesterday’s event made clear was that libraries help young people build a mental bridge between knowledge and passion. Libraries and librarians don’t just connect children and teens with important information. We help them understand how they can use that information to become exactly who they want to be. For libraries to continue to be youth education leaders, we must continue to help children and teens arrive at this understanding. As the ALA Washington Office ramps up its work on youth and technology, our members and staff will advocate for policies, programs, and initiatives that help libraries play this important role.

ALA thanks Venicia Gray of Rep. Fudge’s office for her hard work leading to this successful program, including a Congressional meeting room packed to capacity. Jessica McGilvray and Marijke Visser of the Washington Office orchestrated the event for ALA. The ALA Washington Office thanks YALSA for its assistance in enabling Nicola McDonald’s appearance at this program.

The post Library youth and tech leaders on the Hill: We build a bridge between knowledge and passion appeared first on District Dispatch.

Tim Ribaric: Code4Lib North 2015 St. Catharines: And so can you!

Fri, 2015-06-12 20:23

Just last week I had the opportunity to co-host the Code4Lib North 2015 meetup.

read more

Roy Tennant: MARC Speaks: “When I Die”

Fri, 2015-06-12 20:09

A guest column by Marc Record.

I know that some people have been all too quick to call for my death, but we must look beyond such such short-sighted little people toward the greater good, as we have always done.

In doing so, I must acknowledge that I am probably not long for this world. Not because I am not doing useful work, mind you, as I completely am, but that because there are so many who call for my demise and at some point you must realize your time is near at hand.

Thus your thoughts must betimes turn to the future, and the legacy you wish to leave to those coming after. And the legacy I wish to leave is one of description. To be specific, a description most OWL. Or Turtle. Or whatever. The point is to leave behind descriptions of useful resources that don’t require the old ways of doing things — frankly, our ways of doing things — as a requirement.

I mean, it’s all about the data. If the data that I have carried so faithfully for so many decades can be taken forward into the future, then great.

But I want you to promise me something. Back when I was born, I represented the cutting edge in technology. Sure, now I seem long in the tooth, and I am, but for many years I was a technological marvel. Most professions that relied on data (for example, doctors) were decades behind me. In a lot of ways, I was the poster child for capturing structured data controlled by rules that could be parsed by computers. It’s remarkable, really.

So that’s my ask. I will go quietly into that dark night but on one condition — that you don’t set your sights too low. That you really examine both the modern requirements for bibliographic data and the amazing opportunities that exist today.

And having reviewed all of this, you then make a bold, cutting edge, almost astonishing move. Like what happened when Henriette Avram gave birth to me.

Now that would be a legacy worthy of my name.


Picture by Schu,, CC BY-NC-SA 2.0.

LITA: LITA Updates, June 2015

Fri, 2015-06-12 18:23

This is one of our periodic messages sent to all LITA members. This update includes information about the following:

  • Election Results
  • Learning Opportunities at Annual Conference
  • Annual Conference Highlights
  • LITA Executive Director Plans to Retire

Election Results

Please join in congratulating the newly elected LITA Board Members:

Aimee Fifarek, Vice-President/President-Elect,
Ken Varnum and Susan Sharpless Smith, Directors-at-Large for three-year terms.

Thanks go to the Nominating Committee which included Karen G. Schneider, chair, Pat Ensor, Adriene Lim, and, Chris Evjy, members.

LITA members elected to the ALA Council include: Eric Suess and Joan Weeks, Councilors-at-Large.

Congratulations to all, and, thank you to every candidate who was willing to stand for office.

Learning Opportunities at Annual Conference

Three full day workshops are being offered in San Francisco on Friday, June 26th. Two of the sessions are in the Moscone Convention Center; the third preconference is off site in a maker/hacker space. These are your choices:

  1. Creating Better Tutorials Through User-Centered Instructional Design. Hands-on workshop with experts from the University of Arizona. Moscone Convention Center 2008 (W)
  2. Learn to Teach Coding and Mentor Technology Newbies – in Your Library or Anywhere! Work with experts from the Black Girls CODE to become master technology teachers. Moscone Convention Center 2010 (W)
  3. Build a Circuit & Learn to Program an Arduino in a Silicon Valley Hackerspace. This workshop will convene at Noisebridge, 2169 Mission Street, a hacker space in San Francisco. Clearly, it will be hands on.

To register for one of these three LITA workshops simply go to the ALA Annual Conference registration and sign up. If you are already registered for conference, the workshop will be added to your registration. If you can’t attend the Annual Conference but a full day workshop on Friday, June 26th from 8:30 – 4:00 pm would be perfect for you, please go to the ALA Annual Conference registration site and sign up. Although you register for these full day workshops through the Annual Conference registration site, please note: you do not have to register for the entire conference in order to register for a workshop. Registration will be accepted on site outside the classrooms for the two workshops in the Moscone Center.

  • Register online through June 19
  • Call ALA Registration at 1-800-974-3084
  • Onsite registration will also be accepted in San Francisco.

Be sure to watch the LITA web sites for announcements about online learning opportunities that are being developed for July and August.

Annual Conference Highlights

The Open House on Friday, June 26, from 3:00 to 4:00pm, MCC-2005 (W), provides members and non-members alike an opportunity to explore with the LITA leadership the many opportunities within LITA. If there is a Committee or an Interest Group that might provide you with the leadership experience you are seeking, this is the perfect time to get some f2f advice. If you have ideas about how LITA might serve you better, this is the perfect time to share those ideas. If you are interested in programming or publications, if you are looking for people who share your interests in various aspects of technology, and/or if you are seeking a good conversation with engaged members, then you will want to attend the Open House.

“Sunday Afternoon with LITA” is scheduled for the Moscone Convention Center, 3014-3016 (W). The Afternoon starts with the popular Top Technology Trends program on June 28th from 1:00 to 2:00pm. This program features our ongoing roundtable discussion about trends and advances in library technology. The panel of experts includes: Carson Block, Andrea Davis, Grace Dunbar, Bonnie Tijerina, and Sarah Houghton, moderator.

A brief awards program at 3:00 will be followed by the LITA President’s Program. The award winners include:

  • Ed Summers, Frederick G. Kilgour Award for Research in Library and Information Technology,
  • David Walker, LITA/Library Hi Tech Award for Outstanding Communication in Library and Information Technology,
  • Heather Terrell, LITA/Ex Libris Student Writing Award for her paper “Reference is dead, long live reference: electronic collections in the digital age.”

Following the awards ceremony, you will want to stay for Rachel Vacek’s President’s Program with Lou Rosenfeld, Rosenfeld Media, which publishes some of the best-loved books in user experience, produces UX events, and equips UX teams with coaching and training.

The Top Technology Trends program, LITA awards ceremony, and LITA President’s Program are all in the same room.

At 5:30, we transition from afternoon to evening at the LITA Happy Hour at DaDa Bar, 86 2nd Street.

LITA provides 20 programs at Annual Conference. Be sure to review the LITA Highlights page for detailed information on all LITA programs and activities planned for Annual Conference.

LITA Executive Director plans to retire

I have good news to share. After 24 years with ALA (14 of those with LITA), over 10 years with OCLC, and various other employment, I plan to retire. My last day will be July 31, 2015. I’m very excited. I’ve had a number of recommendations on what to do including: “spend the first day in your PJs”, and, “really enjoy not working”. I do plan to enjoy not working. I have a number of projects and plans I’ll be exploring, plus, people and places I hope to visit.

If you are in the San Francisco area on Sunday, June 28th, please come to the LITA Happy Hour to celebrate with me and the Membership Development Committee and other LITA leaders and members. The Happy Hour/Party is at the DaDa Bar, 86 2nd Street.

Hope to see you in San Francisco.

I encourage you to connect with LITA by:

  1. Exploring our web site.
  2. Subscribing to LITA-L email discussion list.
  3. Visiting the LITA blog and LITA Division page on ALA Connect.
  4. Connecting with us on Facebook and Twitter.
  5. Reaching out to the LITA leadership at any time.

Please note: the Information Technology and Libraries (ITAL) journal is available to you and to the entire profession. ITAL features high-quality articles that undergo rigorous peer-review as well as case studies, commentary, and information about topics and trends of interest to the LITA community and beyond. Be sure to sign up for notifications when new issues are posted (March, June, September, and December).

If you have any questions or wish to discuss any of these items, please do let me know.

All the best,


Mary Taylor, Executive Director
Library and Information Technology Association (LITA)
50 E. Huron, Chicago, IL 60611
800-545-2433 x4267
312-280-4267 (direct line)
312-280-3257 (fax)
mtaylor (at)

Join us in Minneapolis, November 12-15, 2015 for the LITA Forum.

Peter Murray: Top Tech Trends, ALA Annual 2015 edition: Local and Unique; New metrics and citation tools

Fri, 2015-06-12 17:50

I threw my hat into the ring to be on the LITA Top Tech Trends panel at the ALA annual conference later this month in San Francisco, and never could I say that I was more excited not to be selected. (You can find more info on this year’s Top Tech Trends session in the ALA Conference Scheduler.) There is a great lineup of panelists this year:

  • Sarah Houghton (Moderator), Director of the San Rafael Public Library – @TheLiB
  • Carson Block, 20-year veteran of libraries and now a library technology consultant – @CarsonBlock
  • Andrea Davis, who has her fingers in so many pies that you should really check out her LinkedIn profile@detailmatters
  • Grace Dunbar, Vice President of Equinox Software, Inc
  • Bonnie Tijerina, Fellow at the Data & Society Institute in New York City – @bonlth

As part of the process, the committee asks potential panelists to explain two trends, why they are important, and how it will affect libraries. Although I’m not on the panel this year, I thought it would be useful to post my two trends here and see what others thought.

Making the local and the unique available to everyone and everywhere

A core part of libraries and other cultural heritage organizations has been to collect, preserve, and make available the resources that are unique to their users, their location, and their specialization. Up until recent years, availability of these resources spread by word-of-mouth, by citation in published literature, by large national union catalog volumes, and by short bibliographic records made available first through OCLC then by online catalogs. With the conception and execution of projects like DPLA and Europeana, broad audiences are finding digital surrogates (or digital copies) of these resources. And most recently IMLS highlighted its intent to build up capabilities through its focus on a National Digital Platform and its funding of the Hydra-in-a-Box initiative. How does that change the collection development missing? Or the research and reference mission? It isn’t so much a matter of remaining relevant as it is to serve different audiences and different needs. And I think public libraries are impacted the most. How can the experiences of large — typically academic — libraries be applied on a large scale to cultural heritage organizations of all types?

New metrics and new citation tools

Last year’s Library Horizon Report from the New Media Consortium listed “Bibliometics and Citation Technologies” with a Time-to-Adoption horizon of three years. “Alt Metrics” has taken off to the point where NISO’s Todd Carpenter suggests we should simply call them “metrics” — there is nothing alternative about them. If the tools of the day allow us to measure the impact of our scholars’ work to the article and dataset level, how will that impact the library’s mission to collect, offer, and preserve materials of local interest? If annotation frameworks like take off, what is the role of libraries in preserving and contextualizing those additions to the scholarly conversation?

So! If you were able to sit on the panel, what would your two top technology trends be for this year?

Link to this post!

Eric Lease Morgan: Some automated analysis of Ralph Waldo Emerson’s works

Fri, 2015-06-12 17:26

This page describes a corpus named emerson, and it was programmatically created with a program called the HathiTrust Research Center Workset Browser.

General statistics

An analysis of the corpus’s metadata provides an overview of what and how many things it contains, when things were published, and the sizes of its items:

  • Number of items – 62
  • Publication date range – 1838 to 1956 (histogram : boxplot)
  • Sizes in pages – 20 to 660 (histogram : boxplot)
  • Total number of pages – 11866
  • Average number of pages per item – 191

Possible correlations between numeric characteristics of records in the catalog can be illustrated through a matrix of scatter plots. As you would expect, there is almost always a correlation between pages and number of words. Are others exist? For more detail, browse the catalog.

Notes on word usage

By counting and tabulating the words in each item of the corpus, it is possible to measure additional characteristics:

Perusing the list of all words in the corpus (and their frequencies) as well as all unique words can prove to be quite insightful. Are there one or more words in these lists connoting an idea of interest to you, and if so, then to what degree do these words occur in the corpus?

To begin to see how words of your choosing occur in specific items, search the collection.

Through the creation of locally defined “dictionaries” or “lexicons”, it is possible to count and tabulate how specific sets of words are used across a corpus. This particular corpus employs three such dictionaries — sets of: 1) “big” names, 2) “great” ideas, and 3) colors. Their frequencies are listed below:

The distribution of words (histograms and boxplots) and the frequency of words (wordclouds), and how these frequencies “cluster” together can be illustrated:

Items of interest

Based on the information above, the following items (and their associated links) are of possible interest:

  • Shortest item (20 p.) – The wisest words ever written on war / by R.W. Emerson … Preface by Henry Ford. (HathiTrust : WorldCat : plain text)
  • Longest item (660 p.) – Representative men : nature, addresses and lectures. (HathiTrust : WorldCat : plain text)
  • Oldest item (1838) – An address delivered before the senior class in Divinity College, Cambridge, Sunday evening, 15 July, 1838 / by Ralph Waldo Emerson. (HathiTrust : WorldCat : plain text)
  • Most recent (1956) – Emerson at Dartmouth; a reprint of his oration, Literary ethics. With an introd. by Herbert Faulkner West. (HathiTrust : WorldCat : plain text)
  • Most thoughtful item – Transcendentalism : and other addresses / by Ralph Waldo Emerson. (HathiTrust : WorldCat : plain text)
  • Least thoughtful item – Emerson-Clough letters, edited by Howard F. Lowry and Ralph Leslie Rusk. (HathiTrust : WorldCat : plain text)
  • Biggest name dropper – A letter of Emerson : being the first publication of the reply of Ralph Waldo Emerson to Solomon Corner of Baltimore in 1842 ; With analysis and notes by Willard Reed. (HathiTrust : WorldCat : plain text)
  • Fewest quotations – The wisest words ever written on war / by R.W. Emerson … Preface by Henry Ford. (HathiTrust : WorldCat : plain text)
  • Most colorful – Excursions. Illustrated by Clifton Johnson. (HathiTrust : WorldCat : plain text)
  • Ugliest – An address delivered before the senior class in Divinity College, Cambridge, Sunday evening, 15 July, 1838 / by Ralph Waldo Emerson. (HathiTrust : WorldCat : plain text)

Eric Lease Morgan: Some automated analysis of Henry David Thoreau’s works

Fri, 2015-06-12 17:24

This page describes a corpus named thoreau, and it was programmatically created with a program called the HathiTrust Research Center Workset Browser.

General statistics

An analysis of the corpus’s metadata provides an overview of what and how many things it contains, when things were published, and the sizes of its items:

  • Number of items – 32
  • Publication date range – 1866 to 1953 (histogram : boxplot)
  • Sizes in pages – 38 to 556 (histogram : boxplot)
  • Total number of pages – 7918
  • Average number of pages per item – 247

Possible correlations between numeric characteristics of records in the catalog can be illustrated through a matrix of scatter plots. As you would expect, there is almost always a correlation between pages and number of words. Are others exist? For more detail, browse the catalog.

Notes on word usage

By counting and tabulating the words in each item of the corpus, it is possible to measure additional characteristics:

Perusing the list of all words in the corpus (and their frequencies) as well as all unique words can prove to be quite insightful. Are there one or more words in these lists connoting an idea of interest to you, and if so, then to what degree do these words occur in the corpus?

To begin to see how words of your choosing occur in specific items, search the collection.

Through the creation of locally defined “dictionaries” or “lexicons”, it is possible to count and tabulate how specific sets of words are used across a corpus. This particular corpus employs three such dictionaries — sets of: 1) “big” names, 2) “great” ideas, and 3) colors. Their frequencies are listed below:

The distribution of words (histograms and boxplots) and the frequency of words (wordclouds), and how these frequencies “cluster” together can be illustrated:

Items of interest

Based on the information above, the following items (and their associated links) are of possible interest:

Harvard Library Innovation Lab: Link roundup June 12, 2015

Fri, 2015-06-12 16:59

This is the good stuff.

Paul Ford: What is Code? | Bloomberg

well worth the very long read

How 77 Metro Agencies Design the Letter ‘M’ for Their Transit Logo – CityLab

77 different versions of the letter ‘M’ in mass transit signs around the world

Go To Hellman: Protect Reader Privacy with Referrer Meta Tags

HTML 5 referrer meta element is a new and easy way to not overshare.

Can the Swiss Watchmaker Survive the Digital Age?

The clock itself was a first step toward the “quantified self,”

‘Passports’ To Vermont Libraries Encourage Literary Exploration

Take a tour of Vermont libraries. Be sure to get your passport stamped.

District Dispatch: Congressional Republicans open new attack on FCC net neutrality rules

Fri, 2015-06-12 16:25

Yesterday the House Appropriations Subcommittee released – and today approved – its FY 2016 Financial Services Appropriations Bill providing funding for the Federal Communications Commission and other agencies. But House Republicans included a “net neutrality surprise” in its funding bill.

Tucked in this $20.2 billion funding bill is language that would prohibit the FCC from implementing the net neutrality order, issued February 26, 2015, until three specific legal challenges are fully resolved, including any available appeals. This provision could likely delay implementation of the net neutrality order for several years. (The three challenges specifically noted in the bill were Alamo Broadband Inc. v. FCC, United States Telecom Association v. FCC, and CenturyLink v. FCC.)

Overall funding for the FCC would be dramatically cut under the Republican’s austere budget. Republicans are recommending a $25 million reduction for the FCC below FY 2015 levels and $73 million below the Obama Administration’s requested level. The bill contains $315 million for the FCC. While Appropriation bills, by tradition, generally eschew including legislative language, the House appears to make an exception this year by including non expenditure-related language that would require the FCC to make proposed regulations publicly available for 21 days before a vote, and prohibit the agency from regulating rates for either wireline or wireless Internet service.

The House Appropriations Subcommittee on Financial Services approved the funding measure on Thursday. The full Committee has not announced a timetable for consideration although the committee is seeking to pass all 12 appropriations bills in the coming weeks. The Senate Appropriations Committee has not released its funding measure.

Efforts to bring any appropriations measures to the floor in the Senate will be a challenge for the Republican Majority. Numerous sources and media reports indicate that Senate Democrats intend to filibuster all Appropriations bills on the Floor unless funding levels are increased. The White House has also indicated the President will likely veto funding bills. It is expected that FY 2016 appropriations bills will not be finalized for several months and may drag on through the Fall, well past the October 1 start of the Fiscal Year.

ALA has worked tirelessly to support the FCC net neutrality order and is greatly concerned that House Republicans are using back-door methods to thwart the implementation of open Internet protections. While this initial flurry of Appropriations activity is a strong statement by Republicans against the FCC Order, ALA will continue to urge this language not be included in any final funding bill in the coming months.

The post Congressional Republicans open new attack on FCC net neutrality rules appeared first on District Dispatch.

Open Library Data Additions: Amazon Crawl: part im

Fri, 2015-06-12 10:02

Part im of Amazon crawl..

This item belongs to: data/ol_data.

This item has files of the following types: Data, Data, Metadata, Text

Cynthia Ng: Accessible Format Production Part 3: Making Accessible PDF

Fri, 2015-06-12 06:19
Once again, there are numerous programs that can edit PDFs. Unfortunately, I have yet to find a free (or very cheap) one that allows you to edit even the basic pieces I talk about below. Would love to hear if anyone has recommendations. Anyway, that means I will discuss what needs to be done but … Continue reading Accessible Format Production Part 3: Making Accessible PDF

DPLA: Apply to be a new DPLA Service Hub!

Thu, 2015-06-11 16:43


The Digital Public Library of America seeks applicants to serve as Service Hubs in our growing national network.  The application and corresponding instructions are available from the link below.

Service Hub Application

A Service Hub represents a community of institutions to DPLA, provides their partners’ aggregated metadata to DPLA through a single source, and offers tiered services to create a local community of practice.  Service Hubs are geographically based, and DPLA seeks to grow the map of covered states and/or regions by inviting applications through this and future calls for applicants.

The deadline for submission is Monday, July 20, 2015.  Applicants will be notified of their selection on or before August 28, 2015, and will be expected to begin working with DPLA staff immediately following the August announcement to formalize the partnership and begin the process of harvesting metadata.

To answer general inquiries about the application process and to provide information about the hubs network and activities, an open information and Q&A session will be held with key DPLA staff members on June 22 at 4pm eastern.  If you would like to join this webinar, please register ahead of time using the link at

After registering, you will receive a confirmation email containing information about joining the webinar.

LibraryThing (Thingology): LibraryThing Recommends in BiblioCommons

Thu, 2015-06-11 16:34

Does your library use BiblioCommons as its catalog? LibraryThing and BiblioCommons now work together to give you high-quality reading recommendations in your BiblioCommons catalog.

You can see some examples here. Look for “LibraryThing Recommends” on the right side.

Quick facts:

  • As with all LibraryThing for Libraries products, LibraryThing Recommends only recommends other books within a library’s catalog.
  • LibraryThing Recommends stretches across media, providing recommendations not just for print titles, but also for ebooks, audiobooks, and other media.
  • LibraryThing Recommends shows up to two titles up front, with up to three displayed under “Show more.”
  • Recommendations come from LibraryThing’s recommendations system, which draws on hundreds of millions of data points in readership patterns, tags, series, popularity, and other data.

Not using BiblioCommons? Well, you can get LibraryThing recommendations—and much more—integrated in almost every catalog (OPAC and ILS) on earth, with all the same basic functionality, like recommending only books in your catalog, as well as other LibraryThing for Libraries feaures, like reviews, series and tags.

Check out some examples on different systems here.


BiblioCommons: email or visit See the full specifics here.
Other Systems: email or visit

District Dispatch: Senate leaders rush end-run on personal privacy

Thu, 2015-06-11 16:22

Apparently Senate Majority Leader Mitch McConnell (R-KY) and Senate Intelligence Committee Chairman Richard Burr (R-NC) learned nothing from the overwhelming outpouring of opposition to the worst of the USA PATRIOT Act earlier this month. They’re using the Senate Rules to try to ram through – as early as tomorrow – privacy-hostile legislation that’s received no public hearing or Senate floor debate as an amendment to a Department of Defense funding measure. The bill, innocuously dubbed the Cybersecurity Information Sharing Act (“CISA”), would expose enormous amounts of your personal data to federal, state, and even local law enforcement … without a warrant. Along with many coalition partners, ALA strongly opposed CISA in a letter to Senate leaders last March.

Photo by Yuri Samoilov

Sometimes, you just can’t explain anything better than a colleague, and Gabe Rottman of the ACLU’s Washington Office spoke up yesterday (with many links to detailed background information) about this issue and the immediate procedural threat in the Senate:

“So, what does the bill . . . do? It’s a surveillance bill, pure and simple. It says that any and all privacy laws, including laws requiring a warrant for electronic communications, and those that protect financial, health or even video rental records, do not apply when companies share ‘cybersecurity’ information, broadly defined, with the government.”

Please, take a minute to read his “Playing Politics With Cybersecurity and Privacy” now. Then, take just 60 more seconds to contact your Senators immediately with a simple message:

“VOTE NO on the ‘Burr Amendment’ to the defense bill – or on any such “information sharing” bill without full public hearings and Senate debate.”

The post Senate leaders rush end-run on personal privacy appeared first on District Dispatch.

Eric Lease Morgan: EEBO-TCP Workset Browser

Thu, 2015-06-11 15:25

I have begun creating a “browser” against content from EEBO-TCP in the same way I have created a browser against worksets from the HathiTrust. The goal is to provide “distant reading” services against subsets of the Early English poetry and prose. You can see these fledgling efforts against a complete set of Richard Baxter’s works. Baxter was an English Puritan church leader, poet, and hymn-writer. [1, 2, 3]

EEBO is an acronym for Early English Books Online. It is intended to be a complete collection of English literature between 1475 through to 1700. TCP is an acronym for Text Creation Partnership, a consortium of libraries dedicated to making EEBO freely available in the form of XML called TEI (Text Encoding Initiative). [4, 5]

The EEBO-TCP initiative is releasing their efforts in stages. The content of Stage I is available from a number of (rather hidden) venues. I found the content on a University Michigan Box site to be the easiest to use, albiet not necessarily the most current. [6] Once the content is cached — in the fullest of TEI glory — it is possible to search and browse the collection. I created a local, terminal-only interface to the cache and was able to exploit authority lists, controlled vocabularies, and free text searching of metadata to create subsets of the cache. [7] The subsets are akin to HathiTrust “worksets” — items of particular interest to me.

Once a subset was identified, I was able to mirror (against myself) the necessary XML files and begin to do deeper analysis. For example, I am able to create a dictionary of all the words in the “workset” and tabulate their frequencies. Baxter used the word “god” more than any other, specifically, 65,230 times. [8] I am able to pull out sets of unique words, and I am able to count how many times Baxter used words from three sets of locally defined “lexicons” of colors, “big” names, and “great” ideas. Furthermore, I am be to chart and graph trends of the works, such as when they were written and how they cluster together in terms of word usage or lexicons. [9, 10]

I was then able to repeat the process for other subsets, items about: lutes, astronomy, Unitarians, and of course, Shakespeare. [11, 12, 13, 14]

The EEBO-TCP Workset Browser is not as mature as my HathiTrust Workset Browser, but it is coming along. [15] Next steps include: calculating an integer denoting the number of pages in an item, implementing a Web-based search interface to a subset’s full text as well as metadata, putting the source code (written in Python and Bash) on GitHub. After that I need to: identify more robust ways to create subsets from the whole of EEBO, provide links to the raw TEI/XML as well as HTML versions of items, implement quite a number of cosmetic enhancements, and most importantly, support the means to compare & contrast items of interest in each subset. Wish me luck?

More fun with well-structured data, open access content, and the definition of librarianship.

  1. Richard Baxter (the person) –
  2. Richard Baxter (works) –
  3. Richard Baxter (analysis of works) –
  4. EEBO-TCP –
  5. TEI –
  6. University of Michigan Box site –
  7. local cache of EEBO-TCP –
  8. dictionary of all Baxter words –
  9. histogram of dates –
  10. clusters of “great” ideas –
  11. lute –
  12. astronomy –
  13. Unitarians –
  14. Shakespeare –
  15. HathiTrust Workset Browser –

LITA: Congratulations to the LITA UX Contest Winners

Thu, 2015-06-11 14:55

The results are in for LITA’s Contest: Great Library UX Ideas Under $100. Congratulations to winner Conny Liegl, Designer for Web, Graphics and UX at the Robert E. Kennedy Library at California Polytechnic State University for her submission entitled Guerilla Sketch-A-Thon. The LITA President’s Program Planning Team who ran the contest and reviewed the submissions loved how creative the project was and how it engaged users. From the sketches that accompanied the submission, and from looking at the before and after screenshots of the library website, it was clear the designers incorporated ideas from the student sketches.

Conny won a personal one-year, online subscription to Library Technology Reports, generously donated by ALA Tech Source. She gets to have lunch with LITA President Rachel Vacek and the LITA President’s Program speaker and UX expert Lou Rosenfeld at ALA in San Francisco. She gets a free book generously donated from Rosenfeld Media. And finally, her winning submission will be published in in Weave, an open-access, peer-reviewed journal for Library User Experience professionals published by Michigan Publishing.

There were so many entries submitted for the contest, picking a single winner was difficult. The Planning Team unanimously agreed to recognize first and second runner-up entries.

The First Runner-Up was the team at the University of Arizona Libraries who submitted their project Wayfinding in the Library. The team included people from multiple departments in their library including the User Experience department, Access & Information Services, and Library Communications. Congrats to Rebecca Blakiston, User Experience Librarian, Shoshana Mayden, Content Strategist, Nattawan Wood, Administrative Associate, Aungelique Rodriguez, Library Communications Student Assistant, and Beau Smith, Usability Testing Student Assistant. Each team member gets a book from Rosenfeld Media.

The Second Runner-Up was the team from Purdue University Libraries who submitted their project Applying Hierarchal Task Analysis Method to Discovery Tool Evaluation. The team consisted of Tao Zhang, Digital User Experiences Specialist and Marlen Promann, Graduate Research Assistant. Each team member gets a book from Rosenfeld Media.

In the coming months, interviews with the winners from each institution will be posted to the blog.

Brown University Library Digital Technologies Projects: Search relevancy tests

Thu, 2015-06-11 13:37

We are creating a set of relevancy tests for the library’s Blacklight implementation.  These tests use predetermined phrases to search Solr, Blacklight’s backend, mimicking the results a user would retrieve.  This provides useful data that can be systematically analyzed.  We use the results of these tests to verify that users will get the results we, as application managers and librarians, expect.  It also will help us protect against regressions, or new, unexpected problems, when we make changes over time to Solr indexing schema or term weighting.

This work is heavily influenced by colleagues at Stanford who have both written about their (much more thorough at this point) relevancy tests and developed a Ruby Gem to assist others with doing similar work.

We are still working to identify common and troublesome searches but have already seen benefits of this approach and used it to identify (and resolve) deficiencies in title weighting and searching by common identifiers, among other issues.  Our test code and test searches are available on Github for others to use as an example or to fork and apply to their own project.

Brown library staff who have examples of searches not producing expected results, please pass them on to Jeanette Norris or Ted Lawless.

— Jeanette Norris and Ted Lawless

Hydra Project: Booking for Hydra Connect 2015 open!

Thu, 2015-06-11 12:04

We are delighted to announce that booking for Hydra Connect 2015 is now open.  This year’s Connect takes place in Minneapolis, MN, from Monday September 21st to Thursday September 24th.  Details at   It is intended to publish a draft program in the first week of July.

Hydra Connect meetings are intended to be the major annual event in the Hydra year.  Hydra advertises them with the slogan “as a Hydra Partner or user, if you can only make it to one Hydra meeting this academic year, this is the one to attend!”

The three-day meetings are preceded by an optional day of workshops.  The meeting proper is a mix of plenary sessions, lightning talks, show and tell sessions, and unconference breakouts.  The evenings are given over to a mix of conference-arranged activities and opportunities for private meetings over dinner and/or drinks!  The meeting program is aimed at existing users, managers and developers and at new folks who may be just “kicking the tires” on Hydra and who want to know more.

We hope to see you there!


Peter Sefton: Ozmeka: extending the Omeka repository to make linked-data research data collections for (any and) all research disciplines

Thu, 2015-06-11 11:10

Ozmeka: extending the Omeka repository to make linked-data research data collections for (any and) all research disciplines by Peter Sefton, Sharyn Wise, Katrina Trewin is licensed under a Creative Commons Attribution 4.0 International License.

[Update 2015-06-11, fixing typos]

Ozmeka: extending the Omeka repository to make linked-data research data collections for (any and) all research disciplines Peter Sefton, University of Technology, Sydney, Sharyn Wise, University of Technology of Sydney, Peter Bugeia, Intersect Australia Ltd, Sydney, Katrina Trewin, University of Western Sydney, Katrina Trewin, University of Western Sydney,

There have been some adjustments to the authorship on this presentation, Peter Bugeia was on the abstract but didn’t end up contributing to the presentation, whereas Katrina Trewin withdrew her name from the proposal for a while, but then produced the Farms to Freeways collection and decided to come back in to the fold. The notes here are written in the first person, to be delivered in this instance by Peter but they come from all of the authors.

Abstract as submitted

The Ozmeka project is an Australian open source project to extend the Omeka repository system. Our aim is to support Open Scholarship, Open Science, and Cultural Heritage via repository software than can manage a wide range of Research (and Open) Data, both Open and access-restricted, providing rich repository services for the gathering, curation and publishing of diverse data sets. The Ozmeka project places a great deal of importance in integrating with external systems , to ensure that research data is linked to its context, and high quality identifiers are used for as much metadata as possible. This will include links to the ‘traditional’ staples of the Open Repositories conference series, publications repositories, and to the growing number of institutional and discipline research data repositories.

In this presentation we will take a critical look at how the Omeka system, extended with Ozmeka plugins and themes can be used to manage (a) a large cross disciplinary archive of research data about water-resources (b) an ethno-historiography built around a published book and (c) for managing large research data sets in and scientific institute, and talk about how this work paves the way for eResearch and repository support teams to supply similar services to researchers in a wide variety of fields. This work intended to reduce the cost of and complexity of creating new research data repository systems.

Slightly different scope now

I will be talking about Dharmae, the database of water-resources-themed research data, the project to put the book data into Omeka took a different turn and the scientific data repository is still being developed.

How does this presentation fit in to the conference?

Which Conference Themes are we touching on?

  • Supporting Open Scholarship, Open Science, and Cultural Heritage

  • Managing Research (and Open) Data

  • Building the Perfect Repository

  • Integrating with External Systems

Re-using Repository Content

Things we want to cover:
  • A bit about the research data projects we’ve worked on.

  • How we’ve implemented Linked Data for metadata (stamping out strings!)

  • What about this Omeka thing?

(The picture is one I took of the conference hotel)

What’s Omeka?

We like to call Omeka the “Wordpress of repositories”

It’s a PHP application which is easy to install and get up and running and yes – it is a ‘repository’, it lets you upload digital objects, describe them with Dublin Core Metadata, and no, it’s not perfect.

The Perfect Repository?

So lets talk about this phrase “the perfect repository”. I have been following Jason Scott at the Internet archive (who would make a great keynote speaker for this conference, by the way) and his work on rescuing and making available cultural heritage such as computer-related ephemera and programs for obsolete computing and gaming platforms. He uses the phrase “Perfect is the enemy of done” and talks about how making some tradeoffs and compromises and then just doing it mean that stuff, you know, actually gets done that otherwise wouldn’t.

No, we’re not calling Omeka “third best”, but one of the points of this talk is that instead of waiting for or trying to build the ‘perfect’ research data repository Omeka is a low-barrier-to-entry, cheap way to build some kinds of working-data-repositories or data-publishing websites. I have talked to quite a few people who say they have looked at Omeka and decided that it is too simple, too limited for whatever project they were doing. Indeed, it does have some limitations; the two big ones are that it does not handle access control at all and it has no approval workflow, at least not in this version.

The quote on the slide is via the wikipedia page Perfect is the Enemy of Good

The Portland Common Data Model

Omeka more-or-less implements a subset of the Portland Common Data Model, which I was introduced to yesterday in the Fedora workshop, although as I just mentioned it is not strong on Access control, having only a published/unpublished flag on items.

Why Omeka? We’ll come back to this – but the ability to do Linked Data was one of the main attractions of Omeka. We had to add some code to make the relations appear like this, and easier to apply than I the ‘trunk’ version of Omeka 2.x but that development was not hard or expensive, compared to what It might have cost on top of other repository systems with more complex application stacks. Another

(Note – if you look at the current version of Dharmae, the item relations will appear a little differently, as not all the Ozmeka enhanced code has been rolled out).

Australian national data service (ANDS) funded project … to kick-start a major open data collection

I’m going to give you a quick backgrounder on our project by way of introduction: ANDS approached us with a funding opportunity to create an open data collection. Many of you will be familiar with the frustrations of funding rules : our constraint was that we were not allowed to develop software, although we could adapt it.

The UTS team put the word out for publishable research data collections but got little response. Then, thanks to the library eScholarship team, Sharyn met Professor Heather Goodall and Jodi Frawley, who had data from a large Oral History project on the impacts of water management on stakeholders in the Murray Darling Basin – called Talking Fish.

And they had had the amazing foresight – the foresight of the historian- to obtain informed consent to publically archive the interview data.

Field science in MDB (from Dharmae)

In the image above MDB means he Murray Darling Basin, a big, long river system with hardly any water in it.

First up I’ll talk about Dharmae. was conceived as a multi-disciplinary data hub themed around water related research, with the “ecocultures” concept intended to flag that we welcome scientific data contributors (ecological or otherwise), as well as cultural data. Because they are equally crucial if we want to research to have an impact on the world.

This position is also supported in the literature of the intergovernmental science policy community and environmental sustainability and resilience research.

One paper expressed it this way – for research to have a transformative impact, its not simply more knowledge that we need, but different types of knowledge.

The literature emphasizes the need for improved connectivity between knowledge systems: those applied to researching the natural world, such as science, and those that investigate socio-cultural practices such as social sciences, history and particularly also indigenous knowledge.

But because these different knowledge systems each come with their own practices and terminologies, we have an interesting information science problem:

How to support data deposit and discovery by users from all disciplines?

Linked data & disambiguation

Essentially by using linked data. We extended the open source repository Omeka by allowing all named entities (like, places, people, species) to be linked to an authoritative source of truth.

Lets take location – it is one of the obvious correspondences between scientific and cultural data..

That still doesn’t mean its an easy thing to link on. Place names are rarely unique as we see Kendell noticing above.

But by using authoritative sources, like Geonames, we can disambiguate place names, and better still we can derive their coordinates.

Now we want users of Dharmae who are interested in finding data by location to access it in the way that makes sense to them – and that may not be name.

Lower Darling/Anabranch

In Dharmae readers can search by place name or they can use a map.

Here is one of 12 study regions from the Talking Fish data, showing the Lower Darling and Anabranch above Murray Sunset National Park.

We georeferenced these regions using a Geonode map server, but we have superimposed the researchers hand-drawn map as a layer on top to preserve the sense of human scale interaction

You can click through from here to read or listen to the oral histories completed in this region, look at photos or investigate the species identified by participants.

You can also search by Indigenous language Community if you prefer.

How else could this be useful?

Lower Darling/Anabranch:

It just so happens that we also have a satellite remote sensing dataset that corresponds reasonably well to this region above the national park.

It shows the Normalized Difference Vegetation Index for the region or the vegetation change over the decade 1996-2006.

Relative increase in vegetation shows as green and relative decrease as pink.

Could the interviews with participants from that region provide any clues as to why?

I can’t tell you that, but the point is that the more we enrich and link data, the more possible hypotheses we can generate.

The Graph

Here’s the graph of our solution: We created local records, so that the Dharmae hub could maintain its own set of ‘world-views’ while still interfacing with the semantic web knowledge graph.

This design pattern is something we want to explore more: having a local record for an entity or concept, with links to external authorities. So, for example we might use a Dbpedia URI for a person, and quote a particular ‘approved’ version of the wikipedia page about them so there is a local, stable proxy for an external URI, but the local record is still part of the global graph. With the species data, this will allow researchers to explore the way the participants in Talking Fish talked about fish and compare this to what the Atlas of Living Australia says about nomenclature and distribution.

From the Journey to Horshoe Bend Website at the University of Western Sydney:

TGH Strehlow’s biographical memoir, Journey to Horseshoe Bend, is a vivid ethno-historiographic account of the Aboriginal, settler and Lutheran communities of Central Australia in the 1920’s. The ‘Journey to Horseshoe Bend’ project elaborates on Strehlow’s book in the form of an extensive digital hub – a database and website – that seeks to ‘visualise’ the key textual thematics of Arrernte* identity and sense of “place”, combined with a re-mapping of European and Aboriginal archival objects related to the book’s social and cultural histories.

Thus far the project has produced a valuable collection of unique historical and contemporary materials developed to encourage knowledge sharing and to initiate knowledge creation. By bringing together a wide variety of media – including photographs, letters, journals, Government files, audio recordings, moving images, newspaper, newsletters, interviews, manuscripts, an electronic version of the text and annotations – the researchers hope to ‘open out’ the histories of Central Australia’s Aboriginal, settler and missionary communities.

JTHB research work entailed creating annotations relating to sections of the book text. The existing book text, marked up with TEI, was converted to HTML and the annotations were anchored within the HTML. Plan was to create an Omeka plugin to display the text and co-display or footnote the annotations relating to each part of the text.


  • The existing annotations were incomplete and the research team wished to continue adding annotations and material. This meant that the HTML would need to be continuously edited (outside Omeka), giving rise to issues around workflow, researcher skills, and version control.
  • Cultural sensitivities were also a barrier to open publication (not an Omeka issue but a MODC one)

Katrina Trewin is a data librarian, working at the University of Western Sydney. While the Journey to Horseshoe Bend project could not be completed using Omeka, due to resource constraints. Another project was able to be completed. Using Omeka, Katrina was able to build web site around an oral-history data set without needing any development. This work took place in parallel with the work on Dharmae at UTS so was not able to make use of some of the innovations introduced in that project such as enhancements to the Item Relations plugin to allow rich-interlinking between resources.

Katrina’s notes:

Material had been in care of researcher for 20+ years.

  • Audio interviews on cassette, photographs, transcripts (some electronic)
  • Digitised all the material
  • Created csv files for upload of item metadata into Omeka
  • Once collections of items were created, then used exhibit plugin to bring material relating to each interviewee together.

Worked well because collection was complete – fine to edit metadata in Omeka but items themselves need to be stable (unlike the JTHB text)

Omeka allows item-level description which is not possible via institutional repository. This could have been done in Omeka interface but was more efficient via csv upload. csv files, bundled item files, readme and Omeka xml output made available from institutional repository record for longer term availability as hosting arrangement is not in place. Chambers, Deborah; Liston, Carol; Wieneke, Christine (2015): Interview material from Western Sydney women’s oral history project: ‘From farms to freeways: Women’s memories of Western Sydney’. University of Western Sydney.

Katrina and team have published all the data as a set of files with a link to the website , in the institutional research data repository. This screenshot shows the data files available for download for re-use. My team at UTS are doing a similar thing with the Dharmae data.

At UTS we are are constructing a growing ‘grid’ of research data services. This diagram is a sketch of how Omeka fits into this bigger picture, showing the geonode mapping service which supplies map display services and can harvest maps from Omeka as well. In this architecture, all items ultimately end up in an archival repository with a catalogue description, as I showed earlier for the Farms to Freeways data.

Interested? Check out Clone our Ozmeka github repostiories


Omeka is a very simple seeming repository solution which is easy to dismiss for projects that demand the ‘perfect’ repository, but looking beyond its limitations it has some strengths that make it attractive for creating ‘micro repository services’ (Field & McSweeney 2014). Our work has made it easier to set up new research-data repositories that adhere to linked-data principles and create rich semantic web interfaces to data collections. This paves the way for a new generation of micro or workgroup-level research data repositories which link-to and re-use a wide range of data sources.


Johnson, Ian. “Heurist Scholar,”2014

Kucsma, Jason, Kevin Reiss, and Angela Sidman. “Using Omeka to Build Digital Collections: The METRO Case Study.” D-Lib Magazine 16, no. 3/4 (March 2010). doi:10.1045/march2010-kucsma.

Nahl, Diane. “A Discourse Analysis Technique for Charting the Flow of Micro-Information Behavior.” Journal of Documentation 63, no. 3 (2007): 323–39. doi:

Palmer, Carole L., and Melissa H. Cragin. “Scholarship and Disciplinary Practices.” Annual Review of Information Science and Technology 42, no. 1 (2008): 163–212. doi:10.1002/aris.2008.1440420112.

Palmer, Carole L. “Thematic Research Collections”, Chapter 24 in Schreibman, Susan, Ray Siemens, and John Unsworth. Companion to Digital Humanities (Blackwell Companions to Literature and Culture). Hardcover. Blackwell Companions to Literature and Culture. Oxford: Blackwell Publishing Professional, 2004.

Simon, Herbert. “Rational Choice and the Structure of the Environment.” Psychological Review 63, no. 2 (1956): 129–38.

Strehlow, Theodor George Henry. Journey to Horseshoe Bend. [Sydney]: Angus and Robertson, 1969.

  • Researchers:
    • Prof. Heather Goodall
    • Dr Michelle Voyer
    • Associate professor Carol Liston
    • Dr Jodi Frawley
    • Dr Kevin Davies
  • eResearch: Sharyn Wise, Peter Sefton, Mike Lynch, Paul Nguyen, Mike Lake, Carmi Cronje, Thom McIntyre and Kevin Davies, Kim Heckenberg, Andrew Leahy, Lloyd Harischandra
  • Library: Duncan Loxton (eScholarship) & Kendell Powell (Aboriginal & Torres Strait Islander Data Archive Officer), Katrina Trewin, Michael Gonzalez
  • Thanks to: State Library of NSW Indigenous Unit, Atlas of Living Australia, Terrestrial Ecosystems Research Network and our funder, ANDS.

I didn’t have this slide when I presented, and forgot to acknowledge the contribution of all of the above, and anyone who’s been left off by accident.