You are here

Feed aggregator

Open Knowledge Foundation: New Initiative: Open Data for Tax Justice #OD4TJ

planet code4lib - Wed, 2016-03-02 13:07

Every year countries lose billions of dollars to tax avoidance, tax evasion and more generally to illicit financial flows. According to a recent IMF estimate around $700 billion of tax revenues is lost each year due to profit-shifting. In developing countries the loss is estimated to be around $200 billion, which as a share of GDP represents nearly three times the loss suffered by OECD countries. Meanwhile, economist Gabriel Zucman estimates that certain components of undeclared offshore wealth total above $7 trillion, implying tax losses of $200 billion annually; Jim Henry’s work for TJN suggests the full total of offshore assets may range between $21 trillion and $32 trillion.

We want to transform the way that data is used for advocacy, journalism and public policy to address this urgent challenge by creating of a global network of civil society groups, investigative reporters, data journalists, civic hackers, researchers, public servants and others.

Today, Open Knowledge and the Tax Justice Network are delighted to announce the launch of a new initiative in this area: Open Data for Tax Justice. We want to initiate a global network of people and organisations working to create, use and share data to improve advocacy and journalism around tax justice. The website is: and using the hashtag #od4tj.

The network will work to rally campaigners, civil society groups, investigative reporters, data journalists, civic hackers, researchers, public servants and others; it will aim to catalyse collaborations and forge lasting alliances between the tax justice movement and the open data movement. We have received a huge level of support and encouragement from preliminary discussions with our initial members, and look forward to expanding the network and its activities over the coming months.

What is on the cards? We’re working on a white paper on what a global data infrastructure for tax justice might look like. We also want to generate more practical guidance materials for data projects – as well as to build momentum with online and offline events. We will kick off with some preliminary activities at this year’s global Open Data Day on Saturday 5th March. Tax justice will be one of the main themes of the London Open Data Day, and if you’d like to have a go at doing something tax related at an event that you’re going to, you can join the discussion here.

DuraSpace News: DuraSpace at the SPARC MORE Meeting

planet code4lib - Wed, 2016-03-02 00:00

Austin, TX  DuraSpace is a proud sponsor of the upcoming SPARC MORE Meeting, March 7-8, 2016 in San Antonio, Texas. The gathering of librarians, educators, and researchers will focus on the 2014 “Convergence” meeting theme and will explore the increasingly central role libraries play in the growing shift toward Open Access, Open Education and Open Data.

DuraSpace News: German DSpace User Group Meeting to be Held in Hamburg, Sept. 27, 2016

planet code4lib - Wed, 2016-03-02 00:00

From Jan Weiland, ZBW - Deutsche Zentralbibliothek für Wirtschaftswissenschaften

Hamburg, Germany  The ZBW - German National Library of Economics gladly invites you to join the next German DSpace User Group meeting in Hamburg:

Date: Tuesday, 27th September 2016, from 11 a.m. to 5 p.m.
Venue: ZBW, Neuer Jungfernstieg 21, 20354 Hamburg, Germany, fifth floor, Room 519

DuraSpace News: Introducing the first Open Peer Review Module for DSpace Repositories

planet code4lib - Wed, 2016-03-02 00:00

From Emilio Lorenzo, ARVO Consultores

Asturias, Spain  With the support of OpenAIRE, Open Scholar has coordinated a consortium of five partners to develop the first Open Peer Review Module (OPRM) for DSPACE.

DuraSpace News: VIVO Updates for February 28–Anniversary, Upcoming Events, Open VIVO

planet code4lib - Wed, 2016-03-02 00:00

From Mike Conlon, VIVO Project Director

Richard Wallis: Evolving in Practice Pt3: Choosing Where to Extend

planet code4lib - Tue, 2016-03-01 16:24

In this third part of the series I am going to concentrate less on the science of working with the technology of and more on what you might call the art of extension.

It builds on the previous two posts The Bits and Pieces which introduces you to the mechanics of working with the repository in GitHub and your own local version; and Working Within the Vocabulary which takes you through the anatomy of the major controlling files for the terms and their examples, that you find in the repository.

Art maybe an over ambitious word for the process that I am going to try and describe. However it is not about rules, required patterns, syntaxes, and file formats – the science; it is about general guidelines, emerging styles & practices, and what feels right.  So art it is.

OK. You have read the previous posts in this series. You have said to yourself I only wish that I could describe [insert you favourite issue here] in You are now inspired to do something about it, or get together with a community of colleagues to address the usefulness of for your area of interest.  Then comes the inevitable question…

Where do I focus my efforts – the core vocabulary or a Hosted Extension or an External Extension?

Firstly a bit of background to help answer that question.

The core of the vocabulary has evolved since its launch by Google, Bing, and Yahoo! (soon joined by Yandex), in June 2011. By the end of 2015 its term definitions had reached 642 types and 992 properties.  They cover many many sectors commercial, and not, including sport, media, retail, libraries, local businesses, heath, audio, video, TV, movies, reviews, ratings, products, services, offers and actions.  Its generic nature has facilitated is spread of adoption across well over 10 million sites.  For more background I recommend the December 2015 article Evolution of Structured Data on the Web – Big data makes common schemas even more necessary. By Guha, Brickley and Macbeth.

That generic nature however does introduce issues for those in specific sectors wishing to focus in more detail on the entities and relationships specific to their domain whist still being part of, or closely related to,  In the spring of 2015 an Extension Mechanism, consisting of Hosted and External extensions, was introduced to address this.

Reviewed/Hosted Extensions are domain focused extensions hosted on the site. They will have been reviewed and discussed by the broad community as to style, compatibility with the core vocabulary, and potential adoption.  An extension is allocated its own part of the namespace – & being the first two examples.

External Extensions are created and hosted separate from in their own namespace.  Although related to and building upon [extending] the vocabulary these extensions are not part of the vocabulary.  I am editor for an early example of such an external extension that predates the launch of the extension mechanism.  Much more recently GS1 (The Global Language of Business) have published their External Extension – the GS1 Web Vocabulary at

An example of how extends can be seen from inspecting the class gs1:WearableProduct which is a subclass of gs1:Product which in turn is defined as an exact match to schema:Product.  Looking at an example property of gs1:Product, gs1:brand we can see that it is defined as a subproperty of schema:brand.  This demonstrates how is foundational to

Choosing Where to Extend

This initially depends on what and how much you are wanting to extend.

If all you are thinking of is adding the odd property to an already existent type, or to add another type to the domain and/or range of a property, or improve the description of a type or property; you probably do not need to create an extension.  Raise an issue, and after some thought and discussion, go for it – create the relevant code and associated Pull Request for the Gihub repositiory.

More substantial extensions require a bit of thought.

When proposing extension to the vocabulary the above-described structure provides the extender/developer with three options.  Extend the core; propose a hosted extension; or develop an external extension.  Potentially a proposal could result in a combination of all three.

For example a proposal could be for a new Type (class) to be added to the core, with few or no additional properties other than those inherited from its super type.  In addition more domain focused properties, or subtypes, for that new type could be proposed as part of a hosted extension, and yet more very domain specific ones only being part of an external extension.

Although not an exact science, there are some basic principles behind such choices.  These principles are based upon the broad context and use of across the web, the consuming audience for the data that would be marked up; the domain specific knowledge of those that would do the marking up and reviewing the proposal; and the domain specific need for the proposed terms.

Guiding Questions
A decision as to if a proposed term should be in the core, hosted extension or external extension can be aided by the answers to some basic questions:

  • Public or not public? Will the data that would be marked up using the term be normally shared on the web?  Would you expect to find that information on a publicly accessible web page today?If the answer is not public, there is no point in proposing the term for the core or a hosted extension.  It would be defined in an external extension.
  • General or Specific?  Is the level of information to be marked up, or the thing being described, of interest or relevant to non-domain specific consumers?If the answer is general, the term could be a candidate for a core term. For example Train could be considered as a potential new subtype of Vehicle to describe that mode of transport that is relevant for general travel discovery needs.  Whereas SteamTrain and its associated specific properties about driving wheel configuration etc. would be more appropriate to a railway extension.
  • Popularity? How many sites on the web would potentially be expected to make use of these term(s) How many webmasters would find them useful?If the answer is lots, you probably have a candidate for the core. If it is only a few hundred, especially if they would be all in a particular focus of interest, it would be more likely a candidate for a hosted extension. If it is a small number, it might be more appropriate in an external extension.
  • Detailed or Technical? Is the information, or the detailed nature of proposed properties, too technical for general consumption?If yes, the term should be proposed for a hosted or external extension. In the train example above, the fact that a steam train is being referenced could be contained in the text based description property of a Train type. Whereas the type of steam engine configuration could be a defined value for a property in an external extension.
Evolutionary Steps

When defining and then proposing enhancements to the core of Schema, or for hosted extensions, there is a temptation to take an area of concern, analyse it in detail and then produce a fully complete proposal.   Experience has demonstrated that it is beneficial to gain feedback on the use and adoption of terms before building upon them to extend and add more detailed capability.

Based on that experience the way of extending should be by steps that build upon each other in stages.  For example introducing a new subtype with few if any new specific properties.  Initial implementers can use textual description properties to qualify its values in this initial form.  In a later releases more specific properties can be proposed, their need being justified by the take-up, visibility, and use of the subtype on sites across the web.

Closing Summary

Several screen-full’s and a few weeks ago, this started out as a simple post in an attempt to cover off some of the questions I am often asked about how is structured, and how it can be made more appropriate for this project or that domain.  Hopefully you find the distillation of my experience and my personal approach, across these three resulting posts on Evolving in Practice, enlightening and helpful.  Especially if you are considering proposing a change, enhancement or extension to

My association with – applying the vocabulary; making personal proposals; chairing W3C Community groups (Schema Bib Extend, Schema Architypes, The Tourism Structured Web Data Community Group); participating in others (Schema Course extension Community Group, Sport Schema Community Group, Financial Industry Business Ontology Community Group, Community Group); being editor of the extension vocabulary; working with various organisations such as OCLC, Google’s team, and the Financial Industry Business Ontology (FIBO); and preparing & presenting workshops & keynotes at general data and industry specific events –  has taught me that there is much similarity between, on the surface disparate, industries and sectors when it comes to preparing structured data to be broadly shared and understood.

Often that similarity is hidden behind sector specific views, understanding, and issues in dealing with the open wide web of structured data where they are just another interested group, looking to benefit from shared recognition of common schemas by the major search engine organisations.  But that is all part of the joy and challenge I relish when entering a new domain and meeting new interested and motivated people.

Of course enhancing, extending and evolving the vocabulary is only one part of the story.  Actually applying it for benefit to aid the discovery of your organisation, your resources and the web sites that reference them is the main goal for most.

I get the feeling that there maybe another blog post series I should be considering!

FOSS4Lib Recent Releases: veraPDF - 0.10

planet code4lib - Tue, 2016-03-01 14:45

Last updated March 1, 2016. Created by Peter Murray on March 1, 2016.
Log in to edit this page.

Package: veraPDFRelease Date: Monday, February 29, 2016

LITA: I’m a Librarian. Of tech, not books.

planet code4lib - Tue, 2016-03-01 03:38
Image from

When someone finds out I’m a librarian, they automatically think I know everything there is to know about, well, books. The thing is, I don’t. I got into libraries because of the technology. My career in libraries started with the take off, a supposed library replacement, of ebooks. Factor in the Google “scare” and librar*s  were going to be done forever. Librar*s were frantic to debunk that they were no longer going to be useful, insert perfect time and opportunity to join libraries and technology.

I am a Systems Librarian and the most common and loaded question I get from non-librarians is (in 2 parts), “What does that mean? and What do you do?” Usually this resorts to a very simple response:
I maintain the system the library sits on, the one that gives you access to the collection from your computer in the comfort of your home. This tool, that lets you view the collection online and borrow books and access databases and all sorts of resources from your pajamas, my job is to make sure that keeps running the way we need it to so you have the access you want.
My response aims to give a physical picture about a technical thing. There is so much we do as systems librarians that if I were to get in-deep with what I do, we’d be there for a while. Between you and I, I don’t care to talk *that* much, but maybe I should.

There’s a lot more to being a Systems Librarian, much of which is unspoken and you don’t know about it until you’re in the throws of being a systems librarian. There was a Twitter conversation prompted when a Twitter’er asked for recommendations on things to teach or include in on the job training for someone who is interested in library systems. It got me thinking, because I knew little to nothing about being a Systems Librarian and just happened upon it (Systems Librarianship) because the job description sounded really interesting and I was already a little bit qualified. It also allowed me to build a skill set that provided me a gateway out of libraries if and when the time arrived. Looking back, I wonder what would I have wanted to know before going into Systems, and most importantly, would it have changed my decision to do so, or rather, to stay? So what is it to be a Systems Librarian?

The unique breed: A Systems Librarian:

  • makes sure users can virtually access a comprehensive list of the library’s collection
  • makes sure library staff can continue to maintain that ever-growing collection
  • makes sure that when things in the library system break, everything possible is done to repair it
  • needs to be able to accurately assess the problem presented by the frantic library staff member that cannot log into their ILS account
  • needs to be approachable while still being the person that may often say no
  • is an imperfect person that maintains an imperfect system so that multiple departments doing multiple tasks can do their daily work.
  • must combine the principles of librarianship with the abilities of computing technology
  • must be able to communicate the concerns and needs of the library to IT and communicate the concerns and needs of IT to the library

Things I would have wanted to know about Systems Librarianship: When you’re interested but naive about what it takes.

  • You need to be able to see the big and small pictures at once and how every piece fits into the puzzle
  • Systems Librarianship requires you to communicate, often and on difficult to explain topics. Take time to master this. You will be doing a lot of it and you want everyone involved to understand, because all parties will most likely be affected by the decision.
  • You don’t actually get to sit behind a computer all day every day just doing your thing.
  • You are the person to bridge the gap between IT and librarians. Take the time to understand the inner workings of both groups, especially as they relate to the library.
  • You’ll be expected to communicate between IT staff and Library staff why their request, no matter the intention, will or will not work AND if it will work, but would make things worse – why.
  • You will have a new problem to tackle almost every day. This is what makes the job so great
  • You need to understand the tasks of every department in the library. Take the time to get to know the staff of those departments as well – it will give insight to how people work.
  • You need to be able to say no to a request that should not or cannot be done, yes even to administration.
  • No one really knows all you do, so it’s important to take the time to explain your process when the time calls for it.
  • You’ll most likely inherit a system setup that is confusing at best. It’s your job to keep it going, make it better even.
  • You’ll be expected to make the “magic” happen, so you’ll need to be able to explain why things take time and don’t appear like a rabbit out of a hat.
  • You’ll benefit greatly from being open about how the system works and how one department’s requests can dramatically, or not so dramatically, affect another part of the system.
  • Be honest when you give timelines. If you think the job will take 2 weeks, give yourself 3.
  • You will spend a lot of time working with vendors. Don’t take their word for  “it,” whatever “it” happens to be.
  • This is important– you’re not alone. Ask questions on the email lists, chat groups, Twitter, etc..
  • You will be tempted to work on that problem after work, schedule time after work to work on it but do not let it take over your life, make sure you find your home/work life balance.

Being a systems librarian is hard work. It’s not always an appreciated job but it’s necessary and in the end, knowing everything I do,  I’d choose it again. Being a tech librarian is awesome and you don’t have to know everything about books to be good at it. I finally accepted this after months of ridicule from my trivia team for “failing” at librarianship because I didn’t know the answer to that obscure book reference from an author 65 years ago.

Also, those lists are not, by any means, complete — I’m curious, what would you add?

Possibly of interest, a bit dated (2011) but a comprehensive list of posts on systems librarianship:

HangingTogether: The end of an era — goodbye to Jim Michalko

planet code4lib - Tue, 2016-03-01 00:19

Today is the day when we say goodbye to our leader and colleague Jim Michalko. Rather than wallowing in our loss, we’d like this post to celebrate Jim’s accomplishments and acknowledge his many wonderful qualities.

Jim Michalko February 2016

Before OCLC, Jim was the president of the Research Libraries Group. He came to RLG from the administration team at the University of Pennsylvania Libraries in 1980. In those relatively early days of library automation, RLG was very much a chaotic start up. Jim, with both a MLS and an MBA, came on as the business manager and as part of the senior administrative team helped to get the organization on more stable footing. He was named RLG president in 1989.

In 2006, Jim once again played a key role in a time of uncertainty, helping to bring RLG into the OCLC fold. This included both integrating RLG data assets into OCLC services and bringing forward programmatic activities into OCLC Research. A key part of those programmatic activities is collaboration with the research library community, and the OCLC Research Library Partnership is a key component in driving our work agenda. Under Jim’s leadership, the Partnership has grown from 110 in 2006 to over 170 institutions now, including libraries at 25 of the top 30 universities in the Times Higher Education World University rankings.

Jim is a wise and gentle leader with a sardonic sense of humor. We’ve appreciated his ability to foster experimentation (and his patience while those experiments played out), his willingness to get obstacles out of our way so that we can get our work done, his tolerance of our quirks and other personal qualities, and his ability to maximize our strengths.

Jim’s retirement is part of a larger story that is playing out in the larger research library community as those who have overseen generations of change in technology, education, and policy are moving on. We will honor these leaders by following in their footsteps, while reminding ourselves that the path they set was marked by innovation.


About Merrilee Proffitt

Mail | Web | Twitter | Facebook | LinkedIn | More Posts (284)

LibUX: 034 – How “UX as a Measurement” Leads to “Service Design”

planet code4lib - Tue, 2016-03-01 00:07

You might remember that in our 2016 Design Predictions episode, my number one was that we are going to see an explosion of “Service Design” in writeups, job descriptions, and the like. I hadn’t really heard about Service Design until winter 2015, but as I was editing this episode — a recut of a talk from June prior — my spiel about conceptualizing the user experience as a measurement led into a totally unintended talk about service design. This makes sense, because when we think about UX as a measurement we are thinking about holistic experiences that transcend the screen which reflect back at us the quality of the services we provide.

Every service design decision you make has a performance pay off.

On the user experience as a measurement

Also, the slides from the above.

If you like you can download the MP3.

As usual, you support us by helping us get the word out: share a link and take a moment to leave a nice review. Thanks!

You can subscribe to LibUX on Stitcher, iTunes, or plug our feed right into your podcatcher of choice. Help us out and say something nice. You can find every podcast on

The post 034 – How “UX as a Measurement” Leads to “Service Design” appeared first on LibUX.

DuraSpace News: Osmania University Offers "Live DVD" for DSpace and Joomla Installation

planet code4lib - Tue, 2016-03-01 00:00

From P. Ramesh, Senior Technical Officer and Asst. Professor, Department of Library and information Science, Osmania University

Hyderabad, India  A team at Osmania University has developed a live DVD for installation of DSpace 5.2 and Joomla 3.4.5, which was formally released in 2015. This live DVD is very useful for Library and Information Science students, teachers and professionals. More than 650 people from 40 countries have downloaded and are using this file.

District Dispatch: CopyTalk cancelled for March

planet code4lib - Mon, 2016-02-29 23:22

CopyTalk webinar is cancelled for March.


Due to circumstances beyond any normal human’s control, we will not have a CopyTalk webinar for the month of March. We will be back on schedule April 7th 2015, the first Thursday of the month!

Upcoming webinars will focus on music, video and best practices for fair use regarding art resources.

Stay tuned!

The post CopyTalk cancelled for March appeared first on District Dispatch.

District Dispatch: Free webinar: LibERate the Telecommunications Act of 1996! — Making E-Rate Make Sense

planet code4lib - Mon, 2016-02-29 21:15

Patrons using Wi-Fi at the MLK Digital Commons in Washington D.C.

WHAT:  Free PLA webinar! Presented in partnership with the ALA Office for Information Technology Policy (OITP).

WHEN:  Thursday, 3/3/2016

  • 2:00 PM-3:00 PM (Eastern)
  • 1:00 PM-2:00 PM (Central)
  • 12:00 PM-1:00 PM (Mountain)
  • 11:00 AM-12:00 PM (Pacific)

We’ve all heard about the massive changes to E-rate over the last couple of years. As we’re in the midst of filing for the 2016–2017 year, there are some changes that you don’t want to miss. You have probably heard there is less money for telephone services this year. Don’t let that get you down! It just means there’s even more money to support broadband access and connectivity. Since we have a legacy of providing access to information to the public that dates back to the earliest days of our national independence, we also know that we often have historical, beautiful buildings that had no way to predict the cabling and WiFi needs of today. The FCC wants to help us get past those and other obstacles, and improve our ability to keep our citizenry informed. In this free webinar, OITP staff and guests will touch on E-rate as a program, but really delve into some tools you want to have handy when you’re filing this year—and in years to come!

Learning Outcomes

At the conclusion of this webinar, participants will:

  • Know about changes to the E-rate program that can help you improve internet access and WiFi connectivity in your library;
  • Have discovered new resources to support you as you navigate the E-rate application process to set yourself up for success; and
  • Understand what you need to have on hand to start filing for FY16.

Who Should Attend

Representatives from any public library planning to file for E-rate funding; Appropriate for those new to E-rate as well as those with previous E-rate experience.


Emily Almond is Director of IT for the Georgia Public Library Service.  After starting her career at CNN, she worked at Emory University as a systems librarian and then at the Atlanta Journal-Constitution as an archive manager and a project manager for She has experienced the ways in which technology can transform an organization and further, the ways in which quality leadership and smart management can use the right technology in the right instances to achieve strategic goals. Emily holds a B.S. in Journalism from Kennesaw State University and a MLIS from Florida State University.

Amber Gregory has worked with the E-rate program since 2010 as the coordinator of E-rate Services at the Arkansas State Library where she helps public libraries navigate the program. Amber is currently a member of the American Library Association’s E-Rate Task Force.

Wendy Knapp is the associate director of Statewide Services at the Indiana State Library.

Marijke Visser is associate director of ALA’s OITP where she is responsible for broadband adoption and all of ALA’s work on E-rate issues. She came to OITP in 2009 to support a grant project funded by the Bill & Melinda Gates Foundation looking at broadband capacity in public libraries. She is also program director for OITP’s emerging portfolio on children, youth, and technology.  She co-chairs the Edlinc Coalition, which promotes E-rate policy for libraries and schools at the national level. In addition to E-rate, Marijke supports the Program on Networks focusing on broadband adoption issues for diverse populations.



You can register for this webinar until it begins, or until space is no longer available, whichever comes first. Please do not register unless you are sincere about attending the live webinar. Space is limited, and signing up and not attending may deprive someone else of the opportunity. Thank you for your cooperation.

How to Register

REGISTER NOW!  Click Register to continue the online registration process.


If you have a physical or communication need that may affect your participation in this webinar, please contact us at or 800-545-2433 ext. 5PLA (5752) at least one week prior to the registration deadline above. Without prior notification of need, we cannot attempt to provide appropriate accommodations.

LibERate the Telecommunications Act of 1996! Making E-Rate Make Sense registration in WebEx screenshot

If you have a physical or communication need that may affect your participation in this webinar, please contact us at or 800-545-2433 ext. 5PLA (5752) at least one week prior to the registration deadline above. Without prior notification of need, we cannot attempt to provide appropriate accommodations.

Tech Requirements

This webinar will be presented using the WebEx platform. You may listen to the audio portion of the webinar via your computer’s speakers, headphones plugged into your computer’s audio jack or USB port; or by dialing in with your telephone (your carrier’s charges may apply) or Skype (by following the process outlined by Skype to place calls to land lines). We suggest that groups, especially larger groups, plan ahead to use an LCD/LED projector in the room to project the webinar. Groups will also want to have speakers or a sound system capable of amplifying the webinar audio for the entire room. No microphone is required.

PLEASE NOTE: PLA provides its webinar audio through voice over IP (VoIP), which means the sound comes through speakers or headphones plugged into your computer. PLA works with its webinar platform provider to assure the highest quality audio is being delivered to attendees. However, variables over which PLA has no control—such as the speed of your Internet connection or traffic on your local network—can affect the end quality of the webinar audio delivered by your computer. Each webinar’s audio is also available by teleconference via a toll number, so we recommend you have access to a long-distance enabled phone as a backup in case you experience audio issues with VoIP. If you do encounter any problems during the webinar, you will receive a link to its archived recording within a week of the live event and can review anything you missed.


Questions about this webinar? Please contact us at or 800-545-2433 ext. 5PLA (5752). For questions about webinar registration, please call 800-545-2433 ext. 5.

The post Free webinar: LibERate the Telecommunications Act of 1996! — Making E-Rate Make Sense appeared first on District Dispatch.

District Dispatch: Libraries recognized at House Energy and Commerce hearing on 3D printing

planet code4lib - Mon, 2016-02-29 15:18

U.S. Capitol. photo by Jonathon Colman via Flickr

Rep. Cardenas credits libraries as leaders of the maker movement

On Friday, the House Energy and Commerce Committee’s Subcommittee on Commerce, Manufacturing and Trade held a hearing exploring the implications of the rapid takeoff of 3D printing in this country and beyond. Witnesses included Alan Amling of UPS, Edward Herderick of General Electric, Ed Morris of the National Additive Manufacturing Innovation Institute (NAMI) – also known as America Makes – and Neal Orringer of 3D Systems.

The hearing touched on a myriad of topics, including the emerging field of bioprinting – the printing of human organs, the impact of 3D printing on the supply chain and the consequences of the rise of 3D printed prosthetics for the public, as well as the medical device industry. However, it didn’t start getting good for libraries until the issue of public access to 3D printing and its benefits to students and the workforce came to the fore. Rep. Yvette Clark (NY-9) raised the topic. Not long after, the Chairman handed the floor to Rep. Tony Cardenas (CA-29).

Rep. Cardenas led with a nod to libraries as leaders of the maker movement, followed by an inquiry into the witnesses’ commitment to supporting the learning, innovation and workforce development the library community facilitates through 3D printing:

We’ve noticed that in America’s libraries, we’ve had an increase of opportunities…Libraries are investing in 3D printers – now to the tune of over 400 libraries, at little-to-no cost to individuals going to the library. For me, this is a very important issue for making sure we [provide] access to as many minds, as many inquisitive folks [as possible], so that they can get turned on to how wonderful it is, and to the potential of getting a job in the industry. How committed is the industry to advancing that kind of effort?

Neal Orringer of 3D Systems responded by trumpeting his company’s recent partnership with the Young Adult Library Services Association (YALSA) on the MakerLab Club initiative. “We need to do more like this (the MakerLab Club); it’s going to pay back dividends,” Orringer said.

Orringer also underscored the importance of helping libraries answer practical set-up and management questions so that they can connect their patrons to all of the benefits their 3D printing services have to offer. Ed Morris echoed this sentiment, emphasizing the need for organizations like his to ensure library professionals have the knowledge and the training they need to keep their 3D printers operating over the long-term. Rep. Cardenas concluded the thread on libraries by exhorting the industry leaders in attendance to view partnerships with libraries and other anchor institutions around 3D printing as “an investment in human capital.”

ALA is deeply grateful to Rep. Cardenas for his eloquent acknowledgement of the library community’s efforts to democratize and build skills through 3D printing technology. We hope that the discussion his questions sparked yields fruitful collaboration between libraries and 3D printing leaders across the public, private and non-profit sectors. For a video of Rep. Cardenas’ comments on libraries, click here. For a full video of the hearing, click here.

The post Libraries recognized at House Energy and Commerce hearing on 3D printing appeared first on District Dispatch.

Open Knowledge Foundation: Sloan Foundation Funds Frictionless Data Tooling and Engagement at Open Knowledge

planet code4lib - Mon, 2016-02-29 12:58

We are excited to announce that Open Knowledge International has received $700,000 in funding from The Alfred P. Sloan Foundation over two years to work on a broad range of activities to enable better research and more effective civic tech through our Frictionless Data initiative. The funding will target standards work, tooling, and infrastructure around “data packages” as well as piloting and outreach activities to support researchers and civic technologists in addressing real problems encountered when working with data.

The Alfred P. Sloan Foundation is a philanthropic, not-for-profit grant-making institution based in New York City. Established in 1934 by Alfred Pritchard Sloan Jr., then-President and Chief Executive Officer of the General Motors Corporation, the Foundation makes grants in support of original research and education in science, technology, engineering, mathematics and economic performance.  

“Analyzing and working with data is a significant (and growing) source of pain for researchers of all types”, says Josh Greenberg, Program Director at the Alfred P. Sloan Foundation. “We are excited to support Open Knowledge International in this critical area. This support will help data-intensive researchers to be more efficient and effective.” What is being funded?

The funding will support three key streams of work around data packages: (a) the further development of the data package suite of standards, (b) the creation and enhancement of a suite of tools and integrations around these standards, and (c) broad outreach and engagement to educate researchers about the benefits of this approach.


The Data Package standard is a simple, lightweight specification for packaging all types of data, but we have a special emphasis on tabular (e.g. CSV) data. As the sources of useful data grow, effective data-driven research is becoming more and more critical. Such research often depends on cleaning and validating data, as well as combining such data from multiple sources, processes that are still frequently manual, tedious, and error-prone.  Data packages allow for the greater automation of these processes, thereby eliminating the “friction” involved.  

Tooling and Integration

A key aspect of this work is that it aligns with researchers’ usual tools and will require few or no changes to existing data and data structures.  To do this, we are seeking to build and support integrations with popular tools for research, for example, R, STATA, LibreOffice, etc.  In addition, we are looking to define ways of seamless translating datasets to and from typical file formats used across various research communities such as HDF5, NetCDF, etc.

Community Outreach

While our core mission is to design a well defined set of specifications and build a rich and vibrant ecosystem of tooling around them, none of this is possible without also building a broad awareness of data packages, where to use them and their utility, and a sustainable group of engaged users to support this.  To make our work in this area as effective as possible, we are building partnerships with organizations in research, civic tech, as well as government.

Be a part of the Frictionless Data future

We are looking to discover much more about the needs of different research groups and to identify the problems they might currently have.  To do this, we are running targeted pilots to trial these tools and specifications on real data.

Are you a researcher looking for better tooling to manage your data?  

Do you work at or represent an organization working on issues related to research data like DataCite, DataONE, RDA, or CODATA and would like to work with us on complementary issues for which data packages are suited?

Are you a developer and have an idea for something we can build together?

Are you a student looking to learn more about data wrangling, managing research data, or open data in general?

If any of the above apply to you, email us at  We’d love to hear from you.  If you have any other questions or comments about this initiative, please visit this topic in our forum: or hashtag #frictionlessdata. 

Stuart Yeates: Prep notes for NDF2011 demonstration

planet code4lib - Mon, 2016-02-29 06:56
I didn't really have a presentation for my demonstration at the NDF, but the event team have asked for presentations, so here are the notes for my practice demonstration that I did within the library. The notes served as an advert to attract punters to the demo; as a conversation starter in the actual demo and as a set of bookmarks of the URLs I wanted to open.

Depending on what people are interested in, I'll be doing three things

*) Demonstrating basic editing, perhaps by creating a page from the requested articles at

*) Discussing some of the quality control processes I've been involved with ( and

*) Discussing how wikipedia handles authority control issues using redirects ( ) and disambiguation ( )

I'm also open to suggestions of other things to talk about.

Stuart Yeates: Thoughts on the NDFNZ wikipedia panel

planet code4lib - Mon, 2016-02-29 06:55

Last week I was on an NDFNZ wikipedia panel with Courtney Johnston, Sara Barham and Mike Dickison. Having reflected a little and watched the youtube at I've got some comments to make (or to repeat, as the case may be).

Many people, including apparently including Courtney, seemed to get the most enjoyment out of writing the ‘body text’ of articles. This is fine, because the body text (the core textual content of the article) is the core of what the encyclopaedia is about. If you can’t be bothered with wikiprojects, categories, infoboxes, common names and wikidata, you’re not alone and there’s no reason you need to delve into them to any extent. If you start an article with body text and references that’s fine; other people will to a greater or less extent do that work for you over time. If you’re starting a non-trivial number of similar articles, get yourself a prototype which does most of the stuff for you (I still use which I wrote for doing New Zealand women academics). If you need a prototype like this, feel free to ask me.

If you have a list of things (people, public art works, exhibitions) in some machine readable format (Excel, CSV, etc) it’s pretty straightforward to turn them into a table like or Send me your data and what kind of direction you want to take it.

If you have a random thing that you think needs a Wikipedia article, add to  if you have a hundred things that you think need articles, start a subpage, a la and both completed projects of mine.

Sara mentioned that they were thinking of getting subject matter experts to contribute to relevant wikipedia articles. In theory this is a great idea and some famous subject matter experts contributed to Britannica, so this is well-established ground. However, there have been some recent wikipedia failures particularly in the sciences. People used to ground-breaking writing may have difficulty switching to a genre where no original ideas are permitted and everything needs to be balanced and referenced.

Preparing for the event, I created a list of things the awesome Dowse team could do as follow-ups to they craft artists work, but we never got to that in the session, so I've listed them here:
  1. [[List of public art in Lower Hutt]] Since public art is out of copyright, someone could spend a couple of weeks taking photos of all the public art and creating a table with clickable thumbnail, name, artist, date, notes and GPS coordinates. Could probably steal some logic from somewhere to make the table convertible to a set of points inside a GPS for a tour.
  2. Publish from their archives a complete list of every exhibition ever held at the Dowse since founding. Each exhibition is a shout-out to the artists involved and the list can be used to check for potentially missing wikipedia articles.
  3. Digitise and release photos taken at exhibition openings, capturing the people, fashion and feeling of those era. The hard part of this, of course, is labelling the people.
  4. Reach out to their broader community to use the Dowse blog to publish community-written obituaries and similar content (i.e. encourage the generation of quality secondary sources).
  5. Engage with your local artists and politicians by taking pictures at Dowse events, uploading them to commons and adding them to the subjects’ wikipedia articles—have attending a Dowse exhibition opening being the easiest way for locals to get a new wikipedia image.
I've not listed the 'digitise the collections' option, since at the end of the day, the value of this (to wikipedia) declines over time (because there are more and more alternative sources) and the price of putting them online declines. I'd much rather people tried new innovative things when they had the agility and leadership that lets them do it, because that's how the community as a whole moves forward.

Journal of Web Librarianship: A Review of "The Complete Guide to Using Google in Libraries: Research, User Applications, and Networking, Vol. 2"

planet code4lib - Mon, 2016-02-29 05:37
Volume 10, Issue 1, January-March 2016, pages 45-45
Dena L. Luce

Terry Reese: MarcEdit Update

planet code4lib - Sun, 2016-02-28 14:49

Update was posted Feb. 27 to all versions.  Update Contains the following changes:


  • Enhancement: Characterset Detection: MarcEdit is including a tool that will provide a heuristical analysis of a file to provide best guess characterset detection. (
  • Enhancement: Build New Tool Function: Adding a find macro to the function so that users can now identify specific fields when building new fields from data in a MARC record. (
  • Update: Build Links — improved handling of MESH data ** Update: Build Links — improved handling of AAT data
  • Update: Build Links — improved handling of ULAN data
  • Update: Build Links — added work around to character escaping issues found in .NET 4.0. Issue impacts URIs with trailing periods and slashes (/). Apparently, the URI encoding tool doesn’t escape them properly because of how Windows handles file paths.
  • Update: Build Links — Rules file updated to include refined definitions for the 6xx fields.
  • Update: MarcEdit Command-Line: program updated to include new build links functional updates
  • Update: COM object: Updated character encoding switching to simplify streaming functions.
  • Update: Validate Headings: Integrated rules file into checking.
  • Bug Fix: Validate Headings: headings validation was being tripped by the URI escaping issue in .NET 4.0. This has been corrected.
  • Update: RDA Helper: Finished code refinements
  • Update: Build Links — tool is now asynchronous
  • Enhancement: Build Links — Users can now select and build their own rules files
  • Enhancement: Build Links — Tool now includes a function that will track resolution speed from linked services and attempt to provide notification when services are performing poorly. First version won’t identify particular services — just that data isn’t being processed in a timely manner.
  • Bug Fix: Character Conversion — UTF-8 to MARC-8, the {dollar} literal isn’t being converted back to a literal dollar sign. This is related to removing the fall back entity checking in the last update. This has been corrected.

Updates can be picked up through the automated update tools in MarcEdit or via the downloads page:




Subscribe to code4lib aggregator