Mita Williams: Hackerspaces, Makerspaces, Fab Labs, TechShops, Incubators, Accelerators... Where do libraries fit in?
Today’s session is going to start out as a field guide but it’s going to end with a history lesson.
We’re going to start here - with a space station called c-base that found/ed in Berlin in 1995.
And then we are going travel through time and space to the present day where business start-up incubator innovation labs are everywhere including CBASE which is the College of Business and Economics from the University of Guelph.
But before we figure out where libraries makerspaces fit in, we’re going to use the c-base space station to go back in time, just before the very first public libraries were established around the world, so we can figure out how to go back to the future we want. It is 2015, after all.
But before we can talk about library makerspaces, we need to talk about hackerspaces.
This is the inside of c-base.
c-base is considered one of - or perhaps even - the very first hackerspace. It was established in 1995 by self-proclaimed nerds, sci-fi fans, and digital activists who tell us that c-base was built from a reconstructed space station that fell to earth, then somehow became buried, and when it was uncovered it was found to be borne with the inscription : be future compatible.
The c-base is described as a system of seven concentric rings that can move in relation to each other. These rings are called core, com, culture, creative, cience, carbon and clamp.
Beyond its own many activities, c-base has become the meeting place for German Wikipedians and it’s where the German Pirate Party was first established.
Members of c-base have been known to present at events hosted by the Chaos Computer Club, which is Europe's largest association of hackers that's been around for 30 years now.
So c-base is a hackerspace that is actually inhabited by what we commonly think of as hackers.
Some of the earliest hackerspaces were directly inspired by c-base. There is story that goes that in August of 2007, a group of North American hackers visited Germany for Chaos Communication Camp and was so impressed that when came back, they formed the first hackerspaces in the United States including NYC Resistor (2007), HacDC (2007), and Noisebridge (San Francisco, 2008).
Since then, many, many more hackerspaces have been developed - there are at least a thousand - but behind these new spaces are organizations that have are much less counter-culture in their orientation than the mothership of c-base. In fact, at this moment, you could say there isn’t a clear delineation between hackerspaces and makerspaces at all.
But before we can start talking about makerspaces, I think it’s necessary to pay a visit two branches of the hackerspace evolutionary tree: TechShops and Fab Labs.
TechShop is a business that started in 2006 which provides - in return for a monthly membership - access to space that contains over a half a million dollars of equipment, generally including an electronics lab, a machine shop, a wood shop, a metal working shop, etc. There are only 8 of these TechShops across the US despite earlier predictions that would be about 20 of them by now. They have been slow to open because the owner has stated that the business requires at least 800 people willing to pay over $100 a month in order for a TechShop to be viable.
The motto of TechShop is Build Your Dreams here. But TechShops have been largely understood as places where members dream of prototypes for their future Kickstarter projects. And such dreams have already come true: the prototype of the Square credit card processing reader, for example, was built in a Techshop. I think it's telling that the Detroit Techshop has a bright red phone in the space that connects you directly to the United States Patent and Trademark Office in case of a patent emergency.
Three of out of the 8 TechShops have backing from other organizations. TechShop's Detroit center opened in 2012 in partnership with Ford, which gives its employees free membership for three months. Ford employees can claim patents for themselves or they can give them to Ford in exchange for a share in revenue generated. Ford claims that this partnership with TechShop has led to a 50% rise in the number of patentable ideas put forward by the carmaker's employees, in one year.
TechShop's offices in Washington DC and Pittsburgh are being sponsored by DARPA, an agency of the Defense Department. DARPA is reported to have invested $3.5 million dollars into TechShop as part of its “broad mission to see if regular citizens can outinvent military contractors on some of its weirder projects.” But DARPA is not just helping pay for the space, they supposedly use the space themselves. According to the Bloomberg Business Week story I read, DARPA employees arrive at midnight to work when the TechShop is closed to its regular members.
You might be surprised, but we're going to be talking about DARPA again during this talk. But before that, we need to visit another franchise-like type of makerspace called the Fab Lab.
In 1998, Neil Gershenfeld started a class at MIT called "How to make (almost) anything". Gershenfeld wanted to introduce industrial-size machines normally inaccessible to technical students. However, he found his class also attracted a lot of students from various backgrounds including artists, architects, and designers. This led to a larger collaboration which eventually resulted in the Fab Lab Project which began in 2001. Fab Lab began as an educational outreach program from MIT but the idea has since developed into an ambitious network of labs located around the world.
The idea behind Fab Lab is that the space should provide a core set of tools powered by open source software that allow novice makers to make almost anything given a brief introduction to engineering and design education. Anyone can create a recognized Fab Lab as long as it makes a strong effort uphold the criteria of a Fab Lab, with the most important being that Fab Labs are required to be regularly open to the public for little or no cost. While it's not required, a Fab Lab is also strongly encouraged to communicate and collaborate with the other 350 or so other Fab Labs around the world. The idea is that, for example, if you design and make something using Fab Lab equipment in Boston, you could send the files and documents to someone in the Cape Town Fab Lab who could the same using their equipment.
The first library makerspace was a Fab Lab. It was established in 2011 in the Fayetteville Free Library in the state of New York. That's Lauren Britton pictured on screen who was a driving force that helped make that happen.
Now we don't tend to talk about Fab Labs in libraries. We talk about makerspaces. I think this is for several reasons with one of the main ones being - as admirable as I personally find the goals of international collaboration through open source and standardization - the established minimum baseline for such a Fab Lab generally costs between $25,000 and $65,000 in capital costs alone. This means that a proper Fab Lab is out of reach for many communities and smaller organizations.
I think there's another reason why we think of makerspaces before we think of Fab Labs, TechShops or hackerspaces. And that's because of Make Magazine.
Started in 2005 from the influential source of so many essential computer books, O'Reilly Publishing, Make Magazine was going to be called Hack. But then the daughter of founder Dale Dougherty told him that hacking didn’t sound good, and she didn’t like it. Instead, she suggested he call the magazine MAKE instead, because ‘everyone likes making things’.
And there is something to be said for having a more inclusive name, and something less threatening than hackerspace. But I think there's more to it as well. There is a freedom that comes with the name of makerspace.
One my favourite things about makerspaces is that most of them are open to everyone - artists, scientists, educators, hobbyists, hackers and entrepreneurs and it is possibility for cross-pollination of ideas that is one of the espoused benefits of their spaces for their members. In a world where there's so much specialization, makerspaces are a force that are trying to bring different groups of people together.
Here's such an example. This is i3Detroit which calls itself a DIY co-working space that is a "a collision of art, technology and collaboration".
There are also makerspaces that are more heavily arts-based. Miss Despoinas is a salon for experimental research and radical aesthetics that hosts workshops using code in contemporary art practice. It is physically located in Hobart, Tasmania.
There are presumably makerspaces that are designed primarily for the launching of new companies, although the only one I could find was Haxlr8r . Haxkl8r is a hardware business accelerator that combines workshop space with mentorship and venture capital opportunities and official bases in San Francisco and Shenzhen, China.
That being said, I can't help but note that most of these maker spaces that I've found that are designed specifically to support start ups has been in universities. Pictured here is the "Industrial Courtyard" where students and recent graduates of the university where I work can have access for prototype or product development.
In some ways, this brings up us full circle because it's been said the originators of the first hackerspaces set them up deliberately outside of universities, governments, and businesses because they wanted a form of political independence and even to be a place for resistance to the bad actors of these organizations.
As Willow Brugh describes this transition from the earliest hackerspaces and hacklabs :
The commercialization of the space means more people have access to the ideals of these spaces - but just as when "Open Source" opened up the door to more participants, the blatant political statement of "Free Software" was lost - hacklabs have turned from a political statement on use of space and voice into a place for production and participation in mainstream culture.
For as neutral and benign makerspaces seemingly are ("everyone likes to make things"), there are reasons to be mindful of the organizations behind them. For one, in 2012 Make Magazine received a grant from DARPA to establish makerspaces in 1000 U.S. high schools over the next four years.
Now it's one thing if makerspaces simply exist as a place where friends and hobbyists can meet, work and learn from each other. It's quite another if the makerspace becomes the basis of a model to address STEM anxieties in education.
As much as I appreciate how the Maker Movement is trying to bring a playful approach to learning through building, it's important to recognize that makerspaces tend to collect successful makers rather than produce them. The community who participates in hackerspaces and makerspaces is pronouncedly skewed white and male. In 2012, Make Magazine reported that of its 300,000 in total readership, 81% are male, median age is 44, and the median household income is $106,000.
Lauren Britton, the librarian who was responsible for the very first Library Fab Lab/Makerspace is now studying as a doctoral student at Syracuse University in Information Science and Technology and a researcher for their Information Institute. She's been doing discourse analysis on the maker movement and last year she informally published some of her findings so far. She's already tackled STEM anxiety and I'm particularly looking forward to what has has to say about gender and the makerspace movement.
But there's no time to get into all of that now, because it is now time to hop into c-base and travel through and time and space to the time before public libraries. We are going to travel up the makerspace evolutionary tree to what I like to consider the proto-species of the makerspace : The Mechanics Institute.
The world's first Mechanics' Institute was established in Edinburgh, Scotland in October 1821. Mechanics Institutes were formed to provide libraries and forms of adult education, particularly in technical subjects, to working men. As such, they were often funded by local industrialists on the grounds that they would ultimately benefit from having more knowledgeable and skilled employees. Mechanics Institutes as an institution did not last very long - the movement lasted only fifty years or so - although at their peak there were 700 of them worldwide.
What I think is so particularly poetic is that many of the buildings and core books collections of these Mechanics Institutes- especially where I'm from which is the province of Ontario in Canada - became the foundation for the very first public libraries.
Although there are still some Mechanics Institutes still among us, like coelacanths evolutionary speaking- most notably Montreal's Atwater Library and San Francisco's beautiful Mechanics Institute and Chess Room.
Now, I have to admit, when I see some makerspaces, they remind me of mechanics institutes: subsidized spaces that exist to provide access to technologies to be used for potential start-ups. And if that remains their primary focus, I think their moment will pass, just like mechanics institutes. The forces that made industrial technology accessible to small groups will presumably continue to develop into consumer technology. To live by disruption is to die by disruption.
This is one reason why I'm so happy and proud of the way so many libraries have embraced makerspaces and have made them their own. Because by and large, libraries keep people at the centre of the space- not technology.
Librarians - by and large - have opted for accessible materials and activities in their spaces and have host activities that emphasize creativity, personal expression and learning through play.
This is The Bubbler which is a visually arts based makerspace from the Madison Public Library. I have never been but from what I can see, they are doing many wonderful things. They hosts events that involve bike hacking, audio engineering, board game making, and media creation projects. I was particular impressed how they are working with juvenile justice programs to bring these activities and workshops to justice involved youth.
As long as libraries can continue to focus on building a better future for all of us, then we can continue to be a space where that future can be built.
This concludes our tour through time and space. Thank you kindly for your attention.
May your libraries and your makerspaces be future compatible.
Today I found the following resources and bookmarked them on <a href=
- Coggle Coggle is about redefining the way documents work: the way we share and store knowledge. It’s a space for thoughts that works the way that people do — not in the rigid ways of computers.
Digest powered by RSS Digest
- Irony of Ironies
- ATO2014: Open Source Schools: More Soup, Less Nuts
- NFAIS: Innovation for Today’s Chemical Researchers
President Barack Obama today transmitted to Congress the Obama Administration’s nearly $4 trillion budget request to fund the federal government for fiscal year 2016, which starts October 1, 2015. The President’s budget reflected many of the ideas and proposals outlined in his January 20th State of the Union speech.
Highlights for the library community include $186.5 million in assistance to libraries through the Library Services and Technology Act (LSTA). This important program provides funding to states through the Institute of Museum and Library Services (IMLS).
“We applaud the President for recognizing the tremendous contributions libraries make to our communities, ” said American Library Association (ALA) President Courtney Young in a statement. “The American Library Association appreciates the importance of federal support for library services around the country, and we look forward to working with the Congress as they draft a budget for the nation.
“The biggest news for the library community is the announcement of $8.8 million funding for a national digital platform for library and museum services, which will give more Americans free and electronic access to the resources of libraries, archives, and museums by promoting the use of technology to expand access to the holdings of museums, libraries, and archives. Funding for this new program will be funded through the IMLS National Leadership Grant programs for Libraries ($5.3 million) and Museums ($3.5 million).Statutory Authority FY 2010 FY 2011 FY 2012 FY 2013 FY 2014 Request FY 2014 Enacted FY 2015 Request Grants to States 172,561 160,032 156,365 150,000 150,000 154,848 152,501 Native Am/Haw. Libraries 4,000 3,960 3,869 3,667 3,869 3,861 3,869 Nat. Leadership / Libraries 12,437 12,225 11,946 11,377 13,200 12,200 12,232 Laura Bush 21st Century 24,525 12,818 12,524 10,000 10,000 10,000 10,000 Subtotal, LSTA 213,523 189,035 184,704 175,044 177,069 180,909 178,602
“With the appropriations process beginning, we look forward to working for continued support of key programs, including early childhood learning, digital literacy, and the Library Services and Technology Act.”
The post President Obama’s budget increases library funding appeared first on District Dispatch.
If you would like to listen in to the LITA Board meeting at ALA Midwinter 2015, it is streaming (in audio) below:
The Islandora 7.x/Fedora 4.x integration that we announced in December has officially begun. Work began on January 19th and our first team meeting was Friday, January 30th and we will be meeting on the 4th Friday of every month at 1:00 PM Eastern time. Here's what's going on so far:Project Updates
The new, Fedora 4 friendly version of Islandora is being built under the working designation of Islandora 7.x-2.x (as oppose to the 7.x-1.x series that encompasses current Fedora 3.x updates to Islandora, which are not going away any time soon). A new GitHub organization is in place for development and testing, and the Islandora Fedora 4 Interest Group has been reconvened under new Terms of Reference to act as a project group for the Fedora 4 integration. If you want to participate, please sign up as part of this group. If you don't have time to participate in regular meetings, we would still love to hear your use case. You can submit it for discussion in the issue queue of the interest group. Need help getting into the GitHub of it all? Contact us and we'll get you there.
There is also a new chef recipe in the works to quickly spin up development and testing environments with the latest for 7.x-2.x. Special thanks to MJ Suhonos and the team at Ryerson University for Islandora Chef!
- The University of Toronto Scarborough
- The University of Oklahoma
- The University of Manitoba
- The University of Virginia
- The University of Prince Edward Island
- The University of Limerick
- Simon Fraser University
- Common Media
- The Colorado Alliance
If you would like to talk to Nick and Danny about the project, or even offer up some help while they code away on an unofficial 'sprint,' you can meet up with them at discoverygarden's table at Code4Lib 2015 in Portland, OR February 9 - 12.Technical Planning
Danny Lamb has kicked off the design of the next stage of Islandora with a Technical Design Doc that you should definitely read and comment on if you have any plans to use Islandora with Fedora 4 in the future. We are still at the stage of hearing use cases and making plans, so now is the time to get your needs into the mix. The opening line sums up the basic approach: Islandora version 7.x-2.x is middleware built using Apache Camel to orchestrate distributed data processing and to provide web services required by institutions who would like to use Drupal as a frontend to a Fedora 4 JCR repository.
Some preliminary Big Ideas:
- No more Tuque. No more GSearch. No more xml forms. The Java middleware layer will handle many things that were previously done in PHP and Drupal.
- It will treat Drupal like any other component of the stack. There will be indexing in Drupal for display using nodes, fields, and other parts of the Drupal ecosystem.
- It will use persistent queues, so the middleware layer can exist on separate servers.
- The Fedora-Drupal connection comes first. An admin interface will be developed later.
And some preliminary Wild Ideas (we'd love to hear your opinions):
- Headless Drupal 7.x
- Make the REST API endpoints the same for Drupal 7 and Drupal 8 so migration is easier.
- Dropbox-style ingest.
Or rather, upgration (a portmanteau of upgrade and migration, and our new favourite word). Nick Ruest and York University are working through a Fedora 3.x -> 4.x upgration path. Because York's Islandora stack is as close to generic as you can reasonably get in in production, this should provide a model for a generic upgration path that others can follow - as well as keeping the needs of the Islandora community on the radar for the Fedora 4 development team, so that all of the pieces evolve to work together.Funding
We launched the project with a funding goal of $100,000 to get a functioning prototype and Fedora 3.x -> 4.x migration path. We are very pleased to announce that we have achieved more than half of that funding goal and are well set to see things through to the end.
Many, many thanks to our supporters, all of whom are now members the Islandora Foundation as Partners:
- York University
- McMaster University
- University of Prince Edward Island
- University of Manitoba
- University of Limerick
If your institution would like to join up, whether as a $10,000 Partner or at some other level of support, please contact us.
For a change of pace: A not too technical tale of my recent visit to England.
The people behind IIPC Technical Training Workshop – London 2015 had invited yours truly as a speaker and participant in the technical training. IIPC stands for International Internet Preservation Consortium and I were to talk about using Solr for indexing and searching preserved Internet resources. That sounded interesting and Statsbiblioteket encourages interinstitutional collaboration, so the invitation was gladly accepted. Some time passed and British Library asked if I might consider arriving a few days early and visit their IT development department? Well played, BL, well played.
I kid. For those not in the know, British Library made the core software we use for our Net Archive indexing project and we are very thankful for that. Unfortunately they do have some performance problems. Spending a few days, primarily talking about how to get their setup to work better, was just reciprocal altruism working. Besides, it turned out to be a learning experience for both sides.At British Library, Boston Spa
The current net archive oriented Solr setups at British Library is using SolrCloud with live indexes on machines with spinning drives (aka harddisks) and a – relative to index size – low amount of RAM. At Statsbiblioteket, our experience tells us that such setups generally have very poor performance. Gil Hoggarth and I discussed Solr performance at length and he was tenacious on exploring every option available. Andy Jackson partook in most of the debates. Log file inspections and previous measurements from the Statsbiblioteket setups seemed to sway them in favour of different base hardware, or to be specific: Solid State Drives. The open question is how much such a switch would help or if it would be a better investment to increase the amount of free memory for caching.
- A comparative analysis of performance with spinning drives vs. SSDs for multi-TB Solr indexes on machines with low memory would help other institutions tremendously, when planning and designing indexing solutions for net archives.
- A comparative analysis of performance with different amounts of free memory for caching, as a fraction of index size, for both spinning drives and SSDs, would be beneficial on a broader level; this would give an idea of how to optimize bang-for-the-buck.
Logistically the indexes at British Library are quite different from the index at Statsbiblioteket: They follow the standard Solr recommendation and treats all shards as a single index, both for index and search. At Statsbiblioteket, shards are build separately and only treated as a whole index at search time. The live indexes at British Library have some downsides, namely re-indexing challenges, distributed indexing logistics overhead and higher hardware requirements. They also have positive features, primarily homogeneous shards and the ability to update individual documents. The updating of individual documents is very useful for tracking meta-data for resources that are harvested at different times, but have unchanged content. Tracking of such content, also called duplicate handling, is a problem we have not yet considered in depth at Statsbiblioteket. One of the challenges of switching to static indexes is thus:
- When a resource is harvested multiple times without the content changing, it should be indexed in such a way that all retrieval dates can be extracted and such that the latest (and/or the earliest?) harvest date can be used for sorting, grouping and/or faceting.
One discussed solution is to add a document for each harvest date and use Solr’s grouping and faceting features to deliver the required results. The details are a bit fluffy as the requirements are not strictly defined.At the IIPC Technical Training Workshop, London 2015
The three pillars of the workshop were harvesting, presentation and discovery, with the prevalent tools being Heritrix, Wayback and Solr. I am a newbie in two thirds of this world, so my outsider thoughts will focus on discovery. Day one was filled with presentations, with my Scaling Net Archive Indexing and Search as the last one. Days two and three were hands-on with a lot of discussions.
As opposed to the web archive specific tools Heritrix and Wayback, Solr is a general purpose search engine: There is not yet a firmly established way of using Solr to index and search net archive material, although the work from UKWA is a very promising candidate. Judging by the questions asked at the workshop, large scale full-text search is relatively new in the net archive world and as such the community lacks collective experience.
Two large problems of indexing net archive material is analysis and scaling. As stated, UKWA has the analysis part well in hand. Scaling is another matter: Net archives typically contains billions of documents, many of them with a non-trivial amount of indexable data (webpages, PDFs, DOCs etc). Search responses ideally involve grouping or faceting, which requires markedly more resources than simple search. Fortunately, at least from a resource viewpoint, most countries does not allow harvested material to be made available to the general public: The number of users and thus concurrent requests tend to be very low.
General recommendations for performant Solr systems tend to be geared towards small indexes or high throughput, minimizing the latency and maximizing the number of requests that can be processed by each instance. Down to Earth, the bottleneck tend to be random reads from the underlying storage, easily remedied by adding copious amounts of RAM for caching. While the advice arguable scales to net archive indexes in the multiple TB-range, the cost of terabytes of RAM, as well as the number of machines needed to hold them, is often prohibitive. Bearing in mind that the typical user groups on net archives consists of very few people, the part about maximizing the number of supported requests is overkill. With net archives as outliers in the Solr world, there is very little existing shared experience to provide general recommendations.
- As hardware cost is a large fraction of the overall cost of doing net archive search, in-depth descriptions of setups are very valuable to the community.
Measurements from British Library as well as Statsbiblioteket shows that faceting on high cardinality fields is a resource hog when using SolrCloud. This is problematic for exploratory use of the index. While it can be mitigated with more hardware or software optimization, switching to heuristic counting holds promises of very large speed ups.
- The performance benefits and the cost in precision of approximate search results should be investigated further. This area is not well-explored in Solr and mostly relies on custom implementations.
On the flipside of fast exploratory access is the extraction of large result sets for further analysis. SolrCloud does not scale for certain operations, such as deep paging within facets and counting of unique groups. Certain operations, such as percentiles in the AnalyticsComponent, are not currently possible. As the alternative to using the index tend to be very heavy Hadoop processing of the raw corpus, this is an area worth investing in.
- The limits of result set extractions should be expanded and alternative strategies, such as heuristic approximation and per-shard processing with external aggregation, should be attempted.
Visiting British Library and attending the IIPC workshop was a blast. Being embedded in tech talk with intelligent people for 5 days was exhausting and very fulfilling. Thank you all for the hospitality and for pushing back when my claims sounded outrageous.
Bringing together over 15,000 photographs of football, from its origins after the Civil War to the Super Bowl era, and from over a thousand collections around the United States, presents an opportunity to see in one place how this uniquely American sport has been played—and imagined. Photography itself evolved in concert with the sport, from lantern slides of players to aerial shots of stadiums.
From the very beginning, however, one constant has been the tension between picturing football as balletic and gentlemanly, or chaotic and brutish.
Eadweard Muybridge’s 1887 collotype of a nude man punting a football put the sport squarely into the graceful category, showing the wide range of motion involved in a kick.
[Eadweard Muybridge. Animal locomotion: an electro-photographic investigation of consecutive phases of animal movements. 1872-1885 / published under the auspices of the University of Pennsylvania. Plates. The plates printed by the Photo-Gravure Company. Philadelphia, 1887. Image courtesy of the University of Southern California Libraries]
[Image courtesy of the California Historical Society Collection via the University of Southern California Libraries]
[Image courtesy of the University of Virginia Special Collections]
Catching the football also presented the photographer with an opportunity to depict football as ballet:
[Image courtesy of the Boston Public Library via Digital Commonwealth]
Early photographs often showed football players in suits and tuxedos, as the 1869 Rutgers team wore in their team photograph after beating Princeton in the very first college game:
[Image courtesy of the New York Public Library]
For photographs of football formations, ties and jackets were sometimes worn.
[Image courtesy of the Archives and Special Collections at the University of Montana via the Mountain West Digital Library]
But the fact that football, unlike baseball, was based on contact—in many cases, extreme contact—made it clearly open to other interpretations. Faster film, which required less exposure to light, could not only capture the punter and wide receiver at work; it could capture the moment of impact, leading to distinctly different images of football.
[Images courtesy Springfield College Archives and Special Collections via Digital Commonwealth]
Many of these photographs effectively create freeze-frame sculpture, heightened with the painful knowledge of what is about to be felt by the player under assault.
[Image courtesy of the Austin History Center at the Austin Public Library, via the Portal to Texas History]
[Image courtesy of the Boston Public Library via Digital Commonwealth]
The cameras may have changed radically and film is now virtually obsolete, but you’ll undoubtedly see these two photographic styles in the coverage of today’s Super Bowl. Football: still balletic, still brutal.
As has become traditional, I’m posting again in February after a long break in the second half of last year. Hopefully in 2015 I can break my bad habit and actually continue with regular blog content all the way through the year.
I’ve spent much of the last few months obsessing over stats and analytics from the Boroondara library websites, as I developed a brief for developers to help us with a major overhaul. The experience has reinforced the advice from Matthew Reidsma to regularly analyse the way people use your website, and test and make changes immediately and incrementally. A lot of the recommendations I’ve made at Boroondara are as much about the way we produce website content as they are about the design of the sites. For example, I’ve discovered that visitors using mobile devices are most likely to visit on weekends, whilst visitors on desktop are most likely to visit on Monday and least likely to visit on Sunday. Do we need to change our posting schedules? Does this difference reflect different users or just the same visitors using different devices across the week? These are questions we would not have even thought to ask until we saw the data - and this is just one simple example. Something more intriguing (and hindsight obvious) was my discovery that visitors to our Storytimes page were more than 50% more likely to come from mobiles (about 46% compared to 30% for visits to all pages). It’s pretty easy to construct a story of busy parents checking their phone from the local park to check if the library has a storytime today - but we hadn’t really considered this behaviour until now (and of course, there could be any number of alternative explanations for why there is this difference).
What became quite clear is that we should have been doing more than simply look at total hits and visits each month and really looked deeply into our analytics on both our catalogue and our general website. I won’t be doing this at Boroondara, because I finished up there last week, but if anyone at Brimbank Libraries is reading this - be prepared to become obsessed with user tracking and analytics!
Coincidentally, I recently read John O’Nolan’s post about onboarding stats at Ghost. I’ve read lots of stuff from UX experts and library website experts emphasising the usefulness of things like A/B testing and ongoing analysis of usage data, but until now I’ve never fully appreciated what they’re saying. Perhaps it’s because the Ghost Foundation is a non-profit, but I found O’Nolan’s post helped me to see how we can (carefully) use usage data to help library members get more from us. That is, libraries have the ability to actually use analytics to ‘improve the user experience’. Using data to manipulate users to act against their own best interests, as too many commercial services seem to do, isn’t the only possibility.A couple of simple examples Email notices
Pretty much every library sends notifications to members in one form or another. Mostly these are emails. Whilst I am stunned by the fact that several major library management systems are still only capable of sending plaintext emails and not HMTL formatted emails, at this point I am going to assume you are sending HTML formatted email notices.
Ever wondered whether the wording of your notices is effective? Perhaps if you used a different subject heading or made your email text more friendly members would have less overdue loans. Wouldn’t it be great to test your theory scientifically? A/B testing is the way to do this. Web companies do this all the time. True A/B testing is random - on a given day a website might randomly show different users different configurations on the front page, for example. They can then test which configuration (‘configuration A’ or ‘configuration B’) resulted in more sales, or newsletter sign-ups, or whatever.
It all sounds very hard and complicated, but you can fairly easily use an analytics program like Piwik to create ‘campaigns’ and associated tracking codes. All this does is add some extra code to the URLs you use, which is identified by your analytics system when visitors use a URL with that code. You could use campaign tracking codes by sending out two batches of email notices (perhaps on two consecutive Tuesdays, for example) with a link to ‘click here to renew these items’. By comparing the number of hits on your login page from that tracking code to the number of notices sent out using it, you can measure the effectiveness of different types of approaches to subject headings, wording and layout.What do mobile visitors want to do?
An even simpler example comes from some of the analysis I’ve recently been doing. I had a feeling that visitors on mobile devices might show different browsing behaviour to those on desktops, but I didn’t really know. Because browsers tend to broadcast what type of browser they are, what device they are installed on, and the size of their screen, it’s pretty easy to track what type of device visitors are using. By creating a segment (about 15 seconds in your favourite analytics software), you can determine if visitors from mobile (or tablets, for that matter) behave differently from desktop users.
What I discovered was that nearly half of all mobile visitors to our website visited the Opening Hours page - making mobile users about three times more likely than desktop users to be looking for our opening hours. This has obvious ramifications for any mobile optimisation of our website - clearly opening hours need to be pretty close to the first thing they see. Of course, by claiming your branches’ Google Maps pages you can ensure that your opening hours are available right there in Google before users even hit your site. Since we’re in the business of providing information and experiences, rather than selling stuff through our websites, we’re in the fortunate position that it doesn’t actually matter if people get the information they need (in this case “Is the library open?”) without visiting our website at all.
It might strike you as obvious that people visiting a library website using a smartphone probably want to know whether the library is open, but with hard data you can actually test such intuitions. There were plenty of other ‘obvious’ assumptions that I found to be false when checking our website analytics properly. None of the things I have just described are difficult or even particularly clever. There are smart librarians who use and understand these tools in much more sophisticated ways than I ever have. Given the state of most library websites, however, it seems doubtful that these sorts of techniques are anywhere close to mainstream in libraries today.Privacy
At this point, some of you are probably yelling at your screen “I thought you were supposed to be interested in user privacy, you hypocrite!” Indeed, I am very interested in user privacy. Whilst working on our website project I have also been busy tightening up the privacy and security of our existing catalogue. The conclusion I have come to, however, is that we can genuinely protect the privacy of library members and visitors whilst still collecting a lot of useful aggregate data. The important thing is to always consider the consequences of tracking, collecting and storing any particular piece of data before you do anything, and ensure that is how you decide whether to collect it, rather than how useful or interesting it might be.
There are a couple of practices we need to be particularly careful to avoid:Linking web and search analytics to identified library members
Whilst it may be possible to make a link between a tracked website user and a registered member through data matching things like their IP address, this still takes time and requires a targeted effort aimed at a specific person. If, on the other hand, you set up your web analytics in a way that can easily identify search terms used by specific user (and, therefore, vice-versa) you make it possible to provide lists of search terms associated with a specific person, or lists of specific people associated with particular search terms. It would be so easy to track actual members’ search terms and general website use that you could probably do it accidentally.
This is also worth thinking about with regard to how you track individual website users. Piwik, for example, includes ‘Visitor profiles’, which track users over time based on their IP address. This makes me very uncomfortable, especially coming from software that prides itself on being great for privacy. There are a couple of ways to reduce the privacy problems caused by this. Firstly, Piwik can be set up to simply ignore the last one, two or three bytes of an IP address. This makes it impossible to track usage geographically to particular suburbs or cities, but usually you won’t care much about that. The other feature Piwik recommends administrators use is archiving. The archive function stores usage data in aggregate in tables, then deletes the actual logs. This means you get to use old data for aggregate reports, but when the men in dark suits come knocking you don’t have any personally identifiable data to give them.Using third parties who can see your data
It’s all very well to have policies and statements about the freedom to read and how you protect member loan records, but the world has moved on. The library user who doesn’t use online services at all is almost extinct. Privacy statements are one thing, but privacy practice is another entirely. As a general rule if the data isn’t stored on-site, someone else probably has access to it. If you didn’t pay anything for the service, you can guarantee that. Eric Hellman provided a stark illustration last year of how many people and organisations have access to your users’ data if you don’t pay attention to security and privacy. Following on the heels of the Adobe Digital Editions debacle in October, it should be obvious to even the most obstinately clueless that libraries need to ask a lot more questions when third parties are providing services on our behalf.The future
I’d like to see libraries take more action to protect user privacy and collect more and better data. I truly think it is possible for us to do both - but only if we are careful and thoughtful about how we go about it. Jason Griffey announced an exciting new project over the weekend, called ‘Measure the Future’. Led by Griffey and other library stars Gretchen Caserotti and Jenica Rogers, along with educator Jeff Branson, the project seeks to build a ‘Google Analytics for your library building’, tracking physical use of libraries just as we can track digital use. Built on open hardware and software by librarians, this has huge promise - but we need to be mindful of the same privacy concerns we have always expressed with regard to reading habits, and started to neglect as reading moves increasingly to digital environments.
Currently most libraries seem to be (accidentally) providing a huge hoard of private user data to virtually anyone who wants it, but not actually using any of it themselves. If we are to credibly claim to be defenders of intellectual freedom and responsive to our communities, we need to use data more cleverly - and protect member privacy while we do so.
Lila is a cognitive technology that extends reading and analysis capabilities for a writing project. Author content is used to generate “slips”, short units of text from unread content. Slips are visualized to allow embedded reading. Embedded means “to fix firmly and deeply in a surrounding mass.” Embedded reading is reading content in the context of other closely related content. Context is meaning. Embedded reading gives new insight and ensures completeness. It is visualized as web of associated, clickable slips in Lila. View the video.
If you are an IfThisThenThat user and are interested in archives maybe you’ll be interested in this recipe that will email you when a new item is added to the Documenting Ferguson repository. Let me know if you give it a try! I just created the recipe and it hasn’t emailed me yet. But the RSS Feed from Washington University’s Omeka instance reports that the last item was added on January 30th, 2015. So the collection is still being added to.
I thought about having it tweet, but that would involve creating a Twitter account for the project and that isn’t my place. Plus, RSS and Email are still fun Web 1.0 technologies that don’t get enough love. Well I guess Email predates the Web entirely heh, but you get my drift.
One thing that tends to be hard in the digital library world is to understand how a given program is doing in relation to other programs throughout the country. This information can be helpful to help justify funds spent locally on digital library initiatives. The same information can be used within a department to understand if workflows are on par with others throughout the country/region.
Most often the numbers that are reported are those that are required by membership groups such as ARL, ACRL and others who have a token question or two about digital library statistics but most people involved with those numbers know that they are often…. unclear at best.
Some of the dimensions that are available to look at include traffic to the digital library system, visitors, page views, time on site, referral traffic. Locally we use Google Analytics for this data at the repository level. How a digital libraries items get used is also another metric that is helpful in knowing the impact of these resources. This can be measured in a wide range of ways and there are initiatives such as Counter that provide some guidance to this sort of work but it feels like it is more focused on “Electronic Resources” and doesn’t really handle the range of cases we run into in digital library/repository land. The University of Florida Digital Collections makes their usage data for each item in the collection easily obtainable, many modern DSpace instances also have great reporting on usage of items. I’ve talked a little about how UNT Libraries calculates “uses” for our digital library collections here and here. The final area that is often reported on is the collection growth of the repository either in the number of items added, number of bytes (or GB, TB) added, or number of files added in a given year.
I think walking through some of these metrics in a series of posts will be helpful for me to articulate some of the opportunities that are available if the digital libraries/repository community openly shared more of this data. There are of course organizations such as Hathi Trust, the Digital Public Library of America, and others who make growth data available front and center, but for most of our repositories it is pretty hidden.
The data that I’m showing in this post is from the UNT Libraries Digital Collections which contains three separate digital library interfaces, The Portal to Texas History, the UNT Digital Library, and the Gateway to Oklahoma History. All three of these interfaces are powered by the same repository infrastructure on the backend and are made searchable by a unified Solr index. The datasets here are from that Solr instance directly.Items added per month
From Jan 1 to Dec 31, 2014 the UNT Libraries Digital Collections added 417,645 unique digital resources to its holdings. The breakdown of the monthly additions look like this:Month Items Added January 32,074 February 9,220 March 7,758 April 11,161 May 11,475 June 32,549 July 18,503 August 67,769 September 83,916 October 25,537 November 73,404 December 44,279
A better way to look at this might be a simple chart.
Or looked at a different way.
The average number of items added to the system in 2014 by month is 34,803.Wait, What is an Item,Object,Resource
A little side trip is needed so that we are on the same page. For us a “digital object” or “digital item” or “digital resource” is an intellectual unit that a descriptive metadata record is assigned at. This may be a scan of a photographic negative, front and back scans of a physical photographic print, a book, letter, pamphlet, map, or issue of newspaper. In most instances there are multiple files/images/pages per item in our system but we are just talking about those larger units and not the files that make up the items themselves. Just wanted to make sure we were on the same page about that.Items added per day
In looking at the daily data for the year, there were 215 days that new content was processed and added to the collection with no processing being done on 150 days. The average number of items added per day during the year was 1,144 items. If we think about an ten hour work day (roughly when the library is open for normal folks) that’s 114 items per hour, or 1.9 new items created per minute during the work week last year.Items by Type
I thought it might be interesting to see how the 417,645 were distributed among the various resource types that we categorize records into. Here is that table.Resource Type Items image_photo 197,133 text_newspaper 109,456 image_map 66,637 text_report 12,569 text 9,517 text_patent 7,052 text_etd 4,449 physical-object 3,573 text_leg 1,660 text_book 1,171 text_journal 1,063 video 804 text_article 494 image_postcard 366 collection 347 text_pamphlet 346 text_letter 235 text_legal 216 text_yearbook 180 image_presentation 96 image_artwork 44 text_clipping 44 dataset 36 image_poster 30 image 26 text_paper 23 image_score 22 sound 17 website 13 text_review 12 text_chapter 8 text_prose 5 text_poem 1
As you can see the majority of all of the items added were in the category of image_photo (Photographs) or text_newspaper (Newspapers) with those two types accounting for 73% of the new additions to the system.Closing
As I mentioned at the beginning of this post, I think knowing metrics of other digital library programs is helpful for local initiatives in a number of ways. The UNT Libraries had a very successful year for adding new content, over the past few years we’ve been able to double the number of items each year, I don’t think that’s a rate of growth that we can keep up with but it is always fun to try. How do repository systems at your institution look in relation to this? Sharing that data more broadly would be helpful to the digital library community overall and I encourage others to take some time and make this data available.
If you have any specific questions for me let me know on twitter.
If you would like to listen in to the LITA Board meeting at ALA Midwinter 2015, it is streaming (in audio) below:
Code4Lib seeks to provide a welcoming, fun, and safe community and
conference experience as well as an ongoing community for everyone. We do not
tolerate harassment in any form. Discriminatory language and imagery
(including sexual) is not appropriate for any event venue, including talks,
or any community channel such as the chatroom or mailing list.
Harassment is understood as any behavior that threatens another person or
group, or produces an unsafe environment. It includes offensive verbal
comments or non-verbal expressions related to gender, gender identity,
gender expression, sexual orientation, disability, physical appearance,
body size, race, age, religious beliefs, sexual or discriminatory images
in public spaces (including online), deliberate intimidation, stalking,
following, harassing photography or recording, sustained disruption of
talks or other events, inappropriate physical contact, and unwelcome sexual
- Initial Incident
If you are being harassed, notice that someone else is being harassed,
or have any other concerns, and you feel comfortable speaking with
the offender, please inform the offender that he/she/ze has affected you
negatively. Oftentimes, the offending behavior is unintentional, and the
accidental offender and offended will resolve the incident by having
that initial discussion.
Code4Lib recognizes that there are many reasons speaking directly to
the offender may not be workable for you (including but not limited to
unfamiliarity with the conference or its participants, lack of spoons,
and concerns for personal safety). If you don't feel comfortable
speaking directly with the offender for any reason, skip straight to
If the offender insists that he/she/ze did not offend, if offender is
actively harassing you, or if direct engagement is not a good option
for you at this time, then you will need a third party to step in.
If you are at a conference or other event, find an event organizer or
staff person, who should be listed on the wiki.
If you can't find an event organizer, there will be other staff
available to help if the situation calls for immediate action.
If you are in the #code4lib IRC, the zoia command to list people
designated as channel helpers is @helpers. At most times, there is at least one helper in the channel.
For the listserv, you have a free-for-all for public messages; however,
the listserv does have a maintainer, Eric Lease Morgan.
- Wider community response to Incident:
If the incident doesn't pass the first step (discussion reveals offense
was unintentional, apologies said, public note or community is informed
of resolution), then there's not much the community can do at this point
since the incident was resolved without outside intervention.
If incident results in corrective action, the community should support
the decision made by the Help in Step 2 if they choose corrective action,
like ending a talk early or banning from the listserv, as well as
support those harmed by the incident, either publicly or privately
(whatever individuals are comfortable with).
If the Help in Step 2 run into issues implementing the CoC, then the
Help should come to the community with these issues and the community
should revise the CoC as they see fit.
In Real Life people will have opinions about how the CoC is enforced.
People will argue that a particular decision was unfair, and others will
say that it didn't go far enough. We can't stop people having
opinions, but what we could do is have constructive discussions
that lead to something tangible (affirmation of decision, change in CoC,
modify decision, etc,).
Participants asked to stop any harassing behavior are expected to comply
immediately. If a participant engages in harassing behavior, organizers may
take any action they deem appropriate, including warning the offender,
expulsion from the Code4Lib event, or banning the offender from a chatroom
or mailing list.
Specific sanctions may include but are not limited to:
- warning the harasser to cease their behavior and that any further reports
will result in other sanctions
- requiring that the harasser avoid any interaction with, and physical
proximity to, their victim for the remainder of the event
- early termination of a talk that violates the policy
- not publishing the video or slides of a talk that violated the policy
- not allowing a speaker who violated the policy to give (further) talks at
- immediately ending any event volunteer responsibilities and privileges the
harasser holds requiring that the harasser not volunteer for future Code4lib
events (either indefinitely or for a certain time period)
- requiring that the harasser immediately leave the event and not return
- banning the harasser from future events (either indefinitely or for a
certain time period)
- publishing an account of the harassment
Code4Lib event organizers can be identified by their name badges, and will
help participants contact hotel/venue security or local law enforcement,
provide escorts, or otherwise assist those experiencing harassment to feel
safe for the duration of the event. Code4Lib IRC volunteers can be identified
by issuing the @helpers command to the #code4lib bot named "zoia".
If an incident occurs, please use the following contact information:
- Conference organizers: Tom Johnson, 360-961-7721 or Evviva Weinraub, 617-909-2913
- Hilton Portland & Executive Tower: 503-226-1611
- Portland Police Department: 503-823-0000
- Portland Women's Crisis Line (24/7): 503-235-5333 (or toll-free: 888-235-5333) 24/7
- Radio Cab: 503-227-1212
- IRC channel administrators: anarchivist, mistym, mjgiarlo, ruebot; or enter @helpers in the IRC channel
We expect participants to follow these rules at all conference venues,
conference-related social events, community gatherings, and online communication channels.
We value your participation in the Code4Lib community, and your efforts to
keep Code4Lib a safe and friendly space for all participants!
In the 20th century, mass media redistributed much of this organizational power. In politics, charismatic individuals could motivate millions of people independently of the hierarchies that maintain command and control. But for the most part, one hierarchy got swapped for another. In business, production innovations such as Henry Ford's assembly line needed the hierarchy to support the capital investments.
I think the history of the 21st century will be the story of non-hierarchical systems of human organization enabled by the Internet. From this point of view, Wikipedia is particularly important not only for its organization of knowledge, but because it demonstrated that thousands of people can be organized with extremely small amounts of hierarchy. Anyone can contribute, anyone can edit, and many do. Bitcoin, or whatever cryptocurrency wins out, won't be successful because of a hierarchy but rather because of a framework of incentives for a self-interested network of entities to work together. Crowdfunding will enable resources to coalesce around needs without large hierarchical foundations or financial institutions.
So let's think a bit about book publishing. Through the 20th century, publishing required a signification amount of investment in capital- printing presses, warehouses, delivery trucks, bookstores, libraries, and people with specialized skills and abilities. A few large publishing companies emerged along with big-box retailers that together comprised an efficient machine for producing, distributing and monetizing books of all kinds. The transition from print to digital has eliminated need for the physical aspects of the book publishing machine, but the human components of that machine remain essential. It's no longer clear that the hierarchical organization of publishing is necessary for the organization of publishing's human effort.
I've already mentioned Wikipedia's conquest of encyclopedia publishing, by dint of its large scale and wide reach. But equally important to its success has been a set of codes and customs bound together in a suite of collaboration and workflow tools. Version tracking allows for easy reversion of edits. "Talk pages" and notifications facilitate communication and collaboration. (And edit-wars and page locking, but that's another bucket of fish.)
Most publishing projects have audiences that are too small or requirements too specific to support Wikipedia's anyone-can-edit-or-revert model of collaboration. A more appropriate model for collaboration in publishing is one widely used for software development.
Modern software development requires people with different skills to work together. Book publishing is the same. Designers, engineers, testers, product managers, writers, and subject domain experts may each have an important role in creating a software application; authors, editors, proofreaders, illustrators, designers, subject experts, agents, and publicists may all work together on a book. Book publishing and software can be either open or proprietary. The team producing a book or a piece of software might number from one to a hundred. Books and programs can go into maintenance mode or be revised in new editions or versions. Translation into new languages happens for both. Assets from one project can be reused in other projects.
Open source software has been hugely successful over the past few decades. Along the way, an ecosystem of collaboration tools and practices has evolved to support both open source development and software development in general. Many aspects of this ecosystem have been captured in GitHub.
The "Git" in GitHub comes from git, an open source distributed version control system initially written by Linus Torvalds, the Linus behind Linux. It's fast, and it lets you work on a local code repository and then merge your changes with a repository stored somewhere else.
In just two sentences, I've touched on several concepts that may be foreign to many book publishing professionals. Microsoft Word's "track changes" is probably the closest that most authors get to a version control system. The big difference is that "track changes" is designed to facilitate collaboration between a maximum of two people. Git works easily with many contributors. A code "repository" holds more than just code, it can contain all the assets, documentation, and licenses associated with a project. And unlike "track changes", Git remembers the entire history of your project. Many book publishers still don't keep together all the assets that go into a book. And I'm guessing that publishers are still working on centralizing their asset stores instead of distributing them!
Git is just one of the useful aspects of GitHub. I think the workflow tools are perhaps more important. Developers talk about the workflow variants such as "git-flow" and "GitHub-flow", but the differences are immaterial to this discussion. Here's what it boils down to: Someone working on a project will first create a "feature branch", a copy of the repository that adds a feature or fixes a bug. When the new feature has been tested and is working, the changes will be "committed". Each set of changes is given an identifier and a message explaining what has been changed. The branch's developer then sends a "pull request" to the maintainers of the repository. A well crafted pull request will provide tests and documentation for the new feature. If the maintainers like the changes, they "pull" the changes into the main branch of the repository. Each of these steps is a push of a button on GitHub, and GitHub provides annotation, visualization and commenting tools that support discussions around each pull request, as well as issue lists and wiki pages.
The reason the workflow tools and the customs surrounding their use are so important is that anyone who has used them already knows how to participate in another project. For an excellent non-programming example, take a look at the free-programming-books repository, which is a basic list of programming books available online for free. As of today, 512 different different people have contributed a total of 2,854 sets of changes the the repository, have expanded it to books in 23 languages, and have added free courses, screencasts and interactive tutorials. The maintainers enforce some basic standards and make sure that the list is free of pirated books and the like.
It's also interesting that there are 7,229 "forks" of free-programming-books. Each of these could be different. If the main free-programming-books repo disappears, or if the maintainers go AWOL, one of these forks could become the main fork. Or if one group of contributors want to move the project in a different direction from the maintainers, it's easy to do.
Forking a book is a lot more common than you might think. Consider the book Robinson Crusoe by Daniel Defoe. OCLC's WorldCat lists 7,459 editions of this book, each one representing significantly more effort than a button push in a workflow system. It's common to have many editions of out-of-copyright books of course, but it's also becoming common for books developed with open processes. As an example, look at the repository for Amy Brown and Greg Wilson's Architecture of Open Source Applications. It has 5 contributors, and has been forked 58 times. For another example of using GitHub to write a book, read Scott Chacon's description of how he produced the second edition of Pro Git. (Are you surprised that a founder of GitHub is using GitHub to revise his book about Git?)
There's another aspect of modern software engineering with GitHub support that could be very useful for book publishing and distribution. "Continuous integration" is essential for development of complex software systems because changes in one component can have unintended effects on other components. For that reason, when a set of changes is committed to a project, the entire project needs to be rebuilt and retested. GitHub supports this via "hooks". For example, a "post-commit" hook can trigger a build-test apparatus; hooks can even be used to automatically deploy the new software version into production environments. In the making of a book, the insertion of a sentence might necessitate re-pagination and re-indexing. With continuous integration, you can imagine the correction of a typo immediately resulting in changes in all the copies of a textbook for sale. (or even the copies that had already been purchased!)
A number of startups have recognized the applicability of Git and GitHub to book publishing. Leanpub, GitBook, and Penflip are supporting GitHub backends for open publishing models; so far adoption has been most rapid in author communities that already "get" GitHub, for example, software developers. The company that is best able to teach a GitHub-like toolset to non-programmers will have a good and worthy business, I think.
As more people learn and exercise the collaboration culture of GitHub, new things will become possible. Last year, I became annoyed that I couldn't fix a problem I found with an ebook from Project Gutenberg. It seemed obvious to me that I should put my contributions into a GitHub repo so that others could easily make use of my work. I created a GitHub organization for "Project GitenHub". In the course of creating my third GitenHub book, I discovered that someone named Seth Woodward had done the same thing a year before me, and he had moved over a thousand Project Gutenberg texts onto GitHub, in the "GITenberg" organization. Since I knew how to contribute to a GitHub project, I knew that I could start sending pull requests to GITenberg to add my changes to its repositories. And so Seth and I started working together on GITenberg.
Seth has now loaded over 50,000 books from Project Gutenberg onto GitHub. (The folks at Project Gutenberg are happy to see this happening, by the way.) Seth and I are planning out how to make improved quality ebooks and metadata for all of these books, which would be impossible without a way to get people to work together. We put in a funding proposal to the Knight Foundation's NewsChallenge competition. And we were excited to learn that (as of Jan 1, 2015) the Text Creation Partnership has added 25,000 texts from EEBO (Early English Books Online) on GitHub. So it's an exciting time for books on GitHub.
There's quite a bit of work to do. Having 50,000 repositories in an organization strains some GitHub tools. We need to figure out how to explain the GitHub workflow to potential contributors who aren't software developers. We need to make bibliographic metadata more git-friendly. And we need to create a "continuous integration system" for building ebooks.
Who knows, it might work.
Update January 30: Our NewsChallenge proposal is being funded!!!
Shortly after it came to light that Adobe Digital Editions was transmitting information about ebook reading activity in the clear, for anybody to snoop upon, I asked a loaded question: does ALA have a role in helping to verify that the software libraries use protect the privacy of readers?
As with any loaded question, I had an answer in mind: I do think that ALA and LITA, by virtue of their institutional heft and influence with librarians, can provide significant assistance in securing library software.
I waited a bit, wondering how the powers that be at ALA would respond. Then I remembered something: an institution like ALA is not, in fact, a faceless, inscrutable organism. Like Soylent Green, ALA is people!
Well, maybe not so much like Soylent Green. My point is that despite ALA’s reputation for being a heavily bureaucratic, procedure-bound organization, it does offer ways for members to take up and idea an run with it.
And that’s what I did — I floated a petition to form a new interest group within LITA, the Patron Privacy Technologies IG. Quite a few people signed it… and it now lives!
Here’s the charge of the IG:
The LITA Patron Privacy Technologies Interest Group will promote the design and implementation of library software and hardware that protects the privacy of library users and maximizes user ability to make informed decisions about the use of personally identifiable information by the library and its vendors.
Under this remit, activities of the Interest Group would include, but are not necessarily limited to:
- Publishing recommendations on data security practices for library software.
- Publishing tutorials on tools for libraries to use to check that library software is handling patron information responsibly.
- Organizing efforts to test commercially available software that handle patron information.
- Providing a conduit for responsible disclosure of defects in software that could lead to exposure of library patron information.
- Providing sample publicity materials for libraries to use with their patrons in explaining the library’s privacy practices.
I am fortunate to have two great co-chairs, Emily Morton-Owens of the Seattle Public Library and Matt Beckstrom of the Lewis and Clark Library, and I’m happy to announce that the IG’s first face-to-face meeting will at ALA Midwinter 2015 — specifically tomorrow, at 8:30 a.m. Central Time in the Ballroom 1 of the Sheraton in Chicago.
We have two great speakers lined up — Alison Macrina of the Library Freedom Project and Gary Price of INFODocket, and I’m very much looking forward to it.
But I’m also looking forward to the rest of the meeting: this is when the IG will, as a whole, decide how far to reach. We have a lot of interest and the ability to do things that will teach library staff and our patrons how to better protect privacy, teach library programmers how to design and code for privacy, and verify that our tools match our ideals.
Despite the title of this blog post… it’s by no means my effort alone that will get us anywhere. Many people are already engaging in issues of privacy and technology in libraries, but I do hope that the IG will provide one more point of focus for our efforts.
I look forward to the conversation tomorrow.
Updated January 26, 2015
Escola Bahiana de Medicine e Saude Publica
Escola Superior de Educacao de Paula Frassinetti
Lundh Research Foundation
ABRACICON: Academia Brasileira de Ciencias Contabeis
Canakkale Arastirmalari Turk Yilligi
Chinese Journal of Plant Ecology
Eskisehir Osmangazi University Journal of Social Sciences
Geological Society of India
Instituto do Zootecnia
Journal of Social Studies Education Research
Journal Press India
Kahramanmaras Sutcu Imam Universitesi Tip Fakultesi Dergisi
Nitte Management Review
Sanat Tasarim Dergisi
Sociedade Brasileira de Virologia
The Apicultural Society of Korea
The East Asian Society of Dietary Life
The Korea Society of Aesthetics and Science of Art
Turkish History Education Journal
Last Updated January 20, 2015
All-Russia Petroleum Research Exploration Institute (VNIGRI)
Barbara Budrich Publishers
Botanical Research Institute of Texas
Faculty of Humanities and Social Sciences, University of Zagreb
Graduate Program of Management and Business, Bogor Agricultural University
IJSS Group of Journals
Innovative Pedagogical Technologies LLC
International Network for Social Network Analysts
Slovenian Chemical Society
Subsea Diving Contractor di Stefano Di Cagno Publisher
The National Academies Press
Wisconsin Space Grant Consortium
Artvin Coruh Universitesi Orman Fakultesi Dergisi
Canadian Association of Schools of Nursing
Indian Society for Education and Environment
Journal for the Education of the Young Scientist and Giftedness
Kastamonu University Journal of Forestry Faculty
Korean Society for Metabolic and Bariatric Surgery
Korean Society of Acute Care Surgery
The Korean Ophthalmological Society
The Pharmaceutical Society of Korea
Uludag University Journal of the Faculty of Engineering
YEDI: Journal of Art, Design and Science
Following on the heels of BPL President Amy Ryan’s appointment as its next chair, the DPLA Board of Directors announced today the appointment of a slate of officer positions and committee chairs. Jamie Hollier will serve as the Treasurer of the Board and chair of the Finance Committee (and the associated Audit Subcommittee). Laura DeBonis, who served previously as Board Secretary, will continue in this role, while Robert Darnton will continue as the chair of the Governance Committee (and the associated Nomination Subcommittee). These appointments will take effect as of July 1, 2015.
The DPLA Board committees are responsible for carrying out the formal governance functions of the Board. Descriptions of these committees, included below, can also be found on the DPLA Committees page.Finance Committee
The purpose of the Board Finance Committee is to oversee the financial doings of the DPLA; to review and evaluate the DPLA’s fiscal operation and its managers; to report to the Board and/or Executive Director on the DPLA’s finances, and/or any irregularities or issues; to provide advice and recommendations to the Board of Directors, Executive Director, Director for Content, and staff on how the DPLA’s financial operations align with its mission, vision, and strategic goals. An Audit subcommittee may be formed.
The members of the Board Finance Committee are:
- Cathy Casserly, Creative Commons (current chair)
- Paul Courant, University of Michigan
- Jamie Hollier, Anneal, Inc.; Commerce Kitchen (new chair effective July 1, 2015)
- Siva Vaidhyanathan, University of Virginia
- Ex Officio members: Amy Ryan, Chair of the Board (effective July 1, 2015), and Dan Cohen, Executive Director, serve as ex officio members on all formal Board committees
The purpose of the Board Governance Committee is to provide advice and recommendations to the Board of Directors, Executive Director, Director for Content, and staff on the governance practices to which the DPLA should adhere, and to promote Board development. A Nomination subcommittee may be established when appropriate.
The members of the Board Governance Committee are:
- Robert Darnton, Harvard University (current chair; will continue on in second term)
- Laura DeBonis
- Luis Herrera, San Francisco Public Library
- Ex Officio members: Amy Ryan, Chair of the Board (effective July 1, 2015), and Dan Cohen, Executive Director serve as ex officio members on all formal Board committees
Updated January 26, 2015
Total no. participating publishers & societies 5760
Total no. voting members 3046
% of non-profit publishers 57%
Total no. participating libraries 1926
No. journals covered 37,564
No. DOIs registered to date 71,934,134
No. DOIs deposited in previous month 648,271
No. DOIs retrieved (matched references) in previous month 46,260,320
DOI resolutions (end-user clicks) in previous month 134,057,984
Publishers have been including FundRef information in their article deposits for over a year now and many have been able to make use of the names Registry to include funding organization identifiers. In some cases an appropriate Registry identifier did not exist or a match could not be made at the time of deposit so the metadata deposited with CrossRef contained just a funding organization name. As the Registry grows these past deposits should be updated to reflect new identifiers and better Registry metadata.
The getFunders API returns Funder information deposited for a DOI, as well as any additional funder information that may not have been established at the time of deposit and the matching for the getFunders API has been improved.
For example, http://doi.crossref.org/getFunders?q=10.1109/JPHOT.2014.2331233 displays funder information in JSON format for DOI 10.1109/JPHOT.2014.2331233.
The funder name 'NSF' was deposited for this DOI. This funder name is too ambiguous to make a match, the getFunder API displays all available funders related to 'NSF' and the publisher can use this to update the metadata.
District Dispatch: Senator Richard Durbin to open Washington Update session at 2015 ALA Midwinter Meeting
Senator Richard Durbin (D-IL), the senior senator from Springfield, Ill., will open a discussion on the implications of the November mid-term Congressional elections for America, libraries and library advocacy at the 2015 American Library Association (ALA) Midwinter Meeting in Chicago. The session, titled “Whither Washington?: The 2014 Election and What it Means for Libraries,” takes place from 8:30–10:00a.m. on Saturday, January 31, 2015, in room W183A.
Sen. Durbin will explore the implications of the recent national election and the ways that librarians can help shape the coming debates on privacy, library funding and other key library issues. With critical bills to reauthorize federal library funding, efforts to reform key privacy and surveillance statutes, and changes to copyright law all likely to be on legislators’ plates, libraries will engage heavily with the newly-elected 114th Congress.
Additionally, Roger Goldblatt, associate bureau chief of the Federal Communications Commission’s Consumer and Government Affairs Bureau, will speak at the session and focus on the Commission’s new consumer education initiative and its digital literacy agenda.
- Sen. Richard Durbin, Assistant Senate Minority Leader, Illinois Senator
- J. Mark Hansen, professor, Department of Political Science, University of Chicago
- Thomas Susman, director, Government Affairs, American Bar Association
- Roger Goldblatt, associate bureau chief, Consumer and Government Affairs Bureau, Federal Communications Commission
The post Senator Richard Durbin to open Washington Update session at 2015 ALA Midwinter Meeting appeared first on District Dispatch.