news aggregator

Metro Distortion Map

unalog - Sun, 2014-01-05 03:18

Murray, Peter: Join the Community: Open Source is Nothing Without You

planet code4lib - Sat, 2014-01-04 21:17

During the American Library Association meeting in Chicago in 2013 I gave an “ignite” talk on open source software in libraries. (The “ignite talk” format, if you’re not familiar, is one in which “each speaker is allocated five minutes of presentation time and is accompanied by 20 presentation slides. During presentations, each slide is displayed for 15 seconds and then automatically advanced.”1 ) The talk was geared to inspiring community involvement and commitment in open source projects. The abstract:
The open source method for developing software works best when everyone contributes a little bit to the process. Do you benefit from open source? Do you wish the open source you use was a little better? Don’t know why the community nature of open source is important? Hear what you can do to make the world a better place by nudging your favorite open source project along a path to perfection.

The slides synchronized to audio are embedded below and are on Slideshare.

Image Credits
Attributed photographs from Flickr and Wikimedia Commons; used under Creative Commons derivatives-okay licenses.

Footnotes
  1. Definition from Wikipedia.
Link to this post!

Rosenthal, David: Threat Model for Archives

planet code4lib - Sat, 2014-01-04 13:44
Discussing the recent vulnerability in the Bitcoin protocol, I pointed out that:
One of the key ideas of the LOCKSS system was to decentralize custody of content to protect against powerful adversaries' attempts to modify it. Governments and other powerful actors have a long history of censorship and suppression of inconvenient content. A centralized archive system allows them to focus their legal, technical or economic power on a single target.Today Boing-Boing points us to a current example of government suppression of inconvenient content that drives home the point.
Scientists say the closure of some of the world's finest fishery, ocean and environmental libraries by the Harper government has been so chaotic that irreplaceable collections of intellectual capital built by Canadian taxpayers for future generations has been lost forever.Many collections such as the Maurice Lamontagne Institute Library in Mont-Joli, Quebec ended up in dumpsters while others such as Winnipeg's historic Freshwater Institute library were scavenged by citizens, scientists and local environmental consultants. Others were burned or went to landfills, say scientists.Read the whole piece, especially if you think single, government-funded archives are a solution to anything.

ALA Equitable Access to Electronic Content: What’s next for surveillance reform?

planet code4lib - Fri, 2014-01-03 23:21

Photo by vpickering via flickr.

Library advocates can expect 2014 to be a year full of privacy and surveillance issues. The potential for reform of our country’s surveillance laws and programs is serious business and will take dedicated association and grassroots resources to contribute to reform efforts.

As noted in the first part of our blog series on surveillance, one of the considerations for the American Library Association (ALA) is how can our association best contribute to reform efforts and how do these issues relate to the other legislative and policy priorities for ALA? What separates the current surveillance situation than previous years when little or no reform came to fruition? Equally important: Are ALA members, and other library supporters, willing to do the heavy grassroots lifting to make an impact?

Will 2014 be any different?

First: The public knows now what we couldn’t know before: We have confirmation about the tremendous breadth and degree of National Security Agency (NSA) surveillance programs because of the Edward Snowden revelations. The NSA surveillance programs were secret. Many other law enforcement activities weren’t public knowledge because the U.S.A. PATRIOT Act and the Foreign Intelligence Surveillance Act (FISA) include have standing gag order provisions, meaning that no individual or institution could reveal that they had received a FISA or PATRIOT Act subpoena/order or national security letter. Such approaches by law enforcement permanently kept government information secret.

The “Connecticut Four” library consortium case was the first, and one of only a few, examples of a successful court-ordered lift to a gag order inherent to these laws. This permitted the four librarians involved to speak publicly about the government’s attempt to search the consortium’s patron internet records.

Second: All three branches of the federal government are now examining government surveillance activities. Congress has over two dozen bills introduced; there are two contradicting judicial decisions, not just by the secret Foreign Intelligence Surveillance Court (FISC); and, there is now White House task force group report that does not fully accept the premises held by other parts of the executive branch, such as the U.S. Department of Justice (DOJ).

Third: The public has more possible avenues to push for reforms and to articulate why library supporters – and the general public – should and must be involved in the debates. I am personally concerned that too many in the general public, including some library supporters, are numbed by these challenges to privacy. For ALA’s grassroots advocacy on these federal issues, there will have to be more activity and soul-searching to increase advocate involvement in the coming year.

Fourth: There are more issues to address now, not “just” the surveillance of phone records and the government’s collection of personal information. Other issues include open government and transparency, secret interpretations of laws and reform of the FISA court and its procedures. The debate must also address requiring more reporting and public information about surveillance activities how to establish effective and comprehensive oversight by Congress. Add in concerns about whistleblowers and potential threats to journalists, and it is a very full public agenda.

What does this all mean for ALA?

All of these issues, in some way or another, are part of ALA’s legislative and policy agenda based upon the library community’s long-standing commitment to First Amendment and privacy principles. The surveillance issues alone make for a very full agenda. How should ALA address these issues and how deeply? Are ALA members and other library supporters willing to get involved and to support community and public engagement?

In my next article, I’ll discuss: What are ALA’s resources to promote community engagement, advocacy and privacy education around surveillance issues? How will these debates move forward during Midwinter 2014 in Philadelphia and beyond? Finally, what will advocates learn from a Guardian reporter who will speak at Midwinter?

The post What’s next for surveillance reform? appeared first on District Dispatch.

OCLC Dev Network: WorldCat Metadata API Temporarily Unavailable

planet code4lib - Fri, 2014-01-03 17:37
Related Web Service(s):  WorldCat Metadata API

We are experiencing a problem with the WorldCat Metadata API web service and it is temporarily unavailable. Our investigation so far points to a communication breakdown between the web service and one of its dependencies. We are analyzing options for resolving this as quickly as possible and will provide an update to the Developer Network community on Wednesday, January 8th. We apologize for any inconvenience this may cause, as we know that there is a lot of interest in this API.

Ribaric, Tim: Progress on Sabbatical (Part 2)

planet code4lib - Fri, 2014-01-03 16:30

 *

(* Officially Now)

Since we last met the wheels have been turning. 

read more

Rochkind, Jonathan: CC 4.0 updated for use with data?

planet code4lib - Fri, 2014-01-03 14:53

I had a previous post on how Creative Commons licenses aren’t (weren’t?) suitable for licensing data, for several reasons. 

Charles Nepote helpfully commented on that post linking to a December announcement on changes in CC 4.0 related to data licensing.

It appears that CC agreed that past CC licenses were problematic for data use, and are attempting to address that.  Mainly by explicitly addressing ‘database rights’ in addition to copyright — CC licenses legality was previously based on the licensor having copyright in the thing licensed, and the rights to grant licenses and enforce restrictions through copyright. But in many jurisdictions (including the U.S.), there may not be any copyright existing over ‘data’,  but in some jurisdictions (but not the U.S.)  there may (instead or in addition) be certain legal ‘database rights’.

So it looks like CC 4.0 tries to mention database rights to use those as the basis for licensing, in contexts where database rights may exist but not copyright.

Additionally, according to the announcement, CC 4.0 tries to be more flexible with how ‘attribution’ requirements can be complied with, in ways that will make it more reasonable for data uses. I’m not sure if this is represented in the actual license legal language, or just in the FAQ’s and other supporting documents.

I haven’t spent a lot of time looking over the changes myself and have no opinion on how effective or suitable they are.  I continue to have some concerns about data ‘licensing’ in the U.S. where some things we think of as ‘data’ will not be copyrightable (those that are considered by the courts to be mainly ‘factual’ information — which may or may not be what you or I would consider to be ‘mainly factual information’).  And in the U.S., there is no such thing as distinct ‘database rights’ at all.

If you have neither copyright nor ‘database rights’ over data, then you really have no legal ability to enforce restrictions on it’s use at all, and trying to convince people you do anyway is really a form of copyfraud, over-reaching by content ‘owners’ (or controllers) trying to restrict the rights of the social public beyond what is legally intended.  I think we should be encouraging more widespread recognition of existing public rights to use certain things (like data which is not copyrightable) without permission, rather than encouraging content controllers to try and convince people they need permission when the law doesn’t support that.

While the CC data FAQ does  try to recognize this, with statements like “If you are not exercising an exclusive right held by the database maker, then you do not need to rely on the license to mine.”  This is great. As is their stated effort to make sure that “CC license terms and conditions are not triggered by uses permitted under any applicable exceptions and limitations to copyright, nor do license terms and conditions apply to elements of a licensed work that are in the public domain. This also means that CC licenses do not contractually impose restrictions on uses of a work where there is no underlying copyright.”  This is great too!

But CC doesn’t offer much practical guidance on figuring out when this is the case.  Nor is it probably feasible to offer such guidance, as it’s a complicated legal question which can differ by jurisdiction. But I’d rather we were encouraging and supporting people to expand their use of legally unencumbered data, rather than providing tools which encourage treating possibly unencumbered data as if it were legally controlled.

For that reason, I continue to support strongly considering CC0 or equivalent releases or dedications (rather than strictly licenses) for data use, that simply release the data into the public domain in jurisdictions where that’s required, while acknolwedging in some jurisdictions it may not have been required at all; I think it’s better for all of us.

However, I’m glad that CC is at recognizing some of the issues and attempting to address them. Previous warnings about previous versions of CC being unsuitable for data didn’t seem to impede it’s widespread use for data anyway!  It’s definitely worth reading the post about CC 4.0 and open data, as well as the Creative Commons Data Guidance page linked to.

I’m also quite pleased to see in that guidance page that “CC does not recommend use of its NonCommercial (NC) or NoDerivatives (ND) licenses on databases intended for scholarly or scientific use.” — I guess that leaves attribution (BY) and “share alike” (SA)?  As well as recommending against trying to license “some rather than all of the rights they have in a database.” If people stick to these and similar recommendations, using only a “BY” license and expecting the more liberal conceptions of attribution compliance, my concerns will be ameliorated. (“SA” is still very tricky, and could create license incompatibilities where you can’t combine data that at some point came from a CC-SA licensed database with data from other sources with other licenses – if the data is controlled by copyright or database rights in the first place, which it may not be even if originators try to tell you it is!)

In general, I appreciate the intentions of CC towards data with the CC 4.0 licenses, and consider is a step forward, although I’d still strongly advise considering if you can simply allow your data to be used unencumbered rather than attempting to impose restrictions on it’s use — recognizing it may have no legal protection anyway, and it may be difficult to determine whether it does or not, especially in the U.S.  But if you do decide you need to try and impose restrictions, following CC’s advice on how to do this, and using CC 4.0 to do so (preferably just “BY”), seems probably a good way to go.


Filed under: General

Williams, Mita: A Place for Place

planet code4lib - Fri, 2014-01-03 09:37
There has only been one department in the 375+ year history of Harvard that has ever been dismantled and that was the Geography Department.  Since then many other Geography Departments have been dealt a similar fate including the one at my My Own Place of Work which disappeared some years before I started my employment there. Some of its faculty remain at the university, either exiled to Sociology or Political Science or regrouped as Earth Sciences, depending on which of The Two Cultures they pledged allegiance to.

I have an undergraduate degree in Geography and Environmental Science and as such I sometimes feel that I'm part of an academic diaspora.

So after almost 20 years of librarianship I've made one of my sabbatical goals to ‘re-find my inner geographer.’ My hope is that through my readings I will be able to find and apply some of the theories and tools that geographers use in my own practice.

I think I have already found a good example to use a starting point as I try to explain in this post what sort of ground I'm hoping to explore and how it may apply to librarianship.



It came to me as I was browsing through the most recent issue of Antipode: The Radical Journal of Geography when my eyes immediately fell on an article whose topic was literally close to home. It was an article about migrant worker experiences in “South-Western Ontario”.

I had to download and scan most of the article before I could learn that what was being referred to as ‘South-Western Ontario’ was actually East of where I live. And that’s when I noticed that the official keywords associated with the article (migrant workers; agriculture; labour control; Seasonal Agricultural Workers Program) made no mention of place. And this struck me as a curious practice for a journal dedicated to *geography*.

But I know better to blame the editors of Antipode for this oversight. The journal is on the Wiley publishing platform (which they call the “Wiley Online Library”, huh) which provides a largely standardized reading experience across the disciplines. On one hand, it’s understandable that location isn't a standardized metadata field for academic articles as many papers in many disciplines aren't concerned with a particular place. On the hand, I do think that is telling that the within academia there is  much more care and effort dedicated to clarifying the location of the author rather than that of that of the subject at hand.

(I will, however, blame the editors for using the phrase ‘South-Western Ontario’ when the entire world uses ‘Southwestern Ontario” in reference to these parts. Their choice of spelling means if you search the “Wiley Online Library” for Southwestern Ontario, the article in question does not even show up.)

There is another reason why I'm concerned that the article at hand doesn't have a metadata field to express location and that is this: without a given location, the work cannot be found on a map. And that’s going to increasingly be a problem because the map is increasingly where we will live.

Let me explain what I mean by that.

You may know that Google became the pre-eminent search engine based on the strength of its PageRank algorithm which, unlike its peers at the time, created relevance rankings that takes into account the number of incoming links to that page as a means to establish authority and popularity and make it less immune to spam.

In those heady, early days of the Internet finding news and more from around the world was deliriously easy. Oddly enough one of the challenges of using the Internet back then was that it was hard to find info about the features of your small town. The Internet was wonderfully global but not very good at the local.

But now, in 2014, when I search for the word ‘library’ using Google and I receive my local library system as the first result.



This is because Google is now thought to incorporate 200 some factors in its page ranking.

And one of the most important factors is location.

In fact, I would go so far to say that, just like real estate, the three of the most important factors for search is location, location, location.

It's location because if you search for political information while in Beiing your experience using the Internet is going to be significantly different from that of Berlin because of government enforced filtering and geofencing.

It's location because if you search for Notre Dame in the United States you are probably going to get something related to football rather than a cathedral in Paris.

And it's location because so much of our of information seeking is contextual based. If I'm searching for information about a particular chemical additives while at a drug store, it’s probably because I'm about to make a consumer choice about a particular shampoo and not because I need to know that chemical's melting point.

(An aside: imagine if by the very act of entering a library space, the context of your searches were automatically returned as more scholarly. Imagine if you travelled to different spaces on a campus, your searches results would be factored automatically by the context of a particular scholarly discipline?)

While it’s difficult to imagine navigating a map of research papers, it is much easier to understand and appreciate how a geographical facet could prove useful in other interfaces. For example, if I'm looking for articles about about a whether particular social work practice conforms to a particular provincial law in Canada, then the ability to either pre-select articles from that province or filter articles to a list of results pertaining to that province could prove quite useful.

It's surprising how few of our library interfaces have this ability to limit by region. Summon doesn't. Neither does Primo. But Evergreen does and so does Blacklight.





There are other examples of using maps to discover texts. OCLC has been experimenting with placing books on a map. They were able to do so by geocoding Library of Congress Subject Heading Geographical Subdivisions that they parsed so that they can be found on a map on a desktop or nearby where you are while holding a mobile phone.





And there are many, many projects that seek to place digitized objects on a map, such as the delightful HistoryPin which allows you to find old photos of a particular place but of a different time visible only when when you look through the world through the magical lens of your computer or your mobile phone.

Less common are those projects which seek to make available actual texts (or as we say in the profession the full-text) accessible in particular places outside of the library. One of my favourite of such projects is the work of Peter Rukavina who has placed a Piratebox near a park bench in Charlottetown PEI that makes available a wide variety of texts: works of fiction (yes, about that red-headed girl), a set of city bylaws, and a complete set of community histories from the IslandLives repository.

When you think about embedding the world with a hidden layers of images and text that can only be unlocked if know its secrets, well that sounds to me like a gateway to a whole other world of experience, namely, games, and ARGs or alternative reality games in particular. Artists, museums, and historians have created alternative reality games that merged the physical exploration of place with narratives and as such have created new forms of story writing and storytelling.

Personally, I think its very important that libraries become more aware of the possibilities of in situ searching and discovery in the field and there are many fields worth considering.  Over the holiday break, I bought the Audubon Birding App which acts as field guide, reference work, includes a set of vocal tracks for each bird to help with identification, allows the creation of to store my personal birding life list, and a provides means to report geocoded bird sightings to eBird -- while being half the price of a comparable print work.  We, the people of print have a tendency to dismiss and scoff at talk of the end of the print book, but I don't see any of our reference works on our shelves providing this degree of value to our readers like this app does.

In my opinion, there’s not enough understanding of this potential future of works that take into account the context of place. Otherwise, why would our institutions force our users to visit the a physical library in order to access a digitalize copy of historical material that we might have already had in our collection but in microfilm?

So, as you can see, there’s a lot of territory for myself to explore during the next 12 months and I think I'm going to start by going madly off in all directions.

I do hope that by the end of this time I will have made a convincing argument to my peers that we have an opportunity here to do better.  I hope that one day the article in question that I started this train of thought - the one about migrant agricultural workers in South-Western Ontario -  should, when and if its included in an in a library maintained institutional repositories, have a filled out location field.

And then perhaps one day, those in the future who will work those fields in South-Western Ontario can discover it where they work.

Farkas, Meredith: Creating safe spaces vs. freedom of expression

planet code4lib - Thu, 2014-01-02 17:00

I don’t really think of myself as a victim. I feel exceptionally lucky when I think of the life I’ve had so far. I grew up in an upper middle class family, got a fantastic private liberal arts college education, and am now living the American dream with a family of my own, a house, and a good career. And yet, when I read about the ALA Conference Code of Conduct (or, to be precise, the Statement of Appropriate Conduct at ALA Conferences), I started to think back to how many times I have been a victim of things that violate it.

In middle school, I was bullied. In one case that lasted for a year and was very painful, it was about my religion, but there were all sorts of other reasons. In my teens and twenties, I experienced plenty of sexual harassment and unwanted advances including things I’d never mention on this blog. Like most women, I’ve had men yell lewd things from car windows, construction sites and the like. I had a creepy stalker at my first professional social work job. I stopped using instant messaging many years ago because a librarian I’d never actually met professed his love for me and kept contacting me every time I showed up online (did he ever go offline???). Someone at a previous job sexually harassed me and a number of other women and got away with it (a colleague finally reported him, but he simply left for another job at which he probably continued his inappropriate behavior). At an ALA Annual Conference way back when, I was repeatedly sexually propositioned by someone I had respected who invaded my space and made me feel deeply uncomfortable. And that doesn’t include all of the subtle incidents of discrimination, which are far more numerous. Many of my friends in the profession have been through far worse, including things that made their work-lives hell.

And yet, with all that history, I’ll admit that my very first reaction to the ALA Conference Code of Conduct bore more resemblance to a better-informed Will Manley (cached here if it’s still inaccessible) than an Andromeda Yelton. It rankled the non-conformist intellectual-freedom-loving part of me. I’m fairly convinced it’s because of the title. Were it called ALA Policy on Harassment and Discrimination at Conferences, I probably wouldn’t have batted an eye, but Code of Conduct or Statement of Appropriate Conduct just feels very, I hate to say, big brother.

When I think about some of the best talks I’ve been to, the speakers provoked. They cursed. They pissed people off. They made us think. I despise political correctness, maybe a reaction to graduating from the University that was the model for the movie PCU, and believe that people can have respectful discussions without being politically correct. Then again, I don’t think a Code of Conduct is going to make people like Jenica Rogers and Sarah Houghton  be any less awesome and provocative in conference talks than they are (if you look, the Code just says to consider how others may perceive what you’re going to say).  I hope librarians are not actually going to report other librarians who use the F-word or say something they don’t like (but isn’t discriminatory or harassing), but I know that we all have different sensibilities and different triggers based on our  backgrounds. I recently read a post where a blogger argued that a decidedly sophomoric discussion on Facebook about consensual sex at the ALA Conference was consistent with rape culture. When I checked out the link she supplied, I thought it was stupid rather than offensive, but it’s a great example of how people’s freedom to be sex-positive comes into conflict with other people need to feel safe in a space. Same with people’s freedom of speech and people’s more traditional values. I’m pretty sure that the people at ALA  Conference Services are smart enough to handle complaints in an intelligent manner, gathering evidence from a variety of people involved to come to a constructive conclusion, but I do hope they’ve had conversations internally about their own planned responses to complaints. Then again, I’m sure they’ve dealt with complaints before this new policy. 

I initially thought when I read the Code of Conduct that it won’t really make a difference, because it’s just a policy and no one actually pays attention to those. But then my husband told me about how the online community he manages changed after he instituted a code of conduct. Before the code, there were lots of offensive posts and personal attacks on the site. After the code, things actually changed. And it didn’t actually require much enforcement from moderators. Some people left the site because of it. Others changed their tone. And a lot of new people started contributing. Many women have told my husband that they didn’t feel comfortable in the community before the code of conduct was in place and now they do. It’s really encouraging to hear that a code of conduct can truly change behavior without requiring a lot of actual moderation and punishment.

But in an online community, everything happens in public. Everyone can see the horrible behavior. At ALA conferences, the victim of the harassment will have to report it. And, as Coral Sheldon-Hess describes, it’s easier said than done:

 I don’t know if I can convey the feeling of powerlessness that comes from being harassed, or even from unintentional neglect/mistreatment due to (for instance) gender or disability, or from comments people might make about a group they don’t know you’re a member of; I hope that even if you haven’t had these experiences, you will at least believe me that the perceived lack of power in those situations can be completely stifling. When I’ve experienced these kinds of mistreatment, even though there was usually no direct physical threat, I have nonetheless felt very unsafe, totally out on a limb, and completely alone. Looking back, some of my biggest regrets are my failures to respond to those situations, but it is so difficult. Something about having a CoC really helps, because it’s a public, visible statement that the whole organization has your back, and it’s OK to respond, either directly, or by asking for help from someone attached to the event/organization.

I hate how weak I’ve felt every time I’ve been harassed, and how I’ve second guessed whether what the other person did was harassment. I’m very happy that they’ve explicitly provided a way to report harassment in this policy, but I think for this to really be successful there need to be real  and visible consequences that makes people believe it’s worth reporting an offender. Otherwise, many people who’ve been victimized will not report it, especially if they have a history of not getting justice. A policy is great, but the organization needs to do more to encourage people to come forward. Again, I’ll reiterate that I hope ALA has internally discussed how they should respond to allegations.

In the end, I think it’s a very good thing that our professional organization is saying that harassment and hate speech are not acceptable at our conferences and giving people a way to report offenders. I understand people’s concerns about intellectual freedom, but I think it’s critically important that we create safe spaces at conferences where everyone feels comfortable attending and participating. Because conferences are better when they encourage the sharing of diverse viewpoints, and people won’t feel safe doing that if certain groups of people feel shamed or silenced. Creating safe spaces, in my opinion, encourages freedom of speech, because it encourages everyone to speak.

At the same time, I’m glad people are bringing up intellectual freedom concerns because it is a cornerstone of our profession and one we should never treat as a minor concern. Truly hearing and responding respectfully to those concerns can only strengthen the arguments for a Code of Conduct (or whatever people want to call it). 

Photo credit: Freedom of Expression by Harald Groven on Flickr (cc license: Attribution ShareAlike 2.0)

Bisson, Casey: Where on earth can I get an weotype list?

planet code4lib - Thu, 2014-01-02 05:12

It’s not like these aren’t documented, but I keep forgetting where.

WOEID place types:

$woetype = array( '7' => 'town', '8' => 'state-province', '9' => 'county-parish', '10' => 'district-ward', '11' => 'postcode', '12' => 'country', '19' => 'region', '22' => 'neighborhood-suburb', '24' => 'colloquial', '29' => 'continent', '31' => 'timezone', );

They can be queried via YQL:

<?xml version="1.0" encoding="UTF-8"?> <placeTypes xmlns="http://where.yahooapis.com/v1/schema.rng" xmlns:yahoo="http://www.yahooapis.com/v1/base.rng" yahoo:start="0" yahoo:count="1" yahoo:total="1"> <placeType yahoo:uri="http://where.yahooapis.com/v1/placetype/35" xml:lang="en-us"> <placeTypeName code="35">Historical Town</placeTypeName> <placeTypeDescription>A historical populated settlement that is no longer known by its original name</placeTypeDescription> </placeType> </placeTypes>

Mark Ockerbloom, John: Public Domain Day 2014: The fight for the public domain is on now

planet code4lib - Wed, 2014-01-01 19:42

New Years’ Day is upon us again, and with it, the return of Public Domain Day, which I’m happy to see has become a regular celebration in many places over the last few years.  (I’ve observed it here since 2008.)  In Europe, the Open Knowledge Foundation gives us a “class picture” of authors who died in 1943, and whose works are now entering the public domain there and in other “life+70 years” countries.  Meanwhile, countries that still hold to the Berne Convention’s “life+50 years” copyright term, including Canada, Japan, and New Zealand, and many others, get the works of authors who died in 1963.  (The Open Knowledge Foundation also has highlights for those countries, where Narnia/Brave-New-World/purloined-plums crossover fanfic is now completely legal.)  And Duke’s Center for the Study of the Public Domain laments that, for the 16th straight year, the US gets no more published works entering the public domain, and highlights the works that would have gone into the public domain here were it not for later copyright extensions.

It all starts to look a bit familiar after a few years, and while we may lament the delays in works entering the public domain, it may seem like there’s not much to do about it right now.  After all, most of the world is getting another year’s worth of public domain again on schedule, and many commentators on the US’s frozen public domain don’t see much changing until we approach 2019, when remaining copyrights on works published in 1923 are scheduled to finally expire.  By then, writers like Timothy Lee speculate, public domain supporters will be ready to fight the passage of another copyright term extension bill on Congress like the one that froze the public domain here back in 1998.

We can’t afford that sense of complacency.  In fact, the fight to further extend copyright is raging now, and the most significant campaigns aren’t happening in Congress or other now-closely-watched legislative chambers.  Instead, they’re happening in the more secretive world of international trade negotiations, where major intellectual property hoarders have better access than the general public, and where treaties can be used to later force extensions of the length and impact of copyright laws at the national level, in the name of “harmonization”.   Here’s what we currently have to deal with:

Remaining Berne holdouts are being pushed to add 20 more years of copyright.  Remember how I said that Canada, Japan, and New Zealand were all enjoying another year of “life+50 years” copyright expirations?  Quite possibly not for long.  All of those countries are also involved in the Trans-Pacific Partnership (TPP) negotiations, which include a strong push for more extensive copyright control.  The exact terms are being kept secret, but a leaked draft of the intellectual property chapter from August 2013 shows agreement by many of the countries’ trade negotiators to mandate “life+70 years” terms across the partnership.  That would mean a loss of 20 years of public domain for many TPP countries, and ultimately increased pressure on other countries to match the longer terms of major trade partners.  Public pressure from citizens of those countries can prevent this from happening– indeed, a leak from December hints that some countries that had favored extensions back in August are reconsidering.  So now is an excellent time to do as Gutenberg Canada suggests and let legislators and trade representatives know that you value the public domain and oppose further extensions of copyright.

Life+70 years countries still get further copyright extensions.   The push to extend copyrights further doesn’t end when a country abandons the “life+50 years” standard.  Indeed, just this past year the European Union saw another 20 years added on to the terms of sound recordings (which previously had a 50-year term of their own in addition to the underlying life+70 years copyrights on the material being recorded.)  This extension is actually less than the 95 years that US lobbyists had pushed for, and are still pushing for in the Trans-Pacific Partnership, to match terms in the US.

(Why does the US have a 95-year term in the first place that it wants Europe to harmonize with?  Because of the 20-year copyright extension that was enacted in 1998 in the name of harmonizing with Europe.  As with climbers going from handhold to handhold and foothold to foothold higher in a cliff, you can always find a way to “harmonize” copyright ever upward if you’re determined to do so.)

The next major plateau for international copyright terms, life+100 years, is now in sight.  The leaked TPP draft from August also includes a proposal from Mexico to add yet another 30 years onto copyright terms, to life+100 years, which that country adopted not many years ago.  It doesn’t have much chance of passage in the TPP negotiations, where to my knowledge only Mexico has favored the measure.   But it makes “life+70″ seem reasonable in comparison, and sets a precedent for future, smaller-scale trade deals that could eventually establish longer terms.  It’s worth remembering, for instance, that Europe’s “life+70″ terms started out in only a couple of countries, spread to the rest of Europe in European Union trade deals, and then to the US and much of the rest of the world.  Likewise, Mexico’s “life+100″ proposal might be more influential in smaller-scale Latin American trade deals, and once established there, spread to the US and other countries.  With 5 years to go before US copyrights are scheduled to expire again in significant numbers, there’s time for copyright maximalists to get momentum going for more international “harmonization”.

What’s in the public domain now isn’t guaranteed to stay there.  That’s been the case for a while in Europe, where the public domain is only now getting back to where it was 20 years ago.  (The European Union’s 1990s extension directive rolled back the public domain in many European countries, so in places like the United Kingdom, where the new terms went into effect in 1996, the public domain is only now getting to where it was in 1994.)  But now in the US as well, where “what enters the public domain stays in the public domain” has been a long-standing custom, the Supreme Court has ruled that Congress can in fact remove works from the public domain in certain circumstances.   The circumstances at issue in the case they ruled on?  An international trade agreement– which as we’ve seen above is now the prevailing way of getting copyrights extended in the first place.   Even an agreement that just establishes life+70 years as a universal requirement, but doesn’t include the usual grandfathered exception for older works, could put the public domain status of works going back as far the 1870s into question, as we’ve seen with HathiTrust international copyright determinations.

But we can help turn the tide.  It’s also possible to cooperate internationally to improve access to creative works, and not just lock it up further.  We saw that start to happen this past year, for instance, with the signing of the Marrakesh Treaty on copyright exceptions and limitations, intended to ensure that those with disabilities that make it hard to read books normally can access the wealth of literature and learning available to the rest of the world.  The treaty still needs to be ratified before it can go into effect, so we need to make sure ratification goes through in our various countries.  It’s a hopeful first step in international cooperation increasing access instead of raising barriers to access.

Another improvement now being discussed is to require rightsholders to register ongoing interest in a work if they want to keep it under copyright past a certain point.  That idea, which reintroduces the concept of “formalities”, has been floated some prominent figures like US Copyright Register Maria Pallante.  Such formalities would alleviate the problem of “orphan works” no longer being exploited by their owners but not available for free use.   (And a sensible, uniform formalities system could be simpler and more straightforward than the old country-by-country formalities that Berne got rid of, or the formalities people already accept for property like motor vehicles and real estate.)  Pallante’s initial proposal represents a fairly small step; for compatibility with the Berne Convention, formalities would not be required until the last 20 years of a full copyright term.  But with enough public support, it could help move copyright away from a “one size fits all” approach to one that more sensibly balances the interests of various kinds of creators and readers.

We can also make our own work more freely available.  For the last several years, I’ve been applying my own personal “formalities” program, in which I release into the public domain works I’ve created that I don’t need to further limit.  So in keeping with the original 14-year renewable terms of US copyright law, I now declare that all work that I published in 1999, and that I have sole control of rights over, is hereby dedicated to the public domain via a CC0 grant.  (They join other works from the 1900s that I’ve also dedicated to the public domain in previous years.)  For 1999, this mostly consists of material I put online, including all versions of  Catholic Resources on the Net, one of the first websites of its kind, which I edited from 1993 to 1999.  It also includes another year’s history of The Online Books Page.

Not that you have to wait 14 years to free your work.  Earlier this year, I released much of the catalog data from the Online Books Page into the public domain.  The metadata in that site’s “curated collection” continues to be released as open data under a CC0 grant as soon as it is published, so other library catalogs, aggregators, and other sites can freely reuse, analyze, and republish it as they see fit.

We can do more with work that’s under copyright, or that seems to be.  Sometimes we let worries about copyright keep us from taking full advantage of what copyright law actually allows us to do with works.  In the past couple of years, we saw court rulings supporting the rights of Google and HathiTrust to use digitized, but not publicly readable, copies of in-copyright books for indexing, search, and preservation purposes.   (Both cases are currently being appealed by the Authors Guild.)  HathiTrust has also researched hundreds of thousands of book copyrights, and as of a month ago they’d enabled access to nearly 200,000 volumes that were classified as in-copyright under simple precautionary guideliness, but determined to be actually in the public domain after closer examination.)

In the coming year, I’d like to see if we can do similar work to open up access to historical journals and other serials as well.  For instance, Duke’s survey of the lost public domain mentions that articles from 1957 major science journals like Nature, Science, and JAMA are behind paywalls, but as far as I’ve been able to tell, none of those three journals renewed copyrights for their 1957 issues.  Scientists are also increasingly making current work openly available through open access journals, open access repositories, and even discipline-wide initiatives like SCOAP3, which also debuts today.

There are also some potentially useful copyright exemptions for libraries in Section 108 of US copyright law that we could use to provide more access to brittle materials, materials nearing the end of their copyright term, and materials used by print-impaired users.

Supporters of the public domain that sit around and wait for the next copyright extension to get introduced into their legislatures are like generals expecting victory by fighting the last warThere’s a lot that public domain supporters can do, and need to do, now.  That includes countering the ongoing extension of copyright through international trade agreements, promoting initiatives to restore a proper balance of interest between rightsholders and readers, improving access to copyrighted work where allowed, making work available that’s new to the public domain (or that we haven’t yet figured out is out of copyright), and looking for opportunities to share our own work more widely with the world.

So enjoy the New Year and the Public Domain Day holiday.  And then let’s get to work.


Hellman, Eric: In 2013, eBook Sales Collapsed... in My Household.

planet code4lib - Wed, 2014-01-01 14:54
2013 was not the year the ebook industry was expecting. We hoped that ebooks would continue their explosive year-on-year revenue growth, and that the replacement of print by digital would proceed apace. We suspected that the growth of ebook sales might moderate, because, as HarperCollins CEO Brian Murray told Publisher's Weekly, "Nothing grows by triple digits for too long." But just as CDs replaced vinyl and digital downloads replaced CDs, it seemed obvious that the the age of the printed book was nearing its end; the century of the ebook was dawning.

We got a few things right. Internationally, ebook sales growth was strong. Print continued its slow decline. Bookstores continued to close. But for some reason, ebook sales in the US stopped increasing. And even started declining!

There are many possible explanations for this turn of events. There are technicalities with the data collection, particularly with publishers such as Amazon's imprints that don't report their sales numbers. Young Adult sales dropped steeply, as there was no smash hit to follow on the huge success of Hunger Games. 50 Shades of Gray didn't turn out to be a lasting relationship. And there's been a downward trend on prices, particularly as publishers start to use dynamic pricing to stimulate sales. But it seems to me that something in the environment is changing, more than just a market maturation.

Amazon probably has enough data to understand what's happening, but they're notoriously opaque about reporting numbers. On the other hand, they're quite good about reporting to customers what they've bought. So I decided to analyze my own household's Amazon data. I had the impression that my family was spending less on ebooks, but I wasn't sure, because they still seem to spend all hours of the day reading. The results were kind of shocking.

The graph shows my household Kindle ebook purchases from 2009-2013. As you can see, 2013 marked a steep drop from the 2009-2011 peak years of about $1000 per year.

I don't buy Kindle ebooks myself (I buy ePub only, so I can hack on them) but other members of my household have bought quite a lot. The average price paid is about $7, and this has held quite steady. But in 2013, Kindle purchases stopped almost completely, and they were not replaced by purchases on other platforms.

Based on detailed "interviews" with the subject ebook purchasers, here are some non-factors in this collapse:
  1. "Netflix-for-Books" services. Nobody but me has heard of them.
  2. Kindle Owner's Lending Library. Despite an well-used Amazon Prime subscription, they haven't figured out how to use it for ebooks.
  3. Our public library. Nobody but me has used it for ebooks.
  4. Piracy. As if!
The two main reasons for this spending collapse turn out to be:
  1. The Kindle acquired in early 2009 reached end-of-life due to a cheaply made power cord, and was replaced by an iPad. The lack of in-app purchase for the Kindle App has resulted in a significant impediment to Kindle purchases. The iBookStore has not attracted a single ebook purchase.
  2. The iPad owner now spends the vast majority of her reading time in fan-fiction websites, mostly fanfiction.net and ArchiveOfOurOwn.org. Same for the iPad borrower, but a different mix of websites.
I find it worrying that the Justice Department pursued an big antitrust suit against Apple and 5 of the big 6 publishers, won, and despite making an issue of Apple's in-app purchase ban in iOS, it seems to have lost the argument with Judge Cote. We'll see how that turns out.

It's worth paying close attention to the fan fiction sites. After all, 2012's biggest revenue engine for the book industry, 50 Shades, was a repackaged fanfic. On an iPad with a decent internet connection, the fanfic sites work better than ePubs. They link and they script. Just try making a link from one ePub to another and you'll get my point.  They deliver content in smaller, more addictive chunks, and they integrate popular culture MUCH more effectively than books do, for reasons relating primarily to copyright. The authors are responsive and deeply connected to readers; they often ARE the readers!

There's a fanfic site to appeal to every reader; I highlighted Wattpad earlier this year. ArchiveOfOurOwn.org ("AO3"), a project of the Organization for Transformative Works, a non-profit, experienced the growth in 2013 that was missing from the ebook sector. The number of works hosted by AO3 doubled to just under a million works, covering almost 14,000 "fandoms". (A good example of a fandom is the "Dragonriders of Pern" fandom, which currently hosts 534 works). Fanfiction.net, an advertising supported site, hosts almost 2000 fandoms and over 1.3 million works, more than half of which are in the Harry Potter or Twilight fandoms. Game oriented discussion forums also engage in fanfiction. (Popular in my house is spacebattles.com)

My anecdata might be completely anomalous, although Amazon, a very data-driven company, seems to be aware of the same phenomena. They've been making the Kindle into a full featured tablet to go head-to-head with the iPad. They've also launched a fanfic site called Kindle Worlds, which has 15 worlds and 341 works.

Early stage venture capitalist Josh Kopelman says that many of the best opportunities for startups are not those in expanding markets. "We love investing in technologies and business models that are able to shrink existing markets. If your company can take $5 of revenue from a competitor for every $1 you earn – let's talk!",  he has written on his firm's website. Kopelman founded Half.com in the early days of the internet, a company which shrank the book market by getting people to resell the books they had just bought for a fraction of the price of a new book. Microsoft's Encarta shrank the Encyclopedia business from $1.2B to $600M before Wikipedia shrank the business by another 90%.

In 2014, I'm guessing it's the book publishing industry's time to shrink. A convergence of tech startups, tech monsters, and tech non profits seems to be ready for the assault. The fanfic sites, the Wattpads, the Project Gutenbergs and the Manybooks, the Readmills, the Leanpubs and the Smashwords (and I hope the Unglue.its); these are people building the foundations of a creative industry that will flourish even if the ebook sales collapse that I see around me spreads to your house as well.

Happy New Year!

Rosenthal, David: Implementing DAWN?

planet code4lib - Wed, 2014-01-01 08:00
In a 2009 paper "FAWN A Fast Array of Wimpy Nodes" David Andersen and his co-authors from C-MU showed that a network of large numbers of small CPUs coupled with modest amounts of flash memory could process key-value queries at the same speed as the networks of beefy servers used by, for example, Google, but using 2 orders of magnitude less power. In 2011, Ian Adams, Ethan Miller and I proposed extending this concept to long-term storage in a paper called “Using Storage Class Memory for Archives with DAWN, a Durable Array of Wimpy Nodes”. DAWN was just a concept, we never built a system.

Now, in a fascinating talk at the Chaos Computer Conference called "On Hacking MicroSD Cards" the amazing Bunnie Huang and his colleague xobs revealed that much of the hardware for a DAWN system may already be on the shelves at your local computer store. Below the fold, details of the double-edged sword that is extremely low-cost hardware, to encourage you to read the whole post and watch the video of their talk.

Bunnie points out that the complexity of the algorithms needed to manage flash memory, especially at the low end where the vendors need to cover up the miserable quality of the actual chips, is increasing.
Flash memory is really cheap. So cheap, in fact, that it’s too good to be true. In reality, all flash memory is riddled with defects — without exception. The illusion of a contiguous, reliable storage media is crafted through sophisticated error correction and bad block management functions. This is the result of a constant arms race between the engineers and mother nature; with every fabrication process shrink, memory becomes cheaper but more unreliable. Likewise, with every generation, the engineers come up with more sophisticated and complicated algorithms to compensate for mother nature’s propensity for entropy and randomness at the atomic scale.The low-cost way to implement the "sophisticated and complicated algorithms" turns out to be to with a micro-controller chip. So your SD card is actually some flash memory plus:
The embedded microcontroller is typically a heavily modified 8051 or ARM CPU. In modern implementations, the microcontroller will approach 100 MHz performance levels, and also have several hardware accelerators on-die.  But:
The inevitable firmware bugs are now a reality of the flash memory business, and as a result it’s not feasible, particularly for third party controllers, to indelibly burn a static body of code into on-chip ROM. The crux is that a firmware loading and update mechanism is virtually mandatory, especially for third-party controllers.xobs and Bunnie, who among his other amazing achievements reverse-engineered the encryption for the original Xbox, reverse-engineered the firmware loading protocol for one particular micro-controller and were able to insert code that ran on the only two SD cards they found that used it. (Later, they were able to do the same for a more modern micro-controller). Especially in the light of other talks at the conference, the obvious implication is that:
code execution on the memory card enables a class of MITM (man-in-the-middle) attacks, where the card seems to be behaving one way, but in fact it does something else.On the other hand, the SD card implements everything needed for the DAWN concept except a network interface. One can easily envisage a box with something like a Raspberry Pi interfacing between a lot of SD cards and a network and power. Bunnie points out in the talk as he hands out cards, the real difficulty would be actually sourcing SD cards with a known micro-controller.

PS - check out bunnie and Jie Qi's Circuit Stickers!

Tennant, Roy: A Tale of Two Lives, Well Lived

planet code4lib - Tue, 2013-12-31 20:29

I can’t say that 2013 was a great year for me and those close to me. And a couple of the low points were the passing of two great colleagues whom I have long admired. Both in the last several weeks of the year.

Photo by Barry Wheeler

Steve Puglia died on December 10, 2013 after a year-long battle with pancreatic cancer. For a much more thorough and knowledgeable description of Steve’s life and contributions, I refer you to this Library of Congress post. From me you will get a personal memoir.

I first met Steve  at the famous School for Scanning run for many years by the Northeast Document Conversion Center (NEDCC). I remember him as being someone who personified the term “consummate professional”. His knowledge of digital capture was encyclopedic. And yet he also knew how to explain complicated topics simply and well. At my first School for Scanning I was in awe of all that I learned about the science and technology of imaging.

For many people, such knowledge would have made them arrogant, but not Steve. He remained approachable, generous, kind, and, to repeat, a consummate professional.

In the late 1980s and early 1990s I was learning about the Internet and how it could be used in a library context. One of the first really useful tools to come along was HyTelnet, developed by Peter Scott, using software developed by Earl Fogel. Since an early use of the Internet was to use Telnet to connect to distant library catalogs this tool brought together directory information for those catalogs in one easy-to-use tool. The fact that this came out of the distant prairie lands of Saskatchewan seemed particularly appropriate given the world-shrinking effect of the Internet.

After long admiring his work from afar, I finally got the chance to meet Peter and even traveled to Saskatoon at his invitation to keynote the 1998 Access Conference. He continued to make his mark on the profession with directory services and an informative Twitter feed. Peter was one of those professionals who never met a tool he couldn’t figure out how to use, and after having done so, would use it to help others. Peter sadly passed away yesterday, December 30.

Two great and good digital library pioneers have left us. And we are vastly poorer for it.

ALA Equitable Access to Electronic Content: What does a New Year bring to our privacy and surveillance?

planet code4lib - Tue, 2013-12-31 20:25

The article below is the first in a two-part series about surveillance.

Thinking about resolutions or intentions for 2014

On this last day of 2013, like most of us, I’ve been thinking about new opportunities and challenges for American Library Association (ALA) work in 2014. There are all the usual important issues the library community must face: library funding at every level of government. Those battles are never-ending and continue to be especially challenging in recent years. Ask the Washington Office about sequestration and federal funds for libraries…

Then there are copyright, broadband and other access issues for all types of libraries, which are ongoing and challenging issues that, at a basic level, are also about funding in terms of “who pays what?” and “who controls the content and access?” More importantly, the question is continually “how can libraries best serve the public?” in various types of library.

Few of our library issues are in the news these days like privacy and surveillance issues, at least at the federal level. Splashed across the news are weekly, sometimes reports, about the breadth and depth of the National Security Agency’s (NSA) continued collection of massive amounts of individualized information. Can you even believe that our government hacked into Chancellor Angela Merkel’s phone in Germany!

The year 2014 is, for the first time since passage of the USA PATRIOT Act in October 2001, an opportunity to push for real surveillance reforms. However, do not underestimate the strength of powerful forces opposing any NSA reform. ALA and our related coalition partners have pushed back against the opponents of reform for all these years. Despite over two dozen reform bills being introduced in Congress in the last six months, reform will not be a quick or easy task.

2013 was the year…..

New surveillance bills were introduced in Congress after contractor Edward Snowden leaked government information in June 2013. Whatever one’s thoughts are about Snowden as a whistleblower/hero or a law-breaker/traitor, the revelations have shown the world what we suspected, but did not know for sure: the NSA and other entities and other governments collected personal information on Americans for many years.

It is far more massive than we ever thought. The revelations have provoked world debate, congressional hearings and even some small changes to what the government will reveal about its surveillance activities. It also means that in the last six months of 2013, there have been many more opportunities for ALA to demand reforms and seek changes to many surveillance policies including reform to the Foreign Intelligence Surveillance Act (FISA), increased transparency on government activities and the end to the massive collection of personal data, because now we all know more about what the government, especially the NSA, has been doing.

The District Dispatch has reported the many public letters and policy statements ALA has signed with various coalitions. Our allies include the American Civil Liberties Union (ACLU), OpentheGovernment.org, the Center for Democracy and Technology, the Electronic Frontier Foundation and many diverse other organizations. Lobbying and related activities continue on the issues – especially pushing for support of the USA Freedom Act, S. 1599 and H.R. 3361, one of the more likely bills to be addressed. But in 2014, ALA will have to do far more work and seek far more grassroots involvements if any reforms are to be realized.

As in the past, ALA will continue to argue that the balance is dangerously “out-of-whack” between the protection of our civil liberties and the need to protect people against terrorism and other bad acts. ALA has consistently pushed for reforms based upon the library community’s widespread and longstanding commitment to First Amendment and patron privacy during reauthorizations of or amendments to the PATRIOT Act, FISA or other laws, bills and policies. This work continues, and so must grassroots advocacy.

We now know what we didn’t know…

Moving forward, I’m thinking about the privacy and surveillance issues and how and why we, as library supporters, should be involved. 2013 brought us the needed information. To be continued…….

Happy New Year everyone!

The post What does a New Year bring to our privacy and surveillance? appeared first on District Dispatch.

Open Knowledge Foundation: Extended: Open Data Scoping Terms of Reference

planet code4lib - Tue, 2013-12-31 14:00

The Open Data Partnership for Development Scoping Terms of Reference deadline has been extended until January 13, 2014. We have received some great submissions and want to give more people the best opportunity to tackle the project. Truly, we recognize that the holiday season is a busy time.

The Open Data Partnership for Development Scoping Terms of Reference opened on December 11, 2013 and will close on January 13, 2014 at 17:00 GMT.

Updated Open Data Partnership for Development – Scoping Terms of Reference

Help us get a current state Open Data Activity snapshot to guide our decisions for the Open Data Partnership for Development programmes. Proposals for a Scoping Analysis will address two objectives:

  • (i) identify potential funders and the key delivery partners in the Open Data ecosystem, and
  • (ii) map the existing efforts to support open data in developing countries and their status.

More about Open Data Partnership for Development

Happy New Year.

Ng, Cynthia: 2013 in review

planet code4lib - Tue, 2013-12-31 11:34

This year’s annual report was not nearly as exiting as last year’s. While the busiest day’s top post was unrelated, once again, the reason for February 13th being the top view date is due to Code4Lib.

The top posts almost all being related to WordPress just goes to show what people seem to search the internet for (and perhaps what they’re not finding in the WordPress documentation).

Here’s an excerpt:

The concert hall at the Sydney Opera House holds 2,700 people. This blog was viewed about 30,000 times in 2013. If it were a concert at Sydney Opera House, it would take about 11 sold-out performances for that many people to see it.

Click here to see the complete report.


Filed under: Update
Syndicate content