You are here

Feed aggregator

Hydra Project: Duoc UC, Chile, becomes a Hydra Partner

planet code4lib - Tue, 2015-01-13 09:30

[English version below]

Estamos encantados de anunciar que Duoc UC (http://www.duoc.cl), en Santiago de Chile, se ha convertido en el más reciente Hydra Socio formales, y nuestro primer socio en América Latina. Duoc ha estado trabajando con Hydra para construir la “Biblioteca Digital Patrimonial” (http://loncofilu.cl), un repositorio digital de planos arquitectónicos, fotografías, planes de restauración y documentos históricos relacionados con los edificios históricos más preciados de Chile, y que representa el trabajo producido por los estudiantes de la Escuela de Construcción Duoc UC. Para el 2015 están planeando el desarrollo de dos repositorios adicionales basados Hydra que se centrarán en la recogida de proyectos de títulos  de estudiantes y de audio y producciones visuales de la Escuela de Comunicación.

En su carta de intención, Duoc dice que se han comprometido no sólo a la construcción de más proyectos con Hydra, sino también para la construcción de una comunidad de Hydra en América Latina mediante la traducción de la documentación en talleres españoles y explicaciones por otras instituciones de América Latina interesados en la construcción de repositorios de Hydra.

Bienvenidos, Duoc UC!

We are delighted to announce that Duoc UC (http://www.duoc.cl), in Santiago, Chile, has become the latest formal Hydra partner, and our first partner institution in Latin America. Duoc has been working with Hydra to build the “Heritage Digital Library” (http://loncofilu.cl), a digital repository of architectural drawings, photographs, restoration plans and historical documents related to the most precious historic buildings in Chile, and representing work produced by the students of Duoc’s Faculty of Construction. In 2015 they are planning to develop two additional repositories based on Hydra that will focus on the collection of student thesis projects and audio and visual productions from their Faculty of Communication.

In their letter of intent, Duoc says they are committed not only to building more projects with Hydra, but also to building a Hydra community in Latin America through the translation of documentation into Spanish and offering workshops to other Latin American institutions interested in building Hydra repositories.

Welcome, Duoc UC!

Ed Summers: Bowie

planet code4lib - Tue, 2015-01-13 02:18

Bowie by Simon Critchley
My rating: 5 of 5 stars

If you are a Bowie fan, you will definitely enjoy this. If you are curious why other people are so into Bowie you will enjoy this. If you’ve never read any Critchley and are interested in something quick and accessible by him you will enjoy this. I fell into the first and third categories so I guess I’m guessing about the second. But I suspect it’s true.

I finished the book feeling like I understand the why and how of my own fascination with Bowie’s work much better. I also want to revisit some of his albums like Diamond Dogs, Heathen and Outside which I didn’t quite connect with at first. I would’ve enjoyed a continued discussion of Bowie’s use of the cutup technique, but I guess that fell out of the scope of the book.

I also want to read some more Critchley too — so if you have any recommendations please let me know. The sketches at the beginning of each chapter are wonderful. OR Books continues to impress.

William Denton: Clapping Music on Sonic Pi

planet code4lib - Tue, 2015-01-13 02:10

A while ago I bought a Raspberry Pi, a very small and cheap computer, and I never did much with it. Then a few days ago I installed Sonic Pi on it and I’ve been having a lot of fun. (You don’t need to run it on a Pi, you can run it on Linux, Mac OS X or Windows, but I’m running it on my Pi and displaying it on my Ubuntu laptop.)

My Raspberry Pi.

Sonic Pi is a friendly and easy-to-use GUI front end that puts Ruby on top of SuperCollider, “a programming language for real time audio synthesis and algorithmic composition.” SuperCollider is a bit daunting, but Sonic Pi makes it pretty easy to write programs that make music.

I’ve written before about “Clapping Music” by Steve Reich, who I count as one of my favourite composers: I enjoy his music enormously and listen to it every week. “Clapping Music” is written for two performers who begin by clapping out the same 12-beat rhythm eight times, then go out of phase: the first performer keeps clapping the same rhythm, but the second one claps a variation where the first beat is moved to the end of the 12 beats, so the second becomes first. That phasing keeps on until it wraps around on the 13 repetition and they are back in phase.

Here’s one animated version showing how the patterns shift:

And here’s another:

Here’s the code to have your Pi perform a rather mechanical version of the piece. The clapping array defines when a clap should be made. There are 13 cycles that run through the clapping array 4 times each. The first time through cycle is 0, and the two tom sounds are the same. The second time through cycle is 1, so the second tom is playing one beat ahead. Third time through cycle is 2, so the second tom is two beats ahead. It’s modulo 12 so it can wrap around: if the second tom is on the fifth cycle and ten beats in, there’s no 15th beat, so it needs to play the third beat.

use_bpm 300 load_sample :drum_tom_lo_soft load_sample :drum_tom_mid_soft clapping = [1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 0] 13.times do |cycle| puts "Cycle: #{cycle}" 4.times do |reps| 12.times do |beat| sample :drum_tom_lo_soft, pan: -0.5 if clapping[beat] == 1 sample :drum_tom_mid_soft, attack: 0.05, pan: 0.5 if clapping[(cycle + beat) % 12] == 1 sleep 1 end end end

If you’re running Sonic Pi, just paste that in and it will work. It sounds like this (Ogg format):

It only does four repetitions of each cycle because my Pi is old and not very powerful and for some reason eight made it go wonky. It’s not perfect even now, but the mistakes are minimal. I think a more recent and more powerful Pi would be all right, as would running Sonic Pi on a laptop or desktop.

It’s lacking all the excitement of a performance by real humans (some of which could be faked with a bit of randomization and effects), but it’s very cool to be able to do this. Algorithmic music turned into code!

DuraSpace News: Get the Scoop on How Institutions Research and Select Hosted Repository Solutions

planet code4lib - Tue, 2015-01-13 00:00

Winchester, MA  Find out first-hand how institutions research and select hosted repository solutions at the January 22 Hot Topics webinar “Doing It: How Non-ARL Institutions are Managing Digital Collections”.

DuraSpace News: CALL DSpace Interest Group Proposals for OR2015

planet code4lib - Tue, 2015-01-13 00:00

From Maureen Walsh, Institutional Repository Services Librarian, The Ohio State University Libraries

Conference Theme: LOOKING BACK, MOVING FORWARD: OPEN REPOSITORIES AT THE CROSSROADS

Conference Dates: June 8-11, 2015

Conference Location: Indianapolis, Indiana

Conference Website: http://www.or2015.net/

Important dates

DuraSpace News: CALL Fedora Interest Group Proposals for OR2015

planet code4lib - Tue, 2015-01-13 00:00

From David Wilcox, Fedora Product Manager, DuraSpace; Co-chair, OR2015 Fedora Interest Group

Evergreen ILS: 2015 conference registration open

planet code4lib - Mon, 2015-01-12 20:50

Registration now is open for the 2015 Evergreen International Conference, to be held on May 13-16 in Hood River, Oregon, USA.
https://www.eventbrite.com/e/evergreen-2015-international-conference-tickets-15029293020

The conference venue is the Best Western Plus Hood River Inn. Booking details is available on the venue page of the conference website. The website will be updated as information comes available.

Stay tuned for information about submitting proposals, sponsoring the conference, and exhibiting.

Questions? Contact conference chair Buzzy Nielsen, buzzy@hoodriverlibrary.org, 541-387-7062.

Nick Ruest: Preliminary look at 3,893,553 #JeSuisCharlie tweets

planet code4lib - Mon, 2015-01-12 19:45

Background

Last Friday (January 9, 2015) I started capturing #JeSuisAhmed, #JeSuisCharlie, #JeSuisJuif, and #CharlieHebdo with Ed Summers' twarc. I have about 12 million tweets at the time of writing this, and plan on writing up something a little bit more in-depth in the coming weeks. But for now, some preliminary analysis of #JeSuisCharlie, and if you haven't seen these two posts ("A Ferguson Twitter Archive", "On Forgetting and hydration") by Ed Summers, please do check them out.

How fast were the tweets coming in? Just to try and get a sense of this, I did a quick recording of tailing the twarc log for #JeSuisCharlie capture.

Hydration

If you checked out both of Ed's post, you'll have noticed that the Twitter ToS forbid the distribution of tweets, but we can distribute the tweet ids, and based on that we can "rehydrate" the data set locally. The tweet ids for each hashtag will be/are available here. I'll update and release the tweet ids files as I can.

We're looking at just around 12 million tweets (un-deduped) at the time of writing, so the hydration process will take some time. I'd highly suggest using GNU Screen or tmux

Hydrate

  • #JeSuisCharlie: % twarc.py --hydrate %23JeSuisCharlie-ids-20150112.txt > %23JeSuisCharlie-tweets-20150112.json
  • #JeSuisAhmed: % twarc.py --hydrate %23JeSuisAhmed-ids-20150112.txt > %23JeSuisAhmed-tweets-20150112.json
  • #JeSuisJuif: % twarc.py --hydrate %23JeSuisJuif-ids-20150112.txt > %23JeSuisJuif-tweets-20150112.json
  • #CharlieHebdo: % twarc.py --hydrate %23CharlieHebdo-ids-20150112.txt > %23CharlieHebdo-tweets-20150112.json
Map

#JeSuisCharlie tweets with geo coordinates.

In this data set, we have 51,942 tweets with geo coordinates availble. This represents about 1.33% of the entire data set (3,893,553 tweets).

©

How do you make this?

  • Create the geojson % ~/git/twarc/utils/geojson.py %23JeSuisCharlie-cat-20150115-tweets-deduped.json > %23JeSuisCharlie-cat-20150115-tweets-deduped.geojson

  • Give the geojson a variable name.

  • Use Leaflet.js to put all the tweets with geo coordinates on a map like this.

Top URLs

Top 10 URLs tweeted from #JeSuisCharlie.

  1. (11220) http://www.newyorker.com/culture/culture-desk/cover-story-2015-01-19?mbid=social_twitter
  2. (2278) http://www.europe1.fr/direct-video
  3. (1615) https://www.youtube.com/watch?v=4KBdnOrTdMI&feature=youtu.be
  4. (1347) https://www.youtube.com/watch?v=-bjbUg9d64g&feature=youtu.be
  5. (1333) http://www.amazon.com/Charlie-Hebdo/dp/B00007LMFU/
  6. (977) http://www.clubic.com/internet/actualite-748637-opcharliehebdo-anonymous-vengeance.html
  7. (934) http://www.maryam-rajavi.com/en/index.php?option=com_content&view=article&id=1735&catid=159&Itemid=506
  8. (810) http://www.lequipe.fr/eStore/Offres/Achat/271918
  9. (771) http://srogers.cartodb.com/viz/123be814-96bb-11e4-aec1-0e9d821ea90d/embed_map
  10. (605) https://www.youtube.com/watch?v=et4fYWKjP_o

Full list of urls can be found here.

How do you get the list?

  • % cat %23JeSuisCharlie-cat-20150115-tweets-deduped.json | ~/git/twarc/utils/unshorten.py > %23JeSuisCharlie-cat-20150115-tweets-deduped-unshortened.json
  • % cat %23JeSuisCharlie-cat-20150115-tweets-deduped-unshortened.json | ~/git/twarc/utils/urls.py| sort | uniq -c | sort -n > %23JeSuisCharlie-cat-20150115-urls.txt
Twitter Clients

Top 10 Twitter clients used from #JeSuisCharlie.

  1. (1283521) Twitter for iPhone
  2. (951925) Twitter Web Client
  3. (847308) Twitter for Android
  4. (231713) Twitter for iPad
  5. (86209)TweetDeck
  6. (82616) Twitter for Windows Phone
  7. (70286) Twitter for Android Tablets
  8. (44189) Twitter for Websites
  9. (39174) Instagram
  10. (21424) Mobile Web (M5)

Full list of clients can be found here.

How do you get this?

  • % ~/git/twarc/utils/source.py %23JeSuisCharlie-cat-20150115-tweets-deduped.json > %23JeSuisCharlie-cat-20150115-tweets-deduped-source.html
Word cloud

Word cloud from #JeSuisCharlie tweets.

I couldn't get the word cloud to embed nice, so you'll have to check it out here.

How do you create the word cloud?

  • % git/twarc/utils/wordcloud.py %23JeSuisCharlie-cat-20150115-tweets.json > %23JeSuisCharlie-wordcloud.html
tags: twarc#JeSuisCharlie#JeSuisAhmed#JeSuisJuif#CharlieHebdo

FOSS4Lib Recent Releases: Avalon Media System - 3.2

planet code4lib - Mon, 2015-01-12 16:40
Package: Avalon Media SystemRelease Date: Friday, December 19, 2014

Last updated January 12, 2015. Created by Peter Murray on January 12, 2015.
Log in to edit this page.

Indiana University and Northwestern University are delighted to announce Avalon Media System 3.2, completed and released on December 19, 2014. As part of a series of minor 3.x releases, Avalon 3.2 provides support for important content management efficiencies and other improvements.

Release 3.2 adds the following capabilities:

District Dispatch: Afterschool funding available through states

planet code4lib - Mon, 2015-01-12 16:09

Photo by the San Jose Library

As discussed in previous District Dispatch entries, Congress passed in late December its massive $1.01 trillion CROmnibus bill providing FY15 funding for much of the Federal government. With the return of the new Congress on January 6, the discussion on the FY16 budget begins anew and ALA will be fighting for library funding.

For FY15 programs of interest to the library community, the CROmnibus package provided level funding for most programs while a small number of programs received slight increases or decreases. It is safe to say that the appropriations package presents no major new library or educational initiatives.

One example of a library program receiving a slight increase is the 21st Century Community Learning Centers, which received an increase of $2.3 million (0.2% of its budget). As with many Federal education programs, funding for 21STCCLC is awarded directly to state educational agencies that control how the grants are apportioned. Libraries have opportunities to apply for many of the grants.

The way this program works is that funds are sent to states who then make competitive grants to “local educational agencies (LEAs), community-based organizations, faith-based organizations, or other public or private entities that can demonstrate experience, or the promise of success, in providing educational and related activities. In making awards, States give priority to applications that: (1) propose to target services to students who attend schools identified as in need of improvement under Title I; and (2) are submitted jointly by at least one LEA that receives funds under Part A of Title I and at least one community-based organization or other public or private entity. States must make awards of at least $50,000 per year for a period of 3 to 5 years.”

Background on 21STCCLC can be viewed here. The Department of Education Guidance answers everything libraries need to know about the program with a helpful table of contents.

A good resource for libraries to be aware of for participating in these grant programs is the Afterschool Alliance, which provides good information and knowledge of all things related to 21STCCLC. The Afterschool Alliance is the main national organization advocating for after school programs.

State Education Agency offices are also a good resource of grants, since they are awarded from the state level. A list of contacts for 21STCCLC in each state is available here and State Educational Agencies here.

The post Afterschool funding available through states appeared first on District Dispatch.

Open Knowledge Foundation: Open Data Handbook 2015 comeback – and you want to be a part of it!

planet code4lib - Mon, 2015-01-12 14:12

There is famous saying that says that outside of a dog, a book is a man’s best friend. We at Open Knowledge tend to agree. This is why we decided to take one of Open Knowledge key resources, the Open Data Handbook, and give it a bit of a face lift in this upcoming year.

The open data handbook has been an important resource for the open knowledge community for years. The handbook introduces and discusses legal, social and technical aspects of open data. It has been used by a wide range of stakeholders from open data beginners to data wizards, from government officials to journalists and civil society activists. It examines the following questions which are relevant to all: what is “open”, why to open up data, and the how to ‘open’ data?

Since it was first written, the handbook is read by thousands of users each month and has been translated into 18 languages (making the most widely translated Open Data resource out there) . However, open data is both a fast moving and a relatively field. As such, it is not surprising that open data initiatives have been launched and open data policies approved, we, as a community, have learned a lot about the opportunities and the pitfalls of open data. The last version of the book is from 2011 and at the time, government open data portals were few and far between and the open government partnership had only just launched. The book represents what we new/thought then but as the open data movement expanded both in terms of numbers and in geographical spread, we have decided that it is high time that we incorporate our learnings into a new version. This version of the Open Data handbook will focus mainly on one main type of open data: open government data, but a number of the sections can be applied to other types of open data. This project is supported by Partnership for Open Data – a collaboration between Open Knowledge, Open Data Institute and the World Bank.

So much of this knowledge, these stories and the brilliant ideas about what works and what doesn’t work is in this community. Therefore, we believe that the process of creating the updated version of handbook should be, as its always been, a community project. This process can not only strengthen the community through a joint project, but also to help us to learn from peers, listen to members who usually do not participate in daily channels and to create a handbook, rich in content, experience and a wide spectrum of knowledge.

There are a number of ways you can get involved! You can submit your stories or comment on the “alpha” version we are planning to launch in February. The handbook will be part of a larger community owned resource platform and we have

How can you help?

  • Contribute a short open data story – We are looking to different stories about open government data stories in various fields. It can be a success story or even a failure that you think we should all learn about. If you want to contribute a story please fill this form and we will get back in touch with you.

  • Revise the first draft of the book – The current chapters in the open data handbook are being review by Open Knowledge staff – we are updating and producing new . Our goal is to release an ‘alpha’ version of the book the week before open data day, so it can be revised, commented on and added to by the community.

  • Propose a resource – We are putting together a list of open data resources – If you know of other resources about open data, in any language, please give us a shout. At the end of each section, we will have a “further reading” section and we’d love to share as many resources as possible.

  • Send us a short video about open data – In the internet world, a handbook doesn’t have to be text only. Send us a video of you / your organization and answer the following questions:

    Tell us an example of open data having an social and/or economic impact in your city/country/region What is your main obstacle dealing with Open Data?
    How do you / your community engage with open data?
    What do you think is the next big thing for Open Data in 2015?

The videos will be embedded in the handbook and on our YouTube channel!

Who can write to the book? Everyone! While we are editing the book are editing the book, we want your input. Unfortunately, we can’t promise that every story / idea will ultimately be part of the book. If you think that we are missing something, please let us know! We will try to include as much as possible!

If you have any comments or suggestions, please email us at handbook [at] okfn [dot] org

2015 is going to be great for open data, let’s write about it together.

Hydra Project: Announcing Avalon 3.2

planet code4lib - Mon, 2015-01-12 13:33

Indiana University and Northwestern University are delighted to announce Avalon Media System 3.2, completed and released on December 19, 2014. As part of a series of minor 3.x releases, Avalon 3.2 provides support for important content management efficiencies and other improvements.

Release 3.2 adds the following capabilities:

  • Bulk item management actions, including publish, un-publish, change collection, delete, and assign access
  • Avalon dropbox subdirectories are accessible to collection managers using the web interface
  • Upgrade to Hydra 7 framework
  • Numerous interface improvements and bug fixes

For more details on each of these new features, visit the What’s New in Avalon 3.2 wiki page: https://wiki.dlib.indiana.edu/display/VarVideo/What%27s+New+in+Avalon+3.2

LibUX: “Social” the Right Way is a Timesuck

planet code4lib - Mon, 2015-01-12 12:16

Pew Research Center’s 2014 Social Media Update, published Friday, validates pretty much any argument libraries have to make for actively reaching out through social media. Your audience is there.

58% of ALL [U.S.] adults are on Facebook. 31% of ALL seniors are on Facebook. #libweb #libux http://t.co/laYJyW1ffg

— Library UX Data (@libuxdata) January 9, 2015

This is a numbers game. Whether libraries should be there is sort of head-scratchingly moot, but brand decisions about which social bandwagon to jump1 should be made only when libraries are prepared to commit real resources to their upkeep. I say “resources,” I mostly mean time – but marketing dollars are not misspent on Facebook ads.

Crafting good content is not an insubstantial timesuck. Knowing your audience, time spent analyzing metrics, helps mitigate people’s capacity to detect bullshit. And this is important. Poor content not only reflects poorly on your library, but for channels like Facebook that highlight popular or relevant content, posts that bomb negatively impact the overall visibility of your brand.

A basic level of engagement requires just the right amount of content, too. Part of this just has to do with currency, right? Old content tends to roll off. It’s too many thumb swipes down. Wisemetrics finds that, on average, the half-life of a Facebook post is about 90 minutes. Hell, a tweet is lost to the void in just 18 minutes. The point is that you have to post regularly to stay on people’s radar – and, for Facebook especially, if you’re off-radar long enough the algorithm [allegedly] guarantees subsequent posts will reach fewer people.

I think, here, it is also important to mention that users expect brands to actively monitor their channels. By having an account, you wade into the pool. It ain’t a billboard over the highway. You and your audience are on the same level. You’re sharing fluids. You’re friends. If they contact you and you don’t answer, that’s not just the passive neglect of strangers on the sidewalk: it’s a dis; it hits home. On Twitter, specifically, 42% expect a response within the hour. Outright ignoring someone is like a punch in the gut.

How Much to Post

We have to be a little honest about the realities of library social media. We’re on board, sure, but we most likely haven’t the benefit of a marketing team. The social accounts are managed on the side and probably aren’t part of anyone’s actual job description. Roger. So, where do we get the most bang for our buck?

Post to Twitter at least 5 times a day. If you can swing up to 20 posts, you might be even better off. Post to Facebook five to 10 times per week. Fast Company

Several studies show that posting too little risks losing “connection with your audience,” that brands should shoot for ten (10) posts per week. Posting more often is annoying.

For Twitter, it depends how you measure success. If you’re looking at retweets, replies, and clicks per tweet, “if you want to wring the most value out of every tweet you send, tweet about five times each day.” If you measure overall response per day, well, 30 tweets a day ought to do it.

This Fast Company article shares optimal post frequency for other platforms, if you’re interested.

2 Hours Per Channel

The timesuck to do social right is determined by the time required to

  • inspect your account’s metrics to understand who your followers are, when they’re on, and what they like
  • analyze your history of content so you know what works, what doesn’t
  • craft enough good content specific to your audience and the medium2
  • schedule that content for optimum reach
  • monitor and respond

Really, this is unique to you. For me, according to my Harvest account, last week I spent two hours scheduling just six tweets (for the entire week! I suck) and ten Facebook posts. This is a little short of previous weeks where I posted more and spent about 4 hours. I include time spent looking at analytics, corresponding about what needs to be posted, and optimizing content I’m sharing in our WordPress network (we use this plugin, which lets you tweak OpenGraph metadata and Twitter Cards).

So, my gut-checked suggestion is that it’s reasonable to expect to spend at least two hours per channel – minimum. Real content marketers for brands-with-budgets certainly devote a lot more, but I think it’s important to recognize the timesuck for what it is and reconcile decisions to go whole hog on a new channel with the human resources required to maintain it.

  1. If your library is on any another social platform except Facebook – wtf are you doing?
  2. People can tell what is auto-tweeted and cross-posted.

The post “Social” the Right Way is a Timesuck appeared first on LibUX.

District Dispatch: Speakers to explore library funding options at 2015 ALA Midwinter Meeting

planet code4lib - Mon, 2015-01-12 06:14

Thinking about new funding sources for your library? Join leaders from the Institute of Museum and Library Services (IMLS) when they discuss federal library funding resources at the 2015 American Library Association (ALA) Midwinter Meeting in Chicago. The session, titled “All Eyes on IMLS: Funding Priorities and Reauthorization,” takes place from 10:30 to 11:30 a.m. on Saturday, January 31, 2015, in the McCormick Convention Center, room W183A.

MLK Digital Commons in Washington, D.C. Photo by Phil Freelon

During the sessions, speakers will shed new light on the Library Services and Technology Act (LSTA), the primary source of annual funding for libraries in the federal budget. Library staff are encouraged to attend the conference session to learn more about Institute of Museum and Library Services’ priorities for the next two years, which will shape the agency’s discretionary and Library Services and Technology Act Grants to States programs. Additionally, participants will learn more about how they can support the Museum and Library Services Act while the law undergoes a reauthorization period in 2016.

Speakers include Maura Marx, acting director of the Institute of Museum and Library Services
and Robin Dale, associate deputy director for state programs for the Institute of Museum and
Library Services.

View other ALA Washington Office Midwinter Meeting conference sessions

The post Speakers to explore library funding options at 2015 ALA Midwinter Meeting appeared first on District Dispatch.

Alf Eaton, Alf: Searching for mergeable tables

planet code4lib - Mon, 2015-01-12 01:10

Among CartoDB’s many useful features is the ability to merge tables together, via an interface which lets you choose which column from each to use as the shared key, and which columns to import to the final merged table.

Google's Fusion Tables similarly encourages merging of tabular data. Fusion Tables creates a virtual merged table, allowing updates to the source tables to be replicated to the final merged table as they occur.

CartoDB can also merge tables using location columns, counting items from one table (with latitude and longitude, or addresses) that are positioned within the areas defined in another table (with polygons).

I've found that UK parliamentary constituencies are useful for visualising data, as they have a similar population number in each constituency and they have at least two identifiers in published ontologies which can be used to merge data from other sources*. The UK parliamentary constituency shapefiles published by the Ordnance Survey as part of the Boundary-Line dataset contain polygons, names and two identifiers for each area: one is the Ordnance Survey’s own “unit id” and one is the Office for National Statistics’ “GSS code”.

Once the parliamentary constituency shapefile has been imported to a base table, any CSV table that contains either of those identifiers can easily be merged with the base table to create a new, merged table and associated visualisation.

So, the task is to find other data sets that contain either the OS “unit id” or the ONS “GSS code”.

The URLs for the data types of these codes are defined in the Ordnance Survey’s “administrative geography and civil voting area” ontology:

The values themselves can also be expressed as URLs:

GSSE14000929 GSS URLhttp://statistics.data.gov.uk/doc/statistical-geography/E14000929 Unit ID24896 Unit ID URLhttp://data.ordnancesurvey.co.uk/id/7000000000024896

However, unlike the Linked Data/SPARQL interfaces, most CSV or Excel files that are currently published (such as those produced by the Office for National Statistics as a result of census analysis) don’t define the data type of each column using URLs. Although there’s usually a property name in the first row, there’s rarely a datapackage.json file defining a basic data type (number, string, date, etc), and practically never a JSON-LD context file to map those names to URLs.

Given an index of CSV files, like those in CKAN-based stores such as data.gov.uk, how can we identify those which contain either unit IDs or GSS codes?

As Thomas Levine's commasearch project demonstrated at csvconf last year, if you have a list of all (or even just some) of the known members of a collection of typed entities (e.g. a list of all the countries in the world), it’s easy enough to find other datasets that contain them: as long as at least a certain proportion of the distinct values of a column match those in the known collection, the data type can be guessed, and can be assigned a URL.

TODO: compile lists of values for known data types, particularly from Wikidata. For example: country names (a list of names that changes slowly), members of parliament (a list of names that changes regularly), years (a range of numbers that grows gradually), gene identifiers (a list of strings that grows over time), postcodes (a list of known values, or values matching a regular expression).

Related tools Footnotes

* On the downside, parliamentary constituencies can be changed (currently every 5 years), as population density shifts around the country and the current government decides to increase or decrease the number of constituencies. This makes it difficult to use the constituencies for long-term comparisons.

Mark E. Phillips: Digital Preservation System Interfaces: UNT Libraries Coda Repository

planet code4lib - Mon, 2015-01-12 01:04

I mentioned to a colleague that I would be happy to do a short writeup of some of the interfaces that we have for our digital preservation system.  This post is trying to move forward that conversation a bit.

System 1, System 2

At UNT we manage our digital objects in a consistent and unified way.  What this means in practice is that there is one way to do everything,  items are digitized, collected, or created, staged for ingest into the repository and everything moves into the system in the same way.  We have two software stacks that we use for managing our digital items,  Aubrey and Coda.

Aubrey is our front-end interface which provides end user access to resources,  search, browsing, and display.  For managers it provides a framework for defining collections, partners, and most importantly it has a framework for creating and managing metadata for the digital objects.  Most of the interaction (99.9%) of the daily interaction with the UNT Libraries Digital Collections is through Aubrey with one of its front-end user interfaces,  The Portal to Texas History, the UNT Digital Library, or The Gateway to Oklahoma History.

Aubrey manages the presentation versions of a digital object,  locally we refer to this package of files as an Access Content Package, or ACP.  The other system in this pair is a system we call Coda.  Coda is responsible for managing the Archival Information Packages (AIP) in our infrastructure.  Coda was designed to manage a collection of BagIt Bags,  help with the replication of these bags and allow curators and managers to access the master digital objects if needed.

What does it look like though?

The conversation I had with a colleague was around user interfaces to the preservation archive, how much or how little we are providing and our general thinking about that system’s user interfaces.  Typically these interfaces are “back-end” and usually are never seen by a larger audience because of layers of authentication and restriction.  I wanted to take a few screenshots and talk about some of the interactions that users have with these systems.

 Main Views

The primary views for the system include a dashboard view which gives you an overview of the happenings within the Coda Repository.

UNT Libraries’ Coda Dashboard

From this page you can navigate to lists for the various sub-areas within the repository.  If you want to view a list of all of the Bags in the system you are able to get there by clicking on the Bags tile.

Bag List View – UNT Libraries’ Coda Repository

The storage nodes that are currently registered with the system are available via the Nodes button.  This view is especially helpful in gauging the available storage resources and deciding which storage node to write new objects to.  Typically we use one storage node until it is completely filled and then move onto another storage node.

Nodes List View – UNT Libraries’ Coda Repository

For events in the coda system including ingest, replication, migration, and fixity check we create and store a PREMIS Event.  These are aggregated using the PREMIS Event Service

PREMIS Event List View – UNT Libraries’ Coda Repository

The primary Coda instance is considered the Coda instance of record and additional Coda instances will poll the primary for new items to replicate.  They do this using ResourceSync to broadcast available resources and their constituent files.  Because the primary Coda system does not have queued items this list is empty.

Replication Queue List View – UNT Libraries’ Coda Repository

To manage information about what piece of software is responsible for an event on an object we have a simple interface to list PREMIS Agents that are known to the system.

PREMIS Agents List View – UNT Libraries’ Coda Repository

Secondary Views

With the primary views out of the way the next level that we have screens for are the detail views.  There are detail views for most of the previous screens once you’ve clicked on a link.

Below is the detail view of a Bag in the Coda system.  You will see the parsed bag-info.txt fields as well as PREMIS Events that are associated with this resource.  You have the buttons at the top which will get you to a list of URLS that when downloaded will re-constitute a given Bag of content and the ATOM Feed for the object.

Bag Detail View – UNT Libraries’ Coda Repository

Here is a URLS list,  if you download all of these files and keep the hierarchy of the folders you can validate the Bag and have a validated version of the item plus additional metadata.  This is effectively the Dissemination Information Package for the system.

Coda URLs List – UNT Libraries’ Coda Repository

An Atom Feed is created for each document as well which can be used by the AtomPub interface for the system.  Or just to look at and bask in the glory of angle brackets.

Atom Feed for Bag – UNT Libraries’ Coda Repository

Below is the detail view of a PREMIS Event in the repository.  You can view the Atom Feed for this document or navigate to the Bag in the system that is associated with this event.

PREMIS Event Detail View – UNT Libraries’ Coda Repository

The detail of a storage node in the system.  These nodes are updated to reflect the current storage statistics for the storage nodes in the system.

Node Detail View – UNT Libraries’ Coda Repository

The detail view of a PREMIS Agent is not too exciting but is included for completeness.

Agent Detail View – UNT Libraries’ Coda Repository

Interacting with Coda

When there is a request for the master/archival/preservation files for a given resource we find the local identifier for the resource,  put that into the Coda repository and do a quick search

Dashboard with Search – UNT Libraries’ Coda Repository

You will end up with search results for one or more Bags in the repository.  If there is more than one for that identifier select the one you want (based on the date, size, or number of files) and go grab the files.

Search Result – UNT Libraries’ Coda Repository

Statistics.

The following screens show some of the statistics views for the system.  They include the Bags added per month and over time,  number of files added per month and over time, and finally the number of bytes added per month and over time.

Stats: Monthly Bags Added – UNT Libraries’ Coda Repository

Stats: Running Bags Added Total – UNT Libraries’ Coda Repository

Stats: Monthly Files Added – UNT Libraries’ Coda Repository

Stats: Running Total of Files Added – UNT Libraries’ Coda Repository

Stats: Monthly Size Added – UNT Libraries’ Coda Repository

Stats: Running Total Sizes – UNT Libraries’ Coda Repository

What’s missing.

There are a few things missing from this system that one might notice.  First of all is the process of authentication to the system.  At this time the system is restricted to a small list of IPs in the library that have access to the system.  We are toying around with how we want to handle this access as we begin to have more and more users of the system and direct IP based authentication becomes a bit unwieldy.

Secondly there is a full set of Atom Pub interfaces for each of the Bag, Node, PREMIS Event, PREMIS Agent, and Queue sections.  This is how new items are added to the system.  But that it a little bit out of scope for this post.

If you have any specific questions for me let me know on twitter.

DuraSpace News: CALL for Proposals for Open Apereo 2015

planet code4lib - Mon, 2015-01-12 00:00
From Ian Dolphin, Executive Director, Apereo Foundation, Laura McCord, Open Apereo 2015 Planning Committee Chair, Reba-Anna Lee, Open Apereo 2015 Program Committee Co-chair  

Access Conference: Details on AccessYYZ

planet code4lib - Sun, 2015-01-11 20:00

Access is headed to Toronto on September 8th-11th, 2015, so mark those calendars! We know that it’s a bit earlier than usual this year, but we hope that giving advance notice will allow attendees to plan accordingly.

Hackfest will be happening on September 8th at Ryerson University’s Heaslip House, while the remainder of the conference (September 9th-11th) will unfold at the beautiful Bram & Bluma Appel Salon on the second floor of the Reference Library (789 Yonge St.) in downtown Toronto.

Keep your eyes on the website in the coming weeks–we’ll announce more details as we have them!

Karen Coyle: This is what sexism looks like #2

planet code4lib - Sun, 2015-01-11 13:38
Libraries, it seems, are in crisis, and many people are searching for answers. Someone I know posted a blog post pointing to community systems like Stack Overflow and Reddit as examples of how libraries could create "community." He especially pointed out the value of "gamification" - the ranking of responses by the community - as something libraries should consider. His approach was that it is "human nature" to want to gain points. "We are made this way: give us a contest and we all want to win." (The rest of the post and the comments went beyond this to the questions of what libraries should be today, etc.)

There were many (about 4 dozen, almost all men) comments on his blog (which I am not linking to, because I don't want this to be a "call out"). He emailed me asking for my opinion.

I responded only to his point about gamification, which was all I had time for, saying that in that area his post ignored an important gender issue. The competitive aspect was part of what makes those sites unfriendly to women.

I told him that there have been many studies of how children play, and they reveal some distinct differences between genders. Boys begin play by determining a set of rules that they will follow, and during play they may stop to enforce or discuss the rules. Girls begin to play with an unstructured understanding of the game, and, if problems arise during play, they work on a consensus. Boys games usually have points and winners. Girls' games are often without winners and are "taking turns" games. Turning libraries into a "winning" game could result in something like Reddit, where few women go, or if they do they are reluctant to participate.

And I said: "As a woman, I avoid the win/lose situations because, based on general social status (and definitely in the online society) I am already designated a loser. My position is pre-determined by my sex, so the game is not appealing."

I didn't post this to the site, just emailed it to the owner. It's good that I did not. The response from the blog owner was:
This is very interesting. But I need to see some proof.Some proof. This is truly amazing. Search on Google Scholar for "games children gender differences" and you are overwhelmed with studies.

But it's even more amazing because none of the men who posted their ideas to the site were asked for proof. Their ideas are taken at face value. Of course, they didn't bring up issues of gender, class, or race in their responses, as if these are outside of the discussion of what libraries should be. And to bring them up is an "inconvenience" in the conversation, because the others do not want to hear it.

He also pointed me to a site that is "friendly to women." To that I replied that women decide what is "friendly to women."

I was invited to comment on the blog post, but it is now clear that my comments will not be welcome. In fact, I'd probably only get inundated with posts like "prove it." This does seem to be the response whenever a woman or minority points out an inconvenient truth.

Welcome to my world.

District Dispatch: Big shoes to fill

planet code4lib - Sun, 2015-01-11 05:32

Those who worked with Linda know her’s are big shoes to fill

E-rate Orders aside, the library community is starting the New Year with one less champion. Linda Lord, now former Maine State Librarian, is officially retired and has turned the keys over to her successor, Jaimie Ritter.

No one who knows Linda is at all reticent in talking about her dedication to her home state libraries—nor are those of us who work with her as a national spokesperson for libraries. Her work for ALA’s Office for Information Technology Policy (OITP) could be an encyclopedic list covering at least of decade of advocacy. In her most recent role as Chair of the E-rate Task Force, Linda has been invaluable to advancing library interests at the Federal Communications Commission (FCC), in Congress, and with her colleagues. At the height of the recent E-rate activity at the FCC, we joked with Linda that she should have special frequent flier miles for all the flights from Bangor (ME) to Washington D.C. That, and the fact that Linda’s email was first to pop up under the “Ls” and her phone number was always under “recents” on my phone list are testament to our reliance on her experience, her dogged support, and her willingness to work well beyond her role as a member-leader (a volunteer).

Of course Linda’s work is well respected in her home state as is evidenced by a number of articles and even a television interview as her retirement approached. These stories make it clear Linda builds strong, collaborative relationships with her colleagues, whether staff at the state library, librarians across Maine, and as far away as the Senate in Washington, D.C.

“Linda has done an amazing job making information accessible through libraries and schools across Maine,” said Senator Angus King. “She has the essential leadership qualities of vision, perseverance, willingness to work on the details, and a personality that enables her to collaborate and bring out the best in people. Her leadership at the national level on the E-rate program and other issues has been a huge benefit to Maine. She will always have my profound respect and appreciation for all that she’s accomplished for Maine and for the country.”

I can testify first hand on the difference Linda’s work has made for Maine libraries from my (wonderful) summer trips to Maine. In recent years we have noticed a marked improvement in library WiFi. While my kids love to hike when we travel in rural Maine, they now also are dedicated texters and need to know the next time we will be near a library so they can update friends in between dry periods of no connectivity. While passing through a town I point out the universal library sign and one child will ask, “Is that one of Linda’s libraries? Can we stop?” (knowing that there will be plenty of WiFi to go around).

We are proud to be able to share our own remembrances of Linda’s long tenure working with ALA. While I have long considered Linda “my ALA member,” many others have similar sentiments when asked to share anecdotes about working with Linda. I have included a few here.

Emily Sheketoff, executive director of ALA’s Washington Office reminds us all of Linda’s strong leadership qualities that have won her a respected place on the national stage:

“Linda has always been a strong voice for libraries, so OITP recognized and took great advantage of that. Coming from Maine, she had a soft spot for rural libraries and she became our “go-to” person when we needed an example of the difference a well-connected library can make for small towns or rural communities. When ALA staff use a Maine library as an exemplar the response is something along the lines of “Oh we know Linda Lord” and the point is immediately legitimized. She will be missed as a voice for libraries on the national stage.

As Chair of the ALA E-rate Task Force, Linda has spent countless hours on the phone, on email, in person making sure issues get covered—often asking the hard questions of how a policy course could impact the daily life of the librarian who has to implement or live with a policy. This ability has been invaluable as a gentle (and sometimes like a hurricane) reminder that what we do in D.C. has a very real impact locally. She is quite a leader.”

Linda Schatz, an E-rate consultant who worked with Linda and ALA for many years, describes Linda’s dedication to garnering support for the E-rate program:

“As I think about the many ways in which Linda has impacted the E-rate program, perhaps the most long-lasting has been her diligence in working with Members [of Congress] and their staff. Not only did she take the time to meet with and inform Senators Snowe and Collins about the impact of the E-rate program on Maine libraries, she continued to point out the benefits to all libraries and helped with last minute negotiations through the night to prevent legislation that would have had a negative impact on the program. She didn’t stop her communications when Senator Snowe left the Senate but took the time to meet with Senator King and his staff as well to ensure that they, too, understood the importance of the program to libraries. These communications about the E-rate program as well as the general needs of libraries will long be felt by the library community.”

Linda has the respect she does across ALA staff and members who have had the privilege of seeing her in action in large part because of her warm and sincere manner. “Not many people can bring the same passion for network technology as for early childhood learning, but Linda did. Not only was she an incredibly effective advocate, but I have admired and enjoyed her generous and collaborative spirit for years,” said Larra Clark, deputy director for ALA OITP. Linda easily wins over her audience.

Kathi Peiffer, current Chair of the E-rate Task Force and Pat Ball, member of the joint ALA Committee on Legislation and OITP Telecommunications Subcommittee both highlight these qualities in their recollection of Linda. “She is always gracious and has a wonderful sense of humor. She is the Queen of E-rate! (Kathi). She is always smiling and always gracious and I am glad that I had the opportunity to meet and work with her.  I salute a great librarian and lady.” (Pat)

Alan S. Inouye, director of OITP, puts it well when he says, “Saying “thank you” to Linda Lord is just so inadequate. Her contributions to national policy on E-rate are extensive and range from testifying at the U.S. Senate Commerce Committee and participating on FCC expert panels to chairing innumerable E-rate Task Force meetings (at their notorious Sunday 8:00 am times!). As Maine State Librarian, she has greatly advanced library services and visibility in her state in many ways. I hope that the library community, ALA, and OITP can find a way to continue to avail ourselves of Linda’s expertise and experience—retirement notwithstanding!”

So Alan leaves me with a little hope that I can continue to dream up ways we can call on Linda. As we often tell members who get involved with OITP, it’s very difficult to cut the ties once you join us.

And Linda was worried she might lose touch with library issues. I doubt it.

The post Big shoes to fill appeared first on District Dispatch.

Pages

Subscribe to code4lib aggregator