You are here

Feed aggregator

Mark E. Phillips: Metadata Edit Events: Part 2 – Who

planet code4lib - Sat, 2015-03-28 15:53

In the previous post I started to explore the metadata edit events dataset generated from 94,222 edit events from 2014 for the UNT Libraries’ Digital Collections.  I focused on some of the information about when these edits were performed.

This post focuses on the “who” of the dataset.

All together we had 193 unique users edit metadata for one of the systems that comprise the UNT Libraries’ Digital Collections.  This includes The Portal to Texas History, UNT Digital Library, and the Gateway to Oklahoma History.

The top ten most frequent editors of metadata in the system are responsible for 57% of the overall edits.

Username Edit Events htarver 15,451 aseitsinger 10,105 twarner 4,655 mjohnston 4,143 atraxinger 3,905 cwilliams 3,490 sfisher 3,466 thuang 3,327 mphillips 2,669 sdillard 2,518

The overall distribution of edits per user looks like this.

Distribution of edits per user for the Edit Event Dataset

As you can see it shows the primary users of the system and then very quickly tapers down to the “long tail” of users who have a lower number of edit events.

A quick look at the total number of users active for given days of the week across the entire dataset.

Sun Mon Tue Wed Thu Fri Sat 40 95 122 122 123 97 39

There is a swell for Tue, Wed, and Thu in the table above.  It seems to be pretty consistent, either you have 39,40 users, 95-97 users, or 122-123 unique users on a given day of the week.

In looking at how unique users were spread across the year, grouped into months,  we got the following table and then graph.

Month Unique Users January 54 February 73 March 64 April 61 May 44 June 40 July 48 August 50 September 50 October 84 November 49 December 36

Unique Editors Per Month

There were some spikes throughout the year,  most likely related to a metadata class in the UNT College of Information that uses the Edit system as part of their teaching.  This is the October and February spikes in number of unique users.  Other than that we are a consistently over 40 unique users per month with a small dip for the December holiday season when school is not is session.

In the previous post we had a heatmap with the number of edit events distributed over the hours of the day and the days of the week.  I’ve included that graph below.

94,222 edit events plotted to the time and day they were performed

I was curious to see how the unique number of editors mapped to this same type of graph,  so that is included below.

Unique editors distribution across day of the week and hour of the day.

User Status

Of the 193 unique metadata editors in the dataset, 135 (70%) of the users were classified as Non-UNT-Employee and  58 (30%) were classified as UNT-Employee. For the edit events themselves, 75,968 (81%) were completed by users classified with a status of UNT-Employee  and 18,254 (19%) by users classified with the status of Non-UNT-Employee.

User Rank Rank Edit Events Percentage of Total Edits (n=94,222) Unique Users Percentage of Total Users (n=193) Librarian 22,466 24% 16 8% Staff 12,837 14% 13 7% Student 41,800 44% 92 48% Unknown 17,119 18% 72 37%

You can see that 44% of all of the edits in the dataset were completed by users who were students. Librarians and Staff members accounted for 38% of the edits.

This is the second in a series of posts related to metadata edit events in the UNT Libraries’ Digital Collections.  check back for the next installment.

As always feel free to contact me via Twitter if you have questions or comments.

Ed Summers: The Adventure of Experiment

planet code4lib - Sat, 2015-03-28 11:50

Love of certainty is a demand for guarantees in advance of action. Ignoring the fact that truth can be bought only by the adventure of experiment, dogmatism turns truth into an insurance company. Fixed ends upon one side and fixed “principles” — that is authoritative rules — on the other, are props for a feeling of safety, the refuge of the timid, and the means by which the bold prey upon the timid.

John Dewey in Human Nature and Conduct (p. 237)

Nicole Engard: Bookmarks for March 27, 2015

planet code4lib - Fri, 2015-03-27 20:30

Today I found the following resources and bookmarked them on Delicious.

Digest powered by RSS Digest

The post Bookmarks for March 27, 2015 appeared first on What I Learned Today....

Related posts:

  1. Herding Cattle
  2. Google Floor Plans
  3. Planning to Travel?

DPLA: DPLAfest in Light of SEA 101

planet code4lib - Fri, 2015-03-27 19:18

In my social media feeds yesterday, I saw some friends and acquaintances say that they were reconsidering their attendance at DPLAfest, scheduled to be held in Indianapolis, IN, April 17-18, in light of the recent signing of SEA 101, or the “Religious Freedom Restoration Act,” into law by Governor Pence of Indiana.  I must admit that as an openly gay employee at DPLA, I had an immediate and strong negative reaction.  I was unhappy about my organization spending money in a place that would allow businesses not to serve me simply because I am gay.

However, after more thought and a night of sleep, I have come to a different conclusion.  The passing of this law should make us all want to attend DPLAfest even more than we might have before.  We should want to support our hosts and the businesses in Indianapolis who are standing up against this law, and we should make it clear that our money will only be spent in places that welcome all.

At DPLA, we have already begun to diligently ensure that all the venues we are supporting welcome all of the DPLA staff and community.  Messages like these have already helped put our mind at ease about a number of our scheduled activities:

 

 

 

 

 

 

 

 

 

 

 

 

Stickers like the one below are going to help us know which businesses to support while we are in Indianapolis:

 

 

 

 

 

 

 

 

 

At DPLAfest, we will also have visible ways to show that we are against this kind of discrimination, including enshrining our values in our Code of Conduct.  We encourage you to use this as an opportunity to let your voice and your dollars speak.  Let’s use this as a time to support those businesses and venues that support true freedom, all while enjoying each other’s company and a great conference lineup!

Best,

Emily Gore

DPLA Director for Content

HangingTogether: Round of 16: The plot thickens … and so do the books

planet code4lib - Fri, 2015-03-27 17:29

OCLC Research Collective Collections Tournament

#oclctourney

Our second round of competition is complete, and only eight conferences remain standing! And yes, our tournament Cinderella, Big South, is still with us! Details below, but here are the Round of 16 results:

[Click to enlarge]

Competition in this round was on book length – which conference has the thickest books?* Big South, continuing its magical tournament run, ended up with the thickest books of all the conferences, averaging about 292 pages and ousting the powerful Big Ten from the tournament! West Coast also continues on to the next round, with a convincing victory over the Ivy Leaguers! Summit League, Ohio Valley, Atlantic 10, Missouri Valley, and Big Sky will also move on to the Round of 8. Conference USA and American Athletic had the tightest battle, with Conference USA coming out on top by less than 10 pages!

While Big South had the thickest books of all the conferences competing in this round (averaging about 292 pages), the Ivy League had the thinnest books, averaging about 225 pages. Does this surprise you? It turns out that the larger the size of the collective collection, the thinner the books. Take a look at this:

[Click to enlarge]

Big South had the smallest collective collection among the conferences competing in this round; the Ivy League had the largest. As the chart shows, there is a pretty strong correlation between collection size, and the percentage of the collection accounted for by books with less than 100 pages. Got any ideas why? Put them in the comments!

By the way, in case you were wondering, the average length of a print book in WorldCat is about 255 pages.

Bracket competition participants: Remember, if the conference you chose has been ousted from the tournament, do not despair! If no one picked the tournament Champion, all entrants will be part of a random drawing for the big prize!

The Round of 8 is next, where the tournament field will be reduced to just four conferences! Results will be posted March 31.

 

*Average number of pages per print book in conference collective collection. Data is current as of January 2015.

[Click to enlarge]

More information:

Introducing the 2015 OCLC Research Collective Collections Tournament! Madness!

OCLC Research Collective Collections Tournament: Round of 32 Bracket Revealed!

Round of 32: Blow-outs, buzzer-beaters, and upsets!

About Brian Lavoie

Brian Lavoie is a Research Scientist in OCLC Research. Brian's research interests include collective collections, the system-wide organization of library resources, and digital preservation.

Mail | Web | LinkedIn | More Posts (11)

Sean Chen: Waving a Dead Fish

planet code4lib - Fri, 2015-03-27 16:05

I’ve been using Vagrant & Virtualbox for development on my OS X machines for my solo projects. But in an effort to get an intern started up on developing a front-end to a project I started a while ago I ran into a really strange problem getting Vagrant working on Windows.

So as a tale of caution for whatever robot wants to pick up this bleg.

Bootcamp partition on a Mid-2010 MacBook Pro. Running a dormant OS X and a full Windows 7. The Windows 7 is the main environment:

Use the git bash shell since it has SSH to stand up the boxes with vagrant init, vagrant up.

And then stuck (similar to Vagrant stuck connection timeout retrying):

==> default: Clearing any previously set network interfaces... ==> default: Preparing network interfaces based on configuration... default: Adapter 1: nat default: Adapter 2: hostonly ==> default: Forwarding ports... default: 22 => 2222 (adapter 1) ==> default: Booting VM... ==> default: Waiting for machine to boot. This may take a few minutes... default: SSH address: 127.0.0.1:2222 default: SSH username: vagrant default: SSH auth method: private key default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying... default: Error: Connection timeout. Retrying...

Well we booted into the VM with a head and it looked like the booting got interrupted by some sort of kernal panic due to:

Spurious ACK on isa0060/serio0. Some program might be trying to access hardware directly.

Ok makes sense…the machine isn’t booting up and there has to be a reason why.

Long story short. The Windows 7 partition didn’t have virtualization enabled, and there is no BIOS setting or switch somewhere to do it. So what do you do:

How to enable hardware virtualization on a MacBook?

Like waving a dead fish in front of your computer.

  • Boot into OSX.
  • System Preferences > Select the Start Up preference pane
  • Select the Boot Camp partition with Windows
  • Restart into the Boot Camp partition
  • Magic

Go figure


FOSS4Lib Recent Releases: Goobi - 1.11.0

planet code4lib - Fri, 2015-03-27 14:10

Last updated March 27, 2015. Created by Peter Murray on March 27, 2015.
Log in to edit this page.

Package: GoobiRelease Date: Wednesday, March 25, 2015

LITA: Making LibGuides Into Library Websites

planet code4lib - Fri, 2015-03-27 12:00

Welcome to Part 2 of my two-part series introducing LibGuides CMS for use as a website. Read Part 1 (with comments from Springshare!). This companion piece was released February 27.

Why LibGuides?

LibGuides logo (© Springshare)

We can design surprisingly good websites with LibGuides 2.0 CMS. WordPress and Drupal are free and open source, but Springshare, the maker of LibGuides, also delivers reliable hosting and support for two grand a year. Moreover, even folks clueless about coding can quickly learn to maintain a LibGuides-based website because (1) the interface is drop-and-drag, fill-in-the-box intuitive, and (2) many academic librarians create research guides as part of their liaison duties and are already familiar with the system. Most importantly, libraries can customize LibGuides-based websites as extensively or minimally as available talent and time permits, without sacrificing visual appeal or usability–or control of the library’s own site.

LibGuides-Based Websites

There are some great LibGuides-based websites out there. Springshare has compiled exemplars across various library sectors here and here. Below are screenshots showing what you can do.

Albuquerque and Bernalillo County (ABC) Library homepage

The Albuquerque and Bernalillo County (ABC) Library is that rare public library that uses LibGuides. The homepage is beautifully laid out, with tons of neat customizations and a carousel that actually enhances UX, despite the load time. One of my favorite LibGuides sites!

World Maritime University Library homepage

The World Maritime University Library, run by the United Nations, has a beautifully minimalist blue-and-white look – classic Scandinavian. Like Google, the logo and search box are front and center; everything else is placed discreetly in tabs at the top and bottom of the homepage.

John S. Bailey Library, American College of Greece

The American College of Greece’s John S. Bailey Library is text-heavy, but its navigation is as clear as the Aegean Sea. Note the absence of a federated search box, which, unless the algorithms are of search-engine caliber, tends to produce results that undergraduates find bewildering.

Even you have other priorities or skills, you can still create a quality LibGuides-based website without major customizations to the stylesheets. Hillsborough Community College Library and Harrison College both do nice jobs, albeit with LibGuides 1.0. Walters State Community College did hardly any deep customizing of LibGuides 2.0, but its site is perfectly functional.

Walters State Community College Library homepage

My Library’s Website

Moving the Hodges University Library to LibGuides has followed a three-stage agile process.

1. September 2014. We upgraded the existing LibGuides CMS to LibGuides 2.0 and reorganized and enhanced existing content. Review my February 27 post for more on this first stage.

Hodges University Library’s faculty support page

2. January 2015. We rolled out the new library homepage and associated pages, which unified the library’s entire web presence under LibGuides. Previously our homepage was designed and run by the university’s IT department using Microsoft SharePoint (ugh), so students could only access the homepage by signing into the university intranet–dreadful for accessibility. We also shuffled DNS records and redirects so that the homepage has a much cleaner URL (library.hodges.edu) than previously (https://myhugo.hodges.edu/organizations/org-libr/Pages/Home.aspx). The new site can be accessed by anyone from anywhere without logging into anything. #librarianwin

3. June 2015. We will roll out the next major iteration of our website, integrating OCLC’s new and improved WorldCat discovery layer, our new LibAnswers virtual reference service, and our revamped website to build better UX. The page header and federated search box will be optimized for mobile devices, as the rest of the site already is. Our motto? Continual improvement!

Have you used LibGuides as a website? What is your experience?

Nicole Engard: Bookmarks for March 26, 2015

planet code4lib - Thu, 2015-03-26 20:30

Today I found the following resources and bookmarked them on Delicious.

  • Booktype Lets you produce beautiful, engaging books in minutes. Booktype is free and open source software that helps you write and publish print and digital books.

Digest powered by RSS Digest

The post Bookmarks for March 26, 2015 appeared first on What I Learned Today....

Related posts:

  1. CA Law to Produce Open Source Textbooks
  2. Espresso Book Machine
  3. E-book reading on the rise

FOSS4Lib Upcoming Events: Northeast Fedora User Group Meeting

planet code4lib - Thu, 2015-03-26 19:59
Date: Monday, May 11, 2015 - 08:00 to Tuesday, May 12, 2015 - 17:00Supports: Fedora Repository

Last updated March 26, 2015. Created by Peter Murray on March 26, 2015.
Log in to edit this page.

From the announcement:

A Northeast Fedora User Group meeting will be held at Yale University on May 11-12. Monday May 11 will be an unconference style format with a lightning round in the afternoon. Tuesday May 12 will focus on Fedora 4 training led by Andrew Woods

Please register for this event by April 3 here: https://docs.google.com/forms/d/1b4ntNkhRuJvtNEfi0vXSjk7C9uuR3bwa2A8e3U-6w08/viewform

DPLA: Girl Scout super stars

planet code4lib - Thu, 2015-03-26 19:22

Unless you haven’t been out of your house for the past month, you know that it’s Girl Scout cookie season. The girls out tugging boxes of cookies around the neighborhood are learning all sorts of skills they’ll use later in life as political leaders, entertainers, astronauts, and athletes. Literally. For proof, check out this list of 25 of the most famous Girl Scouts while enjoying the last of your Thin Mints and Caramel Delights…until next year.

Madeleine Albright, former US Secretary of State

Marion Anderson, singer

Lucille Ball, comedian and film studio executive

Lynda Carter, actress and star of “Wonder Woman.”

Rosalyn Carter, former First Lady

Chelsea (and Hillary) Clinton

Katie Couric, journalist

Sandra Day O’Connor, former Supreme Court Justice

Queen Elizabeth II

Carrie Fisher, actress

Dorothy Hamill, figure skater

Jackie Joyner-Kersee, Olympic athlete

Dorothy Lamour, actress and singer

Shari Lewis, puppeteer and children’s entertainer

Christa McAullife, teacher aboard the Space Shuttle Challenger

Michelle Obama, First Lady

Nancy Reagan, former First Lady

Sally Ride, astronaut

Chita Rivera, actress, dancer, and singer

Gloria Steinem, political activist

Martha Stewart, businesswoman

Shirley Temple, actress

Mary Tyler Moore, actress

Dionne Warwick, singer

Venus Williams, tennis player

 

Banner image from Digital Commonwealth, Boston Public Library.

FOSS4Lib Recent Releases: Jpylyzer - 1.14.1

planet code4lib - Thu, 2015-03-26 15:50

Last updated March 26, 2015. Created by Peter Murray on March 26, 2015.
Log in to edit this page.

Package: JpylyzerRelease Date: Wednesday, March 25, 2015

FOSS4Lib Recent Releases: Siegfried - 1.0

planet code4lib - Thu, 2015-03-26 15:45

Last updated March 26, 2015. Created by Peter Murray on March 26, 2015.
Log in to edit this page.

Package: SiegfriedRelease Date: Sunday, March 22, 2015

FOSS4Lib Updated Packages: Siegfried

planet code4lib - Thu, 2015-03-26 15:44

Last updated March 26, 2015. Created by Peter Murray on March 26, 2015.
Log in to edit this page.

Siegfried is a PRONOM-based file format identification tool.

Key features are:

  • complete implementation of PRONOM (byte and container signatures)
  • reliable results (siegfried is tested against Ross Spencer’s skeleton suite and QA tested against DROID and FIDO output using http://github.com/richardlehane/comparator)
  • fast matching without limiting the number of bytes scanned
  • detailed information about the basis for format matches
  • simple command line interface with a choice of outputs (YAML, JSON, CSV)
  • a built-in server for integrating with workflows and language inter-op
  • power options including debug mode, signature modification, and multiple identifiers.
Package Type: Data Preservation and Management Package Links Operating System: LinuxMacWindows Releases for Siegfried Programming Language: GoOpen Hub Link: https://openhub.net/p/siegfried-pronomOpen Hub Stats Widget: 

CrossRef: CrossRef Extends Management Team, Appoints Ginny Hendricks To Focus on Member and Community Outreach

planet code4lib - Thu, 2015-03-26 15:39

26 March 2015, Lynnfield, MA - CrossRef, the global not-for profit digital hub for scholarly communications, is pleased to announce the addition of Ginny Hendricks to its management team in the newly-created role of Director of Member and Community Outreach, where she will be responsible for marketing, business development, member services, and product support. The appointment reflects CrossRef's mission to innovate for the future of scholarly content, and to foster collaboration among an increasingly diverse community of publishers, researchers, authors, libraries, funders, and beyond.

Executive Director, Ed Pentz, says "I'm very pleased Ginny is joining the CrossRef team; her international experience, background in scholarly publishing, and digital marketing expertise, make her the perfect person to spearhead the CrossRef brand, lead outreach around the world, and contribute to our ongoing success."

Ginny Hendricks says: "CrossRef is indispensible to the reliable running and progression of scholarly communications and its scope is broadening to accommodate changing publisher needs and serve the wider communities. I'm excited to work with some great people and to be able to contribute to such a central part of scholarly publishing."

Ginny has run Ardent Marketing for nine years where she consulted with publishers to develop multichannel marketing plans, brand and launch online products, and build engaged communities. Prior to consulting she managed the launch of Scopus at Elsevier, where she established advisory boards and outreach programs with library and scientific communities. In 1998 Ginny started an early e-resources help desk for Blackwell's information Services and later led training and communication programs for Swets' digital portfolio in Asia Pacific, Middle East, and Africa. She's lived and worked in many parts of the world and has managed globally dispersed creative, technical, and commercial teams. She co-hosts the Scholarly Social networking events in London, and is considering finishing her Master's of Science in Digital Marketing Communications. Ginny will start on Monday 30th March and can be reached via twitter @GinnyLDN or email ginny@crossref.org.

About CrossRef
CrossRef (www.crossref.org) serves as a digital hub for the scholarly communications community. A global not-for profit membership organization of scholarly publishers, CrossRef's innovations shape the future of scholarly communications by fostering collaboration among multiple stakeholders. CrossRef provides a wide spectrum of services for identifying, locating, linking to, and assessing the reliability and provenance of scholarly content.

Contact: Ed Pentz at info@crossref.org.

View this news release on the CrossRef website.

Library of Congress: The Signal: Checking in with NGAC and the National Spatial Data Infrastructure

planet code4lib - Thu, 2015-03-26 14:36

Satellite data, January 1, 2014. Photo courtesy of NCDC/NOAA.

Several times a year I attend meetings of the National Geospatial Advisory Committee, a federal advisory committee that reports to the chair of the Federal Geographic Data Committee. The NGAC pulls together participants from across academia, the private sector and all levels of government to advise the Federal government on geospatial policy and ways to advance the vision of a National Spatial Data Infrastructure. They held two days of meetings in DC on March 17 and 18, 2015 and I was happy to have the opportunity to attend.

We originally got involved with the group when two members of the GeoMAPP project team (Zsolt Nagy and Dennis Goreham) were named founding NGAC members (PDF) and we’ve kept up with it because of the wealth of information that comes out of the meetings about national geospatial policy initiatives.

The group’s membership changes over time, but in the past has included Jack Dangermond, the founder of Esri, and currently includes both Michael Jones of Google (one of the inventors of Google Earth) and Steve Coast, the founder of OpenStreetMap.

Julie Sweetkind-Singer, the Assistant Director of Geospatial, Cartographic and Scientific Data & Services at Stanford University libraries and a former principal investigator on the NDIIPP National Geospatial Digital Archive project, is now the Vice Chair of the group.

As usual, the committee covered a number of topic areas that have ramifications for the library, archive and museum digital stewardship communities.

FGDC Report/GAO Report

A chief area of discussion in the FGDC’s report to the attendees was the March 16 release of the Government Accountability Office report “Geospatial Data: Progress Needed on Identifying Expenditures, Building and Utilizing a Data Infrastructure, and Reducing Duplicative Efforts.” This is the second GAO report in the past 3 years on geospatial information, with the first, “Geospatial Information: OMB and Agencies Need to Make Coordination a Priority to Reduce Duplication,” having been released on November 26, 2012.

GAO’s objectives with the report were to

(1) describe the geospatial data that selected federal agencies and states use and how much is spent on geospatial data; (2) assess progress in establishing the National Spatial Data Infrastructure; and (3) determine whether selected federal agencies and states invest in duplicative geospatial data.

The report urged Congressional input towards a national addressing database, while also recommending that the Office of Management and Budget and associated federal agencies fully implement national spatial data infrastructure activities.

Crowd-Sourced Geospatial Data

Next came an interesting presentation on the concepts of crowd-sourced data, citizen science and volunteered geographic information, as well as crowd-sourced data initiatives happening inside the Federal government. It featured Sophia Liu, a Mendenhall Postdoc Fellow at the U.S. Geological Survey; Denice Ross, a Presidential Innovation Fellow at the Department of Energy; and Sean Gorman from Timbr.io.

Key questions that crossed each of the presentations included the challenges with integrating crowd-sourced data with agency-originated data while validating its integrity, as well as potential legal consequences when agencies rely on crowd-sourced data for action. One suggested way to address the validity question is to incorporate a “human-in-the-loop” to vet, edit or “massage” crowd-sourced data to ensure its accuracy and usability. See http://radar.oreilly.com/2015/02/human-in-the-loop-machine-learning.html for further info.

There was also a bit of discussion on the difference between “ambient” crowd-sourced data (think traffic data compiled from the location reports of cell phones) and volunteered geographic information such as that found in citizen-mapping initiatives such as OpenStreetMap.

Geospatial Privacy Subcommittee Report

The Geospatial Privacy Subcommittee of the NGAC is largely exploring the privacy challenges presented by Unmanned Aircraft Systems and as such is somewhat out of our purview. An important recent document on this front is “Presidential Memorandum: Promoting Economic Competitiveness While Safeguarding Privacy, Civil Rights, and Civil Liberties in Domestic Use of Unmanned Aircraft Systems” released on Feb. 15, 2015.

COGO Report card

Geospatial: application by user dleithinger on Flickr

COGO is the Coalition of Geospatial Organizations, a grouping of private sector geospatial organizations such as the American Society of Civil Engineers (ASCE), American Society for Photogrammetry and Remote Sensing (ASPRS), Association of American Geographers (AAG), National States Geographic Information Council (NSGIC) and a number of others.

On February 16, 2015 they published their first “Report Card on the U.S. National Spatial Data Infrastructure” (PDF). The report was written by an expert panel led by former Wyoming governor James E. Geringer (who presented the findings at the meeting). The focus of the initial report card is on the status of the seven FGDC “framework” data layers and how they are being maintained and accentuated to meet the needs of a national spatial data infrastructure. As the report says, “by evaluating the Federal government’s efforts to lead and coordinate the creation and maintenance of these data, this report reflects on how well the NSDI is meeting its goals.” According to COGO the student is not doing too well.

There was ample discussion on whether COGO was measuring the right thing (is it a measure of what’s actually getting done in a somewhat hostile budgetary environment, or are agencies being measured against an abstract standard of what should be done based on the original goals of the NSDI?) and whether this report could do more harm than good for acquiring future resources across the federal geospatial community.

During the discussion on the report it was noted that the 2016 President’s budget includes an increase of nearly $150 million for the USGS, including “an increase of $11 million for the USGS to support the community resilience toolkit, which is a web-based clearinghouse of data, tools, shared applications, and best practices for resource managers, decision-makers, and the public,” so at least there’s recognition that work does need to get done.

Geospatial Data Act of 2015

Finally, not on the meeting agenda but hanging over all the discussions was the “Geospatial Data Act of 2015,” introduced by Senators Hatch and Warner on March 16, 2015, the day prior to the start of the meeting. The text of the legislation is at https://www.congress.gov/bill/114th-congress/senate-bill/740/text, and my initial reading (note: I am not a lawyer!) is that it codifies in law things that are attempting to be implemented in current practice. Several important items in the proposed bill:

  • Each covered agency shall include geospatial data as a capital asset for purposes of preparing the budget submission of the President.
  • Each covered agency shall disclose each contract, cooperative agreement, grant or other transaction that deals with geospatial data on USAspending.gov.
  • Greater OMB oversight, and a limitation on receiving future funds for data that does not conform to FGDC standards.

The next NGAC meeting is June 9-10, 2015. As always, they are open to the public.

Thom Hickey: Moving to Wikidata

planet code4lib - Thu, 2015-03-26 13:43

VIAF has long interchanged data with Wikipedia, and the resulting links between library authorities and Wikipedia are widely used.  Unfortunately we only harvested data from the English Wikipedia (en.wikipedia.org), so we missed names, identifiers and other information in non-English Wikipedia pages.

Fortunately the problem VIAF had with Wikipedia was similar to the problems that Wikipedia itself had in sharing data across language versions.  Wikidata is Wikimedia's solution to the problem, and over the last year or two has grown from promising to useful.  In fact, from VIAF's point of view Wikidata now looks substantially better than just working with the English pages.  In addition to picking up many more titles for names, we are finding a million names that do not occur in the English pages, and names that match those in other VIAF sources has nearly doubled to 800 thousand from 440 thousand.

Since we (i.e. Jenny Toves) was reexamining the process, we took the opportunity to harvest corporate/organization names as well, something we have wanted for some time, so some 300K of the increase comes from those.

We expect to have the new data in VIAF in mid to late April 2015 and it is visible now in our test system at http://test.viaf.org.

The advantages we see:

  • Much less bias towards English
  • More entities (people and organizations)
  • More coded information about the entities
  • More non-Latin forms of names
  • More links into Wikipedia

This will cause some changes in the data that are visible in the VIAF interface.  One of these is that VIAF will link to the Wikidata pages rather than the English Wikipedia pages, and we are changing the WKP icon to reflect that ( to ).  This means that Jane Austen's WKP identifier (VIAF's abbreviation for Wikipedia) will change from WKP|Jane_Austen to WKP|Q36322.  Links to the WKP source page will change from

http://en.wikipedia.org/wiki/Jane_Austen

to

 http://www.wikidata.org/entity/Q36322

Although it is possible to jump from the Wikidata pages to Wikipedia pages in specific languages, we feel these links are important enough that we will be importing all the language specific Wikipedia page links we find in the Wikidata.  These will show up as 'external links' in the interface in the 'About' section of the display.

A commonly used bulk file from VIAF is the 'links' file that shows all the links made between VIAF identifiers and source file identifiers (pointers to the bulk files can be found here).  The links file includes external links, so the individual Wikipedia pages will show up in the file along with the Wikidata WKP IDs.  Here are some of the current links in the file for Lorcan Dempsey:

http://viaf.org/viaf/36978042   BAV|ADV11117013

http://viaf.org/viaf/36978042   BNF|12276780

. . .

http://viaf.org/viaf/36978042   SUDOC|031580661

http://viaf.org/viaf/36978042   WKP|Lorcan_Dempsey

http://viaf.org/viaf/36978042   XA|2219

 

The new file will change to:

http://viaf.org/viaf/36978042   BAV|ADV11117013

http://viaf.org/viaf/36978042   BNF|12276780

. . .

http://viaf.org/viaf/36978042   WKP|Q6678817

http://viaf.org/viaf/36978042   WKP|http://en.wikipedia.org/wiki/Lorcan_Dempsey

http://viaf.org/viaf/36978042   XA|2219

 

Lorcan only has one Wikipedia page, the English language one.  Jane Austen has more than a hundred, and all those links will be there.

Of course, this also means some changes to the RDF view of the data.  We're still working on that and will post more information when we get it closer to its final form.

--Th

Galen Charlton: Unsolved problems

planet code4lib - Thu, 2015-03-26 11:50

I saw a lot of pain yesterday. I will see more pain today.

Pain from women saying that it’s back to the whisper network for them. Pain from women acknowledging the many faults of whisper networks.

Pain from women who do not want to be chilled — and who yet find themselves in the far north, with the wolves circling.

Pain from women who have seen their colleagues fail them before, and before, and before — and who have less hope now that the future of libraries will be any better.

Pain from women who fear that licenses were issued yesterday — licenses to maintain the status quo, licenses to grind away the hopes and dreams of those women in libraries who want to change the world (or who simply want to catalog books in peace and go home at the end of the day).

Above all, pain from women whose words are now constrained by the full force of the law — and who are now the target of every passerby who has much time and little empathy.

I will speak plainly: Lisa Rabey and nina de jesus did a brave thing, a thing that could never have rebounded to their personal advantage no matter the outcome of the lawsuit. I respect them, and I wish them whatever peace they can find after this.

I will speak bluntly to men in the library profession: regardless of what you think of the case that ended yesterday — regardless of what you think of Joe Murphy’s actions or of the actions of Team Harpy — sexual harassment in our profession is real; the pain our colleagues experience due to it is real.

It remains an unsolved problem.

It remains our unsolved problem.

We must do our part to fix it.

Not sure how? Neither am I. But at least as librarians and library workers, we have access to plenty of tools to learn, to listen.

Time to roll up our sleeves.

DuraSpace News: UPDATE: Toward a Strategic Vision and Technical Roadmap for DSpace

planet code4lib - Thu, 2015-03-26 00:00

From Jonathan Markow, DuraSpace CSO

Winchester, MA  The DSpace Steering Group, working with the DSpace Community Advisory Team (DCAT) and DuraSpace, has organized an initiative to present a strategic vision and technical roadmap for DSpace, representing significant community contributions during the past two years. 

FOSS4Lib Upcoming Events: Fedora 4 Training Workshop at TCDL

planet code4lib - Wed, 2015-03-25 20:48
Date: Tuesday, April 28, 2015 - 13:30 to 17:30Supports: Fedora Repository

Last updated March 25, 2015. Created by Peter Murray on March 25, 2015.
Log in to edit this page.

Fedora team members Andrew Woods and David Wilcox will be presenting a Fedora 4 Training Workshop at the 2015 Texas Conference on Digital Libraries (TCDL) to be held April 27-28 in Austin, Texas. The Fedora 4 Training Workshop will be held on April 28 from 1:30 PM to 5:30 PM and is is free for conference attendees.

Pages

Subscribe to code4lib aggregator