You are here

Feed aggregator

CrossRef: CrossRef Staff at the FORCE2015 Conference

planet code4lib - Fri, 2015-01-09 21:31

Ed Pentz, Karl Ward, Geoffrey Bilder and Joe Wass will be attending the FORCE2015 Conference in Oxford, UK.They'll be available to answer any CrossRef related questions. The conference runs 12-13 January. Learn more.

Jonathan Rochkind: Control of information is power

planet code4lib - Fri, 2015-01-09 21:04

And the map is not the territory.

From the Guardian, Cracks in the digital map: what the ‘geoweb’ gets wrong about real streets

“There’s no such thing as a true map,” says Mark Graham, a senior research fellow at Oxford Internet Institute. “Every single map is a misrepresentation of the world, every single map is partial, every single map is selective. And every single map tells a particular story from a particular perspective.”

Because online maps are in constant flux, though, it’s hard to plumb the bias in the cartography. Graham has found that the language of a Google search shapes the results, producing different interpretations of Bangkok and Tel Aviv for different residents. “The biggest problem is that we don’t know,” he says. “Everything we’re getting is filtered through Google’s black box, and it’s having a huge impact not just on what we know, but where we go, and how we move through a city.”

As an example of the mapmaker’s authority, Matt Zook, a collaborator of Graham’s who teaches at the University of Kentucky, demonstrated what happens when you perform a Google search for abortion: you’re led not just to abortion clinics and services but to organisations that campaign against it. “There’s a huge power within Google Maps to just make some things visible and some things less visible,” he notes.

From Gizmodo, Why People Keep Trying To Erase The Hollywood Sign From Google Maps

But the sign is both tempting and elusive. That’s why you’ll find so many tourists taking photos on dead-end streets at the base of the Hollywood Hills. For many years, the urban design of the neighbourhood actually served as the sign’s best protection: Due to the confusingly named, corkscrewing streets, it’s actually not that easy to tell someone how to get to the Hollywood Sign.

That all changed about five years ago, thanks to our suddenly sentient devices. Phones and GPS were now able to aid the tourists immensely in their quests to access the sign, sending them confidently through the neighbourhoods, all the way up to the access gate, where they’d park and wander along the narrow residential streets. This, the neighbours complained, created gridlock, but even worse, it represented a fire hazard in the dry hills — fire trucks would not be able to squeeze by the parked cars in case of an emergency.

Even though Google Maps clearly marks the actual location of the sign, something funny happens when you request driving directions from any place in the city. The directions lead you to Griffith Observatory, a beautiful 1920s building located one mountain east from the sign, then — in something I’ve never seen before, anywhere on Google Maps — a dashed grey line arcs from Griffith Observatory, over Mt. Lee, to the sign’s site. Walking directions show the same thing.

Even though you can very clearly walk to the sign via the extensive trail network in Griffith Park, the map won’t allow you to try.

When I tried to get walking directions to the sign from the small park I suggest parking at in my article, Google Maps does an even crazier thing. It tells you to walk an hour and a half out of the way, all the way to Griffith Observatory, and look at the sign from there.

No matter how you try to get directions — Google Maps, Apple Maps, Bing — they all tell you the same thing. Go to Griffith Observatory. Gaze in the direction of the dashed grey line. Do not proceed to the sign.

Don’t get me wrong, the view of the sign from Griffith Observatory is quite nice. And that sure does make it easier to explain to tourists. But how could the private interests of a handful of Angelenos have persuaded mapping services to make it the primary route?

(h/t Nate Larson)


Filed under: General

Open Library: Open Library heads to the stars

planet code4lib - Fri, 2015-01-09 20:29

We are excited to announce that the Open Library metadata, pointing to the growing collection of content housed by Internet Archive, has been selected for inclusion in the core archive of Outernet. If you are not familiar with Outernet, they’re calling themselves Humanity’s Public Library and they want to increase access to information for people around the world. Read more here (they’ve got a funding thing happening as well). In their own words

Currently, 2/3 of humanity lacks Internet access. Outernet wants to broadcast humanity’s best work to the entire world from space. For free. They believe that no one should be denied a basic level of information due to wealth, geography, political environment, or infrastructure. Furthermore, every person should be able to participate in the global marketplace of ideas. They are currently live on four continents with more to come. Users can build their own receiver or purchase one.

Inclusion of Open Library metadata will help Outernet users understand the breadth of content that is available. We’re happy to help get more information to more people.

OCLC Dev Network: OCLC LC Name Authority File (LCNAF) Temporarily Unavailable

planet code4lib - Fri, 2015-01-09 20:00

The LC Name Authority File (LCNAF) is temporarily unavailable due to a problem at the data source level. While users will find that this experimental service is up and running, no data is currently available. The good news for users particularly interested in this data is that you can now access it from the Library of Congress directly at http://id.loc.gov/.

Jenny Rose Halperin: Reading Highlights 2014

planet code4lib - Fri, 2015-01-09 18:06

I did this last year too, but here are some of the best books that I read this year. I tend to read a bit haphazardly and mostly fiction, but here’s the list of books that surprised or excited me most in 2014. I can honestly say that this year I only read a few duds and that most of my reading life was very rich!

Fiction: I read a lot of Angela Carter this year, including Burning Your Boats (her collection of short stories,) Wise Children, which is so wildly inventive, and Nights at the Circus, which many consider to be her best. She remains my favorite author and I am glad she has such a large catalog. Each book is like a really delicious fruit.

Perhaps the most surprising book I read this year was The Name of the Rose by Umberto Eco. I picked it up in a used bookstore in London and found it thrilling. I would love to read more monastery murder mysteries.

In the British romances category, standouts include The Enchanted April, Persuasion, Emma, Sense and Sensibility, and Far From the Madding Crowd.
British romances are my comfort food, and I always turn to them when I don’t know what to read next. I find most through browsing Project Gutenberg and seeing what I haven’t read yet. I love Project Gutenberg and think that the work they’re doing is incredibly important.

I devoured Mavis Gallant’s Paris Stories collection from the NYRB back in January and was very sad when she passed.

In German, I read only one book, which was Schachnovelle by Steftan Zweig. I read it because of the Grand Budapest Hotel connection and it was as good as promised.

I finished off my year with Snow by Orhan Pamuk, which I highly recommend! It is particularly prescient now and asks important questions about Western hegemony, art, and religion.

Memoir: I had somehow missed Heartburn by Nora Ephron and have recommended it to everyone, though it’s halfway between memoir and fiction. It is so smart, so funny, and so bitchy, like the best romcom.

Because a bunch of people have asked me: I had very mixed feelings about Not that Kind of Girl by Lena Dunham. The stories in the collection weren’t novel or exciting; the narratives had appeared in her work repeatedly and seemed like a rehashing of the most boring parts of Girls or Tiny Furniture. By the time she got to the section about her food diary, I honestly wondered if anyone had even thought to edit this work. In all, I found it smug and poorly written.

My Berlin Kitchen: A Love Story by Luisa Weiss was a lovely book about remembrance, identity, and food.

Non-fiction: My team read Cultivating Communities of Practice by Etienne Wegner, and it made a massive impression on me and my work. It is a very brilliant book!

I am cheating a bit here because I just finished it this week, but Don’t Make me Think by Steve Krug was also fantastic and asked all the important questions about usability, testing, and the Web.

Reinventing Organizations by Frederic Laloux made some interesting claims and I am not quite sure what to make of it still, but definitely gave me food for thought.

If you don’t yet use it, Safari Books Online is the best tool for discovering literature in your field, both in terms of platform and content.

Historical fiction: I didn’t read so much in this category this year, but what I did was amazing. In the Garden of Beasts by Erik Larson was so well-researched and engrossing. I am officially a Larson convert!  The Orientalist by Tom Reiss was incredibly exciting as well.

Honorable Mentions: In the field of community management, Jono Bacon’s The Art of Community is a classic. I liked it very much, but found its emphasis on “meritocracy” deeply problematic.

I picked up Good Poems, an anthology by Garrison Keillor at a library sale last month and it is a delight! I leave it on my kitchen table to read while hanging around.

Nicholson Baker is such a good writer, so The Way the World Works was enjoyable, though not my favorite of his.

Feel free to share your favorites as well! Here’s to a 2015 full of even more books!

 

Eric Lease Morgan: Hands-on text analysis workshop

planet code4lib - Fri, 2015-01-09 16:42

I have all but finished writing a hands-on text analysis workshop. From the syllabus:

The purpose of this 5­-week workshop is to increase the knowledge of text mining principles among participants. By the end of the workshop, students will be able to describe the range of basic text mining techniques (everything from the creation of a corpus, to the counting/tabulating of words, to classification & clustering, and visualizing the results of text analysis) and have garnered hands­-on experience with all of them. All the materials for this workshop are available online. There are no prerequisites except for two things: 1) a sincere willingness to learn, and 2) a willingness to work at a computer’s command line interface. Students are really encouraged to bring their own computers to class.

The workshop is divided into the following five, 90-minute sessions, one per week:

  1. Overview of text mining and working from the command line
  2. Building a corpus
  3. Word and phrase frequencies
  4. Extracting meaning with dictionaries, parts­of­speech analysis, and named entity recognition
  5. Classification and topic modeling

For better or for worse, the workshop’s computing environment will be the Linux command line. Besides the usual command-line suspects, participants will get their hands dirty with wget, tika, a bit of Perl, a lot of Python, Wordnet, Treetagger, Standford’s Named Entity Recognizer, and Mallet.

For more detail, see the syllabus, sample code, and corpus.

Eric Lease Morgan: distance.cgi – My first Python-based CGI script

planet code4lib - Fri, 2015-01-09 16:10

Yesterday I finished writing my first Python-based CGI script — distance.cgi. Given two words, it allows the reader to first disambiguate between various definitions of the words, and second, uses Wordnet’s network to display various relationships (distances) between the resulting “synsets”. (Source code is here.)

Reader input

Disambiguate

Display result

The script relies on Python’s Natural Language Toolkit (NLTK) which provides an enormous amount of functionality when it comes to natural language processing. I’m impressed. On the other hand, the script is not zippy, and I am not sure how performance can be improved. Any hints?

Jonathan Rochkind: Fraud in scholarly publishing

planet code4lib - Fri, 2015-01-09 15:38

Should librarianship be a field that studies academic publishing as an endeavor, and works to educate scholars and students to take a critical perspective?  Some librarians are expected/required to publish for career promotion, are investigations in this area something anyone does?

From Scientific American, For Sale: “Your Name Here” in a Prestigious Science Journal:

Klaus Kayser has been publishing electronic journals for so long he can remember mailing them to subscribers on floppy disks. His 19 years of experience have made him keenly aware of the problem of scientific fraud. In his view, he takes extraordinary measures to protect the journal he currently edits, Diagnostic Pathology. For instance, to prevent authors from trying to pass off microscope images from the Internet as their own, he requires them to send along the original glass slides.

Despite his vigilance, however, signs of possible research misconduct have crept into some articles published in Diagnostic Pathology. Six of the 14 articles in the May 2014 issue, for instance, contain suspicious repetitions of phrases and other irregularities. When Scientific American informed Kayser, he was apparently unaware of the problem. “Nobody told this to me,” he says. “I’m very grateful to you.”

[…]

The dubious papers aren’t easy to spot. Taken individually each research article seems legitimate. But in an investigation by Scientific American that analyzed the language used in more than 100 scientific articles we found evidence of some worrisome patterns—signs of what appears to be an attempt to game the peer-review system on an industrial scale.

[…]

A quick Internet search uncovers outfits that offer to arrange, for a fee, authorship of papers to be published in peer-reviewed outlets. They seem to cater to researchers looking for a quick and dirty way of getting a publication in a prestigious international scientific journal.

This particular form of the for-pay mad-libs-style research paper appears to be prominent  mainly among researchers in China. How can we talk about this without accidentally stooping to or encouraging anti-Chinese racism or xenophobia?   There are other forms of research fraud and quality issues which are prominent in the U.S. and English-speaking research world too.  If you follow this theme of scholarly quality issues, as I’ve been trying to do casually, you start to suspect the entire scholarly publishing system, really.

We know, for instance, that ghost-written scholarly pharmaceutical articles are not uncommon in the U.S. too.   Perhaps in the U.S. scholarly fraud is more likely to come for ‘free’ from interested commercial entities, then by researchers paying ‘paper salesmen’ for poor quality papers.  To me, a paper written by a pharmaceutical company employer but published under the name of an ‘independent’ researcher is arguably a worse ethical violation, even if everyone involved can think “Well, the science is good anyway.”  It also wouldn’t shock me if very similar systems to China’s paper-for-sale industry exist in the U.S., on a much smaller scale, but they are more adept at avoiding reuse of nonsense boilerplate, making it harder to detect. Presumably the Chinese industry will get better at avoiding detection too, or perhaps already is at a higher end of the market.

In both cases, the context is extreme career pressure to ‘publish or perish’, into a system that lacks the ability to actually ascertain research quality sufficiently, but which the scholarly community believes has that ability.

Problems with research quality, don’t end here, they go on and on, and are starting to get more attention.

  • An article from the LA Times from Oct 2013,
    Science has lost its way, at a big cost to humanity: Researchers are rewarded for splashy findings, not for double-checking accuracy. So many scientists looking for cures to diseases have been building on ideas that aren’t even true.” (and the HN thread on it).
  • From the Economist, also from last year, “Trouble at the lab: Scientists like to think of science as self-correcting. To an alarming degree, it is not.”
  • From Nature August 2013 (was 2013 the year of discovering scientific publishing ain’t what we thought?), “US behavioural research studies skew positive:
    Scientists speculate ‘US effect’ is a result of publish-or-perish mentality.

There are also individual research papers investigating particular issues, especially statistical methodology problems, in scientific publishing.  I’m not sure if there are any scholarly papers or monographs which take a big picture overview of the crisis in scientific publishing quality/reliability — anyone know of any?

To change the system, we need to understand the system — and start by lowering confidence in the capabilities of existing ‘gatekeeping’.  And the ‘we’ is the entire cross-disciplinary community of scholars and researchers. We need an academic discipline and community devoted to a critical examination of scholarly research and publishing as a social and scientific phenomenon, using social science and history/philosophy of science research methods; a research community (of research on research) which is also devoted to education of all scholars, scientists, and students into a critical perspective.   Librarians seem well situated to engage in this project in some ways, although in others it may be unrealistic to expect.


Filed under: General

Islandora: Announcing the Islandora GIS Interest Group

planet code4lib - Fri, 2015-01-09 14:34

A new Islandora Interest Group has been convened by James Griffin of Lafayette College Libraries. The GIS IG is looking for interested members to join in discussions about how to handle geospatial data sets in Islandora. As with our other Interest Groups, group documents and membership details are handled through GitHub.

  • The primary objective of this interest group is to aim to release a set of functionality that provides for members of the Islandora Community the ability to ingest, browse, and discover geospatial data sets
    • Considered to be essential to this solution is the ability to visualize geospatial vector and raster data sets (features and coverages), as well as the ability to index Fedora Commons Objects within Apache Solr using key geographic metadata fields
  • In doing do, this interest group addresses the preservation, access, and discovery of geospatial data sets (these data sets, not being limited to but including Esri Shapefiles [vector data sets] and images in the GeoTIFF [raster data sets]).
  • It enables and structures descriptive and technical metadata for these data sets (including the usage of ISO 19139-compliant XML Documents referred to as "Federal Geographic Data Committee" (FGDC) Documents and MODS Documents)
  • It explores and defines a series of best practices for the generation of common and open standards for the serialization of entities within geospatial data sets (such as the Keyhole Markup Language Documents and GeoJSON Objects)
  • It captures and manages user stories involving the management of geospatial metadata, and to refactor these into feature or improvement requests for the solution being implemented by the Islandora Community

If you are interested in joining up, please reply on this listserv thread or contact the convenor.

Library of Congress: The Signal: Web Archive Management at NYARC: An NDSR Project Update

planet code4lib - Fri, 2015-01-09 14:21

The following is a guest post by Karl-Rainer Blumenthal, National Digital Stewardship Resident at the New York Art Resources Consortium (NYARC).

A tipping point from traditional to emergent digital technologies in the regular conduct of art historical scholarship threatens to leave unprepared institutions and their researchers alike in a “digital black hole.” NYARC–the partnership of the Frick Art Reference Library, the Museum of Modern Art Library and the Brooklyn Museum Library & Archives–seeks to institute permanent and precedent-setting collecting programs for born-digital primary source materials that make this black hole significantly more gray.

Since the 2013 grant from the Andrew W. Mellon Foundation, for instance, NYARC has archived the web presences of its partner museums and those of prominent galleries, auction houses, artists, provenance researchers and others within their traditional collecting scopes. While working to define description standards and integrating access points with those of traditional resources, NYARC has further leveraged this leadership opportunity by designing this current National Digital Stewardship Residency project, which is to concurrently prepare their nascent collections for long-term management and preservation.

Archiving MoMA’s many exhibit sites preserves them for future art historians, but only if critical elements aren’t lost in the process.

Stewarding web archives to the future generations that will learn from them requires careful planning and policymaking. Sensitive preservation description and reliable storage and backup routines will ultimately determine the accessibility of these benchmarks of our online culture for future librarians, archivists, researchers and students. Before we can plan and prepare for the long term, however, it is incumbent upon those of us with responsibility to steward especially visually rich and complex cultural artifacts to assure their integrity at the point of collection–to assure their faithful rendition of the extent, behavior and appearance of visual information transmitted over this uniquely visual medium.

Quality assurance (QA)–the process of verifying and/or making the interventions necessary to improve the accuracy and integrity of archived web-based resources at the point of their collection–was therefore the logical place to begin defining long term stewardship needs.  As I quickly discovered, though, it also happens to be one of the slipperiest issues for even experienced web archivists. Like putting together a jigsaw puzzle, its success begins with having all of the right pieces, then requires fitting those pieces together in the correct order and sequence, and ultimately hinges on the degree to which our final product’s ‘look and feel’ resembles that of our original vision.

Unless and until the technologies that we use to crawl and capture content from the live web can simply replicate every conceivable experience that any human browser may have online, we are compelled to decide which specific properties of equally sprawling and ephemeral web presences are of primary significance to our respective missions and patrons, and which therefore demand our most assiduous and resource-intensive pursuit.

Determining those priority areas and then finding the requisite time and manpower to do them justice is challenging enough to any web archiving operation. To a multi-institutional partnership sharing responsibility for aesthetically diverse but equally rich and complex web designs, it’s enough to stop you right in your tracks. To keep NYARC’s small army of graduate student QA technicians all moving in the same direction as efficiently as possible, and to sustain a model of their work beyond the end of their grant-funded terms, I’ve therefore spent the bulk of this first phase to my NDSR project building towards the following procedural reference guide. I now welcome the broader web archiving community to review, discuss and adapt this to their own use:

This living document will be updated to reflect technical and practical developments throughout and beyond the remainder of my residency. In the meantime, it will provide NYARC’s decision-makers, and others who are designing permanent web archiving programs, an executive summary of the principles and technologies that influence the potential scopes of QA work. Its procedural guidelines walk our QA technicians through their regular assessment and documentation process. Perhaps most importantly, this roadmap directs them to the areas where they may make meaningful interventions, indicates where they alternatively must rely on help from our software service providers, Archive-It, and flags where future technical development still precludes any potential for improvement. Finally, it inventories the major problem areas and improvement strategies presently known to NYARC to make or break the whole process.

This iteration of NYARC’s documentation is the product of expansive literature review, hands-on QA work, regular consultation and problem solving with interns and professional staff, and the generous advice of colleagues throughout the community. As such, it has prepared me not only for upcoming NDSR project phases focused on preservation metadata and archival storage, but also for a much longer career in digital preservation.

As any such project must, it hinges the success of any rapidly acquired technical knowledge or expertise to equally effective project management, communication and open documentation–skill sets that every emergent professional must cultivate in order to have a permanent role in the stewardship of our always tumultuous digital culture. I’m sure that this small documentation effort will provide NYARC, and similar partners in the field, with the tools to improve the quality of their web archives. Also, I sincerely hope that it provides a model of practice to sustain such improvements over radical and unforeseen technological changes–that it makes the digital black hole just a little more gray.

Shelley Gullikson: Web Rewriting Sprint

planet code4lib - Fri, 2015-01-09 14:08

At the end of October, I was watching tweets coming out of a UX webinar and saw this:

I thought it sounded great, so ran it by Web Committee that same week and we scheduled a sprint for the end of term. Boom. I love it when an idea turns into a plan so quickly!

We agreed that we needed common guidelines for editing the pages. I planned to point to an existing writing guide, but decided to draft one using examples from our own site.

I put together a spreadsheet of all the pages linked directly from the home page or navigation menus, plus all the pages owned by admin or by me. Subject guides and course guides were left out. The committee decided to start with content owned by committee members, rather than asking permission to edit other staff members’ content. We prioritized the resulting list of 57 pages (well, 57 chunks of content – some of those were Drupal “books” with multiple pages).

Seven of us got together on an early December afternoon (six in the room, one online from the East Coast). Armed with snacks, we spent 90 minutes editing and got through most of our top and mid-priority pages.

It was a very positive experience. We got a second set of eyes on content that may have only ever been looked at by one person. We were able to talk to each other to get feedback on clear and concise wording. And we saw pages that were already pretty good, which was a nice feeling too.

We’ve organized another sprint for reading week in February. We’re going to look at the top priority pages again, to see if we can make them even clearer and more concise.

 


DPLA: From Book Patrol: Happy New Calendar!

planet code4lib - Fri, 2015-01-09 14:00

Now that we have our new calendar in place to help track the year ahead let’s have a look back at some of the thousands of calendars available for your perusal at the DPLA. Derived from the Latin word kalendae, which was the name of the first day of every month, there are as many varieties of calendars as there are days of the month.

From a 12th century Book of Hours to a 16th century perpetual calendar to a Native American calendar on buckskin to a handwritten calendar by Lee Harvey Oswald, there is no shortage of creative ways to track time and in many cases to advertise ones business.

Enjoy!

LITA: Tech Tools in Traditional Positions

planet code4lib - Fri, 2015-01-09 12:00

During this winter break, I’ve had a slight lull in library work and time to reflect on my first semester of library school, aside from reading for pleasure and beginning Black Mirror on Netflix (anybody?). Overall, I’m ready to dive in to the new semester, but one tidbit from fall semester keeps floating in my thoughts, and I’m curious what LITA Blog readers have to say.

Throughout my undergraduate education at the University of Nebraska-Lincoln, I was mainly exposed to two different sets of digital humanities practices: encoding and digital archive practices, and text analysis for literature. With my decision to attend library school, I assumed I would focus on the former for the next two to three years.

Last semester, in my User Services and Tools course, we had a guest speaker from User Needs Assessment in the Indiana University Libraries. As the title suggests, he spoke about developing physical user spaces in the libraries and facilitating assessments of current spaces.

For one portion of his assessments, he used text analysis, more specifically topic modeling with MALLET, a Java-based, natural language processing toolkit, to gain a better understanding of written survey results. This post by Shawn Graham, Scott Weingart, and Ian Milligan explains topic modeling, when/how/why to use it, and various tools to make it happen, focusing on MALLET.

If you didn’t follow the links, topic modeling works by aggregating many texts a user feeds into the algorithm and returns sets of related words from the texts. The user then attempts to understand the theme presented by each set of words and give reason to why it appears. Many times, this practice can reveal themes the user may not have noticed through traditional reading across multiple texts.

Image courtesy of Library Technology Consultants.

From a digital humanities perspective, we love it when computers show us things we missed or help make a task more efficient. Thus, using topic modeling seems an intuitive step for analyzing survey results, as the guest speaker presented. Yet, was also unexpected considering his more traditional position.

I’m curious where you have used some sort of technology, coding, or digital tool to solve a problem or expedite a process in a more traditional library position. Librarians working with digital objects use these technologies and practices daily, but as digital processes, such as topic modeling and text analysis, become more widely used, I’m interested to see where else they crop up and for which reasons.

Feel free to respond with an example of when you unexpectedly used text analysis or another tech tool in your library to complete a task that didn’t necessarily involve digital objects! How did you discover the tool? How did you learn it? Would you use it again?

LITA: Jobs in Information Technology: January 8

planet code4lib - Thu, 2015-01-08 21:09

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

Associate Dean for Technology and Digital Strategies, The Pennsylvania State University Libraries, University Park, PA

Visit the LITA Job Site for more available jobs and for information on submitting a  job posting.

 

DPLA: Metadata Aggregation Webinar: January 22, 2015 at 2 PM Eastern

planet code4lib - Thu, 2015-01-08 19:14

Metadata is basis of the work of DPLA. We rely on a growing network of Content Hubs, large repositories of digital content, and Service Hubs that aggregate metadata from partners. We, in turn, aggregate the Hubs’ metadata into the DPLA datastore.

With new Hubs, we often work together to identify organizational and governance structures that make the most sense for their local situation. Once an administrative model is established, the practical matter of how to aggregate their partners’ metadata and how to deal with deal with quality control over the resulting aggregated set plays a larger role.

DPLA’s Hub network does not rely on a single metadata aggregation workflow or tool, and own aggregation practices are quite a bit different from our partners’. While diversity in approaches is good in that each Hub can create a process that works best for them, it also means that our community hasn’t decided on a set of standard practices or tools.

We’ve recently implemented an application process for new Hubs, so it seems timely to start a conversation about metadata aggregation practices among our current and potential Hubs, their partners, and really, anyone else interested in sharing and enhancing metadata. It seems that there’s always something to learn about metadata aggregation, and we’re hopeful that DPLA can be a conduit for a discussion about some of the fundamental concepts and requirements for local practice and aggregation at scale.

To that end, on January 22, at 2 pm eastern, we will be hosting a webinar about metadata aggregation. We’ll be taking an inside look at aggregation best practices at two of our DPLA Service Hubs in North Carolina and South Carolina. In addition, DPLA has been working on improving our existing tools as well as creating some new ones for metadata aggregation and quality control.We’d like to share what’s in place and preview some of our plans and we hope to get feedback on future directions.

Speakers:

This webinar will be offered to the public. Since we’ll be limited to 100 seats, please limit registration to no more than two seats per organization. Please get in touch with Gretchen with any questions.

Hydra Project: Indiana University and WGBH Boston share major NEH grant

planet code4lib - Thu, 2015-01-08 09:31

Hydra Partners Indiana University and WGBH Boston have jointly been awarded nearly $400,000 by the National Endowment for the Humanities to develop HydraDAM2, a Hydra-based software tool that will assist in the long-term preservation of audio and video collections.

HydraDAM2 will primarily address challenges posed by long-term preservation of digital audio and video files. Because these “time-based media” files are significantly larger than many other digital files managed by libraries and archives, they potentially require special solutions and workflows.

An important feature of HydraDAM2 is that it will be open source and can be used and shared freely among cultural institutions, including libraries, archives, universities and public broadcasters.

HydraDAM2 is also scalable to both small and large organizations, having the ability to interact with massive digital storage systems as well as with smaller digital tape storage systems.

The full press release can be found at http://news.indiana.edu/releases/iu/2014/12/neh-grants-digital-preservation.shtml

Congratulations to Jon Dunn at IU, Karen Cariani at WGBH and to their teams.

District Dispatch: Key ALA Offices to team up with “Advocacy Guru” at 2015 Midwinter Meeting

planet code4lib - Thu, 2015-01-08 06:06

In her inaugural column for American Libraries, titled “Advocate. Today.,” American Library Association (ALA) President Courtney Young challenged librarians of all types, and friends of libraries, to commit to spending just an hour a week advocating for libraries. To take the mystery out of just what “advocacy” means, how to do it and how to have fun along the way, ALA’s Offices of Intellectual Freedom (OIF), Library Advocacy (OLA), Public Information and the Washington Office will partner with all ALA divisions to present “An Hour a Week: Library Advocacy is Easy!!!” during the 2015 ALA Midwinter Meeting in Chicago. The program is being cosponsored and will be co-promoted by all of ALA’s twelve divisions.

Grassroots advocacy guru Stephanie Vance leads a session at a previous ALA conference.

The session, which will be held on Saturday, January 31, 2015, from 10:30–11:30 a.m., will be led by the ever-popular “Advocacy Guru,” Stephanie Vance, who will walk “newbies” and “old pros” alike through just what advocacy means today–from engaging with the local PTAs and library boards to lobbying the White House. With the help of panelists from OIF and OLA, Vance will share easy advocacy strategies and lead a lightening tour of the many terrific ALA advocacy resources available to give “ALAdvocates” everything they need to answer Courtney’s call.

For fully half the program, Vance also will “dive” into the audience to “get the story” of what works in the real world from real librarians and library supporters from all parts of the profession. These everyday advocates will share their own hands-on experiences at bringing libraries’ messages to allies, policy-makers and elected officials at every level and explain why every librarian’s (and library lover’s) involvement in advocacy is so critical to the future of libraries of every kind everywhere.

Additional speakers include Marci Merola, director of the ALA Office for Library Advocacy and Barbara Jones, director of the ALA Office for Intellectual Freedom.

View other ALA Washington Office Midwinter Meeting conference sessions

The post Key ALA Offices to team up with “Advocacy Guru” at 2015 Midwinter Meeting appeared first on District Dispatch.

Library of Congress: The Signal: Digital Preservation in Mid-Michigan: An Interview with Ed Busch

planet code4lib - Wed, 2015-01-07 18:12

Conferences, meetings and meet-ups are important networking and collaboration events that allow librarians and archivists to share digital stewardship experiences. While national conferences and meetings offer strong professional development opportunities, regional and local meetings offer opportunities for practitioners to connect and network with a local community of practice. In a previous blog post, Kim Schroeder, a lecturer at the Wayne State University School of Library and Information Science, shared her experiences planning and holding Regional Digital Preservation Practitioners (RDPP) in Detroit. In this post, part of our Insights Interview series, I’m excited to talk to Ed Busch, Electronic Records Archivist at Michigan State University, about his experiences spearheading the Mid-Michigan Digital Practitioners Group.

Erin: Please tell us a little bit about yourself and what you do at Michigan State University.

Ed Busch with MSU archives collection of civil war documents. Photo courtesy of Communications and Brand Strategy.

Ed: I come from what I suspect is a unique background for an archivist. I have an undergraduate B.S. in Fisheries from Humboldt State University in California, took coursework in programming (BASIC, FORTAN, APL), worked as a computer operator (loading punch cards and hanging tapes), performed software testing as well as requirements writing, and was a stay-at home dad for a period of time.

It was during this period, that I looked into librarianship; I thought I could bring an IT background along with my love of history and genealogy to the field. After I completed my MLIS and Archives Administration certificate at Wayne State in 2007, I began a processing archivist position at the MSU Archives that lead to my current position as the Electronic Records Archivist.

As an archivist here, I work on a lot of different projects. This includes “digital projects” such as web crawling (via Archive-It), adding content to our website, managing our Archivists’ Toolkit installation, managing a couple of listservs (Michigan Archival Association and Big10 Archivists), working on our Trusted Digital Repository workflows and identifying useful tools to aid processing digital records. I also continue to do some paper processing, manage our Civil War Letters and Diaries Digitization project and the development of an AV Digitization Lab at the archives. I’m also the first person staff consults for PC or network issues at the archives.

Erin: How are you involved in digital preservation at your organization?

Ed: I supported my fellow electronic records archivist Lisa Schmidt on a NHPRC grant to create the Spartan Archive, a repository for Michigan State University’s born-digital institutional administrative records. For the grant, we focused on MSU’s Office of the Registrar digital records.

As a follow-on to the grant we are working on creating a Trusted Digital Repository for MSU. We are currently ingesting digital records using Archivematica into a preservation environment. Lisa and an intern do most of the actual ingesting while I provide technical advice, create workflows for unique items and identify useful tools. We are also evaluating applications that can help manage our digital assets and to provide access to them.

One area that has been on the “To Do list” is processing the digital assets from our university photographers and videographers. The challenges include selecting what to keep and what not, how to provide access and how to fund the storage for this large amount of data. I’ve also explored some facial recognition applications but haven’t found a good way to integrate into our TDR yet.

I’m also the person doing all the web archiving for the University and testing out migrating ArchivesSpace so that we can schedule a transition for it. Besides the Mid-Michigan Digital Practitioners (MMDP) meeting planning, I also attend meetings of Web Developers here at MSU (WebDev CAFE) and am a volunteer on the ArchivesSpace Technical Advisory Council.

Erin: Could you talk about Mid-Michigan Digital Practitioners Group. You have had some very successful regional meetings over the past couple of years. Can you tell us more about these?

Presentation during the first MMDP. Photo credit: Courtesy of MSU Archives.

Ed: In February of 2013, I heard about a new group for Digital Preservation Practitioners in the Detroit/Ann Arbor/Toledo/Windsor Area. I recall thinking that this sounds neat and wanted to explore if there was interest in holding a session for Mid-Michigan Digital Preservation Practitioners with the purpose to get together and talk about what the various institutions are doing: projects, technologies, partners, etc.

After contacting some of my colleagues about this, the answer was a resounding yes! Portia Vescio (Assistant Director of the Archives) and myself contacted Digital Curation Librarian Aaron Collie and we created Mid-Michigan Digital Practitioners. Systems Librarian Ranti Junus joined the three of us to form the Mid-Michigan Digital Practitioners planning group.  We’ve had great support from the MSU Archives and MSU Libraries leadership for this effort.

We held our first meeting at MSU in August of 2013. From the beginning, we’ve been big on using email and surveys to get ideas and help from the Mid-Michigan professionals working with digital materials. For this first meeting, we came up with a rough agenda and started soliciting presenters to talk about what they were working on. We also communicated with Kim and Lance [Stuchell]’s group to keep them in the loop. There was some concern that there were two groups but we really wanted to serve the needs of the Mid-Michigan area. Many smaller shops don’t have the resources to go far. At that first meeting, we had over 50 attendees from around 15 different institutions. What most people kept saying they liked best was the chance to talk to other people trying to solve the same problems.

We held the second meeting in March 2014 at Grand Valley State University with over 50 attendees from 24 different institutions. We repeated the process and held the third meeting at Central Michigan University this past September with 50 attendees from over 20 institutions.

We’re now just starting the planning for the 4th meeting for March 27, 2015 at the University of Michigan in Ann Arbor. We have high hopes for a great meeting and hopefully some student involvement from the U of Michigan School of Information and Wayne State University School of Library Science. We’ve also setup a listserv (mmdp@list.msu.edu) to aid communication.

Erin: What did you feel was most successful about your meetings?

Participants at the first MMDP. Photo credit: Courtesy of MSU Archives.

Ed:  I think what’s been most successful is creating a chance for archivists, librarians and museum curators from all types and sizes of institutions to share experiences, what’s worked, what hasn’t, nifty tools, cool projects, etc. about their digital materials. Feedback from the meetings has this as the thing most people liked best. We also really do try to use the feedback we get to improve each meeting, try out new things and talk about what people are interested in learning more about.

Erin: What kind of impact do you think these meetings have had on your community and the organizations in your region?

Ed:  I think our greatest contribution to the region has been creating a place for professionals from large and small institutions to see what’s happening in the area of digital materials and to share experiences. Digital materials have the same issues/problems/situations for all of us; the main difference being what resources we can use to deal with them. By providing a forum for people to meet, hopefully everyone can get ideas to take back with them and to have information they can share with their leadership on the importance of this work.

Erin: What one piece of advice would you offer others who may be interested in starting up a regional practitioners group?

Ed:  One thing that I believe has made our group able to keep going is that the core planning group is all located at MSU. We can meet every few weeks to work on the next meeting, assign tasks and share information with the host institution. Saying that, for the next MMDP meeting, we are expanding our planning group to include a few other people to call in to the planning meetings. We’ll see how that works and regroup if needed or possibly add some more. Flexibility is important.

I do sincerely believe though that what really makes a difference is the interest and commitment of the planning team and its leadership at the Archives and Libraries to keep this going even though we each have a lot on our plates. We feel this is vital to the community of archivists, librarians and curators in the area.

Erin White: Easier access for databases and research guides at VCU Libraries

planet code4lib - Wed, 2015-01-07 15:00

Today VCU Libraries launched a couple of new web tools that should make it easier for people to find or discover our library’s databases and research guides.

This project’s goal was to help connect “hunters” to known databases and help “gatherers” explore new topic areas in databases and research guides1. Our web redesign task force identified these issues in 2012 user research.

1. New look for the databases list

Since the dawn of library-web time, visitors to our databases landing page were presented with an A to Z list of hundreds of databases with a list of subject categories tucked away in the sidebar.

The new design for the databases list presents a few ways to get at databases, in this order:

For the hunters:

  • Search by title with autocomplete (new functionality)
  • A to Z links

For the gatherers:

  • Popular databases (new functionality)
  • Databases by subject

And, on database subject pages and database search results, there are links to related research guides.

2. Suggested results for search

Building on the search feature in the new database list, we created an AJAX Google Adwords-esque add-on to our search engine (Ex Libris’ Primo) that recommends databases or research guides results based on the search query. For longer, more complex queries, no suggestions are shown.

Try these queries:

Included in the suggested results:

3. Updates to link pathways for databases

To highlight the changes to the databases page, we also made some changes to how we are linking to it. Previously, our homepage search box linked to popular databases, the alphabet characters A through Z, our subject list, and “all”.

The intent of the new design is to surface the new databases list landing page and wean users off the A-Z interaction pattern in lieu of search.

The top three databases are still on the list both for easy access and to provide “information scent” to clue beginner researchers in on what a database might be.

Dropping the A-Z links will require advanced researchers to make a change in their interaction patterns, but it could also mean that they’re able to get to their favorite databases more easily (and possibly unearth new databases they didn’t know about).

Remaining questions/issues
  • Research guides search is just okay. The results are helpful a majority of the time and wildly nonsensical the rest of the time. And, this search is slowing down the overall load time for suggested results. The jury is still out on whether we’ll keep this search around.
  • Our database subject categories need work, and we need to figure out how research guides and database categories should relate to each other. They don’t connect right now.
  • We don’t know if people will actually use the suggested search results and are not sure how to define success. We are tracking the number of clicks on these links using Google Analytics event tracking – but what’s good? How do we know to keep this system around?
  • The change away from the A-Z link list will be disruptive for many and was not universally popular among our librarians. Ultimately it should be faster for “hunters”, but we will likely hear groans.
  • The database title search doesn’t yet account for common and understandable misspellings2 of database names, which we hope to rectify in the future with alternate titles in the metadata.
Necessary credits

Shariq Torres, our web engineer, provided the programming brawn behind this project, completely rearchitecting the database list in Slim/Ember and writing an AJAX frontend for the suggested results. Shariq worked with systems librarians Emily Owens and Tom McNulty to get a Dublin Core XML file of the databases indexed and searchable in Primo. Web designer Alison Tinker consulted on look and feel and responsified the design for smaller-screen devices. A slew of VCU librarians provided valuable feedback and QA testing.

  1. I believe this hunter-gatherer analogy for information-seeking behaviors came from Sandstrom’s An Optimal Foraging Approach to Information Seeking and Use (1994) and have heard it in multiple forms from smart librarians over the years.
  2. Great info from Ken Varnum’s Database Names are Hard to Learn (2014)

Pages

Subscribe to code4lib aggregator