You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 2 hours 32 min ago

PeerLibrary: Towards Open Access to Research and Knowledge for Development

Tue, 2014-10-14 04:00

Beyond establishing an online database of publicly accessible academic articles, PeerLibrary has committed itself to providing an open space where people are encouraged to collaborate and communicate with one another. This week we want to highlight a fascinating article in our database, titled “Towards Open and Equitable Access to Research and Knowledge for Development”, by Professor Leslie Chan, et. al. at the University of Toronto. Professor Chan’s team focuses on the importance of research and the necessity for anyone, not just academic scholars, to be able to engage and conduct research as well as share and critique one another’s ideas. He claims that only open collaboration will effectively promote human development and allow the merging of different identities. With the words of Professor Chan in mind, PeerLibrary hopes to aid the development of free collaboration under the guiding principle of the universal right to education.

Dan Scott: DCMI 2014: schema.org holdings in open source library systems

Tue, 2014-10-14 01:07

My slides from DCMI 2014: schema.org in the wild: open source libraries++.

Last week I was at the Dublin Core Metadata Initiative 2014 conference, where Richard Wallis, Charles MacCathie Nevile and I were slated to present on schema.org and the work of the W3C Schema.org Bibliographic Extension Community Group (#schemabibex). As a first-timer at DCMI, I wasn't sure what kind of an audience to expect: there is a peer-reviewed papers track, and a series of sessions on a truly intimidating topic (RDF Application Profiles), but on the other hand our own topic was fairly basic. As it turned out, there was an invigoratingly mixed set of backgrounds present, and Eric Miller's opening keynote, which gave an oral history of the origins of DCMI and a look towards the future challenges for the organization, reassured me that I wasn't going to be out of my depth.

Special kudos to Eric for his analogy of the Web to a credit card, which offers both human-readable and machine-readable data. A nice, clean image!

Richard, Charles and I opted to structure our 1.5 hour session as a series of short talks followed by a long period of discussion. However, as often happens, the excitement of speaking in front of a room that drew so many attendees that we had to jam with more chairs led to that plan breaking down. I cut my own materials back to illustrating how one of my primary contributions to the #schemabibex effort--representing library holdings using schema.org's GoodRelations-based Product/Offer model--had been implemented in free software library systems, including Evergreen, Koha, and VuFind. I walked from a basic bibliographic record (represented as a Product), through to the associated borrowable items (represented as Offers with a price of $0.00, call numbers as SKUs, and barcodes as serialNumbers), that were offered by a specific Library with its own set of operating hours, address, and contact information... all published out of the box as RDFa in modern Evergreen systems.

I did stray a little to posit that the use case for schema.org is not and should not be limited to "search engine optimization", but that this very simple level of structured data could fairly easily form the basis of an API. In the rather limited discussion that we were able to hold at the end of the session (and encroaching on break time), Charles counselled that libraries shouldn't really bother with dumbing down their beautiful metadata simply to publish schema.org... while I countered that the pursuit of publishing beautiful metadata in the past has generally led librarians to publish no metadata at all, and that schema.org was a great first step towards building a web of cultural heritage metadata meant for machine consumption.

I wish I could have stayed longer at DCMI, but it was Thanksgiving in Canada and there were families to visit and feast with--not to mention children to help take car of--so I had to depart after just a day and a half. I'm encouraged by the steps the organization is taking to renew itself, and I hope to be able to participate again in the future.

DPLA: DPLA and DigitalNZ present GIF IT UP, an international GIF-making competition: October 13 – December 1, 2014

Mon, 2014-10-13 14:28

It’s a public domain celebration! The Digital Public Library of America and DigitalNZ are very excited to announce the launch of GIF IT UP, an international competition over the next six weeks to find the best GIFs reusing public domain and openly licensed digital video, images, text, and other material available via our search portals. The winners will have their work featured and celebrated online at the Public Domain Review and Smithsonian.com. Pretty sweet, huh?

About GIF IT UP

Cat Galloping (1887). The still images used in this GIF come from Eadweard Muybridge’s “Animal locomotion: an electro-photographic investigation of consecutive phases of animal movements” (1872-1885). Courtesy USC Digital Library, 2010. View original record (item is in the public domain). GIF available under a CC-BY license.

How it works. The GIF IT UP competition has six categories:

  1. Animals
  2. Planes, trains, and other transport
  3. Nature and the environment
  4. Your hometown, state, or province
  5. WWI, 1914-1918
  6. GIF using a stereoscopic image

A winner will be selected in each of these categories and, if necessary, a winner will be awarded in two fields: use of an animated still public domain image, and use of video material.

To view the competition’s official homepage, visit http://dp.la/info/gif-it-up/.

Judging. GIF IT UP will be co-judged by Adam Green, Editor of the Public Domain Review and by Brian Wolly, Digital Editor of Smithsonian.com. Entries will be judged on coherence with category theme, thoroughness of entry (correct link to source material and contextual information), creativity, and originality.

Gallery. All entries that meet the criteria outlined below in the Guidelines and Rules will be posted to the GIF IT UP Tumblr Gallery. The gallery entries with the most amount of Tumblr “notes” will receive the people’s choice award and will appear online at the Public Domain Review and Smithsonian.com alongside the category winners.

Submit. To participate, please first take a moment to read “How it Works” and the guidelines and rules on the GIF IT UP homepage, and then submit your entry by clicking here.

Deadline. The competition deadline is December 1, 2014 at 5:00 PM EST / December 2, 2014 at 10:00 AM GMT+13.

GIFtastic Resources. You can find more information about GIF IT UP–including select DPLA and DigitalNZ collections available for re-use and a list of handy GIF-making tips and tools–over on the GIF IT UP homepage.

Questions. For questions or other inquiries, email us at info@digitalnz.org or info@dp.la, or tweet us @digitalnz or @dpla. Good luck and happy GIFing!

 All written content on this blog is made available under a Creative Commons Attribution 4.0 International License. All images found on this blog are available under the specific license(s) attributed to them, unless otherwise noted.

LITA: ADE in the Library eBook Data Lifecycle

Mon, 2014-10-13 13:53

Reader: “Hey, I heard there is some sort of problem with those ebooks I checked out from the library?”

Librarian: “There are technical problems, potential legal problems, and philosophical problems – but not with the book itself nor your choice to read it.”

As mentioned, there are (at least) three sides to the problem. Nate Hoffelder* discovered the technical problem with the way the current version (4) of Adobe Digital Editions (ADE) manages the ebook experience, which was confirmed by security researcher Benjamin Daniel Mussler, and later reviewed by Eric Hellman. The technical problem, that arguably private data is sent in plain text from a reader’s device to a central data-store, seems pretty obvious once it was discovered. The potential legal problem stems from laws in every state which protect reader privacy which set expectations for data security, plus other laws which may apply. The philosophical problem has several facets, which could be simplified down to the tension between privacy and convenience.

When a widely-used software platform is found to be logging data unexpectedly and transmitting it for some unknown use it causes great unease among users. When that transmission is happening in plain text over easily-intercepted channels, it causes anger among technologists who think a leading software developer should know better. When this is all happening in the context of the library world where privacy is highly valued, there is outrage as expressed by LITA Board member Andromeda Yelton.

Here are the library profession’s basic positions:

  1. Each individual’s reading choices and behavior should be private (i.e. anonymized or, better, not tracked)
  2. Data gathered for user-desired functionality across devices should be private (i.e. anonymized)
  3. Insofar as there is any tracking of reading choices and behavior, there should be an opt-out option readily available to individuals (i.e, not buried in the fine print)

In his October 9th post from The Digital Shift, Matt Enis reports that Adobe is working to correct the problem of data being transmitted in clear text but “maintains that its collection of this data is covered under its user agreement.” The data that corporations transmit should be limited to the data and data elements necessary to provide desired functionality yet also restricted enough for an individual’s activity to remain private.

To join the conversation, begin to educate yourself using our ADE Primer, below, plus the following resources:

A Primer on how Adobe Digital Editions (ADE) works with library ebooks

I’m a reader and I go to use a library ebook
(via Overdrive or other downloading service offered):

  1. what will I need to install on my device(s)?
    (laptop, tablet, phone, & iPod let’s assume)

    • laptop/computer: Adobe Digital Editions (ADE), activated with an Adobe ID
    • tablet, phone, iPod, etc.: Bluefire Reader (or compatible) app, activated with an Adobe ID
  2. how do the various devices know which page to show me next when I switch between them?
    • access and synchronization across devices are managed using the Adobe ID and the information associated with the ebook and by data tracked with ADE
  3. what technologies are behind the scenes?
    • the ADE managed digital rights management (DRM) required by the ebook publisher
    • the ebook reader software/app
    • the internet
  4. what data is needed to be able to do the sync?
    • the minimum required data is arguably the UserID, BookID, and a page-accessed timestamp
    • the current ADE version, ADE4, tracks significantly more data than the minimums above
  5. how is that data shared between devices?
    • Users can access their ADE account from up to 6 different devices. When accessing the ID/account from a new device the user must “activate” the device by logging into the Adobe ID/Account to prove that the user is the legitimate account holder.
    • ADE4 shares all ebook data it tracks in plain-text in an unsecured channel over the internet
  6. what functionality would not work if this were suddenly not provided?
    • if ADE did not provide reader tracking data, each time a reader opened an ebook on a different device the reader would have to remember the page s/he was on and then navigate to that page to continue reading from where they left off
    • A computer can be anonymously activated using ADE, however this will prevent the items from being accessible from more than one computer/device. The ebooks would then be considered to be “owned” by that computer and would not be available to be accessed from other devices.
    • if ADE were completely withdrawn from availability, ebook DRM would prevent use of ADE-managed DRM-protected ebooks

From a technology point of view, the clear-text data transmitted suggests the data may be for synchronization, but it seems, first and foremost, to support various licensing business models. Because Adobe might in the future have customers who want to use Adobe DRM to expire a book after a certain number of hours or pages read, they may feel the need to collect that data. Adobe’s data collection seems to be working as intended here. Clear-text transmission is clearly a bug, but that this data about patron reading habits is being transmitted to Adobe is a feature of the software.

The philosophical discussion which needs to happen around ebooks and DRM should include:

  • what data elements enable user-desired functionality
  • what data elements enable digital rights management
  • what data elements above are/are not within ALA’s stated professional ethics
  • whether tracking ebook user behavior is acceptable *at all*

From libraryland conversations around the issue so far, opinions have ranged from ‘tracking is not the problem, the clear-text transmission is‘ to ‘tracking is very much a problem, it’s unacceptable.’

Issues like this highlight the need to revisit stated positions and evaluate where the balance point is between accomodating user functionality and protecting against collection of personally identifiable data, or metadata.

*Post updated to correctly credit Nate Hoffelder as the original discoverer (my apologies!)

John Miedema: How Watson Works in Four Steps

Mon, 2014-10-13 13:34

A good overview of how IBM’s Watson works. When humans seek to understand something and to make a decision we go through four steps.

  1. Observe visible phenomena and bodies of evidence;
  2. Draw on what we know to interpret evidence and to generate hypotheses;
  3. Evaluate which hypotheses are right or wrong; and
  4. Decide the best option and act accordingly.

So does Watson. Key to the success is the ability to process unstructured inputs using Natural Language Processing.

District Dispatch: Free webinar: Helping patrons understand Ebola

Mon, 2014-10-13 06:48

Photo by Phil Moyer

Reminder: On Tuesday, October 14, 2014, library leaders from the U.S. National Library of Medicine will host the free webinar “Fighting Ebola and Infectious Diseases with Information: Resources and Search Skills can Arm Librarians.” The webinar will teach participants how to find and share reliable health information.

Recent outbreaks across the globe and in the U.S. have increased public awareness of the potential public health impacts of infectious diseases. As a result, many librarians are assisting their patrons in finding credible information sources on topics such as Ebola, Chikungunya and pandemic influenza.

Speakers include:

Siobhan Champ-Blackwell
Siobhan Champ-Blackwell is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center. She selects material to be added to the NLM disaster medicine grey literature data base and is responsible for the Center’s social media efforts. She has over 10 years of experience in providing training on NLM products and resources.

Elizabeth Norton
Elizabeth Norton is a librarian with the U.S. National Library of Medicine Disaster Information Management Research Center where she has been working to improve online access to disaster health information for the disaster medicine and public health workforce. She has presented on this topic at national and international association meetings and has provided training on disaster health information resources to first responders, educators, and librarians working with the disaster response and public health preparedness communities.

Date: Tuesday, October 14, 2014
Time: 2:00 PM – 3:00 PM Eastern
Register for the free event

If you cannot attend this live session, a recorded archive will be available to view at your convenience. To view past webinars also done in collaboration with iPAC, please visit Lib2Gov.org.

The post Free webinar: Helping patrons understand Ebola appeared first on District Dispatch.

Patrick Hochstenbach: Homework assignment #3 Sketchbookskool

Sun, 2014-10-12 07:07
This week we were asked to go to a park and draw people using line art. It was raining, so I decided to go to the Church of Our Lady which is always a tourists attraction.   Filed under: Doodles

Cynthia Ng: Batch Appending a Single PDF to multiple PDFs

Sun, 2014-10-12 03:59
So recently, I came up to the problem of having to add a page at the end of multiple PDFs.A couple of years ago, I’d done some work with GhostScript to merge a bunch of PDFs, so I thought I’d start there. Use Case I have a bunch of PDFs, and what I have is […]

DuraSpace News: The Archivematica + DuraCloud “Soup-to-Nuts” Preservation Service Launches a Beta Test

Sun, 2014-10-12 00:00

Winchester, MA  The Archivematica + DuraCloud hosted service has launched a beta test with pilot partners that will be ongoing from October 2014 to January 2015.

Ensuring that robust Archivematica Archival Information Packages (AIPs) have a secure long-term home is the idea behind the new Archivematica + DuraCloud hosted service. The new integrated service is designed to provide users with a robust preservation workflow plus long-­term archiving in a single hosted solution.

John Miedema: Orlando: the lives and works of British women writers. Digital resources working together in unexpected and insightful ways.

Sat, 2014-10-11 19:55

Orlando is a digital resource, indexing the lives and works of British women writers.

The full name of the project is, Orlando: Women’s Writing in the British Isles from the Beginnings to the Present. It is the work of scholars Susan Brown, Patricia Clements, and Isobel Grundy. The name of the work was inspired by Virginia Woolf’s 1928 novel, Orlando: A Biography. The project, like the novel, is an important resource in the history of women’s writing. It grew out of the limitations of a print-based publication, The Feminist Companion to Literature in English. The Companion presented a great deal of research on women writers but lacked an adequate index. The researchers decided to compile a digital index.

I have the good fortune to work with Susan Brown and the Orlando resource. I have extracted bibliographic and literary data from Orlando, and intend to integrate it with unstructured literary content using Natural Language Processing. The aim is a first demonstration of how digital resources like Orlando can provide new ways of reading and understanding literature. In particular I hope to show how digital resources can work together in unexpected and insightful ways.

More information:

The Orlando Project

Bigold, Melanie (2013) “Orlando: Women’s Writing in the British Isles from the Beginnings to the Present, edited by Susan Brown, Patricia
Clements, and Isobel Grundy,” ABO: Interactive Journal for Women in the Arts, 1640-1830: Vol. 3: Iss. 1, Article 8.
DOI: http://dx.doi.org/10.5038/2157-7129.3.1.8
Available at: http://scholarcommons.usf.edu/abo/vol3/iss1/8

Orlando: A Biography. Wikipedia

 

Open Library Data Additions: An error occurred

Sat, 2014-10-11 10:12
The RSS feed is currently experiencing technical difficulties. The error is: Search engine returned invalid information or was unresponsive

Patrick Hochstenbach: My first VideoScribe project

Sat, 2014-10-11 07:11
Trying out a little animation with VideoScribe to give an introduction into the services of Ghent University Library. The illustrations were created on paper using a fineliner. I scanned them and vector traced them in Adobe Illustrator (VideoScribe need to

FOSS4Lib Upcoming Events: Code3cme

Sat, 2014-10-11 04:53
Date: Saturday, October 11, 2014 - 00:45 to Sunday, October 11, 2015 - 00:45Supports: DMP Online

Last updated October 11, 2014. Created by bunnychris on October 11, 2014.
Log in to edit this page.

Get enrolled for the refresher courses for a great medical career. To know more about the site click here .

LITA: Shifting & Merging

Sat, 2014-10-11 00:39
McKenzie Pass, Ore. Courtesy of Ryan Shattuck. Task Easy Blog 2013.

It has been exactly seven weeks since I moved to Bloomington, Indiana, yet I finally feel like I have arrived. Let me rewind, quick, and tell you a little about my background. During my last two years of undergrad at the University of Nebraska-Lincoln (UNL), I spent my time working on as many Digital Humanities (DH) projects and jobs as I possibly could in the Center for Digital Research in the Humanities.

[DH is a difficult concept to define because everyone does it through various means, for various reasons. To me, it means using computational tools to analyze or build humanities projects. This way, we can find patterns we wouldn't see through the naked eye, or display physical objects digitally for greater access.]

By day, I studied English and Computer Science, and by night, my fingers scurried over my keyboard encoding poems, letters, and aphorisms. I worked at the Walt Whitman Archive, on an image analysis project with two brilliant professors, on text analysis and digital archives projects with leading professors in the fields, and on my own little project analyzing a historical newspaper. My classmates and I, both undergraduate and graduate, constantly talked about DH, what it is, who does it, how it is done, the technologies we use do it and how that differs from others.

Discovering an existing group of people already doing the same work you do is like merging onto a packed interstate where everyone is travelling at 80 miles per hour in the same direction. The thrill, the overwhelming “I know I am in the right place” feeling.

I chose Indiana University (IU) for my Library and Information Science degrees because I knew it was a hub for DH projects. I have an unparalleled opportunity working with Dr. John Walsh and Dr. Noriko Hara, both prominent DH and Information Science scholars.

However, I am impatient. After travelling on the DH interstate, I expected every classmate I met at IU to wear a button proclaiming, “I heart DH, let’s collaborate.” I half expected my courses to start from where I left off in my previous education. The beginning of the semester forced me to take a step back, to realize that I was shifting to a new discipline, and that I needed the basics first. My classes are satisfying my library love, but I was still missing that extra-curricular technology aspect, outside of my work for Dr. Walsh.

Then, one random, serendipitous meeting in the library and I was “zero to eighty” instantly. I met those DH students and learned about projects, initiatives, and IU networking. They reaffirmed that the community for which I was searching existed.

Since then, I have found others in the community and continue those same DH who, what, how, why conversations. While individual research is important, we can reach a higher potential through collaboration, especially in the digital disciplines. I am continuing to learn the importance of reaching out and learning from others, which I don’t believe will cease once I graduate. (Will it?)

I assure you that my future posts will be more closely related to library technology and digital humanities tools, but frankly, I’m new here. While I could talk about the library and information theory I’m learning, I will spare you those library school memories, and keep you updated on new technologies as I learn them.

In the meantime, I’ll ask you to reflect and share your experience transitioning to library school or into a library career. How were you first introduced to library technology or digital humanities? Any nuggets of advice for us beginners?

LITA: 2014 LITA Forum: 3 Amazing Keynotes

Fri, 2014-10-10 17:10

Join your LITA colleagues in Albuquerque, Nov 5-8, 2041 for the 2014 LITA Forum.

This year’s Forum has three amazing keynotes you won’t want to miss:

AnnMarie Thomas, Engineering Professor, University of St. Thomas

AnnMarie is an engineering professor who spends her time trying to encourage the next generation of makers and engineers. Among a host of other activities she is the director of the Playful Learning Lab and leads a team of students looking at both the playful side of engineering (squishy circuits for students, the science of circus, toy design) and ways to use engineering design to help others. AnnMarie and her students developed Squishy Circuits.

Check out AnnMarie’s fun Ted Talk on Play-Doh based squishy circuits.

Lorcan Dempsey, Vice President, OCLC Research and Chief Strategist

Lorcan Dempsey oversees the research division and participates in planning at OCLC. He is a librarian who has worked for library and educational organizations in Ireland, England and the US.

Lorcan has policy, research and service development experience, mostly in the area of networked information and digital libraries. He writes and speaks extensively, and can be followed on the web at Lorcan Dempsey’s weblog and on twitter.

Kortney Ryan Ziegler, Founder Trans*h4ck

Kortney Ryan Ziegler is an Oakland based award winning artist, writer, and the first person to hold the Ph.D. of African American Studies from Northwestern University.

He is the director of the multiple award winning documentary, STILL BLACK: a portrait of black transmen, runs the GLAAD Media Award nominated blog, blac (k) ademic, and was recently named one of the Top 40 Under 40 LGBT activists by The Advocate Magazine and one of the most influential African Americans by TheRoot100.

Dr. Ziegler is also the founder of Trans*H4CK–the only tech event of its kind that spotlights trans* created technology, trans* entrepreneurs and trans* led startups.

See all the keynoters full bios at the LITA Forum Keynote Sessions web page

More than 30 concurrent colleague inspired sessions and a dozen poster sessions will provide a wealth of practical information on a wide range of topics. Networking opportunities, a major advantage of a smaller conference, are an important part of the Forum. Take advantage of the Thursday evening reception and sponsor showcase, the Friday networking dinners or Kitchen Table Conversations, plus meals and breaks throughout the Forum to get to know LITA leaders, Forum speakers, sponsors, and peers.

This year two preconference workshops will also be offered.

Linked Data for Libraries: How libraries can make use of Linked Open Data to share information about library resources and to improve discovery, access, and understanding for library users
Led by: Dean B. Krafft and Jon Corson-Rikert, Cornell University Library

Learn Python by Playing with Library Data
Led by: Francis Kayiwa, Kayiwa Consulting

2014 LITA Forums sponsors include EBSCO, Springshare, @mire, Innovative and OCLC.

Visit the LITA website for more information.

Library and Information Technology Association (LITA) members are information technology professionals dedicated to educating, serving, and reaching out to the entire library and information community.   LITA is a division of the American Library Association.

LITA and the LITA Forum fully support the Statement of Appropriate Conduct at ALA Conferences

OCLC Dev Network: WorldCat Discovery API and Linked Data

Fri, 2014-10-10 14:00

This is the second post in our series introducing the WorldCat Discovery API. In our introductory remarks on the API, we told you about how the API can be used to power all aspects of resource discovery in your library. We also introduced some of the reasons why we chose entity-based bibliographic description for the API’s data serializations over more traditional API outputs. In this post we want to explore this topic even further and take a closer look at the Linked Data available in the WorldCat Discovery API.

Library of Congress: The Signal: Archiving from the Bottom Up: A Conversation with Howard Besser

Fri, 2014-10-10 13:54

Howard Besser, Professor of Cinema Studies and Director of New York University’s Moving Image Archiving & Preservation Program and Senior Scientist for Digital Library Initiatives for NYU’s Library.

The following is a guest post from Julia Fernandez, this year’s NDIIPP Junior Fellow. Julia has a background in American studies and working with folklife institutions and worked on a range of projects leading up to CurateCamp Digital Culture in July. This is part of a series of interviews Julia conducted to better understand the kinds of born-digital primary sources folklorists, and others interested in studying digital culture, are making use of for their scholarship.

Continuing our NDSA Insights interview series, I’m delighted to interview Howard Besser, Director of New York University’s Moving Image Archiving & Preservation Program (MIAP) and Professor of Cinema Studies at NYU. He is also one of the founders of Activist Archivists, a group created in the fall of 2011 to coordinate the collection of digital media relating to the Occupy Wall Street political movement.

Julia: Could you tell us a bit about Activist Archivists?  What are the group’s objectives? What kinds of digital media are you exploring?

Howard: Activist Archivists began with the question of how archivists could help assure that digital media documenting the “Occupy” movement could be made to persist. This led us into a variety of interesting sub-areas: getting individuals making recordings to follow practices that are more archivable; documenting the corruption of metadata on YouTube and Vimeo; evangelizing for the adoption of Creative Commons licenses that would allow libraries and archives to collect and make available content created by an individual; making documenters aware that the material they create could be used against their friends; and a host of other sub-areas.

We focused mainly on moving images and sound, and to a lesser degree on still images.  As the Occupy movement began to dissipate, Activist Archivists morphed into a focus on community archiving that might be analog, digital or a hybrid. We worked with Third World Newsreel and Interference Archive and in 2014 produced the first Home Video Day in association with members of the NYC Asian American community and Downtown Community Television. And several Archivist Archivist members are on the planning committee for the 2015 Personal Digital Archiving Conference.

Julia: Could you tell us a bit about the digital materials you are working from? What made them an interesting source for you?

Peoples Library Occupy Wall Street 2011 Shankbone, shared by user David Shankbone on Flickr.

Howard: Working with Occupy, we were mainly dealing with sound and images recorded on cellphones. This was particularly interesting because of the lack of prior knowledge in the library/archiving community about how to employ the wealth of metadata that cellphones captured while recording images and sound. For example, it’s very easy to set a cellphone to capture geolocation information as part of the metadata coupled to every image or sound that is recorded. And this, of course, can raise privacy issues because a corpus of photos one takes creates an exact path of places that one has been. The other thing that made this project particularly interesting to me was how social media sites such as YouTube strip away so much metadata (including much that could be useful to archives and scholars).

Julia: What are some of the challenges of working with a “leaderless” and anti-establishment movement like Occupy?

Howard: It’s always difficult for people who have spent most of their lives in hierarchical environments to adapt to a bottom-up (instead of a top-down) structure. It means that each individual needs to take on more initiative and responsibility, and usually ends up with individuals becoming more intensively involved, and feeling like they have more of a stake in the issues and the work. I think that the toughest challenge that we experienced was that each time we met with an Occupy Committee or Group, we needed to start re-explaining things from scratch. Because each new meeting always included people who had not attended the previous meeting, we always had to start from the beginning. Other major problems we faced would always be true in all but the most severe hierarchical organizations: how do you get everyone in the organization to adopt standards or follow guidelines. This is an age-old problem that is seldom solved merely by orders from above.

Julia: Activist Archivists has printed a “Why Archive?” informational card that spells out the importance of groups taking responsibility for the record of their activity.  If libraries and archives wanted to encourage a more participatory mode of object and metadata gathering, what would you suggest? What would you want to see in how libraries and archives provide access to them?

Howard: One of the earliest issues we encountered with Occupy was the prevalent notion that history is documented in book-length essays about famous people. Many people in Occupy could not see that someone in the future might be interested in the actions of an ordinary person like them. Now, a third of a century after Howard Zinn’s “A People’s History Of The United States,” most progressive historians believe that history is made by ordinary individuals coming together to conduct acts in groups. And they believe that we can read history through archival collections of letters, post-cards and snapshots. Librarians, archivists and historians need to make the case to ordinary people that their emails, blogs and Flickr and Facebook postings are indeed important representations of early 21st century life that people in the future will want to access. And as librarians and archivists, we need to be aggressive about collecting these types of material and make concrete plans for access.

Julia: In a recent NDSA talk (PDF) you identified some of the challenges of archiving correspondence in the digital age. For one, “digital info requires a whole infrastructure to view it” and “each piece of that infrastructure is changing at an incredibly rapid rate”; and also “people no longer store their digital works in places over which they have absolute control,” opting instead for email services, cloud storage or social network services. What are some effective approaches you’ve seen to dealing with these challenges?

Howard: Only institutions that themselves are sustainable across centuries can commit to the types of continuous refreshing and either migration or emulation that are necessary to preserve born-digital works over time. Libraries, archives and museums are about the only long-term organizations that have preservation as one of their core missions, so effective long-term digital preservation is likely to only happen in these types of institutions. The critical issue is for these cultural institutions to get the born-digital personal collections of individuals into preservable shape (through file formats and metadata) early in the life-cycle of these works.

As we found in both the InterPARES II Project and the NDIIPP Preserving Digital Public Television Project (PDF), waiting until a digital collection is turned over to an archive (usually near the end of it’s life-cycle) is often too late to do adequate preservation (and even more difficult if the creator is dead). We either need to get creators to follow good practices (file formats, metadata, file-naming conventions, no compression, executing Creative Commons licenses, …) at the point of creation, or we need to get the creators to turn over their content to us shortly after creation. So we need to be aggressive about both offering training and guidelines and about collection development.

Updated 10/10/14 for typos.

Open Knowledge Foundation: Open Humanities Hack: 28 November 2014, London

Fri, 2014-10-10 13:42

This is a cross-post from the DM2E-blog, see the original here

On Friday 28 November 2014 the second Open Humanities Hack event will take place at King’s College, London. This is the second in a series of events organised jointly by the King’s College London Department of Digital Humanities , the Digitised Manuscripts to Europeana (DM2E) project, the Open Knowledge Foundation and the Open Humanities Working Group

The event is focused on digital humanists and intended to target research-driven experimentation with existing humanities data sets. One of the most exciting recent developments in digital humanities include the investigation and analysis of complex data sets that require the close collaboration between Humanities and computing researchers. The aim of the hack day is not to produce complete applications but to experiment with methods and technologies to investigate these data sets so that at the end we can have an understanding of the types of novel techniques that are emerging.

Possible themes include but are not limited to

  • Research in textual annotation has been a particular strength of digital humanities. Where are the next frontiers? How can we bring together insights from other fields and digital humanities?

  • How do we provide linking and sharing humanities data that makes sense of its complex structure, with many internal relationships both structural and semantic. In particular, distributed Humanities research data often includes digital material combining objects in multiple media, and in addition there is diversity of standards for describing the data.

  • Visualisation. How do we develop reasonable visualisations that are practical and help build on overall intuition for the underlying humanities data set

  • How can we advance the novel humanities technique of network analysis to describe complex relationships of ‘things’ in social-historical systems: people, places, etc.

With this hack day we seek to form groups of computing and humanities researchers that will work together to come up with small-scale prototypes that showcase new and novel ways of working with humanities data.

Date: Friday 28 November 2014
Time: 9.00 – 21.00
Location: King’s College, Strand, London
Sign up: Attendance is free but places are limited: please fill in the sign-up form to register .

For an impression of the first Humanities Hack event, please check this blog report .

Open Knowledge Foundation: This Index is yours!

Thu, 2014-10-09 20:23

How is your country doing with open data? You can make a difference in 5 easy steps to track 10 different datasets. Or, you can help us spread the word on how to contribute to the Open Data Index. This includes the very important translation of some key items into your local language. We’ll keep providing you week-by-week updates on the status of the community-driven project.

We’ve got a demo and some shareable slides to help you on your Index path.

Priority country help wanted

The amazing community provided content for over 70 countries last year. This year we set the bar higher with a goal of 100 countries. If you added details for your country last year, please be sure to add any updates this year. Also, we need some help. Are you from one of these countries? Do you have someone in your network who could potentially help? Please do put them in touch with the index team – index at okfn dot org.

DATASETS WANTED: Armenia, Bolivia, Georgia, Guyana, Haiti, Kosovo, Moldova, Morocco, Nicaragua, Ukraine, and Yemen.

Video: Demo and Tips for contributing to the Open Data Index

This is a 40 minute video with some details all about the Open Data Index, including a demo to show you how to add datasets.

Text: Tutorial on How to help build the Open Data Index

We would encourage you to download this, make changes (add country specific details), translate and share back. Please simply share on the Open Data Census Mailing List or Tweet us @okfn.

How to Global Open Data Index – Overview from School of Data

Thanks again for sharing widely!

Pages