You are here

Feed aggregator

FOSS4Lib Recent Releases: Archivematica - 1.5.0

planet code4lib - Thu, 2016-06-09 22:36

Last updated June 9, 2016. Created by Peter Murray on June 9, 2016.
Log in to edit this page.

Package: ArchivematicaRelease Date: Thursday, June 9, 2016

SearchHub: Lucidworks Fusion 2.4 Ready For Download

planet code4lib - Thu, 2016-06-09 19:35

Lucidworks is pleased to announce the release of Fusion 2.4 (download, release notes, press release). This new release features several key enhancements allowing for the rapid building and deployment of data-driven experiences.

Index Pipeline Simulator

The Index Pipeline Simulator provides a powerful interface for configuring index pipelines and previewing pipeline output with a sample data set before they are applied to the entire data source. This allows for easy debugging of index pipeline output in a sandbox environment.

Time Series Partitioning

To allow for easy management and querying of time-series data, Fusion collections can now be configured by time window. Using a configurable set of Solr collections, each time series collection stores data for that given time window. Time based queries are automatically directed to the appropriate partition.

SAML Support

Fusion now supports version 2.0 of the Security Assertion Markup Language (SAML), allowing businesses to use existing authentication identities for more finely-tuned and flexible access control.

Spark Integration Updates

Our new Spark Jobs API allows for the management and configuration of Spark jobs from Fusion, as well as retrieving cluster information.

Connector Enhancements

The Box.com connector now indexes metadata and supports authentication using OAuth 2, via the JWT Auth App Service.

The Jive connector now supports indexing of Jive groups and places.

All of the above along with faster pipeline stage processing and improved diagnostics for investigating deployment issues. Fusion 2.4 ships with Apache Solr 5.5.1 and Apache Spark 1.6.1, and is fully supported for production deployments.

Lucidworks Fusion 2.4 is available today. For more information and to download the product, please visit https://lucidworks.com/products/fusion/.

Release notes and documentation also available.

The post Lucidworks Fusion 2.4 Ready For Download appeared first on Lucidworks.com.

District Dispatch: Good news for library funding

planet code4lib - Thu, 2016-06-09 18:35

Credit: Rocky Lubbers

The Senate Appropriations Committee today delivered good news for libraries by increasing funding for LSTA Grants to States and National Leadership Grants to Libraries, while also providing level funding for Innovative Approaches to Literacy (IAL). The Labor, Health and Human Services, Education and Related Agencies Appropriations Subcommittee approved the bill just two days ago with no amendments or controversial policy riders.

The Grants to States program, which the President’s budget proposed cutting by $950,000, will instead be increased in the Senate bill by $314,000, raising its total funding to $156.1 million for FY2017. That reflects an increase of over $1.25 million from the President’s request. National Leadership Grants will also receive a $314,000 increase, bringing its total to $13.4 million. Overall, the Institute of Museum and Library Services (IMLS) will receive a $1 million increase to $231 million for FY2017.

Innovative Approaches to Literacy, just authorized in last year’s Every Student Succeeds Act (ESSA), will receive level funding in the Senate bill of $27 million for FY2017. One half of IAL funding is reserved for school library grants with the remaining reserved for non-profits.

ALA acknowledges the leadership of Senator Jack Reed (D-RI), and the deep commitment to library funding of many other key Senators, including Appropriations Committee Chairman Thad Cochran (R-MS), Subcommittee Chairman Roy Blunt (R-MO), Subcommittee Ranking Member Patty Murray (D-WA) and Senator Susan Collins (R-ME). ALA members from Maine, Mississippi, Missouri, Rhode Island, Washington are urged to send messages of thanks to these Senate offices.

The House Appropriations Committee has not yet announced a timetable for moving its Labor, Health and Human Services, Education and Related Agencies FY2017 funding bill. Despite the “no drama” Senate Subcommittee’s markup earlier this week, the overall Appropriations outlook remains very much in doubt. Few Washington insiders are expecting all 12 appropriations bills to pass the House and Senate. Rather, many are expecting one or more “Continuing Resolutions” to keep the government open beyond the October 1 start of the Fiscal Year. A messy “omnibus” spending package providing funding for numerous agencies also is expected to be considered later this year. A government shutdown, however, is not anticipated.

The post Good news for library funding appeared first on District Dispatch.

District Dispatch: What’s working to connect all Americans to the digital world?

planet code4lib - Thu, 2016-06-09 18:34

Libraries are working with government agencies and nonprofits to connect people to the digital world. From the U.S. Department of Housing & Urban Development’s ConnectHome effort to the Federal Communications Commission’s Lifeline Program to citywide digital inclusion initiatives, libraries are playing leadership roles in connecting low-income Americans online. Policy and library leaders will discuss public policy options and share exemplars of how libraries and allies are expanding digital opportunities at the 2016 American Library Association (ALA) Annual Conference.

Photo by: GSCSNJ

During the conference session “Addressing Digital Disconnect for Low-Income Americans,” leaders will explore efforts to connect disadvantaged Americans to the digital world. The session takes place on Saturday, June 25, 2016, 4:30-5:30 p.m. in the Orange County Convention Center in Room W103A.

Session speakers include Veronica Creech, chief programs officer of EveryoneOn; Felton Thomas, director of the Cleveland Public Library and president-elect of the Public Library Association (PLA); and Lauren Wilson, legal advisor to the Chief of the Consumer and Governmental Affairs Bureau at the Federal Communications Commission (FCC). Larra Clark, deputy director of the ALA Office for Information Technology Policy, will moderate the program.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post What’s working to connect all Americans to the digital world? appeared first on District Dispatch.

District Dispatch: Dr. Hayden one vote away from confirmation

planet code4lib - Thu, 2016-06-09 18:20

The Senate Rules Committee voted unanimously this afternoon to recommend that the full Senate approve the nomination of Dr. Carla Hayden to serve as the nation’s next, first female, first African American and just fourteenth Librarian of Congress in history. As the Committee’s vote was announced, ALA launched a large-scale grassroots and social media campaign to encourage all Senators to support her confirmation, and to urge Senate Majority Leader Mitch McConnell (R-KY) to schedule a Senate vote on her nomination immediately.

Photo Credit: Dave Munch/Baltimore Sun Media Group, via Associated Press

In a statement released immediately after the Committee’s vote, ALA president Sari Feldman said: “Once confirmed, she will be the perfect Librarian to pilot the Library of Congress fully into the 21st century, transforming it again into the social and cultural engine of progress and democracy for all Americans that it was meant to be.” Feldman then called upon Dr. Hayden’s supporters “in every corner of the nation” to use “ALA’s Legislative Action Center to contact every Senator — whether by email, tweet or phone — with this simple message:  Please confirm Dr. Carla Hayden now!”  Given the Rules Committee’s strong endorsement, and the absence of any public opposition to her nomination, that vote easily could come before the Senate takes its extended summer recess in mid-July, and quite possibly before the fast-approaching Independence Day recess beginning July 1st. That means there’s no time to lose to show your support for librarianship and Dr. Hayden.

Fortunately, contacting your two U.S. Senators by emailing, tweeting or phoning them couldn’t be easier. Just access ALA’s Legislative Action Center, choose your preferred method of communicating, and follow the few easy prompts. (You’ll also find more background on Dr. Hayden and the history of the Librarian’s position at the Action Center, and here, if you like.)

Once confirmed, Dr. Hayden — a past-President of ALA — will be the first professional librarian to be named Librarian of Congress in over 60 years. Don’t hide your pride!  Please, take action now — and encourage your friends and colleagues to do the same — to make that a historic reality very soon.

The post Dr. Hayden one vote away from confirmation appeared first on District Dispatch.

Islandora: New Islandora CLAW Sprint - Sign Up Now!

planet code4lib - Thu, 2016-06-09 17:50

Islandora CLAW Community Sprint 08 is coming up at the end of the month, and we want you to join in. New(ish) sprinter Ben Rosner gave us an inside look at what it's like to start working on CLAW as a developer and we have plenty of tasks that an interested newcomer can tackle for Sprint 08. Not a developer? No problem. We've got an extensive documentation ticket to build up written docs from our existing CLAW lessons videos - a great opportunity to learn more while you're creating something that will help other Islandora users.

The sign up sheet is here. We'll have a sprint kick of meeting on June 20 to sort out who is going to do what, and find everyone a job that fits their skills and interests.

Harvard Library Innovation Lab: Adam talks LIL on the Lawyerist Podcast

planet code4lib - Thu, 2016-06-09 16:40

This May, Managing Director Adam Ziegler was a guest on the Lawyerist podcast, discussing recent goings-on at the Library Innovation Lab.

Sam Glover and Adam discuss the future of law, its challenges and how the Innovation Lab endeavors to address these. Perma.cc is chiefly discussed, along with H2O and the Free the Law project.

Listen here!

The Lawyerist Podcast is a weekly show about lawyering and law practice hosted by Sam Glover and Aaron Street.

 

Andromeda Yelton: what I learned about leadership from the Emerging Leaders

planet code4lib - Thu, 2016-06-09 15:26

About five and a half years ago, I was sitting in a big room in conventionland (San Diego, but who’s counting) with my class of Emerging Leaders, as we brainstormed about the qualities of an excellent leader.

Someone was writing those qualities up on a flip chart and, gosh, would I have liked to work for flip chart lady. She was so perceptive and thoughtful and strategic and empathetic and not bad at anything and just great. Way cooler than me. Everyone would like to work for flip chart lady.

And then one of my brainstorming colleagues said, you know, there’s one quality we haven’t put up there, because it’s not actually a core competency for leaders, and that’s intelligence. And the room nodded in agreement, because she was right. You probably can’t be an effective leader if you’re genuinely dumb, but all other things being equal, being smarter doesn’t actually make you a better leader. And we’ve all met really smart people who were disastrous leaders; intelligence alone simply does not confer the needed skills. Fundamentally, if “leader” were a D&D class, its prime requisite would not be INT.

The whole room nodded along with her while I thought, well crap, that’s the only thing I’ve always been good at.

So I was in a funk for a while, mulling that over. And eventually decided, well, people I respect put me in this room; I’m not going to tell them they’re wrong. I’m going to find a way to make it work. I’m going to look for the situations where the skills I have can make a difference, where my weaknesses don’t count against me too much. There’s not a shortage of situations in the world that need more leadership; I’ll just have to look for the ones where the leader that’s needed can be me. They won’t be the same situations where the people to my left and right will shine, and that’s okay. And if I’m not flip chart lady, if I’m missing half her strengths and I’m littered with weaknesses she doesn’t have (because she doesn’t have any)…well, as it turns out, no one is flip chart lady. We all have weaknesses. We are all somehow, if we’re leading interesting lives at all, inadequate to the tasks we set ourselves, and perhaps leadership consists largely in rising to those tasks nonetheless.

So here I am, five and a half years later, awed and humbled to be the LITA Vice-President elect. With a spreadsheet open where I’m sketching out at the Board’s request a two-year plan for the whole association, because if intelligence is the one thing you’ve always been good at, and the thing that’s needed is assimilating years’ worth of data about people and budgets and goals and strengths and weaknesses and opportunities, and transmuting that into something coherent and actionable…

Well hey. Maybe that’ll do.

Thanks for giving me the chance, everybody. I couldn’t possibly be more excited to serve such a thoughtful, creative, smart, motivated, fun, kind bunch of people. To figure out how LITA can honor your efforts and magnify your work as, together, we take a national association with near fifty years of history into its next fifty years. I can’t be flip chart lady for you (no one can), but I am spreadsheet lady, and I’m here for you. Let’s rock.


Galen Charlton: Code4Lib and the “open source way”

planet code4lib - Thu, 2016-06-09 02:21

The question of what Code4Lib wants to be when it grows up seems to be perennial, and the latest iteration of the discussion is upon us. Quoting Christina Salazar:

… I really do think it’s time to reopen the question of formalizing Code4Lib IF ONLY FOR THE PURPOSES OF BEING THE FIDUCIARY AGENT for the annual conference.

I agree — we need to discuss this. The annual main conference has grown from a hundred or so in 2006 to 440 in 2016. Given the notorious rush of folks racing to register to attend each fall, it is not unreasonable to think that a conference in the right location that offered 750 seats — or even 1,000 — would still sell out. There are also over a dozen regional Code4Lib groups that have held events over the years.

With more attendees comes greater responsibilities — and greater financial commitments. Furthermore, over the years the bar has (appropriately) been raised on what is counted as the minimum responsibilities of the conference organizers. It is no longer enough to arrange to keep the bandwidth high, the latency low, and the beer flowing. A conference host that does not consider accessibility and representation is not living up to what Code4Lib qua group of thoughtful GLAM tech people should be; a host that does not take attendee safety and the code of conduct seriously is being dangerously irresponsible.

Running a conference or meetup that’s larger than what can fit in your employer’s conference room takes money — and the costs scale faster than linearly.  For recent Code4Lib conferences, the budgets have been in the low- to-middle- six figures.

That’s a lot of a money — and a lot of antacids consumed until the hotel and/or convention center minimums are met. The Code4Lib community has been incredibly lucky that a number of people have voluntarily chosen to take this stress on — and that a number of institutions have chosen to act as fiscal hosts and incur the risk of large payouts if a conference were to collapse.

To disclose: I am a member of the committee that worked on the erstwhile bid to host the 2017 conference in Chattanooga. I think we made the right decision to suspend our work; circumstances are such that many attendees would be faced with the prospect of traveling to a state whose legislature is actively trying to make it more dangerous to be there.

However, the question of building or finding a long-term fiscal host for the annual Code4Lib conference must be considered separately from the fate of the 2017 Chattanooga bid. Indeed, it should have been discussed before conference hosts found themselves transferring five-figure sums to the next year’s host.

Of course, one option is to scale back and cease attempting to organize a big international conference unless some big-enough institution happens to have the itch to backstop one. There is a lot of life in the regional meetings, and, of course, many, many people who will never get funding to attend a national conference but who could attend a regional one.

But I find stepping back like that unsatisfying. Collectively, the Code4Lib community has built an annual tradition of excellent conferences. Furthermore, those conference have gotten better (and bigger) over the years without losing one of the essences of Code4Lib: that any person who cares to share something neat about GLAM technology can have the respectful attention of their peers. In fact, the Code4Lib community has gotten better — by doing a lot of hard work — about truly meaning “any person.”

Is Code4Lib a “do-ocracy”? Loaded question, that. But this go around, there seems to be a number of people who are interested in doing something to keep the conference going in the long run. I feel we should not let vague concerns about “too much formality” or (gasp! horrors!) “too much library organization” stop the folks who are interested from making a serious go of it.

We may find out that forming a new non-profit is too much uncompensated effort. We may find out that we can’t find a suitable umbrella organization to join. Or we may find out that we can keep the conference going on a sounder fiscal basis by doing the leg-work — and thereby free up some people’s time to hack on cool stuff without having to pop a bunch of Maalox every winter.

But there’s one in argument against “formalizing” in particular that I object to. Quoting Eric Lease Morgan:

In the spirit of open source software and open access publishing, I suggest we
earnestly try to practice DIY — do it yourself — before other types of
formalization be put into place.

In the spirit of open source? OK, clearly that means that we should immediately form a non-profit foundation that can sustain nearly USD 16 million in annual expenses. Too ambitious?  Let’s settle for just about a million in annual expenses.

I’m not, of course, seriously suggesting that Code4Lib aim to form a foundation that’s remotely in the same league as the Apache Software Foundation or the Mozilla Foundation. Nor do I think Code4Lib needs to become another LITA — we’ve already got one of those (though I am proud, and privileged, to count myself a member of both).  For that matter, I do think it is possible for a project or group effort to prematurely spend too much time adopting the trappings of formal organizational structure and thus forget to actually do something.

But the sort of “DIY” (and have fun unpacking that!) mode that Morgan is suggesting is not the only viable method of “open source” organization. Sometimes open source projects get bigger. When that happens, the organizational structure always changes; it’s better if that change is done openly.

The Code4Lib community doesn’t have to grow larger; it doesn’t have to keep running a big annual conference. But if we do choose to do that — let’s do it right.

Library of Congress: The Signal: The Workflow of the American Folklife Center Digital Collections

planet code4lib - Wed, 2016-06-08 19:43

This is a guest post by Julia Kim, archivist in the American Folklife Center at the Library of Congress.

Julia Kim. Photo by Alan Barnett.

The American Folklife Center just celebrated 40 years since it was founded by Congressional mandate. But its origins far predate 1976; its earlier incarnation was the Archive of Folk Song,which was founded in 1928 and was part of the Library’s Music Division.

Its collections included many early analog audio recordings, like the Alan Lomax Collection and the Federal Cylinder Project’s Native and Indigenous American recordings. [See also the CulturalSurvival.org story.]

While the Library is well known for its work with different tools, guidelines and recommendations, less is known about its systems and workflows. I’ve been asked about my work in these areas and though I’ve only been on staff a relatively short while, I’d like to share a little about digital preservation at AFC.

As part of the Nation’s Library, AFC has a mandate to collect in the areas of “traditional expressive culture.” Of its digital collections, AFC maintains ongoing preservation of 200 TB of content but we project a 50% increase of approximately 100 TB of newly digitized or born-digital content this year. In our last fiscal year, the department’s acquisitions were 96% digital, spanning over 45 collections. StoryCorp’s 2015 accessions alone amounted to approximately 50,000 files (8 TB).

It has been a tremendous challenge to understand AFC’s past strategies with an already mature — but largely dark — repository, as well as how to refine them with incoming content. We have not yet had to systemically migrate large quantities of born-digital files but preserving the previously accessioned collections is a major challenge. More often than not, AFC processors apply the terms migration and remediation  to older servers and databases rather than to files. This is an inevitable result of the growing maturity of our digital collections as well as others within the field of Digital Preservation.

The increasing amount of digital content also means that instead of relegating workflows to a single technical person (me), digital content is now handled by most of the archival processors in the division. AFC staff now regularly use a command line interface and understand how to navigate our digital repository. This is no small feat.

Similarly, staff training in core concepts is also ongoing. A common misconception is that ingest is a singular action when, in its fullest definition, it’s an abstraction that encompasses many actions, actors and systems. Ingest is one of the core functions in the OAIS framework. The Digital Preservation Coalition defines ingest as “the process of turning a Submission Information Package into an Archival Information Package, i.e. putting data into a digital archive.” Ingest, especially in this latter definition, can be contingent on relationships and agreements with external vendors, as well as arrangements with developers, project managers, processing staff and curators.

Transferring content is a major function of ingest and it is crucial to ensure that the many preservation actions down the line are done on authentic files. While transferring content involves taking an object into a digital repository, and it may seem to be a singular, discrete process, the transfer can involve many processes taking place over multiple systems by many different actors.

The flexibility inherent throughout the OAIS model requires systematic and clear definitions and documentation to be of any real use. This underscores the need for file verification and creating hash values at the earliest opportunity, as there is no technical ability to guarantee authenticity without receiving a checksum at production.

Ingest can then include validating the SIP, implementing quality assurance measures, extracting the metadata, inputting descriptive administrative metadata, creating and validating hash values and scanning for viruses. In our case, after establishing some intellectual control, AFC copies to linear tape before doing any significant processing and then re-copies again after any necessary renaming, reorganizing and processing.

Our digital preservation ecosystem relies on many commonly used open-source tools (bwfmetaedit, mediainfo, exiftool, Bagit, JHOVE, Tesseract), but one key tool is our modular home-grown repository, our Content Transfer Services ((see more about the development of CTS in this 2011 slide deck), which supports all of the Library of Congress, including the Copyright division and the Congressional Research Services.

Screenshot of the Library of Congress Content Transfer System. ENLARGE (+)

CTS is primarily an inventory and transfer system but it continues to grow in capacity and it performs many ingest procedures, including validating bags upon transfer and copy, file-type validations (JHOVE2) and — with non-Mac filesystems — virus scanning. CTS allows users to track and relate copies of content across both long-term digital linear tape as well as disk-based servers used for processing and access. It is used to inventory and control access copies on other servers and spinning disks, as well as copies on ingest-specific servers and processing servers. CTS also supports workflow specifications for online access, such as optical character recognition, assessing and verifying digitization specifications and specifying sample rates for verifying quality.

Each grouping in CTS can be tracked through a chronology of PREMIS events, its metadata and its multiple copies and types. Any PREMIS event, such as a “copy,” will automatically validate the md5 hash value associated with each file, but CTS does not automatically or cyclically re-inventory and check hash values across all collections. Curators and archivists can use CTS for single files or large nested directories of files: CTS is totally agnostic. Its only requirement is that the job must have a unique name.

CTS’s content is handled by file systems. Historically, AFC files are arranged by AFC collections in hierarchical and highly descriptive directories. These structures can indicate quality, file/content types, collection groupings and accessioning groupings. It’s not unusual, for example, for an ingested SIP directory to include as much as five directory levels with divisions based on content types. This requires specific space projections for the creation and allocation of directory structures.

Similarly, AFC relies on descriptive file-naming practices with pre-pended indications of a collection identifier — as well as other identifiers — to create, in most cases, unique IDs. CTS does not, however, require unique file names, just a unique naming of the grouping of files. CTS, then, accepts highly hierarchical sets of files and directories but is unable to work readily at the file level. It works within curated groupings of files with a reasonable limitation of no more than 5,000 files and 1 TB for each grouping.

AFC plans to regularly ingest off-site to tape storage at the National Audio Visual Conservation Center in Culpepper, Virginia (see the PowerPoint overviews by James Snyder and Scott Rife). While most of our collections are audio and audiovisual, we don’t currently send any digital content to NAVCC servers except when we request physical media to be digitized for patrons to access. We’re in the midst of a year-long project to explore automating ingest to NAVCC in a way that integrates with our CTS repository systems on Capitol Hill.

This AFC-led project should support other divisions looking for similar integration and will also help pave the way to support on-site digitization and then ingest to NAVCC. The project has been fruitful in engaging conversations on different ingest requirements for NAVCC and its reliance, for example, on Merged AudioVisual Information System (MAVIS) xml records, previously used to track AFC’s movement of analog physical media to cold storage at NAVCC. AFC also relies heavily on department-created Oracle APEX databases and Access databases.

One pivotal aspect of ingest is data transfer. We receive content from providers in a variety of ways: hard drives and thumb drives sent through the mail, network transfer over cloud services like Signiant Exchange and Dropbox, and API harvesting of the StoryCorps.Me collection. Each method carries some form of risk, from network failures and outages to hard drive failures. And, of course, human error.

AFC also produces and sponsors lots of content production, including in-house lectures and concerts and its Occupational Folklife Collections, which involve many non-archival processing staff members and individuals.

Another aspect that determines our workflows involves the division between born-digital collections and accessions versus our digitized collections meant for online access. As part of my introduction to the Library, I jumped into AFC’s push to digitize and provide online access to 25 ethnographic field projects collected from 1977-1997 (20 TB multi-format). AFC has just completed and published a digitized collection, the Chicago Ethnic Arts project.

These workflows can be quite distinct but in both the concept of “processing” is interpreted widely. In the online access digitization workflows, which have involved the majority of our staff’s processing time over the past six months, we must assess and perform quality control measures on different digital specifications as well as create and inventory derivatives at a mass scale across multiple servers. These collections, which we will continue to process over the years, test the limits of existing systems.

The department quickly maxed out the server space set aside for reorganizing content, creating derivatives and running optical character recognition software. Our highly descriptive directory structures were incompatible with our online access tools and required extensive reorganization. We also realized that there was a very high learning curve to working with a command line interface for many staff and many ongoing mistakes were not found until much later. Also later in the project, we determined that our initial vendor specifications were unsupported by some of the tools we relied on for online display. The list goes on but the processing of these collections served as an intensive continuous orientation to historical institutional practices.

There  are many reasons for the road blocks we encountered and some were inevitable. At the time that some of the older AFC practices had been established, CTS and other Library systems could not support our now current needs. However, like many new workflows, the field project digitization workflows are ongoing. Each of these issues required extensive meetings with stakeholders across different departments that will continue to over the coming months. These experiences have been essential in refining stakeholder roles and responsibilities as well as expectations around the remaining unprocessed 24 ethnographic field projects. Not least of all, there is a newer shared understanding of the time, training and space needed to move, copy and transform large quantities of digitized files. Like much of digital preservation itself, this is an iterative process.

As the year winds down, priorities can soon shift to revisiting our department’s digital preservation guidelines for amendment, inventorying unclearly documented content on tape and normalizing and sorting through the primarily descriptive metadata of our digital holdings.

Additionally AFC is focusing on re-organizing and understanding complex, inaccessible collections that are on tape. In doing so, we’ll be pushing our department to focus on areas of our self audit from last year that are most lacking, specifically in metadata. Another summer venture for us is to test and create workflows for identifying fugitive media left mixed in with paper with hybrid collections.

This summer, I’ll work with interns to develop a workflow to label, catalog, migrate and copy to tape, using the Alliance of American Quilts Collection as our initial pilot collection. AFC has also accumulated a digital backlog of collections that has not been processed or ingested in any meaningful way during our focus on digitization workflows. These need to be attended to in the next several months.

While this is just a sampling of our current priorities, workflows, systems and tools, it should paint a picture of some of the work being done in AFC’s processing room. AFC was an early adopter of digital preservation at the Library of Congress and as its scope has expanded over the past few decades, its computer systems and workflows have matured to keep up with its needs. The American Folklife Center continues to pioneer and improve digital preservation and access to the traditional expressive culture in the United States.

District Dispatch: What’s new in the library ebook lending market?

planet code4lib - Wed, 2016-06-08 06:29

A young boy enjoys reading an ebook on his tablet. Courtesy: Milltown Public Library.

What has changed in the library ebook lending environment in the past year? A panel of library and publishing experts will provide an update on the library ebook lending market and discuss best ways for libraries to advance library access to digital content at the 2016 American Library Association’s (ALA) Annual Conference in Orlando, Fla. The session, “Digital Content Working Group—Update and Future Directions,” takes place from 8:30 to 10:00 a.m. on Sunday, June 26, 2016, in room W205 of the Orange County Convention Center.

Library leaders from ALA’s Digital Content Working Group (DCWG) will provide an update on the DCWG’s activities. The event features an expert panel that focuses on future directions. The ALA Digital Content Working Group was established by ALA leadership to address the greatest digital opportunities and challenges for libraries.

During the session, participants will hear from a number of library ebook lending experts, including Carolyn Anthony, director of the Skokie Public Library, and co-chair of the American Library Association’s Digital Content Working Group; Michael Blackwell, director of St. Mary’s County Library in Leonardtown, Md.; Erika Linke, associate dean of the University Libraries and director of Research & Academic Services for Carnegie Mellon University and co-chair, American Library Association Digital Content Working Group; Trevor Owens, senior program officer of the Institute for Museum and Library Services’ National Digital Platform.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post What’s new in the library ebook lending market? appeared first on District Dispatch.

District Dispatch: Revolutionary ways to offer e-books to the print-disabled

planet code4lib - Wed, 2016-06-08 06:06

There has been a shift in the way people access information: E-books and the widespread use of graphics to convey information have created a “new normal” for how we read and learn. While these resources are readily available, too many of them are not accessible for the print-disabled. As a result, people with disabilities such as vision impairments, physical limitations and severe learning disabilities, often face barriers to information.

Photo by Elio-Rojano via Flickr

During the session “Accessible Books for All” at the 2016 American Library Association (ALA) Annual Conference, a panel of e-books and accessibility experts will discuss the successful partnership between Benetech/Bookshare, the New York Public Library and others to provide free access to over 400,000 books, periodicals and more to qualified library patrons.

The conference session takes place on Monday, June 27, 2016, 10:30-11:30 a.m., in room W105A of the Orange County Convention Center. Session speakers include Jill Rothstein, managing librarian, Andrew Heiskell Braille and Talking Book Library, New York Public Library (NYPL); and Lisa Wadors Verne, program manager, Education, Research and Partnerships, Benetech.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post Revolutionary ways to offer e-books to the print-disabled appeared first on District Dispatch.

Ed Summers: Baltimore Stories

planet code4lib - Wed, 2016-06-08 04:00

I’m in Baltimore today to participate in a public event in the Baltimore Stories series sponsored by the University of Maryland and the Maryland Humanities Council with support from the National Endowment for the Humanities. Check out the #bmorestories hashtag and the schedule for information about other events in the series. I’m particularly honored and excited to take part because of MITH’s work in build a research archive of tweets related to the protests in Baltimore last year, and a little bit of collaboration with Denise Meringolo and Joe Tropea to connect their BaltimoreUprising archive with Twitter. And of course there is my involvement in the Documenting the Now project, where the role of narrative and its place in public history is so key.

Since it’s a public event I’m not really sure who is going to show up tonight. The event isn’t about talking heads and powerpoints. It’s an opportunity to build conversations about the role of narrative in community and cultural/historical production. This blog post isn’t the text of a presentation, it’s just a way for me to organize some thoughts and itemize a few things about narrative construction in social media that I hope to have a chance talk about with others who show up.

Voice

To give you a little bit more of an idea about what Baltimore Stories is aiming to do here’s a quote from ther NEH proposal:

The work we propose for this project will capitalize on the public awareness and use of the humanities by bringing humanities scholars and practitioners into conversation with the public on questions that are so present in the hearts and minds of Baltimoreans today: Who are we, Baltimore? Who owns our story? Who tells our stories? What stories have been left out? How can we change the narrative? We imagine the exploration of narratives to be of particular interest to Baltimore, and to other cities and communities affected by violence and by singular narratives that perpetuate violence and impede understanding and cooperation. We believe that the levels of transdisciplinarity and collaboration with public organizations in this project are unprecedented in a city-wide collaboration of the humanities and communities.

In one of the previous Baltimore Stories events, film maker and multi-media artist Ralph Crowder talked about his work documenting the events in Ferguson Missouri last year. I was particularly struck by his ability to connect with the experience of the highschool students in the audience and relate it to his work as an artist. One part of this conversation really stuck with me:

Don’t ever be in a situation where you feel like you don’t have a voice. Don’t ever be in a situation where someone else is talking for your experience. You be the person who does that for you. Because if you can’t do that someone else is going to come around and they’re going to talk for what you are going through, and many times they are going to get paid some money off of your struggle.

As someone that received a grant from a large foundation to help build tools and community around social media archiving and events like those in Ferguson, Missouri and those in Baltimore I can’t help but feel implicated by these words–and to recognize their truth. I grew up in the middle class suburbs of New Jersey, where I was raised by both my Mom or Dad. I shouldn’t have been, but I was shocked by how many kids raised their hands when Crowder asked how many were being raised by only one parent, and how many were being raised by no parent at all. I know I am coming from a position of privilege, and that that the mechanics of this privilege afford yet more privilege. But getting past myself for a moment I can see what Crowder is saying here is important. What he is saying is in fact fundamental to the work we are doing on Documenting the Now, and the value proposition of the World Wide Web itself.

For all its breathless techno solutionist promises and landscape fraught with abuse and oppression, social media undeniably expands individuals’ ability to tell their own stories in their own voice to a (potentially) world wide audience. This expansion is of course relative to individual’s ability to speak to that same audience in earlier mediums like letters, books, magazines, journals, television and radio. The Web itself made it much easier to publish information to a large audience. But really the ability to turn the crank of Web publishing was initially limited to people with a very narrow set of technical skills. What social media did was greatly expand the number of people who could publish on the Web, and share their voices with each other, and with the world.

Stories

The various social media platforms (Twitter, Instagram, YouTube, Google+, Facebook, Tumblr, etc) condition what can be said, who it can be said to, and who can say things back. This is an incredibly deep topic that people have written books about. But to get back to Crowder’s advice, consider the examples of DeRay McKesson, Devin Allen, Johnetta Elzie, Alicia Garza and countless others who used social media platforms like Twitter, Facebook and Instagram to document the BlackLivesMatter movement, and raise awareness about institutionalized racism. Checkout DeRay’s Ferguson Beginnings where he has been curating his own tweets from Ferguson in 2014. Would people like me know about Ferguson and BlackLivesMatter if people like DeRay, Devin, Johnetta and Alicia weren’t using social media to document their experience?

It’s not my place to tell this story. But as someone who studies the Web, builds things for the Web here are a few things to consider when deciding how to and where to tell your story on the Web.

Hashtags

Hashtags matter. Bergis and I saw this when we were collecting the #BaltimoreRiots and #BaltimoreUprising tweets. We saw the narrative tension between the two hashtags, and the attempts to reframe the protests that were going on here in Baltimore.

Take a look at a sampling of random tweets with media using the #BaltimoreRiots and #BaltimoreUprising. Do you get a sense of how they are being used differently? #BlackLivesMatter itself is emblematic of the power of naming in social media. Choosing a hashtag is choosing your audience.

Visibility

When you post to a social media platform be aware of who gets to see it and the trade offs associated with that decision. For example, by default when you publish a tweet you are publishing a message for the world. Only people who follow you get an update about it. But anyone in the world can see it if they have the URL for your tweet. You can delete a tweet which will remove the tweet from the Web, but it won’t pull it back from the clients and other places that may have stored it. You can choose to make your account protected which means it is only viewable by people you grant access to. The different social media platforms have different controls for who gets to see your content. Try to understand the controls that are available. When you share things publicly they obviously can be seen by a wider audience which gives it more reach. But publishing publicly also means your content will be seen by all sorts of actors, which may have other ramifications.

Surveillance

One of the ramifications is that social media is being watched by all sorts of actors, including law enforcement. We know from Edward Snowden and Glenn Greenwald that the National Security Agency is collecting social media content. The full dimensions of this work is hidden, but just google for police Facebook evidence or read this Wikipedia article and you’ll find lots of stories of how local law enforcement are using social media. Even if you aren’t doing anything wrong be aware of how your content might be used by these actors. It can be difficult but don’t let this surveillance activity have a chilling effect on you using exercising your right to freedom of speech, and using your voice to tell your story.

Harassment

When you publish your content publicly on the Web you open yourself up to harassment from individuals who may disagree with you. For someone like DeRay and many other public figures on social media this can mean having to block thousands of users because some of them send death threats and other offensive material. This is no joke. Learn how to leverage your social media tools to ignore or avoid these actors. Learn strategies for working with this content. Perhaps you want to have a friend look at your messages for you, to get some distance. Perhaps you can experiment with BlockBots and other collaborative ways of ignoring bad actors on the Web. If there are controls reporting spammers and haters use them.

Ownership

If you look closely at the Terms of Service for major social media platforms like Twitter, Instagram and Facebook you will notice that they very clearly state that you are the owner of the content you put on there. You also often grant the social media platform a right to redistribute and share that content as well. But ultimately it is yours. You can take your content and put it in multiple platforms such as YouTube, Vine and Facebook. You may want to use a site like Wikimedia Commons or Flickr that allow you to attach a Creative Commons license to your work. Creative Commons provides a set of licenses that let you define how your content can be shared on the Web. If you are using a social media platform like Tumblr that lets you change the layout of the website you can add a creative commons license to your site. Ultimately this is the advantage of creating a blog on Wordpress, Medium or hosting it yourself since you can claim copyright and license your material as you see fit. However it is a trade off since it may be more difficult to get the exposure that you will see in platforms like Instagram, Vine or Facebook. If you want you could give low resolution versions to social media outlets with links to high resolution versions you publish yourself with your own license.

Archive

Many social media platforms like Facebook, Twitter and Medium Instagram allow you to download an archive of all the content you’ve put there. When you are selecting a social media platform make sure you can see how to get your content out again. This could be useful if you decide to terminate your account for whatever reason, but retain a record of your work. Perhaps you want to move your content to another social media platform. Or perhaps you are creating a backup in case the platform goes away. Maybe, just maybe, you are donating your work to a local library or archive. Being able to download your content is key.

District Dispatch: Emily Sheketoff to join advocacy panel at ALA Annual

planet code4lib - Tue, 2016-06-07 18:51

With years of hard-won experience under their belt, retired librarians and library workers are well-positioned to put their advocacy skill to use for libraries. Their years in the field have given them a wealth of knowledge and stories that need to be shared with legislators, and in retirement they have the added advantage of

Photo credit: Howard Lake

being able to speak up as a private citizen without job-induced time restraints. This year, attendees at ALA Annual will have the opportunity to learn how they can leverage their time and experience to protect the libraries they love.

Join the Retired Members Round Table and the Federal Legislation Advocacy Group (FLAG) to learn a few simple ways you can promote libraries on a federal, state, and local level. Emily Sheketoff, Executive Director of the ALA Washington office, will cover ways to have an impact on federal elected officials. She will be joined by Marci Merola, Director of the ALA Office for Library Advocacy, who will cover matters of a more local nature. The third panelist, Jan Sanders, Director of the Pasadena Public Library, will relate several successful advocacy projects she implemented at her library and share the insights she gained.

Program details:

Fast and Easy: Advocacy That YOU Can Do!

Sunday, June 26, 2016

1:00 – 2:30pm

OCCC, Room W106

The post Emily Sheketoff to join advocacy panel at ALA Annual appeared first on District Dispatch.

David Rosenthal: The Need For Black Hats

planet code4lib - Tue, 2016-06-07 15:00
I was asked to provided some background for a panel on "Security" at the Decentralized Web Summit held at the Internet Archive. Below the fold is a somewhat expanded version.

Nearly 13 years ago my co-authors and I won Best Paper at SOSP for the peer-to-peer anti-entropy protocol that nodes in a LOCKSS network use to detect and repair damage to their contents. The award was for showing a P2P network that failed gradually and gracefully under attack from a very powerful adversary. Its use of proof-of-work under time constraints is related to ideas underlying blockchains.

The paper was based on a series of simulations of 1000-node networks, so we had to implement both sides, defence and attack. In our design discussions we explicitly switched between wearing white and black hats; we probably spent more time on the dark side. This meant that we ended up with a very explicit and very pessimistic threat model, which was very helpful in driving the design

The decentralized Web will be attacked, in non-obvious ways. Who would have thought that IP's strength, the end-to-end model, would also bring one of its biggest problems, pervasive surveillance? Or that advertising would be the death of Tim Berners-Lee's Web

I'd like to challenge the panelists to follow our example, and to role-play wearing black hats in two scenarios:
  • Scenario 1. We are the NSA. We have an enormous budget, no effective oversight, taps into all the major fiber links, and a good supply of zero-days. How do we collect everyone's history of browsing the decentralized Web? (I guarantee there is a team at NSA/GCHQ asking this question).
  • Scenario 2. We are the Chinese government. We have an enormous budget, an enormous workforce, a good supply of zero-days, total control over our country's servers and its connections to the outside world. How do we upgrade the Great Firewall of China to handle the decentralized Web, and how do we censor our citizens use of it? (I guarantee there is a team in China asking these questions).
I'll kick things off by pointing out one common factor between the two scenarios, that the adversaries have massive resources. Massive resources are an inescapable problem for decentralized systems, and the cause is increasing returns to scale or network effects. Increasing returns are the reason why the initially decentralized Web is now dominated by a few huge companies like Google and Facebook. They are the reason that Bitcoin's initially decentralized blockchain recently caused Mike Hearn to write this:
the block chain is controlled by Chinese miners, just two of whom control more than 50% of the hash power. At a recent conference over 95% of hashing power was controlled by a handful of guys sitting on a single stage.One necessary design goal for networks such as Bitcoin is that the protocol be incentive-compatible, or as Ittay Eyal and Emin Gun Sirer express it:
the best strategy of a rational minority pool is to be honest, and a minority of colluding miners cannot earn disproportionate benefits by deviating from the protocolThey show that the Bitcoin protocol was, and still is, not incentive-compatible. More recently, Sirer and others have shown that the Distributed Autonomous Organization based on Ethereum isn't incentive-compatible either. Even if these protocols were, increasing returns to scale would drive centralization and thus ensure attacks with massive resources, whether from governments, large corporations. And lets not forget that attacks can be mounted using botnets.

Massive resources enable Sybil attacks. The $1M attack CMU mounted in 2014 against the Tor network used both traffic confirmation and Sybil attacks:
The particular confirmation attack they used was an active attack where the relay on one end injects a signal into the Tor protocol headers, and then the relay on the other end reads the signal. These attacking relays were stable enough to get the HSDir ("suitable for hidden service directory") and Guard ("suitable for being an entry guard") consensus flags. Then they injected the signal whenever they were used as a hidden service directory, and looked for an injected signal whenever they were used as an entry guard.Traffic confirmation attacks don't need to inject signals, they can be based on statistical correlation. Correlations in the time domain are particularly hard for interactive services, such as Tor and the decentralized Web, to disguise.
Then the second class of attack they used, in conjunction with their traffic confirmation attack, was a standard Sybil attack — they signed up around 115 fast non-exit relays, all running on 50.7.0.0/16 or 204.45.0.0/16. Together these relays summed to about 6.4% of the Guard capacity in the network. Then, in part because of our current guard rotation parameters, these relays became entry guards for a significant chunk of users over their five months of operation.Sybil attacks are very hard for truly decentralized networks to defend against, since no-one is in a position to do what the Tor project did to CMU's Sybils:
1) Removed the attacking relays from the network.Richard Chirgwin at The Register reports on Philip Winter et al's Identifying and characterizing Sybils in the Tor network. Their sybilhunter program found the following kinds of Sybils:
  • Rewrite Sybils – these hijacked Bitcoin transactions by rewriting their Bitcoin addresses;
  • Redirect Sybils – these also attacked Bitcoin users, by redirecting them to an impersonation site;
  • FDCservers Sybils – associated with the CMU deanonymisation research later subpoenaed by the FBI;
  • Botnets of Sybils – possibly misguided attempts to help drive up usage;
  • Academic Sybils – they observed the Amazon EC2-hosted nodes operated by Biryukov, Pustogarov, and Weinmann for this 2013 paper; and
  • The LizardNSA attack on Tor.
The Yale/UT-Austin Dissent project is an attempt to use cryptographic techniques to provide anonymity while defending against both Sybil and traffic analysis attacks, but they believe there are costs in doing so:
We believe the vulnerabilities and measurability limitations of onion routing may stem from an attempt to achieve an impossible set of goals and to defend an ultimately indefensible position. Current tools offer a general-purpose, unconstrained, and individualistic form of anonymous Internet access. However, there are many ways for unconstrained, individualistic uses of the Internet to be fingerprinted and tied to individual users. We suspect that the only way to achieve measurable and provable levels of anonymity, and to stake out a position defensible in the long term, is to develop more collective anonymity protocols and tools. It may be necessary to constrain the normally individualistic behaviors of participating nodes, the expectations of users, and possibly the set of applications and usage models to which these protocols and tools apply. They note:
Because anonymity protocols alone cannot address risks such as software exploits or accidental self-identification, the Dissent project also includes Nymix, a prototype operating system that hardens the user’s computing platform against such attacks.Getting to a shared view of the threats the decentralized Web is intended to combat before implementations are widely deployed is vital. The lack of such a view in the design of TCP/IP and the Web is the reason we're in the mess we're in. Unless the decentralized Web does a significantly better job handling the threats than the current one, there's no point in doing it. Without a "black hat" view during the design, there's no chance that it will do a better job.

District Dispatch: Putting libraries front and center during the presidential election

planet code4lib - Tue, 2016-06-07 05:42

Photo by Sebastiaan ter Burg via Flickr

The presidential election is right around the corner, with the presidency, Congress, and the U.S. Supreme Court in the balance, and a new Librarian of Congress imminent. Learn about actions that the American Library Association (ALA) is taking to prepare for the coming opportunities and challenges at the 2016 ALA Annual Conference in Orlando, Fla. Join political and library leaders at the conference session “Taking Libraries Transform and the Policy Revolution! to the New Presidential Administration,” where experts will discuss strategic efforts to influence federal policy initiatives in Washington, D.C., and how these efforts transfer to the state and local levels. The session takes place on Saturday, June 25, 2016, 10:30-11:30 a.m.,in the Orange County Convention Center in room W105B.

Speakers include Susan Hildreth, former director, Institute of Museum and Library Services (IMLS); ALA Treasurer-elect; and executive director of the Peninsula (Calif.) Library System; Anthony Sarmiento, executive director of Senior Service America, Inc., member of the ALA Public Policy Advisory Council and past senior official with AFL-CIO; Alan S. Inouye, director of the American Library Association Office for Information Technology Policy (OITP); and Mark Smith, director and Librarian of the Texas State Library and Archives Commission. This conference session is sponsored by ALA’s Office for Information Technology Policy and United for Libraries.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post Putting libraries front and center during the presidential election appeared first on District Dispatch.

DuraSpace News: OpenVIVO: Connect, Share and Discover the VIVO Community at the 2016 VIVO Conference

planet code4lib - Tue, 2016-06-07 00:00

OpenVIVO (OpenVIVO.org) is for anyone who's interested in VIVO or the VIVO community - take a look!

Pages

Subscribe to code4lib aggregator