You are here

Feed aggregator

DuraSpace News: DuraCloud Services Presentations Set for PASIG–Early Bird Registration Fast Approaching!

planet code4lib - Mon, 2015-01-26 00:00

SanDiego, CA  The upcoming 2015 Preservation and Archiving Special Interest Group (PASIG) event will be held March 11-13 on the campus of UC San Diego. The organizers are bringing together an international group of experts in a wide range of fields, dedicated to providing timely, useful information.

Roy Tennant: Wikipedia’s Waterloo?

planet code4lib - Sun, 2015-01-25 23:50

If you are involved in technology at all, you no doubt have heard about GamerGate. Normally at this point I would say that if you hadn’t heard about it, go read about it and come back.

But that would be foolish.

You would likely never come back. Perhaps it would be from disgust at how women have been treated by many male gamers. Perhaps it would be because you can’t believe you have just wasted hours of your life that you are never getting back. Or perhaps it is because you disappeared down the rat hole of controversy and won’t emerge until either hunger or your spouse drags you out. Whatever. You aren’t coming back. So don’t go before I explain why I am writing about this.

Wikipedia has a lot to offer. Sure, it has some gaping holes you could drive a truck through, just about any controversial subject can end up with a sketchy page as warring factions battle it out, and the lack of pages on women worthy of them is striking.

You see, it is well known that Wikipedia has a problem with female representation — both with the percentage of pages devoted to deserving women as well as the number of editors building the encyclopedia.

So perhaps it shouldn’t come as a surprise that Wikipedia has now sanctioned the editors trying to keep a GamerGate Wikipedia page focused on what it is really all about — the misogynistic actions of a number of male gamers. But the shocking part to me is that it even extends beyond that one controversy into really dangerous muzzling territory. According to The Guardian, these women editors* have been banned from editing “any other article about ‘gender or sexuality, broadly construed'”.

I find that astonishingly brutal. Especially for an endeavor that tries to pride itself on an egalitarian process.

Get your act together, Wikipedia.

 

* My bad. Editors were banned. They are not necessarily women. Or even feminists.

Nicole Engard: Bookmarks for January 25, 2015

planet code4lib - Sun, 2015-01-25 20:30

Today I found the following resources and bookmarked them on <a href=

  • Krita Open Source Software for Concept Artists, Digital Painters, and Illustrators

Digest powered by RSS Digest

The post Bookmarks for January 25, 2015 appeared first on What I Learned Today....

Related posts:

  1. Governments Urging the use of Open Source
  2. eXtensible Catalog (XC) gets more funding
  3. Evaluating Open Source

Casey Bisson: Photo hipster: playing with 110 cameras

planet code4lib - Sun, 2015-01-25 18:44

After playing with Fuji Instax and Polaroid (with The Impossible Project film) cameras, I realized I had to do something with Kodak. My grandfather worked for Kodak for years, and I have many memories of the stories he shared of that work. He retired in the late 70s, just as the final seeds of Kodak’s coming downfall were being sown, but well before anybody could see them for what they were.

The most emblematic Kodak camera and film I could think of was the 110 cartridge film type, and that’s what I used to captured this picture of Cliff Pearson and Millicent Prancypants.

I bought two cameras and a small bundle of film from various eBay sellers. They look small in the following photo, but they’re significantly larger and less pocketable than even my iPhone 6 plus.

Developing is $4 per cartridge at Adolph Gasser’s, but they can’t print or scan the film there, so that had me looking for other solutions. I couldn’t find a transparency scanner that had film holders for 110 film. That isn’t surprising, but it did leave me wondering and hesitant long enough to look for other ways to capture this film. For these shots I re-photographed them with my EOS M:

John Miedema: Writing has changed with digital technology, but much is the same. Pirsig’s slip-based writing system was inspired by information technology.

planet code4lib - Sun, 2015-01-25 16:41

Writing has changed with digital technology, but much is the same. The Lila writing technology builds on both the dynamic and static features.

Writers traditionally spend considerable time reading individual works closely and carefully. The emergence of big data and analytic technologies causes a shift toward distant reading, the ability to analyze a large volume of text in terms of statistical patterns. Lila uses these technologies to select relevant content for deeper reading.

Writing, as always, occurs in many locations, from a car seat to a coffee shop to a desk. Digital technology makes it easier to aggregate text from these different locations. Existing technologies like Evernote and Google Drive can gather these pieces for Lila to perform its cognitive functions.

Writing is performed on a variety of media. In the past it might have been napkins, stickies and binder sheets. Today it includes a greater variety, from cell phone notes to email and word processor documents. Lila can only analyze digital media. It is understood that there is still much text in the world that is not digital. Going forward, text will likely always be digital.

Writing tends to be more fragmented today, occurring in smaller units of text. Letter length is replaced with cell phone texts, tweets, and short emails. The phrase “too long; didn’t read” is used on the internet for overly long statements. Digital books are shorter than print books. Lila is expressly designed around a “slip” length unit of text, from at least a tweet length for a subject line, up to a few paragraphs. It would be okay to call a slip a note. Unlike tweets, there will be no hard limit on the number of characters.

A work is written by one or many authors. Print magazines and newspapers are compilation of multiple authors, so too are many websites. Books still tend to be written by a single author, but Lila’s function of compiling content into views will make it easier for authors to collaborate on a work with the complexity and coherence of a book.

In the past, the act of writing was more isolated. There was a clear separation between authors and readers. Today, writing is more social. Authors blog their way through books and get immediate feedback. Readers talk with authors during their readings. Fans publish their own spin on book endings. Lila extends reading and writing capabilities. I have considered additional capabilities with regard to publishing drafts to the web for feedback and iteration. A WordPress integration perhaps.

Pirsig’s book, Lila, was published in 1991, not long after the advent of the personal computer and just at the dawn of the web. His slip-based writing system used print index cards, but he deliberately chose that unit of text over pages because it allowed for “more random access.” He also categorized some slips as “program” cards, instructions for organizing other slips. As cards about cards, they were powerful, he said, in the way that John Von Neuman explained the power of computers, “the program is data and can be treated like any other data.” Pirsig’s slip-based writing system was no doubt inspired by the developments in information technology.

Alf Eaton, Alf: Exploring a personal Twitter network

planet code4lib - Sun, 2015-01-25 13:59
PDF version
  1. Fetch the IDs of users I follow on Twitter, using vege-table:

    var url = 'https://api.twitter.com/1.1/friends/ids.json'; var params = { screen_name: ‘invisiblecomma’, stringify_ids: true, count: 5000 }; var collection = new Collection(url, params); collection.items = function(data) { return data.ids; } collection.next = function(data) { if (!data.next_cursor) { return null; } params.cursor = data.next_cursor_str; return [url, params]; } return collection.get('json');
  2. Using similar code, fetch the list of users that each of those users follows.

  3. Export the 10,000 user IDs with the highest intra-network follower counts.

  4. Fetch the details of each Twitter user:

    return Resource('https://api.twitter.com/1.1/users/lookup.json', { user_id: user_id }).get('json').then(function(data) { return data[0]; });
  5. Process those two CSV files into a list of pairs of connected identifiers suitable for import into Gephi.

  6. In Gephi, drag the “Topology > In Degree Range” filter into the Queries section, and adjust the range until a small enough number of users with the most followers is visible:

  7. Set the label size to be larger for users with more incoming links:

  8. Set the label colour to be darker for users with more incoming links:

  9. Apply the ForceAtlas 2 layout, then the Expansion layout a few times, then the Label Adjust layout:

  10. Switch to the Preview window and adjust the colour and opacity of the edges and labels appropriately. Hide the nodes, set the label font to Roboto, then export to PDF.

  11. Use imagemagick to convert the PDF to JPEG: convert —density 200 twitter-foaf.pdf twitter-foaf.jpg

It would probably be possible to automate this whole sequence - perhaps in a Jupyter Notebook. The part that takes the longest is fetching the data from Twitter, due to the low API rate limits.

Mark E. Phillips: What do we put in our BagIt bag-info.txt files?

planet code4lib - Sat, 2015-01-24 23:03

The UNT Libraries makes heavy use of the BagIt packaging format throughout our digital repository infrastructure.  I’m of the opinion that BagIt is one of the technologies that has contributed more toward moving digital preservation forward in the last ten years than any other one technology/service/specification.  The UNT Libraries uses BagIt for our Submission Information Packages (SIP),  our Archival Information Packages (AIP), our Dissemination Information Packages, and our local Access Content Package (ACP).

For those that don’t know BagIt,  it is a set of conventions for packaging content into a directory structure in a consistent and repeatable way.  There are a number of other descriptions of BagIt that do a very good job of describing the conventions and some of the more specific bits of the specification.

There are a number of great tools for creating, modifying and validating BagIt bags,  and my favorite for a long time has been bagit-python from the Library of Congress.   (To be honest I usually am using Ed Summers fork which I grab from here)

The BagIt specification has a metadata file that is stored in the root of a bag,  this metadata file is called bag-it.txt.  The BagIt specification has a number of fields defined for this file which are stored as key value pairs in the file in the format of.

key: value

I thought it might be helpful for those new to using BagIt bags to see what kinds of information we are putting into these bag-info.txt files,  and also explain some of the unique fields that we are adding to the file for managing items in our system.  Below is a typical bag-info.txt file from one of our AIPs in the Coda Repository.

Bag-Size: 28.32M Bagging-Date: 2015-01-23 CODA-Ingest-Batch-Identifier: f2dbfd7e-9dc5-43fd-975a-8a47e665e09f CODA-Ingest-Timestamp: 2015-01-22T21:43:33-0600 Contact-Email: mark.phillips@unt.edu Contact-Name: Mark Phillips Contact-Phone: 940-369-7809 External-Description: Collection of photographs held by the University of North Texas Archives that were taken by Junebug Clark or other family members. Master files are tiff images. External-Identifier: ark:/67531/metadc488207 Internal-Sender-Identifier: UNTA_AR0749-002-0016-0017 Organization-Address: P. O. Box 305190, Denton, TX 76203-5190 Payload-Oxum: 29666559.4 Source-Organization: University of North Texas Libraries

In the example above,  several of the fields are boiler plate, and others are machine generated.

Field How we create the Value Bag-Size Machine Bagging-Date Machine CODA-Ingest-Batch-Identifier Machine CODA-Ingest-Timestamp Machine Contact-Email Boiler-Plate Contact-Name Boiler-Plate Contact-Phone Boiler-Plate External-Description Changes per “collection” External-Identifier Machine Internal-Sender-Identifier Machine Organization-Address Boiler-Plate Payload-Oxum Machine Source-Organization Boiler-Plate

You can tell from looking at the example bag-info.txt file above that some of the fields are very self explanatory.  I’m going to run over a few of the fields that either are non-standard, or that we’ve made explicit decisions on as we were implementing BagIt.

CODA-Ingest-Batch-Identifier is a UUID for each batch of content added to our Coda Repository,  this helps us identify other items that may have been added during a specific run of our ingest process,  helpful for troubleshooting.

CODA-Ingest-Timestamp is the timestamp when the AIP was added to the Coda Repository.

External-Identifier will change for each collection that gets processed,  it has just enough information about the collection to help jog someone’s memory about where this item came from and why it was created.

External-Identifier is the ARK identifier assigned the item on ingest into one of the Aubrey systems where we access the items or manage the descriptive metadata.

Internal-Sender-Identifier is the locally important (often not unique) identifier for the item as it is being digitized or collected.  It often takes the shape of an accession number from our University Special Collections, or the folder name of an issue of newspaper.

We currently have 1,070,180 BagIt bags in our Coda Repository and they have be instrumental in us being able to scale our digital library infrastructure and verify that each item is just the same as when we added it to our collection.

If you have any specific questions for me let me know on twitter.

John Miedema: Writing non-fiction is mostly reading, thinking, and sorting; the rest is just keystrokes. Lila is for writing non-fiction; poetry, not so much.

planet code4lib - Sat, 2015-01-24 16:31

Writing non-fiction is mostly reading, thinking, and sorting; the rest is just keystrokes. And style. Think clearly and the rest comes easy. Lila is designed to extend human writing capabilities by performing cognitive work:

  1. The work of reading, especially during the early research phase. Writers can simply drop unread digital content onto disk, and Lila will convert it into manageable chunks — slips. These slips are shorter than the full length originals, making them quicker to evaluate. More important, these slips are embedded in the context of relevant content written by the author; context is meaning, so unread content will be easier to evaluate
  2. The work of analyzing content and sorting it into the best view, using visualization. As Pirsig said, “Instead of asking ‘Where does this metaphysics of the universe begin?’ – which was a virtually impossible question – all he had to do was just hold up two slips and ask, ‘Which comes first?'” This work builds of a table of contents, a hierarchical view of the content. Lila will show multiple views so the author can choose the best one.
  3. The ability to uncover bias and ensure completeness of thought. Author bias may filter out content when reading, but Lila will compel a writer to notice relevant content.

Lila’s cognitive abilities depend on the author’s engagement in a writing project, generating content that guides the above work. Lila is designed expressly for the writing of non-fiction; poetry, not so much. The cognitive work is performed in most kinds of writing, and so Lila will aid with other kinds of fiction. Both fiction and creative non-fiction still require substantial stylistic work after Lila has done her part.

CrossRef: CrossRef Indicators

planet code4lib - Fri, 2015-01-23 21:13

Updated January 20, 2015

Total no. participating publishers & societies 5736
Total no. voting members 3022
% of non-profit publishers 57%
Total no. participating libraries 1926
No. journals covered 37,469
No. DOIs registered to date 71,820,143
No. DOIs deposited in previous month 648,271
No. DOIs retrieved (matched references) in previous month 46,260,320
DOI resolutions (end-user clicks) in previous month 134,057,984

CrossRef: New CrossRef Members

planet code4lib - Fri, 2015-01-23 21:06

Updated January 20, 2015

Voting Members
All-Russia Petroleum Research Exploration Institute (VNIGRI)
Barbara Budrich Publishers
Botanical Research Institute of Texas
Faculty of Humanities and Social Sciences, University of Zagreb
Graduate Program of Management and Business, Bogor Agricultural University
IJSS Group of Journals
IndorSoft, LLC
Innovative Pedagogical Technologies LLC
International Network for Social Network Analysts
Peertechz.com
Slovenian Chemical Society
Subsea Diving Contractor di Stefano Di Cagno Publisher
The National Academies Press
Wisconsin Space Grant Consortium

Represented Members
Artvin Coruh Universitesi Orman Fakultesi Dergisi
Canadian Association of Schools of Nursing
GuvenGrup
Indian Society for Education and Environment
Journal for the Education of the Young Scientist and Giftedness
Kastamonu University Journal of Forestry Faculty
Korean Society for Metabolic and Bariatric Surgery
Korean Society of Acute Care Surgery
The Korean Ophthalmological Society
The Pharmaceutical Society of Korea
Uludag University Journal of the Faculty of Engineering
YEDI: Journal of Art, Design and Science

Last updated January 12, 2015

Voting Members
Association of Basic Medical Sciences of FBIH
Emergent Publications
Infotech
Kinga - Service Agency Ltd.
Particapatory Educational Research (Per)
Robotics: Science and Systems Foundation
University of Lincoln, School of Film and Media and Changer Agency
Uniwersytet Przyrodniczy w Poznaniu (Poznan University of Life Sciences)
Voronezh State University
Wyzsza Szkola Logistyki (Poznan School of Logistics)

Represented Members
EJOVOC
Journal of the Faculty of Engineering and Architecture of Gazi University
Korean Insurance Academic Society
Korean Neurological Association
Medical Journal of Suleyman Demirel University

CrossRef: Upcoming CrossRef Webinars

planet code4lib - Fri, 2015-01-23 20:38

Introduction to CrossCheck
Date: Tuesday, Jan 27, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossRef Text and Data Mining
Date: Thursday, Jan 29, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossRef Technical Basics
Date: Wednesday, Feb 11, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Patricia Feeney
Registration

CrossCheck: iThenticate Admin Webinar
Date: Thursday, Feb 19, 2015
Time: 7:00 am (San Francisco), 10:00 am (New York), 3:00 pm (London)
Moderator: iThenticate
Registration

Introduction to CrossRef
Date: Wednesday, Mar 4, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Patricia Feeney
Registration

Introduction to CrossCheck
Date: Tuesday, Mar 17, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 3:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossRef Technical Basics
Date: Wednesday, Mar 18, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 3:00 pm (London)
Moderator: Patricia Feeney
Registration

Introduction to CrossRef Text and Data Mining
Date: Thursday, Mar 19, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 3:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossCheck
Date: Tuesday, May 5, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossRef Text and Data Mining
Date: Thursday, May 7, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossCheck
Date: Tuesday, July 21, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Introduction to CrossRef Text and Data Mining
Date: Thursday, July 23, 2015
Time: 8:00 am (San Francisco), 11:00 am (New York), 4:00 pm (London)
Moderator: Rachael Lammey
Registration

Ed Summers: Library of Alexandria v2.0

planet code4lib - Fri, 2015-01-23 19:16

In case you missed Jill Lepore has written a superb article for the New Yorker about the Internet Archive and archiving the Web in general. The story of the Internet Archive is largely the story of its creator Brewster Kahle. If you’ve heard Kahle speak you’ve probably heard the Library of Alexandria v2.0 metaphor before. As a historian Lepore is particularly tuned to this dimension to the story of the Internet Archive:

When Kahle started the Internet Archive, in 1996, in his attic, he gave everyone working with him a book called “The Vanished Library,” about the burning of the Library of Alexandria. “The idea is to build the Library of Alexandria Two,” he told me. (The Hellenism goes further: there’s a partial backup of the Internet Archive in Alexandria, Egypt.)

I’m kind of embarrassed to admit that until reading Lepore’s article I never quite understood the metaphor…but now I think I do. The Web is on fire and the Internet Archive is helping save it, one HTTP request and response at a time. Previously I couldn’t get the image of this vast collection of Web content that the Internet Archive is building as yet another centralized collection of valuable material that, as with v1.0, is vulnerable to disaster but more likely, as Heather Phillips writes, creeping neglect:

Though it seems fitting that the destruction of so mythic an institution as the Great Library of Alexandria must have required some cataclysmic event like those described above – and while some of them certainly took their toll on the Library – in reality, the fortunes of the Great Library waxed and waned with those of Alexandria itself. Much of its downfall was gradual, often bureaucratic, and by comparison to our cultural imaginings, somewhat petty.

I don’t think it can be overstated: like the Library of Alexandria before it, the Internet Archive is an amazingly bold and priceless resource for human civilization. I’ve visited the Internet Archive on multiple occasions, and each time I’ve been struck by how unlikely it is that such a small and talented team have been able to build and sustain a service with such impact. It’s almost as if it’s too good to be true. I’m nagged by the thought that perhaps it is.

Herbert van de Sompel is quoted by Lepore:

A world with one archive is a really bad idea.

Van de Sompel and his collaborator Michael Nelson have repeatedly pointed out just how important it is for there to be multiple archives of Web content, and for there to be a way for them to be discoverable, and work together. Another thing I learned from Lepore’s article is that Brewster’s initial vision for the Internet Archive was much more collaborative, which gave birth to the International Internet Preservation Consortium, which is made up of 32 member organizations who do Web archiving.

A couple weeks ago one prominent IIPC member, the California Digital Library announced that it was retiring its in house archiving infrastructure and out sourcing its operation to ArchiveIt, which is the subscription web archiving service from the Internet Archive.

The CDL and the UC Libraries are partnering with Internet Archive’s Archive-It Service. In the coming year, CDL’s Web Archiving Service (WAS) collections and all core infrastructure activities, i.e., crawling, indexing, search, display, and storage, will be transferred to Archive-It. The CDL remains committed to web archiving as a fundamental component of its mission to support the acquisition, preservation and dissemination of content. This new partnership will allow the CDL to meet its mission and goals more efficiently and effectively and provide a robust solution for our stakeholders.

I happened to tweet this at the time:

good news for ArchiveIt and CDL, but probably bad news for web archiving in general http://t.co/mV3xvqyzi8

— Ed Summers (@edsu)

January 14, 2015

Which at least inspired some mirth from Jason Scott, who is an Internet Archive employee, and also a noted Internet historian and documentarian.

@edsu bwa ha ha

— Jason Scott (@textfiles)

January 14, 2015

Jason is also well known for his work with ArchiveTeam, which quickly mobilizes volunteers to save content on websites that are being shutdown. This content is often then transferred to the Internet Archive. He gets his hands dirty doing the work, and inspires others to do the same. So I deserved a bit of derisive laughter for my hand-wringing.

But here’s the thing. What does it mean if one of the pre-eminent digital library organizations needs to outsource their Web archiving operation? And what if, as the announcement indicates, Harvard, MIT, Stanford, UCLA, and others might not be far behind. Should we be concerned that the technical expertise and infrastructure for doing this work is becoming consolidated in a single organization? What does it say about our Web archiving tools that it is more cost-effective for CDL to outsource this work?

The situation isn’t as dire as it might sound since ArchiveIt subscribers retain the right to download their content and store it themselves. How many institutions do that with regularity isn’t well known (at least to me). But Web content isn’t like paper that you can put in a box, in a climate controlled room, and return to years hence. As Matt Kirschenbaum has pointed out:

the preservation of digital objects is logically inseparable from the act of their creation — the lag between creation and preservation collapses completely, since a digital object may only ever be said to be preserved if it is accessible, and each individual access creates the object anew

Can an organization download their WARC content, not provide any meaningful access to it, and say that it is being preserved? I don’t think so. You can’t do digital preservation without thinking about some kind of access to make sure things are working and people can use the stuff. If the content you are accessing is on a platform somewhere else that you have no control over you should probably be concerned.

I’m hopeful that this collaboration between CDL and ArchiveIt, and other organizations, will lead to a fruitful collaboration and improved tools. But I’m worried that it will mean organizations can simply outsource the expertise and infrastructure of web archiving, while helping reinforce what is already a huge single point of failure. David Rosenthal of Stanford University notes that diversity is a vital component to digital preservation:

Media, software and hardware must flow through the system over time as they fail or become obsolete, and are replaced. The system must support diversity among its components to avoid monoculture vulnerabilities, to allow for incremental replacement, and to avoid vendor lock-in.

I’d like to see more Web archiving classes in iSchools and computer science departments. I’d like to see improved and simplified tools for doing the work of Web archiving. Ideally I’d like to see more in house crawling and access of web archives, not less. I’d like to see more organizations like the Internet Archive that are not just technically able to do this work, but are also bold enough to collect what they think is important to save on the Web and make it available. If we can’t do this together I think the Library of Alexandria metaphor will be all too literal.

Islandora: Islandora Conference: Registration Now Open

planet code4lib - Fri, 2015-01-23 18:52

The Islandora Foundation is thrilled to invite you to the first-ever Islandora Conference, taking place August 3 - 7, 2015 in the birthplace of Islandora: Charlottetown, PEI.

This full week event will consist of sessions from the Islandora Foundation, Interest groups, community presentations, two full days of hands-on Islandora training, and will end with a Hackfest where we invite you to make your mark in the Islandora code and work together with your fellow Islandorians to complete projects selected by the community.

Our theme for the conference is Community - the Islandora community, the community of people our institutions serve, the community of researchers and librarians and developers who work together to curate digital assets, and the community of open source projects that work together and in parallel.

Registration is now open, with an Early Bird rate available until the end of March. Institutional rates are also available for groups of three or more.

For more information or to sign up for the conference, please visit our conference website: http://islandora.ca/camps/conference2015.

Thank you,

The Islandora Team
community@islandora.ca

M. Ryan Hess: Your Job Has Been Robot-sourced

planet code4lib - Fri, 2015-01-23 18:15

“People are racing against the machine, and many of them are losing that race…Instead of racing against the machine, we need to learn to race with the machine.”

- Erik Brynjolfsson, Innovation Researcher

Libraries are busy making lots of metadata and data networks. But who are we making this for anyway? Answer: The Machines

I spent the last week catching up on what the TED Conference has to say on robots, artificial intelligence and what these portend for the future of humans…all with an eye on the impact on my own profession: librarians.

A digest of the various talks would go as follows:

    • Machine learning and AI capabilities are advancing at an exponential rate, just as forecast
    • Robots are getting smarter and more ubiquitous by the year (Roomba, Siri, Google self-driving cars, drone strikes)

Machines are replacing humans at an increasing rate and impacting unemployment rates

The experts are personally torn on the rise of the machines, noting that there are huge benefits to society, but that we are facing a future where almost every job will be at risk of being taken by a machine. Jeremy Howard used words like “wonderful” and “terrifying” in his talk about how quickly machines are getting smarter (quicker than you think!). Erik Brynjolfsson (quoted above) shared a mixed optimism about the prospects this robotification holds for us, saying that a major retooling of the workforce and even the way society shares wealth is inevitable.

Personally, I’m thinking this is going to be more disruptive than the Industrial Revolution, which stirred up some serious feelings as you may recall: Unionization, Urbanization, Anarchism, Bolshevikism…but also some nice stuff (once we got through the riots, revolutions and Pinkertons): like the majority of the world not having to shovel animal manure and live in sod houses on the prairie. But what a ride!

This got me thinking about the end game the speakers were loosely describing and how it relates to libraries. In their estimation, we will see many, many jobs disappear in our lifetimes, including lots of knowledge worker jobs. Brynjolfsson says the way we need to react is to integrate new human roles into the work of the machines. For example, having AI partners that act as consultants to human workers. In this scenario (already happening in healthcare with IBM Watson), machines scour huge datasets and then give their advice/prognosis to a human, who still gets to make the final call. That might work for some jobs, but I don’t think it’s hard to imagine that being a little redundant at some point, especially when you’re talking about machines that may even be smarter than their human partner.

But still, let’s take the typical public-facing librarian, already under threat by the likes of an ever-improving Google. As I discussed briefly in Rise of the Machines, services like Google, IBM Watson, Siri and the like are only getting better and will likely, and possibly very soon, put the reference aspect of librarianship out of business altogether. In fact, because these automated information services exist on mobile/online environments with no library required, they will likely exacerbate the library relevance issue, at least as far as traditional library models are concerned.

Of course, we’re quickly re-inventing ourselves (read how in my post Tomorrow’s Tool Library on Steroids), but one thing is clear, the library as the community’s warehouse and service center for information will be replaced by machines. In fact, a more likely model would be one where libraries pool community resources to provide access to cutting-edge AI services with access to expensive data resources, if proprietary data even exists in the future (a big if, IMO).

What is ironic, is that technical service librarians are actually laying the groundwork for this transformation of the library profession. Every time technical service librarians work out a new metadata schema, mark up digital content with micro-data, write a line of RDF, enhance SEO of their collections or connect a record to linked data, they are really setting the stage for machines to not only index knowledge, but understand its semantic and ontological relationships. That is, they’re building the infrastructure for the robot-infused future. Funny that.

As Brynjolfsson suggests, we will have to create new roles where we work side-by-side with the machines, if we are to stay employed.

On this point, I’d add that we very well could see that human creativity still trumps machine logic. It might be that this particular aspect of humanity doesn’t translate into code all that well. So maybe the robots will be a great liberation and we all get to be artists and designers!

Or maybe we’ll all lose our jobs, unite in anguish with the rest of the unemployed 99% and decide it’s time the other 1% share the wealth so we can all, live off the work of our robots, bliss-out in virtual reality and plan our next vacations to Mars.

Or, as Ray Kurzweil would say, we’ll just merge with the machines and trump the whole question of unemployment, let alone mortality.

Or we could just outlaw AI altogether and hold back the tide permanently, like they did in Dune. Somehow that doesn’t seem likely…and the machines probably won’t allow it. LOL

Anyway, food for thought. As Yoda said: “Difficult to see. Always in motion is the future.”

Meanwhile, speaking of movies…

If this subject intrigues you, Hollywood is also jumping into this intellectual meme, pushing out several robot and AI films over the last couple years. If you’re interested, here’s my list of the ones I’ve watched, ordered by my rating (good to less good).

  1. Her: Wow! Spike Jonze gives his quirky, moody, emotion-driven interpretation of the AI question. Thought provoking and compelling in every regard.
  2. Black Mirror, S02E01 – Be Right Back: Creepy to the max and coming to a bedroom near you soon!
  3. Automata: Bleak but interesting. Be sure NOT to read the expository intro text at the beginning. I kept thinking this was unnecessary to the film and ruined the mystery of the story. But still pretty good.
  4. Transcendence: A play on Ray Kurzwell’s singularity concept, but done with explosions and Hollywood formulas.
  5. The Machine: You can skip it.

Two more are on my must watch list: Chappie and Ex Machina, both of which look like they’ll be quality films that explore human-robot relations. They may be machines, but I love when we dress them up with emotions…I guess that’s what you should expect from a human being. :)


FOSS4Lib Updated Packages: Repox

planet code4lib - Fri, 2015-01-23 15:45

Last updated January 23, 2015. Created by Peter Murray on January 23, 2015.
Log in to edit this page.

REPOX is a framework to manage data spaces. It comprises several
channels to import data from data providers, services to transform
data between schemas according to user's specified rules, and
services to expose the results to the exterior.
This tailored version of REPOX aims to provide to all the TEL and Europeana partners a
simple solution to import, convert and expose their bibliographic data via
OAI-PMH, by the following means:

  • Cross platform
    It is developed in Java, so it can be deployed in any
    operating system that has an available Java virtual machine.
  • Easy deployment
    It is available with an easy installer, which includes
    all the required software.
  • Support for several data formats and encodings
    It supports UNIMARC and MARC21 schemas, and encodings in ISO 2709 (including several variants),
    MarcXchange or MARCXML. During the course of the TELplus project, support
    will be added for other possible encodings required by the partners.
  • Data crosswalks
    It offers crosswalks for converting UNIMARC and MARC21 records to simple
    Dublin Core as also to TEL-AP (TEL Application
    Profile). A simple user interface makes it possible to customize these
    crosswalks, and create new ones for other formats.
Package Type: Metadata Manipulation Package Links Development Status: Production/Stable Releases for Repox Operating System: Browser/Cross-PlatformTechnologies Used: Dublin CoreMARC21MARCXMLOAITomcatProgramming Language: JavaDatabase: MySQLPostgreSQLOpen Hub Link: https://www.openhub.net/p/repoxOpen Hub Stats Widget: 

FOSS4Lib Upcoming Events: 2015 VIVO implementation Fest

planet code4lib - Fri, 2015-01-23 15:31
Date: Monday, March 16, 2015 - 08:00 to Wednesday, March 18, 2015 - 17:00Supports: Vivo

Last updated January 23, 2015. Created by Peter Murray on January 23, 2015.
Log in to edit this page.

The i-Fest will be held March 16-18 and is being hosted by the Oregon Health and Science University Library in Portland, Oregon.

For further details about the i-Fest program, registration, travel, and accommodations, visit the blog post on vivoweb.org at http://goo.gl/wdOwMf

OCLC Dev Network: Developer House Project: "Today in History" Style Holdings Showcase

planet code4lib - Fri, 2015-01-23 14:00

We’re excited to start sharing more information about the projects created at Developer House in December. This first project comes to you from: Bilal Khalid, Emily Flynn, Francis Kayiwa, Rachel Maderik, Scott Hanrath, and Shawn Denny.

LITA: What Do You Do With a 3D Printer?

planet code4lib - Fri, 2015-01-23 13:00

“Big mac, 3D printer, 3D scanner” by John Klima is licensed under CC BY 2.0

This is the first in a series of posts about some technology I’ve introduced or will be introducing to my library. In my mind, the library is a place where the public can learn about new and emerging technologies without needing to invest in them. To that end, I’ve formed a technology committee at our library that will meet quarterly to talk about how we’re using the existing technology in the building and what type of technology we could introduce to the building.

This next two paragraphs have some demographic information so that you have an idea of whom I’m trying to serve (i.e., you can skip them if you want to get to the meat of the technology discussion).

I work at the Waukesha Public Library in the city of Waukesha, the 7th largest municipality in WI at around 72,000 people. We have a a service population of almost 100,000. The building itself is about 73,000 square feet with a collection of around 350,000 items.

Waukesha has a Hispanic population of about 10% with the remainder of our population being predominantly Caucasian. Our public is a pretty even mix across age groups and incomes. Technological interest also runs pretty evenly from early adopters to neophytes.

I’ve wanted a 3D printer forever. OK, only a few years, but in the world of technology a few years is almost forever. I didn’t bring up the idea to our executive director initially because I wasn’t sure I could justify the expense.

As assistant director in charge of technology at the library, I can justify spending up to a few hundred dollars on new technology. Try out a Raspberry Pi? Sure. Pick up a Surface? Go ahead. But spending a few thousand dollars? That felt like it needed more than my whim.

But after those few years went by and 3D printers were still a topic of discussion and I didn’t have one yet, I approached the executive director and our Friends group and got the money to buy a MakerBot Replicator 2 and a MakerBot Digitizer (it was the Digitizer that finally pushed me over the precipice to buy 3D equipment; more on that later).

So we bought the machine, set it up, and started printing a bunch of objects. At first it was just things on a SD card in the printer: a nut-and-bolt set, a shark, chain links, a comb, and a bracelet.

People loved watching the machine work. Particularly when it was making the chain links. They couldn’t understand how it could print interconnected chain links. I tried to explain that it printed in 100 micron thick layers (slightly thinner than a sheet a paper) and it built the objects up one layer at a time which let it make interconnected objects.

It made more sense if you could watch it.

Our young adult librarian starting making plans for her teen patrons. This past October we read Edgar Allan Poe as a community read and she had her teens make story jars of different Edgar Allan Poe stories using objects we printed: hearts, ravens, bones, coffins, etc.

One of our children’s librarians used the printer to enhance a board-game design program he ran. He printed out dice, figures, and markers that the kids could use when designing a game. Then they got to take their game home when they finished it. More recently he printed out a chess set that assembles into a robot for the winner of our upcoming chess tournament.

I printed out hollow jack o’ lanterns that showed a spooky face when you placed a small electric light inside them. When I realized I needed a desk organizer for the 3D printer I printed one instead of buying one.

“Mushroom candy tin and friend” by John Klima is licensed under CC BY 2.0

Now, as for the Digitizer. We’ve tried digitizing objects. To me that was the coolest thing we could do: make copies of physical objects. Unfortunately, the digitizer has worked poorly at best. It cannot handle small objects—things larger than a egg work best—and it cannot scan complicated or dull objects very well.

Our failures include a kaiju wind-up toy, a LEGO Eiffel Tower, and a squishy stressball brain. Our only success was a Mario Bros. mushroom candy tin. That scanned perfectly, but it’s round, shiny, and the perfect size. If you’re considering buying a digitizer, I would think twice about it (honestly, I’d recommend not getting one at this time).

Now the question I ask is: what’s next? The Replicator 2 isn’t the best machine to put out for public use as it would require quite a bit of staff oversight. There are some 3D printers—the Cube printer from 3D Systems for example—that are better suited for public use in my opinion. It’s currently a moot point as we don’t have space in our public area for one at this time, but I think offering one for public use is in our future plans somewhere down the line.

I’d like to use it more for programming in the library. I want to showcase it to the public more. Our technology committee will make plans so that we can do both of those things.

More importantly, what about the rest of you? Who has a 3D printer in their building? Do you use it for staff or public? Do you want to get a 3D printer for your library? What sorts of questions to have about them?

DuraSpace News: Announcing the 2015 VIVO Implementation Fest

planet code4lib - Fri, 2015-01-23 00:00

From the VIVO i-Fest Planning Team

We're excited to invite you to the 2015 VIVO implementation Fest (i-Fest) where it doesn't matter if you're a seasoned VIVO aficionado or someone who's just begun to learn about VIVO! 

The i-Fest will be held March 16-18 and is being hosted by the Oregon Health and Science University Library in Portland, Oregon.

District Dispatch: Where the heck did all of these librarians come from?

planet code4lib - Thu, 2015-01-22 22:38

We’re taking part in Copyright Week, a series of actions and discussions supporting key principles that should guide copyright policy. Every day this week, various groups are taking on different elements of the law, and addressing what’s at stake, and what we need to do to make sure that copyright promotes creativity and innovation.

Today’s topic is transparency, but I chose to write about librarians.

We have a good number of librarians who, beyond a doubt, are copyright geeks, like me. In fact, we call ourselves copyright geeks especially now that the term “geek” has gained such popularity. These are librarians— a few with JDs—who attend conferences like Berkeley Center for Law and Technology Symposium on copyright formalities. Really, who would find “Constraints and flexibilities in the Berne Convention” an attention -grabbing program? (I loved it!!)

What do we do? Crazy things like studying Congressional hearings from the 1970s, citing eBay v. MercExchange at CopyNight, and reading the entire 130-page Hargreaves Digital Economy Report. You can find our hoard at any American Library Association (ALA) conference program, meeting or discussion group that has anything to do with copyright. We make our selves available to the profession, teaching other librarians about copyright, social responsibility and of course, the four factors of fair use. Of course, we do not give legal advice, but we often know more about the copyright law than the typical counsel retained by the library or educational institutions. Yet we are not snobbish. We have our copyright scholar heroes, and we pester them, prizing any new copyright gem of knowledge that they might utter.

The increased interest in copyright is often interconnected with technological advancement and innovation (what else?), and the desire to use technology to the fullest extent – so we can preserve, lend, data mine, and rely on fair use. But way back in the day—yes, the time before the internet— there were librarians with copyright expertise formidable enough to represent library communities across this great nation at U. S. Congressional copyright policy-making since before the Copyright Act of 1976. These librarians were primarily ALA and the Association of Research Libraries (ARL) staff. Current staffs at these same associations, along with the staff at the Association of College and Research Libraries (ACRL) have formed a coalition, now more than 15 years ago called the Library Copyright Alliance (LCA). We were plodding along before the cooler kids (EFF and Public Knowledge) moved into the copyright neighborhood.

What sets librarian copyright geeks and their associations apart from the cool kids? We have continuing contact with the public, and we talk to them. If a member of the public has a copyright need, we help them. And if this member of the public has issue with government copyright policy, we tell them how to contact their Member of Congress.

Plus we have thousands of association members who believe in civil society and are probably more likely vote in an election. We might be stuck with the librarian stereotype, but on the other hand, our library communities have great trust in us. While it’s true that we don’t have the lobbying resources that large corporations have, and we can’t introduce folks to Angelina Jolie, we hold our own.

So in honor of Copyright Week, all hail the copyright librarians!! (Did you see – we even have a television show!!)

The post Where the heck did all of these librarians come from? appeared first on District Dispatch.

Pages

Subscribe to code4lib aggregator