You are here

Feed aggregator

DPLA: Reflections on Community Currents at #DPLAfest

planet code4lib - Wed, 2016-06-15 14:33

This guest post was written by T-Kay Sangwand, Librarian for Digital Collection Development, Digital Library Program, UCLA and DPLA + DLF ‘Cross-Pollinator.’ (Twitter: @tttkay)

As an information professional committed to social justice and employing a critical lens to examine the impact of our work, I always look forward to seeing how these principles and issues of diversity and representation of the profession and historical record are more widely discussed in national forums. In my new role as Librarian for Digital Collection Development at UCLA’s Digital Library Program, I grapple with how our work as a digital library can serve our predominantly people of color campus community within the larger Los Angeles context, a city also predominantly comprised of people of color. As a first time attendee to DPLAfest, I was particularly interested in how DPLA frames itself as a national digital library for a country that is projected to have a majority person of color population by 2060. I observed that the DPLAfest leadership did not yet reflect the country’s changing demographics. The opening panel featured eight speakers yet there was only one woman and two people of color.

The opening panel of DPLAfest was filled with many impressive statistics – over 13 million items in DPLA, over 1900 contributors, over 30 partners, over 100 primary source sets, with all 50 states represented by the collections. While these accomplishments merit celebration, I appreciated Dr. Kim Christen Withey’s Twitter comment that encourages us to consider alternate frameworks of success:

#DPLAfest lots of talk of numbers–presumably the bigger the better–how else can we think about success? esp in the digital content realm?

— Kim Christen Withey (@mukurtu) April 14, 2016

“Tech Trends in Libraries” panelists Carson Block, Alison Macrina, and John Resig discuss ‘big data’ and libraries. Photo by Jason Dixson

While the amount of materials or information we have access to is frequently used as a measure of success, several panels such as The People’s Archives: Communities and Documentation Strategy, Wax Works in the Age of Digital Reproduction: The Futures of Sharing Native/First Nations Cultural Heritage, and Technology Trends in Libraries encouraged nuanced discussions of success through its discussions around the complexities of access. The conversation between Alison Macrina of Library Freedom Project and John Resig of Khan Academy critically interrogated the celebration of big data. Macrina reminds libraries to ask the questions: Who owns big data? What is the potential for exploitation? Who has access? How do we negotiate questions of privacy for individuals yet not allow institutions to escape accountability?

The complexities of access and privacy were further explored in the community archives sessions. Community archivists Carol Steiner and Keith Wilson from the People’s Archive of Police Violence in Cleveland spoke on storytelling as a form of justice in the face of impunity but also the real concerns of retribution for archiving citizen stories of police abuse. Dr. Kim Christen Withey spoke on traditional knowledge labels and the Mukurtu content management system that privileges indigenous knowledge about their own communities and enables a continuum of access instead of a binary open/closed model of access. In both of these cases, exercising control over one’s self and community representation constitutes a form of agency in the face of symbolic annihilation that traditional archives and record keeping have historically wreaked on marginalized communities. Additionally, community investment in these documentation projects outside traditional library and archive spaces have been key to their sustainability. In light of this, Bergis Jules raised the important question of “what is or should be the role of large scale digital libraries, such as DPLA, in relation to community archives?” First and foremost, I think our role as information professionals is to listen to communities’ vision(s) for their historical materials; it’s only then that we may be able contribute to and support communities’ agency in documentation and representation. I’m grateful that participants created space within DPLA to have these nuanced discussions and I’m hopeful that community driven development can be a guiding principle in DPLA’s mission.

For a closer read of the aforementioned panels, see my Storify: Community Archives @ DPLAfest.

Special thanks to the Digital Library Federation for making the DPLAfest Cross-Pollinator grant possible.

Open Knowledge Foundation: Introducing The New Proposed Global Open Data Index Survey

planet code4lib - Wed, 2016-06-15 11:00

The Global Open Data Index (GODI) is one of the core projects of Open Knowledge International. Originally launched in 2013, it has quickly grown and now measures open data publication in 122 countries. GODI is a community tool, and throughout the years the open data community have taken an active role in shaping it by reporting problems, discussing issues on GitHub and in our forums as well as sharing success stories. We welcome this feedback with open arms and in 2016, it has proved invaluable in helping us produce an updated set of survey questions.

In this blogpost we are sharing the first draft of the revised GODI survey. Our main objective in updating the survey this year has been to improve the clarity of the questions and provide better guidance to submitters in order to ensure that contributors understand what datasets they should be evaluating and what they should be looking for in those datasets. Furthermore, we hope the updated survey will help us to highlight some of the tangible challenges to data publication and reuse by paying closer attention to the contents of datasets.

Our aim is to adopt this new survey structure for future editions of GODI as well as the Local Open Data Index and we would love to hear your feedback! We are aware that some changes might affect the comparability with older editions of GODI and it’s for this reason that your feedback is critical. We are especially curious to hear the opinion of the Local Open Data Index community. What do you find positive? Where do you see issues with your local index? Where could we improve?

In the following we would like to present our ideas behind the new survey. You will find a detailed comparison of old and new questions in this table.

A brief overview of the proposed changes:

  • Better measure and document how easy it is to find government data online
  • Enhance our understanding of the data we measure
  • Improve the robustness of our analysis

 

  1. Better measure and document how easy or difficult it is to find government data online

Even if governments are publishing data, if potential users cannot find them, then it goes without saying that they will not be able to use it. In our revised version of the survey, we ask submitters to document where they found a given dataset as well how much time they needed to find it. We recognise this to be an imperfect measure, as different users are likely to vary in their capacity to find government data online. However, we hope that this question will help us to extract critical information around the challenges related to usability that are not easily captured by a legal and technical analysis of a given dataset, even if it would be difficult to quantify the results and therefore use it in the scoring. 

  1. Enhance our understanding of the data we measure

It is common from governments to publish datasets in separate files and places. Contributors might find department spending data scattered across different department websites or, even when made available in one place such as a portal, the data could be split up into a multiple files. Some portion of this data might be openly licensed, another portion machine-readable while others are in PDFs. Sometimes non-machine-readable data is available without charge, while machine-readable files are available for a fee. In the past, this has proven to be an enormous challenge for the Index as submitters are forced to decide what data should be evaluated (see this discussion in our forum). 

The inconsistent publication of government data leads to confusion among our submitters and negatively impacts the reliability of the Index as an assessment tool. Furthermore, we think it is safe to say if open data experts are struggling to find or evaluate datasets, potential users will face similar challenges and as such, the inconsistent and sporadic data publication policies of governments is likely to affect data uptake and reuse. In order to ensure that we are comparing like with like, GODI assesses the openness of clearly defined datasets. These dataset definitions are what have determined, in collaboration with experts in the field, to be essential government data – data that contains crucial information for society at large. If a submitter only finds parts of this information in a file or scattered across different files, rather than assessing the openness of key datasets, we end up assessing a partial snapshot that is unlikely to be representative. There is more at stake than our ability to assess the “right” datasets – incoherent data publication significantly limits the capacity of civil society to tap into the full value of government data.

  1. Improve the robustness of our analysis

In the updated survey, we will determine whether datasets are available from one URL by asking “Are all the data downloadable from one URL at once?” (formerly “Available in bulk?”).  To respond in the affirmative, submitters would have to be able to demonstrate that all required data characteristics is made available in one file. If the data cannot be downloaded from one URL, or if submitters find multiple files on one URL, they will be asked to select one dataset, from one URL, which the most number of requirements and is available free of charge. Submitters will document why they’ve chosen this dataset and data source in order to help reviewers understand the rationale for choosing a given dataset and to aid in verifying sources.

The subsequent question will, “Which of these characteristics are included in the downloadable file?”, will help us verify that the dataset submitted does indeed contain all the requisite characteristics. Submitters will assess the dataset by selecting each individual characteristic contained within it.  Not only will this prompt contributors to really verify that all the established characteristics are met, it will also allow us to gain a better understanding of the common components missing when governments are publishing data, thus giving civil society a better foundation to advocate for publishing the crucial data. In our results we will more explicitly flag which elements are missing and declare only those dataset fully open that match all of our dataset requirements. 

 

This year, we are committed to improving the clarity of the survey questions: 

  1. “Does the data exist?” – The first question in previous versions of the Index was often confusing for submitters and has been reformulated to ask: Is the data published by government (or a third-party related to government)?” If the response is no, contributors will be asked to justify their response. For example, does the collection, and subsequent publication, of this data fall under under the remit of a different level of government?  Or perhaps the data is collected and published (or not) by a private company? There are a number of legal, social, technical and political reasons that might mean that the data we are assessing simply does not exist and the aim of this question is to help open data activists advocate for coherent policies around data production and publication (see past issues with this question here and here).  
  1. “Is data in digital form?” – The objective of this question was to cover cases where governments provided large data on DVDs, for example. However, users have commented that we should not ask for features that do not make data more open. Ultimately, we have concluded that if data is going to be usable for everyone, it should be online. We have therefore deleted this question.
  2. “Publicly Available?” – We merged “Publicly available?” with “Is the data available online?”. The reason is that we only want to reward data that is publicly accessible online without mandatory registrations (see for instance discussions here and here) .
  3. “Is the data machine-readable?” – There have been a number of illuminating discussions in regards to what counts as machine-readable formats (see for example discussions here and here). We found that the question “Is the data machine-readable?” was overly technical. Now we simply ask users “In which file formats are the data?”. When submitters enter the format our system automatically recognises if the format is machine-readable and in an open format.
  4. “Openly licensed” – Some people argued that the question “Openly licensed?” does not adequately take into account the fact that some government data are in the public domain and not under the protection of copyright. As such, we have expanded the question to “Is the data openly licensed/in the public domain”. If data are not under the protection of copyright, they do not necessarily need to be openly licensed; however, a clear disclaimer must be provided informing users about their copyright status (which can be in form of an open licence). This change is in line with the Open Definition 2.1. (See discussions here and here).

Looking forward hearing your thoughts on the forum or by commenting on this post!

Islandora: The Islandora Long Tail is now Awesome

planet code4lib - Wed, 2016-06-15 10:03

I've been posting about the Long Tail of Islandora for a while now, putting a spotlight on Islandora modules developed and shared by members of our community. It's a good way to find new tools and modules that might answer a need you have on your site (so you don't have to build your own from scratch). We've also kept an annotated list of community developed modules in our Resources section, but it had a tendency to get a little stale and sometimes miss great work that wasn't happening in places we expect.

Enter the concept of the Awesome List, a curated list of awesome lists, complete with helpful guidelines and policies that we could crib from to make our own list of all that is awesome for Islandora. It now lives in our Islandora Labs GitHub organization, and new contributions are very welcome. You can share your own work, your colleagues' work, or any public Islandora resource that you think other Islandorians might find useful. If you have something to add, please put in a pull request or email me.

Awesome Islandora

pinboard: Google Groups

planet code4lib - Wed, 2016-06-15 02:48
Hey #Code4Lib Southeastern folk - #C4LSE is reopening a regional dialogue. Join us?

DuraSpace News: Running Effective Institutional Repositories: A Look at Best Practices

planet code4lib - Wed, 2016-06-15 00:00

From Sarah Tanksalvala, Thomson Reuters  Institutional repositories are an increasingly common feature of universities, creating a database of scholarly and educational work produced by a university’s faculty and students. Done right, they can create a showcase for researchers and students hoping to demonstrate their scholarship, at the same time showcasing the university’s achievements as a whole.

Peter Sefton: Open Repositories 2016: Demo: A repository before breakfast

planet code4lib - Tue, 2016-06-14 22:00

I have just returned from the Open Repositories 2016 conference in Dublin where I did a demo in the Developer Track, curated by my colleagues Claire Knowles and Adam Field. The demo went OK, despite being interrupted by a fire alarm.

Here’s my abstract:

Presented by Peter Sefton, University of Technology, Sydney peter.sefton@uts.edu.au

In this session I’d like to show off the technical side of the open source platform, Ozmeka (based on Omeka) which was presented at OR2015.

In the demo I will:

  • Spin up a fresh instance of a repository using a vagrant script my team prepared earlier.

  • Show how to populate the repository via a CSV file, complete with multiple different item types (people, creative works, that sort of thing) with relations between them.

  • Demonstrate that this is a Linked-data-ish system, with relations between items in the repo, and external authorities and talk about why this is better than using string-based metadata which is still the default in most repository systems.

  • Talk about why it is worth considering Omeka/Ozmeka for small-to-medium repository and website development.

To which I added:

Demo loading the same data into a Fedora 4 repository.

The spreadsheet format I demoed is still a work in progress, which I will document on github project; I think it shows promise as a way of creating simple websites from data, including multiple types of object, and nested collections. I took the first fleet maps data, munged it a little to create a linked-data set for this demo. As downloaded, the data is a list of map images. I added a couple of extra rows:

  • Two collections, one for the maps and one for people
  • Entries for the people mentioned in the metadata the creators of the maps

And extra columns for relationships:

  • Collection membership via a pcdm:Collection column.
  • A REL:dc:creator for the dublin core creator relationship.
A sample Omeka page What is this?

I presented a paper last year co-authored with Sharyn Wise about an Omeka-based project we did at UTS, building a cross-disciplinary research repository, Dharmae. This time I just wanted to do a quick demo for the developer track showing out how easy it is to get started with a dev version of Omeka, and also show some early work on an Python API for Fedora 4.

Audience

This is for developers who can run Python and understand virtual environments. NOTE: These instructions have not been independently tested; you will probably need to do some problem solving to get this to run, including, but not limited to running both python 3 and python 2.

Get the dependencies up and running
  1. Get & run Omeka via this vagrant script, put together by Thom McIntyre.
  • Get an API Key via http://localhost:8080/admin/users/api-keys/1

  • Install the item relations plugin (it’s there, you just need to activate it via the install button) http://localhost:8080/admin/plugins

  1. Get the One-click-run Fedora Application from the Fedora downloads page.
Import some data into Ozmeka

Assuming Omeka is running, as per the instructions above.

NOTE: This is a Python 2 script.

  1. Check out the Ozmeka Python Utils.

  2. Follow the instructions on how to upload some sample data to Omeka from a CSV file.

Remember your API key, and to install the Item Relations plugin.

Import the same data into Fedora 4

NOTE: this is a Python 3 Script.

Also, note that Fedora 4 doesn’t come with a web interface - you’ll just be putting data into it in a raw form like this:

Data in Fedora 4
  1. Start Fedora by running the Jar file (try double-clicking it).
  2. Select port 8081
  3. Click Start
  4. Install our experimental Fedora api client for Python 3.
  5. Follow the instructions to import csv data into Fedora.

Thanks to Mike Lynch for the Fedora API code.

District Dispatch: A “Library for All” around the world and volunteer opportunity

planet code4lib - Tue, 2016-06-14 19:45

June 8, 2016 meeting of the Library for All Board of Directors, Advisory Board, and Young Professional Board in New York City.

Last week, I was in New York City for a board meeting for Library For All (LFA), a nonprofit organization that has built a digital library platform to address the lack of access to quality educational materials in developing countries. Among other things, I learned about the latest LFA success—in Cambodia, where the kids’ demand for ebooks came to exceed the supply, at least temporarily.

Designed for low-bandwidth environments, the LFA digital library is a customizable, user-friendly digital platform that delivers ebooks to low cost devices such as mobile phones, tablets and computers. The collection is filled with content that is culturally relevant and available in local and international languages. The Library currently reaches readers in Haiti, Democratic Republic of Congo, Rwanda, Cambodia, and Mongolia.

The Volunteer Opportunity:  Country Curators

LFA has a particular need for curators of specialized collections for their country libraries. Some of the topical areas include girls, early grade literacy, adult literacy, and health—but other topics are of interest as well.

Responsibilities of the volunteer curator may include:

  • Identify titles that will make up a specific collection
  • Research and have a broad understanding of the collection you’re curating
  • You will be researching existing open source content as well as evaluating what publishers / NGOs already have available
  • Work with the Content Manager to reach out to existing publishers who may have suitable content
  • You will work with the Content Manager and take ownership of the curation and implementation of the collection
  • By the end of your time you will have a specialized collection uploading and being read in the digital library
  • Reach out to publishers and NGOs to see if we can use the content on our library platform
  • Add metadata to the content and upload books onto our Digital Assets Management

The specific tasks and timetable for a given volunteer will vary and are flexible, though generally LFA seeks those who can provide a block of time over a couple of months rather than a lesser engagement over many months. Fluency in English required; fluency in French, or one of the current LFA local languages, Haitian Creole, Khmer, Mongolian a plus but not essential.

For Further Information about the Opportunity or LFA

Those who are interested in learning more should contact Georgia Tyndale at Georgia@libraryforall.org. Also, note that Rebecca McDonald, CEO of Library For All, will be at the upcoming ALA Annual Conference in Orlando. Those interested in learning more about this volunteer opportunity or about LFA generally are invited to meet with her there. To arrange a meeting, contact Rebecca at rebeccam@libraryforall.org.

The post A “Library for All” around the world and volunteer opportunity appeared first on District Dispatch.

Jonathan Rochkind: Handy introspection for debugging Rails routes

planet code4lib - Tue, 2016-06-14 15:44

I always forget how to do this, so leave this here partly as a note to myself. From Zobie’s Blog and Mike Blyth’s Stack Overflow answer

 

routes = Rails.application.routes # figure out what route a path maps to: routes.recognize_path "/station/index/42.html" # => {:controller=>"station", :action=>"index", :format=>"html", :id=>"42"} # or get a ActionController::RoutingError # figure out what url is generated for params, what url corresponds # to certain controller/action/parameters... r.generate :controller => :station, :action=> :index, :id=>42

If you have an isolated Rails engine mounted, it’s paths seem to not be accessible from the
`Rails.application.routes` router. You may need to try that specific engine’s router, like `Spree::Core::Engine.routes`.

It seems to me there’s got to be a way to get the actual ‘master’ router that’s actually used
for recognizing incoming urls, since there’s got to be one that sends to the mounted engine
routes as appropriate based on paths. But I haven’t figured out how to do that.


Filed under: General

David Rosenthal: Decentralized Web Summit

planet code4lib - Tue, 2016-06-14 15:00
Brad Shirakawa/Internet ArchiveThis is a quick report from the Decentralized Web Summit. First, Brewster Kahle, Wendy Hanamura and the Internet Archive staff deserve great praise for assembling an amazing group of people and running an inspiring and informative meeting. It was great to see so many different but related efforts to build a less centralized Web.

Pictures and videos are up here. You should definitely take the time to watch, at least the talks on the second day by:
and the panel moderated by Kevin Marks, in particular this contribution from Zooko Wilcox. He provides an alternative view on my concerns about Economies of Scale in Peer-to-Peer Networks.

I am working on a post about my reactions to the first two days (I couldn't attend the third) but it requires a good deal of thought, so it'll take a while.

Mark E. Phillips: Comparing Web Archives: EOT2008 and EOT2012 – What

planet code4lib - Tue, 2016-06-14 14:30

This post carries on from where the previous post in this series ended.

A very quick recap,  this series is trying to better understand the EOT2008 and the EOT2012 web archives.  The goal is to see how they are similar, how they are different, and if there is anything that can be learned that will help us with the upcoming EOT2016 project.

What

The CDX files we are using has a column that contains the Media Type (MIME Type) for the different URIs in the WARC files.  A list of the assigned Media Types are available at the International Assigned Numbers Authority (IANA) in their Media Type Registry.

This is a field that is inherently “dirty” for a few reasons.  This field is populated from a field in the WARC Record that comes directly from the web server that responded to the initial request.  Usually these are fairly accurate but there are many times where they are either wrong or at the least confusing.  Often times this is caused by  a server administrator, programmer, or system architect that is trying to be clever,  or just misconfigured something.

I looked at the Media Types for the two EOT collections to see if there are any major differences between what we collected in the two EOT archives.

In the EOT2008 archive there are a total of 831 unique Mime/Media Types,  in the EOT2012 there are a total of 1,208 unique type values.

I took the top 20 Mime/Media Types for each of the archives and pushed them together to see if there was any noticeable change in what we captured between the two archives.  In addition to just the raw counts I also looked at what percentage of the archive a given Media Type represented.  Finally I noted the overall change in those two percentages.

Media Type 2008 Count % of Archive 2012 Count % of Archive % Change Change in % of Archive text/html 105,592,852 65.9% 116,238,952 59.9% 10.1% -6.0% image/jpeg 13,667,545 8.5% 24,339,398 12.5% 78.1% 4.0% image/gif 13,033,116 8.1% 8,408,906 4.3% -35.5% -3.8% application/pdf 10,281,663 6.4% 7,097,717 3.7% -31.0% -2.8% – 4,494,674 2.8% 613,187 0.3% -86.4% -2.5% text/plain 3,907,202 2.4% 3,899,652 2.0% -0.2% -0.4% image/png 2,067,480 1.3% 7,356,407 3.8% 255.8% 2.5% text/css 841,105 0.5% 1,973,508 1.0% 134.6% 0.5%

Because I like pictures here is a chart of the percent change.

Change in Media Type

If we compare the Media Types between the two archives we find that the two archives share 527 Media Types.  The EOT2008 archive has 304 Media Types that aren’t present in EOT2012 and EOT2012 has 681 Media Types that aren’t present in EOT2008.

The ten most frequent Media Types by count found only in the EOT2008 archive are presented below.

Media Type Count no-type 405,188 text/x-vcal 17,368 .wk1 8,761 x-text/tabular 5,312 application/x-wp 5,158 * 4,318 x-application/pdf 3,660 application/x-gunzip 3,374 image/x-fits 3,340 WINDOWS-1252 2,304

The ten most frequent Media Types by count found only in the EOT2012 archive are presented below.

Media Type Count warc/revisit 12,190,512 application/http 1,050,895 application/x-mpegURL 23,793 img/jpeg 10,466 audio/x-flac 7,251 application/x-font-ttf 7,015 application/x-font-woff 6,852 application/docx 3,473 font/ttf 3,323 application/calendar 2,419

In the EOT2012 archive the team that captured content had fully moved to the WARC format for storing Web archive content.  The warc/revisit records are records for URLs that had not changed content-wise across more than one crawl.  Instead of storing the URL again, there is a reference to the previously captured content in the warc/revisit record.  That’s why there are so many of these Media types.

Below is a table showing the thirty most changed Media Types that are present in both the EOT2008 and EOT2012 archives.  You can see both the change in overall numbers as well as the percentage change between the two archives.

Media Type EOT2008 EOT2012 Change % Change image/jpeg 13,667,545 24,339,398 10,671,853 78.1% text/html 105,592,852 116,238,952 10,646,100 10.1% image/png 2,067,480 7,356,407 5,288,927 255.8% image/gif 13,033,116 8,408,906 -4,624,210 -35.5% – 4,494,674 613,187 -3,881,487 -86.4% application/pdf 10,281,663 7,097,717 -3,183,946 -31.0% application/javascript 39,019 1,511,594 1,472,575 3774.0% text/css 841,105 1,973,508 1,132,403 134.6% text/xml 344,748 1,433,159 1,088,411 315.7% unk 4,326 818,619 814,293 18823.2% application/rss+xml 64,280 731,253 666,973 1037.6% application/x-javascript 622,958 1,232,306 609,348 97.8% application/vnd.ms-excel 734,077 212,605 -521,472 -71.0% text/javascript 69,340 481,701 412,361 594.7% video/x-ms-asf 26,978 372,565 345,587 1281.0% application/msword 563,161 236,716 -326,445 -58.0% application/x-shockwave-flash 192,018 479,011 286,993 149.5% application/octet-stream 419,187 191,421 -227,766 -54.3% application/zip 312,872 92,318 -220,554 -70.5% application/json 1,268 217,742 216,474 17072.1% video/x-flv 1,448 180,222 178,774 12346.3% image/jpg 26,421 172,863 146,442 554.3% application/postscript 181,795 39,832 -141,963 -78.1% image/x-icon 45,294 164,673 119,379 263.6% chemical/x-mopac-input 110,324 1,035 -109,289 -99.1% application/atom+xml 165,821 269,219 103,398 62.4% application/xml 145,141 246,857 101,716 70.1% application/x-cgi 100,813 51 -100,762 -99.9% audio/mpeg 95,613 179,045 83,432 87.3% video/mp4 1,887 73,475 71,588 3793.7%

Presented as a set of graphs,  first showing the change in number of instances of a given Media Type between the two archives.

30 Media Types that changed the most

The second graph is the percentage change between the two archives.

% Change in top 30 media types shared between archives

Things that stand out are the growth of application/javascript between 2008 and 2012,  up 3,774% and application/json that was up over 17,000%.  Two formats used to deliver video grew as well with video/x-flv and video/mp4 increasing 12,346% and 3794% respectively.

There were a number of Media Types that reduced in the number and percentage but they are not as dramatic as those identified above.  Of note is that between 2008 and 2012 there was a decline of 100% in content with a Media Type of application/x-cgi and a 78% decrease in files that were application/postscript.

Working with the Media Types found in large web archives is a bit messy.  While there are standard ways of presenting Media Types to browsers, there are also non-standard, experimental and inaccurate instances of Media Types that will exist in these archives.  It does appear that we can see the introduction of some of the newer technologies between the two different archives.  Technologies such as the adoption of JSON and Javascript based sites as well as new formats of video on the web.

If you have questions or comments about this post,  please let me know via Twitter.

District Dispatch: Don’t miss NASA astronaut talk about exciting girls about science

planet code4lib - Tue, 2016-06-14 14:23

Astronaut Dr. Yvonne Cagle

What are the best ways that libraries can excite young learners about science and math? How can leaders facilitate informal learning through libraries and entertainment? Join the “Coding in Tomorrowland: Inspiring Girls in STEM” session at the 2016 American Library Association Annual Conference, featuring an astronaut from the National Aeronautics and Space Administration (NASA) and Disney television executives and producers as they discuss the creation of Disney Junior’s acclaimed animated series “Miles from Tomorrowland,” which weaves science, technology, engineering and mathematics (STEM) concepts geared towards kids ages 2-7 into its storylines. The conference session takes place on Sunday, June 26, 2016, 1:00-2:30 p.m. in the Orange County Convention Center, room W303.

As part of the session, Disney will provide one hundred books for roundtable participants will receive copies of two “Miles from Tomorrowland” books (“Journey to the Frozen Planet” a chapter book and “How I Saved My Summer Vacation”).

Session speakers include series consultant and NASA astronaut Dr. Yvonne Cagle; “Miles from Tomorrowland” Emmy-nominated writer and producer and , Sascha Paladino; and Disney Junior executive, Diane Ikemiyashiro. The panelists will discuss the relationship between science and entertainment and detail ways that the show imparts scientific concepts and principles to young viewers, particularly girls. The session will be moderated by Christopher Harris, fellow of the ALA Office for Information Technology Policy’s Youth & Technology Program.

“Miles from Tomorrowland” charts the outer space missions of young adventurer Miles Callisto and his family as they work together to help connect the galaxy on behalf of the Tomorrowland Transit Authority. Space, science and technology experts from NASA and NASA’s Jet Propulsion Laboratory, the Space Tourism Society and Google serve as consultants on the series, which is designed to inspire young kids’ interest in the exhilarating world of space exploration, science and technology.

Paladino is an Emmy-nominated writer and producer and whose writing credits include “Blue’s Clues,” “Doc McStuffins,” and “Sid the Science Kid.” Dr. Cagle’s extensive career boasts many accomplishments in the space, science and technology fields. Selected by NASA in 1996, Cagle reported to NASA’s Johnson Space Center, where she qualified for flight assignment as a mission specialist and was initially assigned to support the Space Shuttle Program and International Space Station. Ikemiyashiro is Director, Original Programming for Disney Junior, overseeing production and creative development on “Miles from Tomorrowland.” Prior to joining Disney in 2014, Ikemiyashiro worked in production and creative development for DreamWorks Animation and served as a staff writer in the White House Office of Correspondence from 1995-1999.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post Don’t miss NASA astronaut talk about exciting girls about science appeared first on District Dispatch.

Access Conference: Access 2016 Program Released

planet code4lib - Tue, 2016-06-14 14:18

Registration is now open for Access 2016, which will be held in the beautiful city of Fredericton, New Brunswick from October 4-7.

Access is Canada’s premier library technology conference that brings librarians, technicians, developers, programmers, and managers together to discuss cutting-edge library technologies.

This year’s program features some of the coolest people on the planet and includes:

  • two amazing keynote talks, one by Director of MIT Libraries, Chris Bourg, about “Libraries, Technology, and Social Justice,” and the other by Geoffrey Rockwell, “On the Ethics of Digitization.”
  • presentations on a range of topics including institutional repositories, Raspberry Pi, AtoM, 3D models, user-centred taxonomies, and a lot more
  • several fast and fun Ignite talks,
  • panel discussions on the current state of the merging of Libraries and Information Technology and on the Future of Access,
  • a full-day hackfest,
  • two half-day workshops,
  • and lots more

Check out the full program.

And then register!

Register by July 13 to take advantage of the early bird rate. Register before July 1st and miss the 2% jump in New Brunswick HST!

We can’t wait to see you at the conference!

District Dispatch: What makes a library entrepreneurship program great?

planet code4lib - Tue, 2016-06-14 06:07

The library community does more to promote entrepreneurship than many realize. Libraries provide assistance at every stage of the effort to launch and operate a new venture—from writing a business plan, to raising capital, to managing workflow. Learn about best practices for supporting entrepreneurs in libraries at the 2016 American Library Association (ALA) Annual Conference.

Photo by Reynermedia via Flickr

During the session “The People’s Incubator: Libraries Propel Entrepreneurship,” a panel of experts will elucidate the value of this assistance to the entrepreneurship ecosystem, and discuss ways in which libraries might make an even greater impact on the innovation economy moving forward. The session takes place on Monday, June 27, 2016, from 10:30-11:30 a.m. in the Orange County Convention Center, room W105B.

Speakers include Vanessa Neblett, assistant manager in Reference Central in the Orange County Library System (Fla.); Thomas J. O’Neal, associate vice president for research and commercialization at the University of Central Florida; Jerry Ross, president of the National Entrepreneur Center; and Charlie Wapner, senior information policy analyst for the American Library Association’s Office for Information Technology Policy (OITP).

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post What makes a library entrepreneurship program great? appeared first on District Dispatch.

Evergreen ILS: Meet Up with Other Evergreeners at ALA

planet code4lib - Mon, 2016-06-13 20:44

Join us at the American Library Association Conference in Orlando for the Evergreen Community Meetup. The meetup is an opportunity for Evergreen users, enthusiasts, and potential future users to learn about Evergreen, see what’s up and coming in the software, hear how open source software empowers libraries, and find out about the vibrant community supporting Evergreen.

The meetup is scheduled for 4:30 to 5:30 p.m. in Room W414B of the Orange County Convention Center. All ALA attendees interested in learning about Evergreen are invited to attend.

District Dispatch: Digital first sale paper available

planet code4lib - Mon, 2016-06-13 18:51

Dr. Yoonmo Sang

In October 2015, Yoonmo Sang joined OITP as a research associate upon completion of his doctorate at the University of Texas at Austin. While in residence at OITP, Dr. Sang completed a working paperExamining the interconnections between copyright law and the mission of the library: Focusing on digital first sale—drawn in part from his PhD dissertation. Dr. Sang is prolific author, featured in numerous journals that include the International Journal of Communication, Telematics and Informatics, American Behavioral Scientist, Speech & Communication, Computers in Human Behavior, Journal of Media Law, Ethics, and Policy, Journal of Medical Systems, and the Korean Journal of Broadcasting & Telecommunications Research.

We would like to thank Dr. Sang for his counsel and multiple contributions to OITP and congratulate him on his appointment as Assistant Professor at Howard University in the Department of Strategic, Legal, and Management Communication. We look forward to continued collaborations.

The post Digital first sale paper available appeared first on District Dispatch.

District Dispatch: Good times in Colorado Springs

planet code4lib - Mon, 2016-06-13 18:41

The Science and Engineering Building at the University of Colorado – Colorado Springs

Last week, I had the pleasure of being part of a wonderful conference—the fourth annual Kraemer Copyright Conference in Colorado Springs. The conference was impeccably organized and facilitated by Carla Myers (a former winner of the Robert A. Oakley Scholarship).  Always oversold, I was lucky to get in because I had a speaking gig. In my presentation about library advocacy, I talked about how, in today’s hyper-partisan policy ecosystem, “action of the ground” is often the best way to influence our information policy agenda.

I suggested that we congratulate ourselves (and others) for meeting the information needs of the public by doing.  To drive my point home, I offered some cases-in-point:

The Supreme Court found in Eldred v. Ashcroft that the Congress had the Constitutional authority to extend copyright term in the Sonny Bono Copyright Act, to life of the author plus 70 years. While a tremendous disappointment, there was a silver lining. The Center for the Study of the Public Domain launched the Creative Commons, spawning a new era of sharing creativity and knowledge by placing more works in the public domain or at least making them accessible without the authorization of the rights holder. Librarians didn’t create the Creative Commons, but man, did we promote and use it.  Today, over 1 billion works are governed by CC licenses in more than 50 jurisdictions.

To address concerns that literacy educators had about the lawfulness of using media in the classroom that implicated the exclusive rights of copyright, Renee Hobbs, Peter Jazsi, and Pat Aufderheide created the first “best practices” document. The publication, funded by the John D. and Catherine T. MacArthur Foundation,” is the first step in an effort to develop standards for educators who continue to experience uncertainty, and often fear, when making decisions about what media is “safe” to use in their classrooms.” Let the teaching continue, and go ahead and use that clip!

Another example: HathiTrust used digital files to preserve works and make them accessible to people with print disabilities.  Nearly all of these works were never available before to college students with print disabilities.  Nobody told HathiTrust they could do it, they just did it. Then they developed the Copyright Review Management System (CRMS) that identified and made available 323,334 public domain documents.  The CRMS is the recipient of this year’s L. Ray Patterson Award.

The conference was a blend of practical and unique and included educational workshops along with research papers and poster sessions. Most of the papers from the conference are available online.  Attendees were itching to volunteer for some new copyright thing. Some asked to join OITP Copyright Education subcommittee! (We always need new committee members to exploit).

Researchers say that people who live in higher altitudes live longer. Colorado Springs is one mile above sea level. Perhaps that is what infected all of us. We felt pleased, and more alive.

The post Good times in Colorado Springs appeared first on District Dispatch.

LITA: Transmission #6

planet code4lib - Mon, 2016-06-13 17:20

Jacob Shelby, intrepid metadata librarian (formerly at Iowa State, now at NCSU) enters the thunderdome  joins us for a lively conversation about the importance of coding/tech literacy for librarians. Read his LITA Blog posts, and join the conversation on twitter @ALA_lita #litavlogs.

Begin Transmission will return June 27th.

David Rosenthal: Eric Kaltman on Game Preservation

planet code4lib - Mon, 2016-06-13 15:00
At How They Got Game, Eric Kaltman's Current Game Preservation is Not Enough is a detailed discussion of why game preservation has become extraordinarily difficult. Eric expands on points made briefly in my report on emulation. His TL;DR sums it up:
The current preservation practices we use for games and software need to be significantly reconsidered when taking into account the current conditions of modern computer games. Below I elaborate on the standard model of game preservation, and what I’m referring to as “network-contingent” experiences. These network-contingent games are now the predominant form of the medium and add significant complexity to the task of preserving the “playable” historical record. Unless there is a general awareness of this problem with the future of history, we might lose a lot more than anyone is expecting. Furthermore, we are already in the midst of this issue, and I think we need to stop pushing off a larger discussion of it.Well worth reading.

DPLA: DPLA Joins Teaching with Primary Sources Unconference at SAA

planet code4lib - Mon, 2016-06-13 13:23

We are pleased to announce that DPLA will be participating in the second annual Teaching with Primary Sources Unconference and Workshops, which will be held on Wednesday, August 3 in Atlanta, Georgia, at the Auburn Avenue Research Library on African American Culture and History of the Atlanta-Fulton Public Library system ahead of the Society of American Archivists (SAA) Annual Meeting. The event will be free of charge and open to the public by registration, which begins on June 13.

An “unconference” is a collaborative, non-hierarchical program in which all participants actively inhabit the roles of teacher-learner-conference planner. The Teaching with Primary Sources Unconference organizers seek to create a forum of exchange and foster participation from the wider community of individuals who employ primary sources in teaching and learning activities. Educators, librarians, museum professionals, public historians, artists and designers, scientists, and archivists are encouraged to attend. Individuals employed in or volunteering with K-12, higher education, and community-based programs are all welcome. The unconference is a full day of activities, but participants may come and go as they please depending on their schedules, needs, and interests. While workshops will be organized in advance, unconference sessions will be spontaneous.

We are excited to have the opportunity to lead one of the workshop discussions on “Building Primary Source Sets for Students and Teachers,” sharing some of the the ideas, goals, questions, and challenges behind the Digital Public Library of America’s Primary Source Sets.  We are excited to have this unconference opportunity to explore key questions raised during the research and development of this project with colleagues from the fields of education, libraries and archives, museums, and more!

In addition to our workshop, we look forward to participating in what promises to be a rich and engaging program of discussion and collaboration. At the inaugural Teaching with Primary Sources Unconference in 2015, participants chose from a selection of workshops on such topics as creating effective exhibitions, teaching with visual materials (artwork, photographs and other non-textual formats), and learning assessment, followed by an afternoon dedicated to sixteen sessions selected and facilitated by participants. A sample of topics discussed include racial and social justice theory, National History Day, materials handling, pedagogy, and using primary sources to teach subjects beyond the humanities.

Keep up with the latest Teaching with Primary Sources Unconference developments by checking the following URL, which will provide the most current source of information about the unconference: bitly.com/SAA16TPS. Registration opens on June 13.

Learn More and Register

 

ABOUT THE TEACHING WITH PRIMARY SOURCES COMMITTEE

The Teaching with Primary Sources Unconference Team is comprised of members of the Teaching with/about Primary Sources (TPS) Committee of the Society of American Archivists’ Reference, Access and Outreach Section. The purpose of the TPS Committee is to advocate for the active and interactive use of primary sources in teaching and learning as a core component of archival work. The TPS Committee seeks collaborative partnerships with all types of institutions (academic, cultural heritage, etc.) and all levels of learners: K-12, college and university, and lifelong learners. After a year of planning, the TPS Committee co-sponsored the first Teaching with Primary Sources Unconference in 2015 with the support of the Cleveland Public Library and plans to hold future unconferences in different regions of the United States. For more information, contact TeachWithStuff@gmail.com and follow the conversation online using #SAATPS16.

Please do not contact the Auburn Avenue Research Library on African American Culture and History regarding this press release.

 

District Dispatch: What kinds of coding classes are offered in libraries?

planet code4lib - Mon, 2016-06-13 07:58

Coding in libraries? Learn about the variety of programming in school and public libraries at the 2016 American Library Association’s (ALA) Annual Conference in Orlando, Fla. During the conference session “Libraries Ready to Code: Increasing CS Opportunities for Young People,” a panel of library experts will share experiences gained through a yearlong look at what’s behind the scenes in coding programs for youth—especially for underrepresented groups in the science, technology, engineering and mathematics (STEM) and computer science fields. Panelists will also discuss “computational thinking” and the unique library perspective on successful learning models based on coding concepts.

Children using library computers.

During the session, coding and library leaders will discuss “Libraries Ready to Code,” a joint partnership between the American Library Association and Google that will investigate the current status of computer programming activities in U.S. public and K–12 libraries. The session takes place on Sunday, June 26, 2016, 10:30-11:30 a.m. in the Orange County Convention Center, room W105B.

Speakers include Linda Braun, learning consultant for LEO: Librarians & Educators Online, Seattle, Wash.; Joanna Fabicon, children’s librarian for the Los Angeles Public Library in Los Angeles, Calif.; Crystle Martin, postdoctoral research fellow at the University of California, Irvine; Hai Hong, K-12 Education Outreach for Google Inc.; and Roger Rosen, senior advisor for the Office for Information Technology Policy of the American Library Association and CEO and president of Rosen Publishing.

Want to attend other policy sessions at the 2016 ALA Annual Conference? View all ALA Washington Office sessions

The post What kinds of coding classes are offered in libraries? appeared first on District Dispatch.

Pages

Subscribe to code4lib aggregator