You are here

Feed aggregator

LibraryThing (Thingology): New “More Like This” for LibraryThing for Libraries

planet code4lib - Mon, 2015-02-09 18:04

We’ve just released “More Like This,” a major upgrade to LibraryThing for Libraries’ “Similar items” recommendations. The upgrade is free and automatic for all current subscribers to LibraryThing for Libraries Catalog Enhancement Package. It adds several new categories of recommendations, as well as new features.

We’ve got text about it below, but here’s a short (1:28) video:

What’s New

Similar items now has a See more link, which opens More Like This. Browse through different types of recommendations, including:

  • Similar items
  • More by author
  • Similar authors
  • By readers
  • Same series
  • By tags
  • By genre

You can also choose to show one or several of the new categories directly on the catalog page.

Click a book in the lightbox to learn more about it—a summary when available, and a link to go directly to that item in the catalog.

Rate the usefulness of each recommended item right in your catalog—hovering over a cover gives you buttons that let you mark whether it’s a good or bad recommendation.

Try it Out!

Click “See more” to open the More Like This browser in one of these libraries:

Find out more

Find more details for current customers on what’s changing and what customizations are available on our help pages.

For more information on LibraryThing for Libraries or if you’re interested in a free trial, email abby@librarything.com, visit http://www.librarything.com/forlibraries, or register for a webinar.

Library of Congress: The Signal: DPOE Interview: Three Trainers Launch Virtual Courses

planet code4lib - Mon, 2015-02-09 15:56

The following is a guest post by Barrie Howard, IT Project Manager at the Library of Congress.

This is the first post in a series about digital preservation training inspired by the Library’s Digital Preservation Outreach & Education (DPOE) Program.  Today I’ll focus on some exceptional individuals, who among other things, have completed one of the DPOE Train-the-Trainer workshops and delivered digital preservation training. I am interviewing Stephanie Kom, North Dakota State Library; Carol Kussmann, University of Minnesota Libraries; and Sara Ring, Minitex (a library network providing continuing education and other services to MN, ND and SD), who recently led an introductory virtual course on digital preservation.

Barrie: Carol, you attended the inaugural DPOE Train-the-Trainer Workshop in Washington, and Stephanie and Sara, you attended the first regional event at the Indiana State Archives during the summer of 2012, correct? Can you tell the readers about your experiences and how you and others have benefited as a result?

Carol Kussmann

Carol: In addition to learning about the DPOE curriculum itself the most valuable aspect of these Train-the-Trainer workshops was meeting new people and building relationships. In the inaugural workshop, we met people from across the country, many whom I have looked to for advice or worked with on other projects. Because of the Indiana regional training, we now have a sizable group of trainers in the Midwest that I feel comfortable with in talking about DPOE and other electronic record issues. We work with each other and provide feedback and assistance when we go out and train others or work on digital preservation issues in our own roles.

Stephanie Kom

Stephanie: We were just starting a digital program at my institution so the DPOE training was beyond helpful in just informing me what needed to be done to preserve our future digital content. It gave me the tools to explain our needs to our IT department. I also echo Carol’s thoughts on the networking opportunities. It was a great way to meet people in the region that are working with the same issues.

Sara: As my colleagues mentioned, in addition to learning the DPOE curriculum, what was most valuable to me was meeting new colleagues and forming relationships to build upon after the workshop. Shortly after the training, about eight of us began meeting virtually on a regular basis to offer our first digital preservation course (using the DPOE curriculum). Our small upper Midwest collaborative included trainers from North Dakota, South Dakota, Minnesota and Wisconsin. We had trainers from libraries, state archives and a museum participating, and we found we all had different strengths to share with our audience. Our first virtual course, “Managing Digital Content Over Time: An Introduction to Digital Preservation,” reached about 35 organizations of all types, and our second virtual course reached about 20 organizations in the region.

Sara Ring

Barrie: Since becoming official DPOE trainers, you have developed a virtual course to provide an introduction to digital preservation. Can you provide a few details about the course, and have you developed any other training materials from the DPOE Curriculum?

Stephanie, Carol, Sara: The virtual course we offered was broken up as three sessions, scheduled every other week. Each session covered two of the DPOE modules. Using the DPOE workshop materials as a starting point we added local examples from our own organizations and built in discussion questions and polls for the attendees so that we had plenty of interaction.

Evaluations from this first offering informed us that people wanted to know more about various tools used to manage and preserve digital content. In response, in our second offering of the course we built in more demonstrations of tools to help identify, manage and monitor digital content over time. Since we were discussing and demonstrating tools that dealt with metadata, we added more content about technical and preservation metadata standards. We also built in take-home exercises for attendees to complete between sessions. Attendees have responded well to these changes and find the take-home exercises that we have built in really useful.

We also created a Google Site for this course, with an up-to-date list of resources, best practices and class exercises. Carol created step-by-step guides that people can follow for understanding and using tools that can assist with managing and preserving their electronic records. These can be found on the University of Minnesota Libraries Digital Preservation Page.

Working through Minitex, we have developed three different classes related to digital preservation; An Introduction to Digital Preservation (webinar); the DPOE virtual course that was mentioned; and a full day in-person DPOE-based workshop. We have presented each of these at least two times.

Tools Quick Reference Guide, provided to attendees of “Managing Digital Content Over Time.”

Barrie: The DPOE curriculum, which is built upon the OAIS Reference Model, recently underwent a revision. Have you noticed any significant changes in the materials since you attended the workshop in 2011 or 2012? What improvements have you observed?

Carol: What I like about DPOE is that it provides a framework for people to talk about common issues related to digital preservation. The main concepts have not changed – which is good, but there has been a significant increase to the number of examples and resources. The “Digital Preservation Trends” slides were not available in the 2011 training. Keeping up to date on what people are doing, exploring new resources and tools, and following changing best practices is very important as digital preservation continues to be a moving target.

Sara, Stephanie: We found the “Digital Preservation Trends” slides, the final module covered in the DPOE workshop, to be a nice addition to the baseline curriculum. We don’t think they existed when we attended the DPOE train-the-trainer workshop back in 2012. We both especially like the “Engaging with the Digital Preservation Community” section which lists some of the organizations, listservs, and conferences that would be of interest to digital preservation practitioners. When you’re new to digital preservation (or the only one at your organization working with digital content), it can be overwhelming knowing where to start. Providing resources like this offers a way to get involved in the digital preservation community; to learn from each other. We always try to close our digital preservation classes by providing community resources like this.

Barrie: Regarding training opportunities, could you compare the strengths and challenges of traditional in-person learning environments to distance learning options?

Stephanie, Carol, Sara: Personally we all prefer in-person learning environments over virtual and believe that most people would agree. We saw this preference echoed in the DPOE 2014 Training Needs Assessment Survey (PDF).

The main strength of in-person is the interaction with the presenter and other participants; as a presenter you can adjust your presentation immediately based on audience reactions and their specific needs and understanding. As a participant you can meet and relate to other people in similar situations, and there are more opportunities at in-person workshops for having those types of discussions with colleagues during breaks or during lunch.

However, in-person learning is not always feasible with travel time and costs, and in this part of the country, weather often gets in the way (we have all had our share of driving through blizzard conditions in Minnesota and North Dakota). Convenience and timeliness is definitely a benefit of long distance learning; more people from a single institution can often attend for little or no additional cost. As trainers we have worked really hard to build in hands-on activities in our virtual digital preservation courses, but could probably do a lot more to encourage networking among the attendees.

Barrie: Are there plans to convene the “Managing Digital Content Over Time” series in 2015?

Stephanie, Carol, Sara: Yes, we plan on offering at least one virtual course this spring. We’ll be checking in with our upper Midwest collaborative of trainers to see who is interested in participating this time around. Minitex provides workshops on request, so we may do more virtual or in-person classes if there is demand.

One of the hands-on activities for the in-person “Managing Digital Content Over Time” course.


Barrie:
How has the DPOE program influenced and/or affected the work that you do at your organization?

Carol: The inaugural DPOE Training (2011) took place while I was working on an NDIIPP project led by the Minnesota State Archives to preserve and provide access to government digital records which provided me with additional tools with which to work from during the project.   After the project ended, I continued to use the information I learned during the project and DPOE training to develop a workflow for processing and preserving digital records at the Minnesota State Archives.

Since then, I became a Digital Preservation Analyst at the University of Minnesota Libraries where I continue to focus on digital preservation workflows, education and training, and other related activities. Overall, the DPOE training helped to build a foundation from which to discuss digital preservation with others whether in a classroom setting, conference presentation or one-on-one conversations. I look forward to continuing to work with members of the DPOE community.

Sara: As a digitization and metadata training coordinator at Minitex, a large part of my job is developing and presenting workshops for library professionals in our region. Participating in the DPOE training (2012) was one of the first steps I took to build and expand our training program at Minitex to include digital preservation. The DPOE program has also given me the opportunity to build up our own small cohort of DPOE trainers in the region, so we can schedule regular workshops based on who is available to present at the time.

Stephanie: I started the digitization program at our institution in 2012. Digital preservation has become a main component of that program and I am still working to get a full-fledged plan moving. Our institution is responsible for preserving other digital content and I would like our preservation plan to encompass all aspects of our work here at the library. I think one of the great things about the DPOE training is that the different pieces can be implemented before starting to produce digital content or it can be retrofitted into an already-established digital program. It can be more work when you already have a lot of digital content but the training materials make each step seem manageable.

Open Knowledge Foundation: Pakistan Data Portal

planet code4lib - Mon, 2015-02-09 11:29

December 2014 saw the Sustainable Development Policy Institute and Alif Ailaan launch the Pakistan Data Portal at the 30th Annual Sustainable Development Conference. The portal, built using CKAN by Open Knowledge, provides an access point for viewing and sharing data relating to all aspects of education in Pakistan.

A particular focus of this project was to design an open data portal that could be used to support advocacy efforts by Alif Ailaan, an organisation dedicated to improving education outcomes in Pakistan.

The Pakistan Data Portal (PDP) is the definitive collection of information on education in Pakistan and collates datasets from private and public research organisations on topics including infrastructure, finance, enrollment, and performance to name a few. The PDP is a single point of access against which change in Pakistani education can be tracked and analysed. Users, who include teachers, parents, politicians and policy makers are able to browse historical data can compare and contrast it across regions and years to reveal a clear, customizable picture of the state of education in Pakistan. From this clear overview, the drivers and constraints of reform can be identified which allow Alif Ailaan and others pushing for change in the country to focus their reform efforts.

Pakistan is facing an education emergency. It is a country with 25m children out of education and 50% girls of school age do not attend classes. A census has not been completed since 1998 and there are problems with the data that is available. It is outdated, incomplete, error-ridden and only a select few have access to much of it. An example that highlights this is a recent report from ASER, which estimates the number of children out of school at 16 million fewer than the number computed by Alif Ailaan in another report.  NGOs and other advocacy groups have tended to only be interested in data when it can be used to confirm that the funds they are utilising are working. Whilst there is agreement on the overall problem, If people can not agree on its’ scale, how can a consensus solution be hoped for?

Alif Ailaan believe if you can’t measure the state of education in the country, you cant hope to fix it fix it. This forms the focus of their campaigning efforts. So whilst the the quality of the data is a problem, some data is better than no data, and the PDP forms a focus for gathering quality information together and for building a platform from which to build change and promote policy change— policy makers can make accurate decisions which are backed up.

The data accessible through the portal is supported by regular updates from the PDP team who draw attention to timely key issues and analyse the data. A particular subject or dataset will be explored from time to time and these general blog post are supported by “The Week in Education” which summarises the latest education news, data releases and publications.

CKAN was chosen as the portal best placed to meet the needs of the PDP. Open Knowledge were tasked with customising the portal and providing training and support to the team maintaining it. A custom dashboard system was developed for the platform in order to present data in an engaging visual format.

As explained by Asif Mermon, Associate Research Fellow at SDPI, the genius of the portal is the shell. As institutions start collecting data, or old data is uncovered, it can be added to the portal to continually improve the overall picture.

The PDP is in constant development to further promote the analysis of information in new ways and build on the improvement of the visualizations on offer. There are also plans to expand the scope of the portal, so that areas beyond education can also reap its’ benefits. A further benefit is that the shell can then be be exported around the world so other countries will be able to benifit from the development.

The PDP initiative is part of the multi-year DFID-funded Transforming Education Pakistan (TEP) campaign aiming to increase political will to deliver education reform in Pakistan. Accadian, on behalf of HTSPE, appointed the Open Knowledge Foundation to build the data observatory platform and provide support in managing the upload of data including onsite visits to provide training in Pakistan.

 

Hydra Project: Announcing Hydra 9.0.0

planet code4lib - Mon, 2015-02-09 09:56

We’re pleased to announce the release of Hydra 9.0.0.  This Hydra gem brings together a set of compatible gems for working with Fedora 4. Amongst others it bundles Hydra-head 9.0.1 and Active-Fedora 9.0.0.  In addition to working with Fedora 4, Hydra 9 includes many improvements and bug fixes. Especially notable is the ability to add RDF properties on repository objects themselves (no need for datastreams) and large-file streaming support.

The new gem represents almost a year of effort – our thanks to all those who made it happen!

Release notes:
https://github.com/projecthydra/active_fedora/releases/tag/v9.0.0
https://github.com/projecthydra/hydra-head/releases/tag/v9.0.0

DuraSpace News: Fedora 4 Makes Islandora Even Better!

planet code4lib - Mon, 2015-02-09 00:00

There are key advantages for users and developers by combining Islandora 7 and Fedora 4.

Charlottetown, PEI, CA  Islandora is an open source software framework for managing and discovering digital assets utilizing a best-practices framework that includes Drupal, Fedora, and Solr. Islandora is implemented and built by an ever-growing international community.

CrossRef: Geoffrey Bilder will be at the 10th IDCC in London tomorrow

planet code4lib - Sun, 2015-02-08 21:58

Geoffrey Bilder @gbilder will be part of a panel entitled Why is it taking so long?. The panel will explore why some types of change in curation practice take so long and why others happen quickly. The panel will be moderated by Carly Strasser @carlystrasser, Manager of Strategic Partnerships for DataCite. The panel will take place on Monday, February 9th at 16:30 at 30 Euston Square in London. Learn more. #idcc15

Patrick Hochstenbach: Homework assignment #7 Sketchbookskool

planet code4lib - Sun, 2015-02-08 16:43
Filed under: Doodles Tagged: Photoshop, sketchbookskool, staedtler, urbansketching

Patrick Hochstenbach: Homework assignment #6 Sketchbookskool

planet code4lib - Sun, 2015-02-08 16:41
Filed under: Doodles Tagged: brushpen, ostrich, sketchbookskool, toy

Patrick Hochstenbach: Homework assignment #5 Sketchbookskool

planet code4lib - Sun, 2015-02-08 16:40
Filed under: Doodles Tagged: brushpen, copic, fudensuke, moleskine, pencil, sketchbookskool

David Rosenthal: It takes longer than it takes

planet code4lib - Sun, 2015-02-08 03:24
I hope it is permissible to blow my own horn on my own blog. Two concepts recently received official blessing after a good long while, for one of which I'm responsible, and for the other of which I'm partly responsible. The mysteries are revealed below the fold.



The British Parliament is celebrating the 800th anniversary of Magna Carta:
On Thursday 5 February 2015, the four surviving original copies of Magna Carta were displayed in the Houses of Parliament – bringing together the documents that established the principle of the rule of law in the place where law is made in the UK today.   The closing speech of the ceremony in the House of Lords was given by Sir Tim Berners-Lee, who is reported to have said:

I invented the acronym LOCKSS more than a decade and a half ago. Thank you, Sir Tim!

On October 24, 2014 Linus Torvalds added overlayfs to release 3.18 of the Linux kernel. Various Linux distributions have implemented various versions of overlayfs for some time, but now it is an official part of Linux. Overlayfs is a simplified implementation of union mounts, which allow a set of file systems to be superimposed on a single mount point. This is useful in many ways, for example to make a read-only file system such as a CD-ROM appear to be writable by mounting a read-write file system "on top" of it.

Other Unix-like systems have had union mounts for a long time. BSD systems first implemented it in 4.4BSD-Lite two decades ago. The concept traces back five years earlier to my paper for the Summer 1990 USENIX Conference Evolving the Vnode Interface which describes a prototype implementation of "stackable vnodes". Among other things, it could implement union mounts as shown in the paper's Figure 10:
This use of stackable vnodes was in part inspired by work at Sun two years earlier on the Translucent File Service, a user-level NFS service by David Hendricks that implemented a restricted version of union mounts. All I did was prototype the concept, and like many of my prototypes it served mainly to discover that the problem was harder than I initially thought. It took others another five years to deploy it in SunOS and BSD. Because they weren't hamstrung by legacy code and semantics by far the most elegant and sophisticated implementation was around the same time by Rob Pike and the Plan 9 team. Instead of being a bolt-on addition, union mounting was fundamental to the way Plan 9 worked.

About five years later Erez Zadok at Stony Brook led the FiST project, a major development of stackable file systems including two successive major releases of unionfs, a unioning file system for Linux.

About the same time I tried to use OpenBSD's implementation of union mounts early in the boot sequence to construct the root directory by mounting a RAM file system over a read-only root file system on a CD, but gave up on encountering deadlocks.

In 2009 Valerie Aurora published a truly excellent series of articles going into great detail about the difficult architectural and implementation issues that arise when implementing union mounts in Unix kernels. It includes the following statement, with which I concur:
The consensus at the 2009 Linux file systems workshop was that stackable file systems are conceptually elegant, but difficult or impossible to implement in a maintainable manner with the current VFS structure. My own experience writing a stacked file system (an in-kernel chunkfs prototype) leads me to agree with these criticisms.Note that my original paper was only incidentally about union mounts, it was a critique of the then-current VFS structure, and a suggestion that stackable vnodes might be a better way to go. It was such a seductive suggestion that it took nearly two decades to refute it! My apologies for pointing down a blind alley.

The overlayfs implementation in 3.18 is minimal:
Overlayfs allows one, usually read-write, directory tree to be overlaid onto another, read-only directory tree. All modifications go to the upper, writable layer.But given the architectural issues doing one thing really well has a lot to recommend itself over doing many things fairly well. This is, after all, the use case from my paper.

It took a quarter of a century, but the idea has finally been accepted. And, even though I had to build a custom 3.18 kernel to do so, I am using it on a Raspberry Pi serving as part of the CLOCKSS Archive.

Thank you, Linus! And everyone else who worked on the idea during all that time!

References (date order):

Mark E. Phillips: How we assign unique identifiers

planet code4lib - Sun, 2015-02-08 03:01

The UNT Libraries has made use of the ARK identifier specification for a number of years and have used these identifiers throughout our infrastructure on a number of levels.  This post is to give a little background about where, when, why and a little about how we assign our ARK identifiers.

Terminology

The first thing we need to do is get some terminology out of the way so that we can talk about the parts consistently.  This is taken from the ARK documentation

http://example.org/ark:/12025/654xz321/s3/f8.05v.tiff \________________/ \__/ \___/ \______/ \____________/ (replaceable) | | | Qualifier | ARK Label | | (NMA-supported) | | | Name Mapping Authority | Name (NAA-assigned) (NMA) | Name Assigning Authority Number (NAAN)

The ARK syntax can be summarized,

[http://NMA/]ark:/NAAN/Name[Qualifier]

For the UNT Libraries we were assigned a Name Assigning Authority Number (NAAN) of 67531 so all of our identifiers will start like this ark:/67531/

We mint Names for our ARKs locally with a home-grown system locally called a “Number Server”  this Python Web service receives a request for a new number,  assigns that number a prefix based on which instance we pull from and returns the new Name.

Namespaces

We have four different namespaces that we use for minting identifiers.  They are the following,  metapth, metadc, metarkv, and coda.  Additionally we have a metatest namespace which we use when we need to test things out but it isn’t used that often.  Finally we have a historic namespace that is no longer used that is metacrs. Here is the breakdown of how we use these namespaces.

We try to assign all items that end up on The Portal to Texas History with Names from the metapth namespace whenever possible.  We assign all other public facing digital objects the metadc namespace.  This means that the UNT Digital Library and The Gateway to Oklahoma History both share Names from the metadc namespace.  The metarkv namespace is used for “archive only” objects that go directly into our archival repository system,  these include large Web archiving datasets.  The coda namespace is used within our archival repository called Coda.  As was stated earlier the metatest namespace is only used for testing and these items are thrown away after processing.

Name assignment

We assign Names in our systems in programatic ways,  this is always done as part of our digital item ingest process.  We tend to process items in batches,  most often we try to process several hundred items at any given time and sometimes we process several thousand items.   When we process items they are processed in parallel and therefore there is no logical order to how the Names are assigned to objects.  They are in the order that they were processed but may have no logical order past that.

We also don’t assume that our Names are continuous.  If you have an identifier metapth123 and metapth125 we don’t assume that there is an item metapth124,  sure it may be there,  but it also may never have been assigned.  When we first started with these systems we would get worked up if we assigned several hundred or a few thousands identifiers and then had to delete those items,  now this isn’t an issue at all but that took some time to get over.

Another assumption that can’t be made in our systems is that if you have an item,  Newspaper Vol 1 Issue 2 that has an identifier of metapth333 there is no guarantee that Newspaper Vol. 1 Issue 3 will have metapth334,  it might but it isn’t guaranteed either.  Another thing that happens in our systems is that items can be shared between systems and the membership to either the Portal, UNT Digital Library or Gateway is notated in the descriptive metadata.  Therefore you can’t say all metapth* identifiers are Portal or all metadc* identifiers are not the Portal, you have to look them up based on the metadata.

Once a number is assigned it is never assigned again.  This sounds like a silly thing to say but it is important to remember,  we don’t try and save identifiers, or reuse them as if we will run out of them.

Level of assignment

We currently assign an ARK identifier at the level of the intellectual object. So for example,  a newspaper issue gets and ARK, a photograph gets an ARK, a book, a map, a report, an audio recording, a video recording gets an ARK.  The sub-parts of an item are not given further unique identifiers because the way that we tend to interface with them is in the form of formatted URLs such as those described here or from other URL based patterns such as the URLs we use to retrieve items from Coda.

http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/manifest-md5.txt http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/coda_directives.py http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/bagit.txt http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/bag-info.txt http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/0=untl_aip_1.0 http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/data/01_data/queries.xlsx http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/data/01_data/README.txt http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/metadata.xml http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/metadata/ba3ce7a1-0e3b-44cb-8b41-5d9d1b0438fe.jhove.xml http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/metadata/7fe68777-54a2-4c71-95b2-aa33204ae84b.jhove.xml http:/coda.library.unt.edu/bag/ark:/67531/codanaf8/data/metadc498968.aip.mets.xml Lessons Learned Things I would do again.
  • I would most likely use just an incrementing counter for assigning identifiers.  Name minters such as Noid are also an option but I like the numbers with a short prefix.
  • I would not use a prefix such as UNT do stay away from branding as much as possible.  Even metapth is way too branded (see below).
Things I would change in our implementation.
  • I would only have one namespace for non-archival items.  Two namespaces for production data just invite someone to screw up (usually me) and then suddenly the reason for having one namespace over the other is meaningless.  Just manage one namespace and move on.
  • I would not have a six or seven character prefix.  metapth and metadc came as baggage from our first system,  we decided that the 30k identifiers we already minted had set our path.  Now after 1,077,975 identifiers in those namespaces,  it seems a little silly that those the first 3% of our items would have such an effect on us still today.
  • I would not brand our namespaces so closely to our systems names such as metapth, metadc, and the legacy metacrs people read too much into the naming convention.  This is a big reason for opaque Names in the first place, and is pretty important.
Things I might change in a future implementation.
  • I would probably pad my identifiers out to eight digits.   While you can’t rely on the ARKs to be generated in a given order, once they are assigned it is helpful to be able to sort by them and have a consistent order,  metapth1, metapth100, metapth100000 don’t always sort nicely like metapth00000001, metapth00000100, metapth00100000 do.  But then again longer run numbers of zeros are harder to transcribe and I had a tough time just writing this example.  Maybe I wouldn’t do this.

I don’t think any of this post applies only to ARK identifiers as most identifier schemes at some level have to have a decision made about how you are going to mint unique names for things.   So hopefully this is useful to others.

If you have any specific questions for me let me know on twitter.

FOSS4Lib Recent Releases: Hydra - 9.0

planet code4lib - Sat, 2015-02-07 17:46
Package: HydraRelease Date: Thursday, February 5, 2015

Last updated February 7, 2015. Created by Peter Murray on February 7, 2015.
Log in to edit this page.

From the release announcement

I'm pleased to announce the release of Hydra 9.0.0! This is the first release of the Hydra gem for Fedora 4 and represents almost a year of effort. In addition to working with Fedora 4, Hydra 9 includes many improvements and bug fixes. Especially notable is the ability to add RDF properties on repository objects themselves (no need for datastreams) and large-file streaming support.

CrossRef: Join us for the first CrossRef Taxonomies Webinar - March 3rd at 11:00 am ET

planet code4lib - Fri, 2015-02-06 21:15

Semantic enrichment is an active area of development for many publishers. Our enrichment processes are based on the use of different Knowledge Models (e.g., an ontology or thesaurus) which provide the terms required to describe different subject disciplines.

The CrossRef Taxonomy Interest Group is a collaboration among publishers, and sponsored by CrossRef, to share the Knowledge Models they are using, creating opportunities for standardization, collaboration and interoperability. Please join the webinar to get an introduction to the work this group is doing, use cases for the information collected and learn how your organization can contribute to the project.

Christian Kohl - Director Information and Publishing Technology, De Gruyter
Graham McCann - Head of Content and Platform Management, IOP Publishing

The webinar will take place on Tuesday, March 3rd at 11 am ET.

Register today!

SearchHub: Enabling SSL on Fusion Admin UI

planet code4lib - Fri, 2015-02-06 20:40
Lucidworks Fusion can encrypt communications to and from clients with SSL. This section describes enabling SSL on Fusion Admin UI with the Jetty server using a self-signed certificate. Basic SSL Setup Generate a self-signed certificate and a key To generate a self-signed certificate and a single key that will be used to authenticate both the server and the client, we’ll use the JDK keytool command and create a separate keystore.  This keystore will also be used as a truststore below.  It’s possible to use the keystore that comes with the JDK for these purposes, and to use a separate truststore, but those options aren’t covered here. Run the commands below in the $FUSION_HOME/jetty/ui/etc directory in the binary Fusion distribution. The “-ext SAN=…” keytool option allows you to specify all the DNS names and/or IP addresses that will be allowed during hostname verification. keytool -genkeypair -alias fusion -keyalg RSA -keysize 2048 -keypass secret -storepass secret -validity 9999 -keystore fusion.keystore.jks -ext SAN=DNS:localhost,IP:127.0.0.1 -dname “CN=localhost, OU=Organizational Unit, O=Organization, L=Location, ST=State, C=Country” The above command will create a keystore file named fusion.keystore.jks in the current directory. Convert the certificate and key to PEM format for use with cURL cURL isn’t capable of using JKS formatted keystores, so the JKS keystore needs to be converted to PEM format, which cURL understands. First convert the JKS keystore into PKCS12 format using keytool: keytool -importkeystore -srckeystore fusion.keystore.jks -destkeystore fusion.keystore.p12 -srcstoretype jks -deststoretype pkcs12 The keytool application will prompt you to create a destination keystore password and for the source keystore password, which was set when creating the keystore (“secret” in the example shown above). Next convert the PKCS12 format keystore, including both the certificate and the key, into PEM format using the openssl command: openssl pkcs12 -in fusion.keystore.p12 -out fusion.pem Configure Fusion First, copy jetty-https.xml and jetty-ssl.xml from $FUSION_HOME//jetty/home/etc to $FUSION_HOME/jetty/ui/etc Next, edit jetty-ssl.xml and change the keyStore values to point to the JKS keystore created above – the result should look like this: Edit ui file (not ui.sh) under $FUSION_HOME/bin and add the following 3 lines
  1. “https.port=$HTTP_PORT” \
  2. “$JETTY_BASE/etc/jetty-ssl.xml” \
  3. “$JETTY_BASE/etc/jetty-https.xml”
  Run Fusion using SSL To start all services, run $FUSION_HOME/bin/fusion start. This will start Solr, the Fusion API, the Admin UI, and Connectors, which each run in their own Jetty instances and on their own ports bin/fusion start After that, trust Fusion website (This is because we are in local machine).   Finally, Fusion Admin UI with SSL    

The post Enabling SSL on Fusion Admin UI appeared first on Lucidworks.

CrossRef: New CrossRef Members

planet code4lib - Fri, 2015-02-06 20:10

Updated February 3, 2015

Voting Members
Academy of Medical and Health Research
Agrivita, Journal of Agricultural Science (AJAS)
Eurasian Scientific and Industrial Chamber, Ltd.
Hitte Journal of Science and Education
Institute of Mathematical Problems of Biology of RAS (IMPB RAS)
MIM Research Group
Tomsk State University
Universitas Pendidikan Indonesia (UPI)

Represented Members
Amasya Universitesi Egitim Fakultesi Dergisi
Hikmet Yurdu Dusunce-Yorum Sosyal Bilimler Arastirma Dergisi
Necatibey Faculty of Education Electronics Journal of Science and Mathematics Education
Optimum Journal of Economics and Management Sciences

Last update January 26, 2015

Voting Members
Academy Publication
Escola Bahiana de Medicine e Saude Publica
Escola Superior de Educacao de Paula Frassinetti
Lundh Research Foundation
RFC Editor

Represented Members
ABRACICON: Academia Brasileira de Ciencias Contabeis
Biodiversity Science
Canakkale Arastirmalari Turk Yilligi
Chinese Journal of Plant Ecology
Dergi Karadeniz
Eskisehir Osmangazi University Journal of Social Sciences
Geological Society of India
Instituto do Zootecnia
Journal of Social Studies Education Research
Journal Press India
Kahramanmaras Sutcu Imam Universitesi Tip Fakultesi Dergisi
Nitte Management Review
Sanat Tasarim Dergisi
Sociedade Brasileira de Virologia
The Apicultural Society of Korea
The East Asian Society of Dietary Life
The Korea Society of Aesthetics and Science of Art
Turkish History Education Journal

CrossRef: CrossRef Indicators

planet code4lib - Fri, 2015-02-06 17:25

Updated February 3, 2015

Total no. participating publishers & societies 5772
Total no. voting members 3058
% of non-profit publishers 57%
Total no. participating libraries 1926
No. journals covered 37,687
No. DOIs registered to date 72,062,095
No. DOIs deposited in previous month 471,657
No. DOIs retrieved (matched references) in previous month 41,726,414
DOI resolutions (end-user clicks) in previous month 134,057,984

OCLC Dev Network: Hitting the Trail for Code4Lib

planet code4lib - Fri, 2015-02-06 17:00

We're all pretty excited about catching up with everyone at Code4Lib in Portland, Oregon next week. Karen Coombs, George Campbell and I will be going, along with Bruce Washburn and a couple of our other OCLC colleagues. Stop us and fill us in on what's new with you - we're anxious to hear about the projects you've been working on and what you'll be doing next. Or ask us about Developer House, our API Explorer, or whatever you'd like to know about OCLC Web services. 

District Dispatch: Sens. Reed and Cochran introduce school library bill

planet code4lib - Fri, 2015-02-06 16:37

Last week, U.S. Senator Jack Reed (D-RI) joined Senate Appropriations Committee Chairman Thad Cochran (R-MS) in introducing the SKILLS Act (S.312). Key improvements to the program include expanding professional development to include digital literacy, reading and writing instruction across all grade levels; focusing on coordination and shared planning time between teachers and librarians; and ensuring that books and materials are appropriate students with special learning needs, including English learners.

The legislation would expand federal investment in school libraries so they can continue to offer students the tools they need to develop the critical thinking, digital, and research skills necessary for success in the twenty-first century.

“Effective school library programs are essential for educational success. Multiple education and library studies have produced clear evidence that school libraries staffed by qualified librarians have a positive impact on student academic achievement. Knowing how to find and use information are essential skills for college, careers, and life in general,” said Senator Reed, a member of the Senate Appropriations Committee, in a statement.

“Absent a clear federal investment, the libraries in some school districts will languish with outdated materials and technology, or cease to exist at all, cutting students off from a vital information hub that connects them to the tools they need to develop the critical thinking and research skills necessary for success,” Senator Reed continued. “This is a true equity issue, which is why I will continue to fight to sustain our federal investment in this area and why renewing and strengthening the school library program is so critical.”

“School libraries should be an integral part of our educational system,” said Chairman Cochran. “This bipartisan legislation is intended to ensure that school libraries are better equipped to offer students the reading, research and digital skills resources they need to succeed.”

The bipartisan SKILLS Act would further amend the Elementary and Secondary Education Act by requiring state and school districts plan to address the development of effective school library programs to help students gain digital literacy skills, master the knowledge and skills in the challenging academic content standards adopted by the state, and graduate from high school ready for college and careers. Additionally, the legislation would broaden the focus of training, professional development and recruitment activities to include school librarians.

he American Library Association (ALA) last week sent comments (pdf) to the U.S. Senate Committee on Health, Education, Labor, and Pensions (HELP) Chairman Sen. Lamar Alexander and member Sen. Patty Murray on the discussion draft to reauthorize the Elementary and Secondary Education Act.

The post Sens. Reed and Cochran introduce school library bill appeared first on District Dispatch.

Library of Congress: The Signal: Conservation Documentation Metadata at MoMA – An NDSR Project Update

planet code4lib - Fri, 2015-02-06 13:50

The following is a guest post by Peggy Griesinger, National Digital Stewardship Resident at the Museum of Modern Art.

The author in MoMA’s Media Conservation Lab with quality assessment tools for analog video. (Photo by Peggy)

As the National Digital Stewardship Resident at the Museum of Modern Art I have had the opportunity to work with MoMA’s newly launched digital repository for time-based media. Specifically, I have been tasked with updating and standardizing the Media Conservation department’s documentation practices. Their documentation needs are somewhat unique in the museum world, as they work with time-based media artworks that are transferred from physical formats such as VHS and U-matic tape to a variety of digital formats, each encoded in different ways. Recording these processes of digitization and migration is a huge concern for media conservators in order to ensure that the digital objects they store are authentic representations of the original works they processed.

It is my job to find a way of recording this information that adheres to standards and can be leveraged for indexing, searching and browsing. The main goal of this project is to integrate the metadata into the faceted browsing system that already exists in the repository. This would mean that, for example, a user could narrow down a results set to all artworks digitized using a particular make and model of a playback device. This would be hugely helpful in the event that an error were discovered with that playback device, making all objects digitized using it potentially invalid. We need the “process history metadata” (which records the technical details of tools used in the digitization or migration of digital objects) to be easily accessible and dynamic so that the conservators can make use of it in innovative and viable ways.

The first phase of this project involved doing in-depth research into existing standards that might be able to solve our documentation needs. Specifically, I needed to find a standardized way to describe – in technical detail – the process of digitizing and migrating various iterations of a time-based media work, or what we call the process history of an object. This work was complicated by the fact that I had little technical knowledge of time-based media. This meant that I not only had to research and understand a variety of metadata standards but I also had to simultaneously learn the technical language being used to express them.

Fortunately, my education in audiovisual technology developed naturally through my extensive interviews and collaborations with the media conservators at MoMA. In order to decide upon a metadata standard to use, I needed to learn very specifically the type of information the conservators wanted to express with this metadata, and how that information would be most effectively structured. This involved choosing artworks from the collection and going over, in great detail, how these objects were assessed, processed, and, if necessary, digitized. After selecting a few standards (namely PBCore, PREMIS, and reVTMD) I thought were worth pursuing in detail, I mapped this information into XML to see if the standards could, in fact, adequately express the information.

Before making a final decision on which standard or combination of standards to use, I organized a metadata experts meeting to get feedback on my project. The discussion at this meeting was immensely helpful in allowing me to understand my project in the wider scope of the metadata world. I also found it extremely helpful to get feedback from experts in the field who did not have much exposure to the project itself, so that they could catch any potential problems or errors that I might not be able to see from having worked so closely with the material for so long.

One important point that was brought up at the meeting was the need to develop detailed use cases for the process history metadata in the repository. I talked with the media conservators at MoMA to see what intended uses they had for this information. To get an idea of the specific types of uses they foresee for this metadata, we can look at the use case for accessing process history metadata. This seems simple on the surface, but we had a number of questions to answer: How do users navigate to this information? Is it accessed at the artwork level (including all copies and versions of an artwork) or at the file level? How is it displayed? Is every element displayed, or only select elements? Where is this information situated in our current system? The discussions I had with the media conservators and our digital repository manager allowed us to answer these questions and create clear and concise use cases.

Developing use cases was simplified by two things:

1) we already had a custom-designed digital repository into which this metadata would be ingested and

2) we had a very clear idea of the structure and content of this metadata.

This meant we were very aware of what we had to work with, and what our potential limitations were. It was therefore very simple for us to know which use cases would be simple fixes and which would require developing entirely new functionalities and user interfaces in the repository. Because we had a good idea of how simple or complex each use case would be, we could prescribe levels of desirability to each use case to ensure the most important and achievable use cases were implemented first.

The next stop for this project will be to bring these use cases, as well as wireframes we have developed to reflect them, to the company responsible for developing our digital repository system. Through conversation with them we will begin the process of integrating process history metadata into the existing repository system.

As I pass the halfway point of my residency, I can look back on the work I have done with pride and look forward to the work still to come with excitement. I cannot wait to see this metadata fully implemented into MoMA’s time-based media digital repository as a dynamic resource for conservators to use and explore. Hopefully the tools we are in the process of creating will be useful to other institutions looking to make their documentation more accessible and interactive.

LITA: Learning to Master XSLT

planet code4lib - Fri, 2015-02-06 12:00

This semester, I have the exciting opportunity to work as an intern among the hum of computers and maze of cubicles at Indiana University’s Digital Library Program! My main projects include migrating two existing digital collections from TEI P4 to TEI P5 using XSLT. If you are familiar with XML and TEI, feel free to skim a bit! Otherwise, I’ve included short explanations of each and links to follow for more information.

XML

Texts for digital archives and libraries are frequently marked up in a language called eXtensible Markup Language (XML), which looks and acts similarly to HTML. Marking up the texts allow them to be human- and machine-readable, displayed, and searched in different ways than if they were simply plain text.

TEI

The Text Encoding Initiative (TEI) Consortium “develops and maintains a standard for the representation of texts in digital form” (i.e. guidelines). Basically, if you wanted to encode a poem in XML, you would follow the TEI guidelines to markup each line, stanza, etc. in order to make it machine-readable and cohesive with the collection and standard. In 2007, the TEI consortium unveiled an updated form of TEI called TEI P5, to replace the older P4 version.

However, many digital collections still operate under the TEI P4 guidelines and must be migrated over to P5 moving forward. Here is where XSLT and I come in.

XSLT

eXtensible Stylesheet Language (XSL) Transformations are used to convert an XML document to another text document, such as (new) XML, HTML or text. In my case, I’m migrating from one type of XML document to another type of XML document, and the tool in between, making it happen, is XSLT.

Many utilize custom XSLT to transform an XML representation of a text into HTML to be displayed on a webpage. The process is similar to using CSS to transform basic HTML into a stylized webpage. When working with digital collections, or even moving from XML to PDF, XSLT is an invaluable tool to have handy. Learning it can be a bit of an undertaking, though, especially adding to an already full work week.

I have free time, sign me up!

Here are some helpful tips I have been given (and discovered) in the month I’ve been learning XSLT to get you started:

  1. Register for a tutorial.

Lynda.com, YouTube, and Oracle provide tutorials to get your feet wet and see what XSLT actually looks like. Before registering for anything with a price, first see if your institution offers free tutorials. Indiana University offers an IT Training Workshop on XSLT each semester.

  1. Keep W3Schools bookmarked.

Their XSLT page acts as a self-guided tutorial, providing examples, function lists, and function implementations. I access it nearly every day because it is clear and concise, especially for beginners.

  1. Google is your best friend.

If you don’t know how to do something, Google it! Odds are someone before you didn’t have your exact problem, but they did have one like it. Looking over another’s code on StackOverflow can give you hints to new functions and expose you to more use possibilities. **This goes for learning every coding and markup language!!

  1. Create or obtain a set of XML documents and practice!

A helpful aspect of using Oxygen Editor (the most common software used to encode in XML) for your transformations is that you can see the results instantly, or at least see your errors. If you have one or more XML documents, figure out how to transform them to HTML and view them in your browser. If you need to go from XML to XML, create a document with recipes and simply change the tags. The more you work with XSLT, the simpler it becomes, and you will feel confident moving on to larger projects.

  1. Find a guru at your institution.

Nick Homenda, Digital Projects Librarian, is mine at IU. For my internship, he has built a series of increasingly difficult exercises, where I can dabble in and get accustomed to XSLT before creating the migration documents. When I feel like I’m spinning in circles, he usually explains a simpler way to get the desired result. Google is an unmatched resource for lines of code, but sometimes talking it out can make learning less intimidating.

Note : If textbooks are more your style, Mastering XSLT by Chuck White lays a solid foundation for the language. This is a great resource for users who already know how to program, especially in Java and the C varieties. White makes many comparisons between them, which can help strengthen understanding.

 

If you have found another helpful resource for learning and applying XSLT, especially an online practice site, please share it! Tell us about projects you have done utilizing XSLT at your institution!

Pages

Subscribe to code4lib aggregator