Congratulations to Ed Pentz who celebrated 15 years in January as Executive Director of CrossRef.
In addition to Ed, Applications Developer Jon Stark, celebrated 11 years and Susan Collins, our Member Services Coordinator her 7th anniversary.
Paula Dwyer, our Controller and Vaishali Patel, our Technical Support Analyst both celebrate 4 years at CrossRef.
Chris Cocci, our Staff Accountant, Amy Kelley, our Operations Administrator and Penny Martin, our Part-Time UK Office Manager, have both been with us for 1 year.
Congratulations to all!
A new Library Explorers is out, Wearables in the library.
It’s spring! Sit at the picnic table and read some rounded up links.
Icon based labels are visually appealing, but often don’t clearly express their meaning. The power of text.
The diaspora of the 1×1 gif.
Botanical manufacturing molds growing trees into furniture
3D print a tiny planter
Recipes developed by a supercomputer and its algorithm. Judged for Pleasantness, Surprise, and Synergy.
This is a guest blog post by Matt Smith, who is a learning technologist at UCL. He is interested in how technology can be used to empower communities.Introduction
Fantasy Frontbench is a not-for-profit and openly licensed project aimed at providing the public with an engaging and accessible platform for directly comparing politicians.
A twist on the popular fantasy football concept, the site uses open voting history data from Public Whip and They Work For You. This allows users to create their own fantasy ‘cabinet’ by selecting and sorting politicians on how they have voted in Parliament on key policy issues such as EU integration, Updating Trident, Same-sex marriage and NHS reform.
Once created, users can see how their fantasy frontbench statistically breaks down by gender, educational background, age, experience and voting history. They can then share and debate their selection on social media.
The site is open licensed and we hope to make datasets of user selections available via figshare for academic inquiry.Aim of the project
Our aim is to present political data in a way that is engaging and accessible to those who may traditionally feel intimidated by political media. We wish to empower voters through information and provide them with the opportunity to compare politicians on the issues that most matter to them. We hope the tool will encourage political discourse and increase voter engagement.
The site features explanations of the electoral system and will hopefully help learners to easily understand how the cabinet is formed, the roles and responsibilities of cabinet ministers and the primary processes of government. Moreover, we hope as learners use the site, it will raise questions surrounding the way in which MPs vote in Parliament and the way in which bills are debated and amended. Finally, we host a gallery page which features a number of frontbenches curated by our team. This allows learners to see how different groups and demographics of politicians would work together. Such frontbenches include an All Female Frontbench, Youngest Frontbench, Most Experienced Frontbench, State Educated Frontbench, and a Pro Same-sex Marriage Frontbench, to name but a few.Development
Over the coming weeks, we will continue to develop the site, introducing descriptions of the main political parties, adding graphs which will allow users to track or ‘follow’ how politicians are voting, as well as adding historical frontbenches to the gallery e.g. Tony Blair’s 1997 Frontbench, Margaret Thatcher’s 1979 Frontbench and Winston Churchill’s Wartime Frontbench.
For further information or if you would like to work with us, please contact firstname.lastname@example.org or tweet us at [@FantasyFbench](http://twitter.com/FantasyFbench).Acknowledgements
Fantasy Frontbench is a not-for-profit organisation and is endorsed and funded by the Joseph Rowntree Reform Trust Ltd.
Javiera Atenas provided advice on open licensing and open data for the project.
This week, the Senate Committee on Health, Education, Labor and Pensions (aka “HELP Committee”) met to mark-up (debate, amend and vote on) the Every Child Achieves Act of 2015, a bill that would reauthorize the Elementary and Secondary Education Act (ESEA), formerly known as No Child Left Behind.
The American Library Association (ALA) sought amendments to require that every student have access to an “effective school library program,” defined in statute to require that: every school library be staffed by a certified librarian; equipped with up-to-date materials and technology; and enriched by a curriculum jointly developed by a grantee school’s librarians and classroom teachers and codifying the currently funded Innovative Approaches to Literacy (IAL) program under ESEA.
While we did not get all we had hoped for, the Committee did adopt Sen. Sheldon Whitehouse’s (with co-sponsors: Sens. Bob Casey, Susan Collins, and Elizabeth Warren) amendment to amend Title V of ESEA establishing “effective school library programs” as an eligible use of funds under a program for literacy and arts education. Passed by unanimous consent as part of Chairman Sen. Lamar Alexander’s “manager’s amendment” package, this provision would allow grants to be awarded to low-income communities for “developing and enhancing effective school library programs, which may include providing professional development for school librarians, books, and up-to-date materials to low-income schools.”
The bill that the Committee marked up and passed will next be taken up by the full Senate, although we don’t yet know when. Our champion, Senator Jack Reed, intends to propose a stronger amendment on the Senate floor than the one adopted by the HELP Committee to broadly provide dedicated funding for school libraries and librarians in ESEA.
We would like to thank all of the library advocates who reached out to their senators and representatives to demand that Congress support effective school library programs. As we move forward in the advocacy process, there is more work to do. Stay tuned as we await further word!
The following is a guest post by Joey Heinen, National Digital Stewardship Resident at Harvard University Library.
As has been famously outlined by the Library of Congress on their website on sustainability factors for digital formats, digital material is just as susceptible to obsolescence as analog formats. Within digital preservation there are a number of strategies that can be employed in order to protect your data including refreshing, emulation or migration, to name a few. As the National Digital Stewardship Resident at Harvard Library, I am responsible for developing a format migration framework which can be continuously adapted for migration projects at Harvard.
In order to test the viability of this framework, I am also planning for migration of three obsolete formats within the Digital Repository Service (DRS) – Kodak PhotoCD, SMIL playlists and RealAudio. While each format will have its own challenges for a standard workflow, there are certain processes which will always be incorporated into the overall migration framework. In a sense I am helping to create a series of incantations that must be uttered in order to raise these much-cherished digital materials back from the dead. No sage-burning necessary.
Migration is the chosen digital preservation strategy for this project since the aim of migration is to move content from its previously tenuous origins to a format with much greater promise in terms of support and usage. Our overall goal is to continue to provide remote access on modern platforms in a way that best matches the original format.
A Framework Emerges – First Steps
I began my residency by performing a broad literature review on the status of migration projects across the library field. This was a great way to acquaint myself with the terrain, but greater depth would be needed by using some real examples and understanding the institutional context of Harvard – its staff structure, its resources, its policies and its digital repository. Bouncing back and forth between the broader framework and the individual format plans, some patterns began to emerge. After further processing, we have arrived upon some core attributes that will inform the overall framework. The specifics of this framework are still in development and are much too large to narrate here, but I’ll discuss some of the most distinct themes.
The mention of “stakeholder involvement” first is deliberate – without gaining a sense for the “who,” the project cannot commence. Depending on the type of content, the exact cast of characters may vary but the types of roles will stay somewhat consistent. For the framework, we identified the following key areas of responsibility and corresponding responsible parties:
- Project Management (that’s me!).
- Technical Guidance/Format Experts (those who understand the format best).
- Documentation (that’s me too! Though gathering provenance and creation of documentation throughout the migration may originate from other departments, depending).
- Quality Assurance/Plan Approval (that’s pretty much everyone but at different points in the process).
- Systems Conformance/Technical Infrastructure (this is almost always our friends in Library IT staff and Metadata who inform us of how the plan does or does not comply with current technological procedures and infrastructure).
- Content Ownership (curators or collection managers, involvement is generally just to be informed of major decisions).
Defined Project Phases
In general, our migration plans can be broken down into these essential phases:
- Planning for the Test.
- Refining the Plan.
- Executing the Plan.
- Verifying Results and Project Wrap-Up.
From these project phases, we then defined the following within each phase:
- Workflow Activities – essential steps in the migration workflow.
- Workflow Components – ways of grouping the more granular activities.
- Project Deliverables – this could take on the form of: the migrated content itself; documentation or metadata generated along the way; diagrams of the workflow and the migration path (e.g. how the content in relation to the Harvard repository will change from pre- to post-migration); or new revelations in digital preservation policies e.g. storage and retention plans.
Last but not least, we want to consider how other projects within the library might impact the migration plan, whether in terms of timing and staff availability, as well as projects that might impact the infrastructure upon which migration is supported. For example, the metadata from Harvard’s DRS is being migrated to a new version of the DRS which includes changes to how relationships between files and objects are described. The relationship structure of still image objects will be completely different before and after this metadata migration so a plan to migrate the Kodak PhotoCD files will need to take this into consideration.
Format Specifics – Examples
In terms of how this framework has been used on the actual formats, we have made the most progress on Kodak PhotoCD, mostly because it’s less complex and less staff intensive than the SMIL/RealAudio formats. So far we have completed the analysis, creation of the test, the testing itself and are beginning to define how the old image objects will be changed relative to the inclusion of migrated content, additional artifacts (e.g. metadata) and the new content model structuring. The details of our decisions around successfully migrating PhotoCD content is too verbose for this post (though more information can be found on the NDSR blog). However, the Migration Workflow and Migration Pathway diagrams shown here help to show “how the sausage is made.”
The Migration Workflow demonstrates every step of the process from gathering documentation for initial analysis to ingest of the migrated content into the repository. In the example at left, we see the first two components of Phase 1 of the Migration Workflow – Format/Tools Research and Confirming Migration Criteria. As is shown in the corresponding legend, stakeholder involvement is determined based on a colored box which names the stakeholder group within each component. These roles were designed based off the RACI Responsibility Assignment Matrix which defines 4 levels of responsibility.
The Migration Pathway diagram (at right) shows how content will be transformed by a migration. A diagram is produced for each “bucket” of content for which the same tools, settings and outputs can be used unanimously based on shared technical characteristics. This example, from the Horblit Collection, a collection of daguerreotypes initially digitized in PhotoCD form, shows the ways in which the original PhotoCD content as found within the DRS will be converted and newly packaged and ingested into the repository. It considers how the image objects look now (DRS1), how they will look after the metadata migration (DRS2) and how the object will look after the content is migrated.
In the two months remaining for my residency I will be completing the overall framework, and working on the Kodak PhotoCD and SMIL/RealAudio plans (though execution of these plans will certainly fall outside of this timeline). After planning for the format-specific migration and going through several passes at the overall framework, we are getting closer to an actionable model for ongoing migration projects.
It has been fascinating to oscillate between deep analysis of the technical and infrastructural challenges faced with each format and finding ways to abstract these processes into a template that can be continuously adapted. The result will certainly be of use to Harvard, and our hope is that in sharing it with the larger digital preservation field that it will be useful to others as well. For the finalized spells and incantations, check the NDSR blog or Harvard website at the end of May. Presto Change-o!
After the passage of SEA 101 (the Indiana Religious Freedom Restoration Act), many scheduled attendees of DPLAFest were conflicted about its location in Indianapolis. Emily Gore, DPLA Director for Content, captured both this conflict and the opportunity the location provides when she wrote:
We should want to support our hosts and the businesses in Indianapolis who are standing up against this law… At DPLAfest, we will also have visible ways to show that we are against this kind of discrimination, including enshrining our values in our Code of Conduct. We encourage you to use this as an opportunity to let your voice and your dollars speak.
As DPLAFest attendees, patronizing businesses identifying themselves with Open for Service is an important start, but some of us wanted to do more. During our visit to Indianapolis, we are donating money to local charities supporting the communities and values that SEA 101 threatens.
One such local charity is the Indiana Youth Group (IYG). The IYG “provides safe places and confidential environments where self-identified lesbian, gay, bisexual, transgender, and questioning youth are empowered through programs, support services, social and leadership opportunities and community service. IYG advocates on their behalf in schools, in the community and through family support services.” IYG was written up as a direct-action donation option in the New Civil Rights Movement, and they provide services and support in parts of the state with a more hostile legal environment than Indianapolis.
This kind of local, direct action effort needs our support in Indiana right now. If you can, please consider donating to the Indiana Youth Group while in Indiana for DPLAFest. There is an existing GoFundMe campaign that IYG recommended linked below. If you choose to donate via GoFundMe, please consider tagging your donation with #DPLAFest so that we can communicate the goodwill of DPLAFest attendees as a group to the charity. The GoFundMe campaign sends money directly to IYG regardless of fundraising goals.
GoFundMe for Indiana Youth Group: http://www.gofundme.com/qpkabg
You can also donate via PayPal through IYG’s website. If you choose to donate through PayPal, please consider mentioning DPLAFest in the related forms on IYG site. IYG has offered to collate those responses with donations to again communicate the positive support DPLAFest attendees give to the charity and to LGBTQ youth in the state of Indiana.
Thank you for considering joining us and other DPLAFest attendees in supporting LGBTQ communities in Indiana. We look forward to seeing you in Indianapolis.
Open Knowledge Foundation: Honouring the memory of leading Open Knowledge community member Subhajit Ganguly
It is with great sadness that we have learned that Mr. Subhajit Ganguly, an Open Knowledge Ambassador in India and a leading community member in the entire region, has suddenly and tragically passed away.
Following a short period of illness Subhajit Ganguly, who was only 30 years old, passed away on the morning of April 7, local time, in the hospital in his hometown of Kolkata, India. His demise came as a shock to his family and loved ones, as well as to his colleagues and peers in the global open data and open knowledge community.
Subhajit was known as a relentless advocate for justice and equality, and a strong proponent and community builder around issues such as open data, open science and open education, which were all areas to which he devoted a large part of both his professional and personal time. Most recently he was the main catalyst and organiser of India Open Data Summit and he successfully contributed as project lead for the Indian Local City Census as well as being a submitter and reviewer of datasets in the Global Open Data Index, a global community-driven project that compares the openness of datasets worldwide to ensure another most pressing issue for him: Political transparency and accountability.
Subhajit was also instrumental in building the Open Knowledge India Local Group over the past two years, alongside also volunteering his time to coordinate other groups and initiatives within the open data landscape. Just last summer he attended the Open Knowledge Festival in Berlin to join his fellow community leaders to plan the future of open knowledge and open data in India, regionally in AsiaPAC, and globally.
Ever since the news passed across the globe during the last few days, messages and praise of Subhajit’s being and work have been pouring in from community leaders and members from near and far. He will be tremendously missed, and we join the many voices across the world mourning his loss.
Our thoughts and condolences go out to his family and loved ones. We hope that his work and vision will continue to stand as a significant example to follow for people around the world. May Subhajit rest in peace.
In September, I wrote a post about new collaborative technology from Crestron. We installed AirMedia in our library, and we are now looking at AirTame as a possible next generation version of collaborative technology.
It works on all mobile devices. AirMedia does this too, but the tablet features have been less than ideal. Airtame was able to raise more money than expected and is currently working to scale its production.
My university is also considering how collaborative technologies can be used in the classroom. This type of technology will allow for enhanced group work, enhanced presentations, and the instructor being able to move around the classroom to work with different students instead of being tied to the front of the classroom.
As technology continues to move toward mobile and wearable, the ability to show a group what is on a small screen will become more important in both education and the business world.
How is your library using collaborative technology?
How can libraries support new communication methods using collaborative technology?
DuraSpace News: Recordings Available: “Integrating ORCID Persistent Identifiers with DSpace, Fedora and VIVO.”
DuraSpace launched its 11th Hot Topics Community Webinar Series, “Integrating ORCID Persistent Identifiers with DSpace, Fedora and VIVO” last month. Curated by ORCID’s European Regional Director, Josh Brown, this series provided detailed insights into how ORCID persistent digital identifiers can be integrated with DSpace and Fedora repositories and with the VIVO open source semantic
The Web service maintenance scheduled for Friday, April 17 has been canceled and will be rescheduled. Stay tuned to Developer Network for updates on future maintenance windows.
We apologize for any inconvenience.
Tuesday May 20, 2015
1:00 pm – 2:00 pm Central Time
Register now for this webinar
A brand new LITA Webinar on youth and technology.
In this digital age it has become increasingly important for libraries to infuse technology into their programs and services. Youth services librarians are faced with many technology routes to consider and app options to evaluate and explore. Join Claire Moore from the Darien Public Library to discuss innovative and effective ways the library can create opportunities for children, parents and caregivers to explore new technologies.
Is the Head of Children’s Services at Darien Library in Connecticut. She is a member of ALSC’s School Age Programs and Services Committee and the Digital Content Task Force. Claire earned her Masters in Library and Information Science at Pratt Institute in New York. Claire currently lives in Brooklyn, NY.
Can’t make the date but still want to join in? Registered participants will have access to the recorded webinar.
LITA Member: $45
Register Online page arranged by session date (login required)
Mail or fax form to ALA Registration
Call 1-800-545-2433 and press 5
Questions or Comments?
For all other questions or comments related to the course, contact LITA at (312) 280-4269 or Mark Beatty, email@example.com.
My Communications of the ACM came in the main recently, and in an article about the future of scholarly publishing in computer science (in general — and what the ACM Publications Board is thinking about doing), there was this paragraph about the attitudes of a subset of ACM members towards open access publishing.
Open access models are an area of broad interest, and we could fill a dozen columns on different issues related to open access publishing. Based on actions taken by certain research funders (primarily governmental, but also foundations), we have been looking at whether and how to incorporate author-pays open access into ACM’s journals. We asked ACM Fellows about author-pays Gold OA journals, and specifically whether they preferred a Gold “umbrella” journal across computer science vs. Gold-only specialty journals vs. Gold editions of current journals. Gold-only specialty journals were preferred by 15% of Fellows; a Gold umbrella journal by 29%; and Gold editions of existing journals by 45%. Ten percent were against author-pays open access generally, preferring the current model or advocating for non-author-pays open access models.
Note that the ACM Fellows are “the top 1% of ACM members [recognized] for their outstanding accomplishments in computing and information technology and/or outstanding service to ACM and the larger computing community.” So it is hardly a representative sample of computer science professors. They do have a survey form for more broad, if yet unscientific, input on the topic.
I’ve tangled with the ACM editor-in-chief before about the cost of the ACM digital library subscriptions cross-subsidizing other ACM activities. There have been others that have taken the ACM to task for their open access policies. It is good to see the publications committee learning from past missteps, educating then listening to its members, and be willing to consider change in this area.Link to this post!
The author advocates an approach to university curriculum that re-emphasizes the student's role in the search for truth and knowledge by providing essential critical thinking skills and treating undergraduate students as full participants in the academic discussion.Preamble
The academy is a place to develop critical thinking skills, and a place to develop those skills by participating in discussions seeking truth and knowledge. These conversations may occur between students in informal spaces; they may be facilitated by a professor and take place during a single class session or over multiple sessions during a course; or they may take place over centuries (most commonly through the medium of the written word).
As a university, we recognize the value of all of these conversations in developing citizens with well-honed critical thinking skills. However, I would argue that our focus (at least at the undergraduate level) has been on the level of single and multiple class discussions. Students are often assigned course work for which the only intended audience is the professor or marking T.A.; the audience for presentations is normally just the rest of the class. A typical unit of work is the “essay” (from the French: essayer, meaning “to try”).
(Rhetorical question alert!) But what are the students trying for? Typically, they are trying for grades; some for an A, some simply to pass. But are they trying to contribute to the greater academic discussions? Where do those essays go in a month, or a year? Do students see their papers as parts of a greater continuum of the academic discussion, or do they see them as a means to an end? Are students exhorted to aspire to publishing their papers on any scale? What effect does the treatment of course work as an ephemeral entity, rather than a permanent contribution to the field of knowledge, have on the motivation of students to excel in the application of their critical thinking skills, to be creative, to write high quality papers? Does the knowledge that their days and nights of hard work going to quickly be consigned to the trash bin cause students to treat the work of the intellectual giants that preceded them with a similar disregard?Inspiration
I initially started worrying about this because of a third-year assignment that simply cited “Google” as its sole source. The sad confusion of search tool with source immediately raised my concern about the student's ability to evaluate alternative sources of information and opinion for authority. I doubted that this student had completed the Library's introductory tutorial on searching and citing sources, and that reinforced my desire to encourage programs to make this course a mandatory requirement. During a casual conversation with Dr. David Robinson, he disclosed that he assigned basic literature research tasks to every one of his courses because he could not guarantee that his students had learned those skills outside of his courses. I continued to reflect on this problem in the attempt to develop an approach to motivating the student to want to participate in the overarching discussions – and that is where the idea of “research across the curriculum” came to mind.
I will credit Dr. Laurence Steven with the idea of motivating higher quality undergraduate work through the expectation of publication. In his fourth-year Literary Criticism course in 1996, he told students at the outset of the class that he planned to compile and publish the complete set of our final assignments. Even though the press run was undoubtedly under 100, the commitment to taking our work seriously positively influenced our efforts to produce high-quality assignments.Emphasizing the academic discussion
The overarching message we can send to students is: “We take your effort seriously, and will help you contribute to your chosen discipline.”
Publishing offers the carrot of fame and the stick of exposure. I cannot help but think that the expectation of publishing your work will improve the quality of that work.
We obviously cannot expect a first year student to publish their work in a traditional academic journal. However, the Web has given us an alternative publishing method that can be controlled to meet the student's comfort level: publishing visibility could be limited to the author herself, to the professor, to the class, to the program, to the university, and to the world. If we created a simple Web-based repository, we could allow a student to first work on drafts of their assignment, then open it up to their professor or a TA for initial review, then open it up to the class to exchange their work with their classmates and participate in peer review. Outstanding work could be surfaced at wider levels of availability. Of course, given that the student retains copyright over their work, they would be free to republish their work as they see fit (on a personal Web log, on a discipline-related mailing list, to an academic journal, etc). This opens up an opportunity to discuss intellectual property issues and the characteristics of various publishing mechanisms.
Through the course of a student's career, this Web-based publishing mechanism would serve as an electronic portfolio of their work. If a student chose to make their work visible outside of the class, they would be able to track citations to that work over time - particularly if professors chose to surface the work of previous students in a given class as optional or required references in addition to traditional sources. We know that one of the primary uses of the Laurentian University Archives today is by students seeking the fourth-year papers of previous students in their disciplines so that they can find work to build upon.
At the fourth-year level, we could strongly encourage (to the point of making it an unstated assumption) that fourth-year work should be published in some fashion. The publishing schedule of traditional journals makes it unlikely that a student could achieve publication within the normal class schedule, however we could commit some resources to assisting those alumni who want to polish their fourth-year papers for journal publication (without necessarily requiring a complete graduate program). Assuming that the J.N. Desmarais Library goes forward with the Laurentian University Institutional Repository, we could offer that as a venue for publishing fourth year work (or exceptional work from previous years).
If there are doubts that fourth-year work is of publishable quality, I would like to refer back to an evaluation (???) of the fourth-year papers that are held by the Laurentian University archives. Many of these papers were found to be of a quality comparable to Master's theses (the hypothesis was that that the lack of graduate programs resulted in higher-quality undergraduate work).
Today I found the following resources and bookmarked them on Delicious.
- eval.in Paste and execute code online.
Digest powered by RSS Digest
New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.New This Week
Visit the LITA Job Site for more available jobs and for information on submitting a job posting.
Science uses the art of observation to unearth truth. Sometimes the observation is minutely focused on a small constituent of a much larger ecosystem. By doing this, it can be possible to detect larger truths from such minutely focused observation. This brings me to my latest metadata investigation, which is about as minutely focused within the library metadata world as it is possible to be.
I decided to look at the life of a single MARC subfield, in this case the lowly 034 $2. The 034 field is “Coded Cartographic Mathematical Data”. The 034 field was proposed and adopted in 2006. The $2 subfield is where one can record the source of the data in the 034. Values were to come from a specified list of potential values.
From my “MARC Usage in WorldCat” work, I already knew that as of last January there were about 2.4 million records with an 034 field. I also knew that the $2 subfield of the 034 only appeared 1,976 times. Of course a year had passed so that figure was likely low.
So the first thing I did was to grab all of the 034 $2 subfields and count how many times each source code had been used. Since the point of my exercise was not to show errors, I combined entries with typos with what they should have been and only counted as “errors” entries that were clearly in the wrong place in the field:3868 bound 2539 gooearth 1069 geoapn 215 geonet 157 geonames 129 pnosa2011 46 other 26 gnis 26 ERRORS 17 cga 5 local 3 gnrnsw 3 aadcg 1 wikiped 1 gettytgn 1 geoapn geonames
I then wanted to find out who was using this subfield, so I ran a job to extract the 040 $a, the “original cataloging agency” and totaled the occurrences. It turns out the vast majority come from five institutions:
2471 National Library of Israel (J9U)
1632 Libraries Australia (AU@)
1076 British Library (UKMGB)
885 Pennsylvania State University (UPM)
799 Cambridge University (UkCU)
Then it drops off rather precipitously from there:
213 Agency for the Legal Deposit Libraries (Scotland) (StEdALDL)
206 New York Public (NYP)
117 Commonwealth Libraries, Bureau of State Library, Pennsylvania (PHA)
101 Yale University, Beinecke Rare Book and Manuscript Library (CtY-BR)
Curious about how the main user of this element was using it, I contacted the National Library of Israel. They were kind enough to reply to my odd query:
We have added geographic coordinates to records that describe ketubot, Jewish marriage contracts. The contracts almost always include the geographic location where the wedding takes place.
Using, google earth ($2 gooearth) , we added the coordinates with the intention of enabling the display of a google map in this website.
I don’t believe that the site is fully functional as to their intended goal, but you can at least start to get an idea as to how this data is going to be used. So even a lowly subfield can have higher aspirations for impact than may seem warranted at first.About Roy Tennant
Roy Tennant works on projects related to improving the technological infrastructure of libraries, museums, and archives.Mail | Web | Twitter | Facebook | LinkedIn | Flickr | YouTube | More Posts (87)
Schema.org is basically a simple vocabulary for describing stuff, on the web. Embed it in your html and the search engines will pick it up as they crawl, and add it to their structured data knowledge graphs. They even give you three formats to chose from — Microdata, RDFa, and JSON-LD — when doing the embedding. I’m assuming, for this post, that the benefits of being part of the Knowledge Graphs that underpin so called Semantic Search, and hopefully triggering some Rich Snippet enhanced results display as a side benefit, are self evident.
The vocabulary itself is comparatively easy to apply once you get your head around it — find the appropriate Type (Person, CreativeWork, Place, Organization, etc.) for the thing you are describing, check out the properties in the documentation and code up the ones you have values for. Ideally provide a URI (URL in Schema.org) for a property that references another thing, but if you don’t have one a simple string will do.
There are a few strangenesses, that hit you when you first delve into using the vocabulary. For example, there is no problem in describing something that is of multiple types — a LocalBussiness is both an Organisation and a Place. This post is about another unusual, but very useful, aspect of the vocabulary — the Role type.
At first look at the documentation, Role looks like a very simple type with a handful of properties. On closer inspection, however, it doesn’t seem to fit in with the rest of the vocabulary. That is because it is capable of fitting almost anywhere. Anywhere there is a relationship between one type and another, that is. It is a special case type that allows a relationship, say between a Person and an Organization, to be given extra attributes. Some might term this as a form of annotation.
So what need is this satisfying you may ask. It must be a significant need to cause the creation of a special case in the vocabulary. Let me walk through a case, that is used in a Schema.org Blog post, to explain a need scenario and how Role satisfies that need.Starting With American Football
Say you are describing members of an American Football Team. Firstly you would describe the team using the SportsOrganization type, giving it a name, sport, etc. Using RDFa:<div vocab="http://schema.org/" typeof="SportsOrganization" resource="http://example.com/teams/tlg"> <span property="name">Touchline Gods</span> <span property="sport">American Football</span> </div>
Then describe a player using a Person type, providing name, gender, etc.:<div vocab="http://schema.org/" typeof="Person" resource="http://example.com/folks/chucker"> <span property="name">Chucker Roberts</span> <span property="birthDate">1989</span> </div>
Now lets relate them together by adding an athlete relationship to the Person description:<div vocab="http://schema.org/" typeof="SportsOrganization"> <span property="name">Touchline Gods</span> <span property="sport">American Football</span> <span property="athlete" typeof="Person" src="http://example.com/folks/chucker"> <span property="name">Chucker Roberts</span> <span property="birthDate">1989</span> </span> </div>
Let’s take a look of the data structure we have created using Turtle – not a html markup syntax but an excellent way to visualise the data structures isolated from the html:@prefix schema: <http://schema.org/> . <http://example.com/teams/tlg> a schema:SportsOrganization; schema:name "Touchline Gods"; schema:sport "American Football"; schema:athlete <http://example.com/folks/chucker> . <http://example.com/folks/chucker> a schema:Person; schema:name "Chucker Roberts"; schema:birthDate "1969".
So we now have Chucker Roberts described as an athlete on the Touchline Gods team. The obvious question then is how do we describe the position he plays in the team. We could have extended the SportsOrganization type with a property for every position, but scaling that across every position for every team sport type would have soon ended up with far more properties than would have been sensible, and beyond the maintenance scope of a generic vocabulary such as Schema.org.
This is where Role comes in handy. Regardless of the range defined for any property in Schema.org, it is acceptable to provide a Role as a value. The convention then is to use a property with the same property name, that the Role is a value for, to then remake the connection to the referenced thing (in this case the Person). In simple terms we have have just inserted a Role type between the original two descriptions.
This indirection has not added much you might initially think, but Role has some properties of its own (startDate, endDate, roleName) that can help us qualify the relationship between the SportsOrganization and the athlete (Person). For the field of organizations there is a subtype of Role (OrganizationRole) which allows the relationship to be qualified slightly more.
RDFa:<div vocab="http://schema.org/" typeof="SportsOrganization" resource="http://example.com/teams/tlg"> <span property="name">Touchline Gods</span> <span property="sport">American Football</span> <span property="athlete" typeof="OrganizationRole"> <span propery="startDate">01072014</span> <span property="roleName">Quarterback</span> <span property="number">11;</span> <span property="athlete" typeof="Person" src="http://example.com/folks/chucker"> <span property="name">Chucker Roberts</span> <span property="birthDate">1989</span> </span> </span> </div>
and in Turtle:@prefix schema: <http://schema.org/> <http://example.com/teams/tlg> a schema:SportsOrganization; schema:name "Touchline Gods"; schema:sport "American Football"; schema:athlete [ a schema:OrganizationRole schema:roleName "Quarterback"; schema:startDate "01072014"; schema:number "11" schema:athlete <http://example.com/folks/chucker> ]. <http://example.com/folks/chucker> a schema:Person; schema:name "Chucker Roberts"; schema:birthDate "1969" .Beyond American Football
So far I have just been stepping through the example provided in the Schema.org blog post on this. Let’s take a look at an example from another domain – the one I spend my life immersed in – libraries.
There are many relationships between creative works that libraries curate and describe (books, articles, theses, manuscripts, etc.) and people & organisations that are not covered adequately by the properties available (author, illustrator, contributor, publisher, character, etc.) in CreativeWork and its subtypes. By using Role, in the same way as in the sports example above, we have the flexibility to describe what is needed.
Take a book (How to be Orange: an alternative Dutch assimilation course) authored by Gregory Scott Shapiro, that has a preface written by Floor de Goede. As there is no writerOfPreface property we can use, the best we could do is to is to put Floor de Goede in as a contributor. However by using Role can qualify the contribution role that he played to be that of the writer of preface.
In Turtle:@prefix schema: <http://schema.org/> . @prefix relators: <http://id.loc.gov/vocabulary/relators/> . @prefix viaf: <http://viaf.org/viaf/> . <http://www.worldcat.org/oclc/859406554> a schema:Book; schema:name "How to be orange : an alternative Dutch assimilation course"; schema:author viaf:305830120; # Gregory Scott Shapiro schema:exampleOfWork ; schema:contributor [ a schema:Role; schema:roleName relators:wpr; # Writer of preface schema:contributor viaf:283191359; # Floor de Goede ] .
and RDF:<div vocab="http://schema.org/" typeof="Book" resource="http://www.worldcat.org/oclc/859406554"> <span property="name">How to be orange : an alternative Dutch assimilation course</span> <span property="author" src="http://viaf.org/viaf/305830120">Gregory Scott Shapiro</span> <span property="exampleOfWork" src="http://worldcat.org/entity/work/id/1404771725"></span> <span property="contributor" typeOf="Role" > <span property="roleName" src="http://id.loc.gov/vocabulary/relators/wpr">Writer of preface</span> <span property="contributor" src="http://http://viaf.org/viaf/283191359">Floor de Goede</span> </span> </div>
You will note in this example I have made use of URLs, to external resources – VIAF for defining the Persons and the Library of Congress relator codes – instead of defining them myself as strings. I have also linked the book to it’s Work definition so that someone exploring the data can discover other editions of the same work.
Do I always use Role?
In the above example I relate a book to two people, the author and the writer of preface. I could have linked to the author via another role with the roleName being ‘Author’ or <http://id.loc.gov/vocabulary/relators/aut>. Although possible, it is not a recommended approach. Wherever possible use the properties defined for a type. This is what data consumers such as search engines are going to be initially looking for.
To demonstrate the flexibility of using the Role type here is the markup that shows a small diversion in my early career:@prefix schema: <http://schema.org/> . <http://www.wikidata.org/entity/Q943241> a schema:PerformingGroup; schema:name "Gentle Giant"; schema:employee [ a schema:Role; schema:roleName "Keyboards Roadie"; schema:startDate "1975"; schema:endDate "1976"; schema:employee [ a schema:Person; schema:name "Richard Wallis"; ]; ]; .
This demonstrates the ability of Role to be used to provide added information about most relationships between entities, in this case the employee relationship. Often Role itself is sufficient, with the ability for the vocabulary to be extended with subtypes of Role to provide further use-case specific properties added.
Whenever possible use URLs for roleName
In the above example, it is exceedingly unlikely that there is a citeable definition on the web, I could link to for the roleName. So it is perfectly acceptable to just use the string “Keyboards Roadie”. However to help the search engines understand unambiguously what role you are describing, it is always better to use a URL. If you can’t find one, for example in the Library of Congress Relater Codes, or in Wikidata, consider creating one yourself in Wikipedia or Wikidata for others to share. Another spin-off benefit for using URIs (URLs) is that they are language independent, regardless of the language of the labels in the data the URI always means the same thing. Sources like Wikidata often have names and descriptions for things defined in multiple languages, which can be useful in itself.
This very flexible mechanism has many potential uses when describing your resources in Schema.org. There is always a danger in over using useful techniques such as this. Be sure that there is not already a way within Schema, or worth proposing to those that look after the vocabulary, before using it.
Good luck in your role in describing your resources and the relationships between them using Schema.org
Last week, the American Library Association (ALA) Washington Office hosted librarians from Thailand who are visiting the United States to learn about library practices and futures. Our visitors, Supawan Ardkhla and Nusila Yumaso, are participants in the U.S. State Department’s International Visitor Leadership Program. Through short-term visits to the U.S., foreign leaders in a variety of fields experience our country firsthand and cultivate professional relationships. They were accompanied by interpreter Montanee Anusas-amornkul.
The visitors’ agenda was wide-ranging. Topics included ebooks, digital literacy, libraries as place, employment and entrepreneurship, and many more. After Washington, the Thai librarians visited libraries in several other U.S. cities.
ALA Washington Office Executive Director Emily Sheketoff and I represented ALA. Hosting visitors from abroad is a regular responsibility of the Office, and we’ve met with librarians from many other countries around the world, from Lebanon to Columbia.
Those who have been paying attention to the cutting edge of digital libraries no doubt know about the Hydra project headed up by Stanford. Hydra is a digital repository system that is built using Ruby and is designed to accept the full range of digital object types that a large research library must manage. Built on top of Fedora and Solr, with Blacklight as the default front-end, one doesn’t normally associate ease of installation with a stack like that. Heck, you could spend a week just getting all of the dependencies installed, configured, and up and running.
So color me surprised when it was announced that the Digital Public Library of America, Stanford University, and the Duraspace organization announced that IMLS had awarded them a $2 million National Leadership Grant to develop “Hydra-in-a-Box”. Just as it sounds, the goal is to “build, bundle, and promote a feature-complete, robust digital repository that is easy to install, configure, and maintain—in short, a next-generation digital repository that will work for institutions large and small, and is capable of running as a hosted service.”
That is no small goal, and a laudable one at that. But…gosh. What a distance there is to travel to get there. The project has it pegged at 30 months, so nearly three years. That sounds about right, and so far Tom Cramer has built one of the most broad-based coalitions I’ve seen in academic libraries around Hydra, so you won’t find me betting against him. Especially since he just landed $2 million to help him build out his pet project. So as much as it pains this Cal Bear to say it, Go Stanford!