Part ee of Amazon crawl..
This item belongs to: data/ol_data.
This item has files of the following types: Data, Data, Metadata, Text
Join a panel of Book Industry Study Group (BISG) and American Library Association (ALA) leaders at this year’s 2015 ALA Annual Conference in San Francisco when they discuss the results of a newly-released study on public library patrons’ use of digital content.
During the conference session “Digital Content in Public Libraries: What Do Patrons Think?” panelists will discuss the results of a new study by the BISG and ALA that was designed to provide invaluable insight into how readers interact with e-books in a library environment. The session takes place from 3:00 to 4:00 p.m. on Sunday, June 28, 2015, at the Moscone Convention Center in room 131 of the North Building.
The digital content survey was developed to understand the behavior of library patrons, including their use of digital resources and other services offered by public libraries. The study examined the impact of digital consumption behaviors, including the adoption of new business models, on library usage across America.Speakers
- Kathy Rosa, director, Office for Research and Statistics, American Library Association
- Carrie Russell, program director, Public Access to Information, Office for Information Technology Policy, American Library Association
- Nadine Vassallo, project manager, Research & Information, Book Industry Study Group
The post How do library patrons feel about digital content? appeared first on District Dispatch.
A previous post I made reviewing the Ithaka report “Streamlining access to Scholarly Resources” got a lot of attention. Thanks!
The primary issue I’m interested in there: Getting our patrons from a paywalled scholarly citation on the open unauthenticated web, to an authenticated library-licensed copy, or other library services. “Bridging the gap”.
Here, we use Umlaut to turn our “link resolver” into a full-service landing page offering library services for both books and articles: Licensed online copies, local print copies, and other library services.
This means we’ve got the “receiving” end taken care of — here’s a book and an article example of an Umlaut landing page — the problem reduces to getting the user from the open unauthenticated web to an Umlaut page for the citation in question.
Which is still a tricky problem. In this post, brief discussion of two things: 1) The new “Google Scholar Button” browser extension from Google, which is interesting in this area, but I think ultimately not enough of a solution to keep me from looking for more, and 2) Possibilities of Zotero open source code toward our end.The Google Scholar Button
This plugin will extract the title of an article from a page (either text you’ve selected on the page first, or it will try to scrape a title from HTML markup), and give you search results for that article title from Google Scholar, in a little popup window.
Interestingly, this is essentially the same thing a couple of third party software packages have done for a while: The LibX “Magic Button”, and Lazy Scholar. But now we get it in an official Google release, instead of hacky workarounds to Google’s lack of API from open source.
The Google Scholar Button is basically trying to bridge the same gap we are; it provides a condensed version of google scholar search results, with a link to an open access PDF if Google knows about one (I am still curious how many of these open access PDF’s are not-entirely-licensed copies put up by authors or professors without publisher permissions);
And it in some cases provides an OpenURL link to a library link resolver, which is just what we’re looking for.
However, it’s got some limitations that keep me from considering it a satisfactory ‘Bridging the Gap’ solution:
- In order to get the OpenURL link to your local library link resolver while you are off campus, you have to set your Google Scholar preferences in your browser, which is pretty confusing to do.
- The title has to match in Google Scholar’s index of course. Which is definitely extensive enough to still be hugely useful, as evidenced by the open source predecessors to Google Scholar Button trying to do the same thing.
- But most problematically at all, Google Scholar Button results will only show the local library link resolver link for some citations: The ones that have been registered as having institutional fulltext access in your institutional holdings registered with Google. I want to get users to the Umlaut landing page for any citation they want, even if we don’t have licensed fulltext (and we might even if Google doesn’t think we do, the holdings registrations are not always entirely accurate), I want to show them local physical copies (especially for books), and ILL and other document delivery services.
- The full Google Scholar gives a hard-to-find but at least it’s there OpenURL link for “no local fulltext” under a ‘more’ link, but the Google Scholar Button version doesn’t offer even this.
- Books/monographs might not be the primary use case, but I really want a solution that works for books too — and books are something users may be especially interested in a physical copy instead of online fulltext for, and books are also something that our holdings registration with Google pretty much doesn’t include, even ebooks. And book titles are a lot less likely to return hits in Google Scholar at all.
I really want a solution that works all or almost all of the time to get the patron to our library landing page, not just some of the time, and my experiments with Google Scholar Button revealed more of a ‘sometimes’ experience.
I’m not sure if the LibX or Lazy Scholar solutions can provide an OpenURL link in all cases, regardless of Google institutional holdings registration. They are both worth further inquiry for sure. But Lazy Scholar isn’t open source and I find it’s UI not great for our purposes. And I find LibX a bit too heavy weight for solving this problem, and have some other concerns about it.
So let’s consider another avenue for “Bridging the Gap”….Zotero’s scraping logic
Instead of trying to take a title and find a hit in a mega-corpus of scholarly citations like the Google Scholar Button approach, another approach would be to try to extract the full citation details from the source page, and construct an OpenURL to send straight to our landing page.
And, hey, it has occurred to me, there’s some software that already can scrape citation data elements from quite a long list of web sites our patrons might want to start from. Zotero. (And Mendeley too for that matter).
In fact, you could use Zotero as a method of ‘Bridging the Gap’ right now. Sign up for a Zotero account, install the Zotero extension. When you are on a paywalled citation page on the unauthenticated open web (or a search results page on Google Scholar, Amazon, or other places Zotero can scrape from), first import your citation into Zotero. Then go into your Zotero library, find the citation, and — if you’ve properly set up your OpenURL preferences in Zotero — it’ll give you a link to click on that will take you to your institutional OpenURL resolver. In our case, our Umlaut landing page.
We know from some faculty interviews that some faculty definitely use Zotero, hard to say if a majority do or not. I do not know how many have managed to set up their OpenURL preferences in Zotero, if this is part of their use of it.
Even of those who have, I wonder how many have figured out on their own that they can use Zotero to “bridge the gap” in this way. But even if we undertook an education campaign, it is a somewhat cumbersome process. You might not want to actually import into your Zotero library, you might want to take a look at the article first. And not everyone chooses to use Zotero, and we don’t want to require them to for a ‘briding the gap’ solution.
But that logic is there in Zotero, the pretty tricky task of compiling and maintaining ‘scraping’ rules for a huge list of sites likely to be desirable as ‘Bridging the Gap’ sources. And Zotero is open source, hmm.
We could imagine adding a feature to Zotero that let the user choose to go right to an institutional OpenURL link after scraping, instead of having to import and navigate to their Zotero library first. But I’m not sure such a feature would match the goals of the Zotero project, or how to integrate it into the UX in a clear way without disturbing from Zotero’s core functionality.
But again, it’s open source. We could imagine ‘forking’ Zotero, or extracting just the parts of Zotero that matter for our goal, into our own product that did exactly what we wanted. I’m not sure I have the local resources to maintain a ‘forked’ version of plugins for several browsers.
But Zotero also offers a bookmarklet. Which doesn’t have as good a UI as the browser plugins, and which doesn’t support all of the scrapers. But which unlike a browser plugin you can install on iOS and Android mobile browsers (although it’s a bit confusing to do so, at least it’s possible). And which it’s probably ‘less expensive’ for a developer to maintain a ‘fork’ of — we really just want to take Zotero’s scraping behavior, implemented via bookmarklet, and completely replace what you do with it after it’s scraped. Send it to our institutional OpenURL resolver.
I am very intrigued by this possibility, it seems at least worth some investigatory prototypes to have patrons test. But I haven’t yet figured out how where to actually find the bookmarklet code, and related code in Zotero that may be triggered by it, let alone the next step of figuring out if it can be extracted into a ‘fork’. I’ve tried looking around on the Zotero repo, but I can’t figure out what’s what. (I think all of Zotero is open source?).
Anyone know the Zotero devs, and want to see if they want to talk to me about it with any advice or suggestions? Or anyone familiar with the Zotero source code themselves and want to talk to me about it?
Filed under: General
Last updated May 14, 2015. Created by Peter Murray on May 14, 2015.
Log in to edit this page.
Binder is an open source digital repository management application, designed
to meet the needs and complex digital preservation requirements of museum
collections. Binder was created by
Artefactual Systems and the
Museum of Modern Art.
Binder aims to facilitate digital collections care, management, and
preservation for time-based media and born-digital artworks and is built
from integrating functionality of the
A presentation on Binder's functionality (Binder was formerly known as the
DRMC during development) can be found here:
Slides from a presentation at Code4LibBC 2014, including screenshots from the
application, can be found here:
Further resourcesArchival Record Manager and EditorLicense: GPLv3 Package Links In DevelopmentOperating System: Browser/Cross-PlatformTechnologies Used: XSLTProgramming Language: PHPDatabase: MySQLworks well with: Archivematica
Tonight, the House of Representatives will vote on the USA FREEDOM Act of 2015, H.R. 2048 to finally ban the “bulk collection” of Americans’ personal communications records (library, telephone and otherwise) under Section 215. Critically, it also would preclude the use of other surveillance laws (related to “PEN registers”) and NSLs to get around that prohibition and would bring the “gag order” provisions of the USA PATRIOT Act into compliance with the First Amendment by permitting them to be meaningfully challenged in court.
The bill, not incidentally, also permits phone and internet companies to publish information (in a sufficiently specific form to be useful) about the number of requests they receive from the government to produce personal subscriber information. It also, for the first time, would create opportunities for specially cleared civil liberties advocates to appear before the secret Foreign Intelligence Surveillance Act (FISA) court that authorizes surveillance activities. The bill also makes important “first step” reforms to privacy-hostile provisions, including Section 702, of the FISA Amendments Act.
ALA and its many public and private sector coalition partners strongly support passage of H.R. 2048. That message was underscored by the more than 400 librarian lobbyists who took to Capitol Hill on May 5, during the American Library Association’s (ALA) National Library Legislative Day. They carried with them a stirring and emphatic OpEd urging real reform entitled “Long Lines for Freedom” by ALA President Courtney Young, which was published that morning in The Hill, a Congress-centric newspaper widely read by Members of Congress, their staffs and the national press.
While House passage of the USA FREEDOM Act is widely expected, its fate in the Senate is uncertain at best. Stay tuned for more on how you can help!
The post U.S. House poised to pass real reforms to USA PATRIOT Act appeared first on District Dispatch.
New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.New This Week
Visit the LITA Job Site for more available jobs and for information on submitting a job posting.
Today I got to attend a talk by Pasco County Library system at the Florida Library Association conference on how they are building robots in the library. They work with a non-profit called First that helps get kids excited in areas of STEM. Pasco is the only public library in the US doing this and has named their team Edgar Allan Ohms.
It’s important to not be scared of this. You don’t have to be an engineer to participate in this program, it’s about more than robot building. The students build these robots, compete with them and then can apply for scholarships through First. The students run the entire program. They build the website, design the logos and signs, build the robots, etc.
How did Pasco do this? They converted a space in their library to a makerspace with outlets, tools and even non-robotic tools like sewing machines and autoCAD tools. Of course they are trying to do this as cheaply and quickly as possible – this too teaches the kids on how to use ‘found’ items to make these things happen. Another skill they’re teaching the kids is out to sell themselves, how to fund-raise, and how to talk to people to get funding and promotion. It’s so much more than kids just sitting around playing games all day – they are learning real life skills.
How do librarians (with no engineering background) do this? You go out in to your community and find people who want to help out! They are using family members, community members, library fans, and local businesses to help provide tools, supplies and services. People know about First and so everyone wants to help. In some cases people will come to you and offer to help if they hear about what you’re doing. If you can’t find anyone yourself First will help you.
They start each August, and this year there are so many interested that they will be interviewing kids to find those who will commit. They attend workshops weekly and bi-weekly August through December to talk about the rules and plans. In January they go in to competition mode – this is when they start to build the robot. This video shows the rules that the team had to use last year in order to build their robot – this was shown to all teams at the exact same time and from then they spend the next 6 weeks building.
They need to start with some planning based on the rules in the video. The kids will start designing on CAD, testing it in modeling software online, and go from there to building something that will run.
Everything these groups are doing is open and shared. This means that the kids of learning job skills not just in engineering but marketing and writing and others. The groups that will be competing go out on scouting missions where they see what other groups have done and learn from them.
So, if you want to do this in your library how do you get funding and approval from your lawyers? First off explain that you will get some funding from the program itself. Next show that the this program is going to help the community members by offering scholarships to the kids, teaching them real skills and bringing the kids out into the community. Think about it this way – how much does a high school pay for a football team? For a fraction of that you can bring together 25 kids and teach them a skill for life whereas most of those kids who play football in high school don’t end up in the NFL. For the lawyers the library basically said that this is a valuable program and went to bat to get it to go through. In the end the lawyers wrote up a disclaimer that all the kids have to sign in order to participate.
This is the kind of program that more libraries should be offering to encourage kids to learn about STEM and bring library awareness to the entire community – our libraries are about so much more than books and DVDs and this is a great way to show that.
- Keynote: Licensing Models and Building an Open Source Community
- How To Get More Kids To Code
- SxSW: Building the Open Source Society
That was the topic discussed several times recently by OCLC Research Library Partners metadata managers, initiated by Philip Schreur of Stanford, who is also involved in the Linked Data for Libraries (LD4L) project. Linked data may well be the next common infrastructure both for communicating library data and embedding it into the fabric of the semantic web. There have been a number of different models developed: Digital Public Library of America’s Metadata Application Profile, schema.org, BIBFRAME, etc. Much of a research library’s routine production is tied directly to its local system and makes use of MARC for internal and external data communication. Linked data offers an opportunity to go beyond the library domain and authority files to draw on information about entities from diverse sources.
Publishing metadata for digital collections as linked data directly, bypassing MARC record conversion, may offer more flexibility and accuracy. (An example of losing information when converting from one format to another is documented in Jean Godby’s 2012 report, A Crosswalk from ONIX 3.0 for Books to MARC 21.) Stanford is pulling together information about faculty members and publications in a way that they could never do without utilizing linked data.
Some of the issues raised in the focus group discussions included:
Critical components in linked data that could be started now: Including persistent identifiers in the MARC bibliographic and authority records created now will help in transitioning to a future linked data environment. The entities are more clearly identified in authority records than in bibliographic records where it’s not always clear which elements represent a work versus an expression of a work. OCLC is already adding FAST identifiers in the $0 subfield (the authority control number or standard number) in the subject fields of WorldCat records. The British Library expects to launch a pilot this summer to match the LC/NACO authority file against the ISNI database and add ISNI identifiers to the authority record’s 024 field. Adding $4 role codes in personal name added entries will help establish relationships among name entities in the future. Creating identifiers for entities that do not yet have them will build a larger pool of data to help disambiguate them later. The community could also consider a wider range of authorities beyond the LC/NACO authority file for re-using existing identifiers (e.g., VIAF, ISNI and identifiers in other national authority files) and “get us into the habit”.
Provenance: How to resolve or reconcile conflicts between statements? We will likely see different types of inconsistencies than we see now with, for example, different birthdates. OCLC has been looking at the work of Google researchers on a “knowledge graph” (the basis of knowledge cards. As Google harvests the Web, it comes across incorrect or conflicting statements. Researchers have documented using algorithms based on frequency and the source of links to come up with a “confidence measure”. (Knowledge Vault: A Web-Scale Approach to Probabilistic Knowledge Fusion.) Aggregations such as WorldCat, VIAF and Wikidata may allow the library community to view statements from these sources with more confidence than others.
Importance of holdings data in a linked data environment: Metadata managers see the need to communicate both the availability and eligibility of the resource being described. A W3C document, Holdings via Offer, recommends mappings from bibliographic holdings data to schema.org.
Impact on workflow: In the next phase of the Linked Data for Libraries project, six libraries (Columbia, Cornell, Harvard, Stanford, Princeton and the Library of Congress) hope to figure out how to use linked data in production using BIBFRAME. They will be looking at how to link into acquisitions and circulation as well as cataloging workflows, and hope to collaborate with cataloging and local system vendors. Metadata managers noted it’s important to collaborate with the book vendors that supply them with MARC records now – even if they cannot generate linked data themselves, perhaps they could enhance MARC records so that transforming them into BIBFRAME is cleaner. Linked data may also encourage more sharing of metadata via statements rather than copy-cataloging a record that is then maintained as a local copy that is not shared with others.
- During this transition period the environment and standards are a moving target.
- It’s unclear how libraries will share “statements” rather than records in a linked data environment
- How to involve the many vendors which supply or process MARC records now? Working with others in the linked data environment involves people unfamiliar with the library environment, requiring metadata specialists to explain what their needs are in terms non-librarians can understand.
- Differing interpretations of what is a “work” may hamper the ability to re-use data created elsewhere.
Success metrics: Moving into a production linked data environment will take time, and each institution may well have a different timetable. Discussions indicated that linked data experiments could be considered successful if:
- The data is more integrated than it is now.
- Data created by different workflows are interoperable.
- Libraries can offer users new, valued services that current data models can’t support.
- The resource descriptions are more machine-actionable than current standards.
- Outside parties use library resource descriptions more.
- The data is better and richer because more parties share in its creation.
Graphic: Partial view of Linking Open Data cloud diagram 2014, by Max Schmachtenberg, Christian Bizer, Anja Jentzsch and Richard Cyganiak. http://lod-cloud.net/About Karen Smith-Yoshimura
Karen Smith-Yoshimura, program officer, works on topics related to renovating descriptive and organizing practices with a focus on large research libraries and area studies requirements.Mail | Web | Twitter | More Posts (58)
We are thrilled to announce that the Open Data Handbook, the premier guide for open data newcomers and veterans alike, has received a much needed update! The Open Data Handbook, originally published in 2012, has become the go to resource for the open data community. It was written by expert members of the open data community and has been translated into over 18 languages. Read it now »
The Open Data Handbook elaborates on the what, why & how of open data. In other words – what data should be open, what are the social and economic benefits of opening that data, and how to make effective use of it once it is opened.
The handbook is targeted at a broad audience, including civil servants, journalists, activists, developers, and researchers as well as open data publishers. Our aim is to ensure open data is widely available and applied in as many contexts as possible, we welcome your efforts to grow the open knowledge movement in this way!
The idea of open data is really catching on and we have learned many important lessons over the past three years. We believe that is time that the Open Data Handbook reflect these learnings. The revised Open Data Handbook has a number of new features and plenty of ways to contribute your experience and knowledge, please do!Inspire Open Data Newcomers
The original open data guide discussed the theoretical reasons for opening up data – increasing transparency and accountability of government, improving public and commercial services, stimulating innovation etc. We have now reached a point where we are able to go beyond theoretical arguments — we have real stories that document the benefits open data has on our lives. The Open Data Value Stories are use cases from across the open knowledge network that highlighting the social and economic value and the varied applications of open data in the world.
This is by no means an exhaustive list; in fact just the beginning! If you have an open data value story that you would like to contribute, please get in touch.Learn How to Publish & Use Open Data
The Open Data Guide remains the premier open data how-to resource and in the coming months we will be adding new sections and features! For the time being, we have moved the guide to Github to streamline contributions and facilitate translation. We will be reaching out to the community shortly to determine what new content we should be prioritising.
While in 2012, when we originally published the open data guide, the open data community was still emerging and resources remained scarce, today as the global open data community is mature, international and diverse and resources now exist that reflect this maturity and diversity. The Open Data Resource Library is curated collection of resources, including articles, longer publications, how to guides, presentations and videos, produced by the global open data community — now available all in one place! If you want to contribute a resource, you can do so here! We are particularly interested in expanding the number of resources we have in languages other than English so please add them if you have them!
Finally, as we are probably all aware, the open data community likes its jargon! While the original open data guide had a glossary of terms, it was far from exhaustive — especially for newcomers to the open data movement. In the updated version we have added over 80 new terms and concepts with easy to understand definitions! Have we missed something out? Let us know what we are missing here.
The updated Open Data Handbook is a living resource! In the coming months, we will be adding new sections to the Open Data Guide and producing countless more value stories! We invite you to contribute your stories, your resources and your ideas! Thank you for your contributions past, present and future and your continued efforts in pushing this movement forward.
The post Query Autofiltering Revisited – Lets be more precise!!! appeared first on Lucidworks.
Okay, so we found it sort of tricky to explain, but the Kano Model really is awesome. In this episode, we try our best to tell you that the Kano Model is a sophisticated tool used to measure the impact of service features on the user experience. It is a way that you and your stakeholders can visualize the weight of a new feature, whether it will produce delight but require a huge investment, or that carousel will make you rue the day.
Winchester, MA Save the dates now! The Fedora Project is pleased to announce that the first Fedora Camp will be offered November 16-18 (Monday-Wednesday) at Duke University (specifically the The Edge: The Ruppert Commons for Research, Technology, and Collaboration ).
I come to bury Caesar, not to praise him. – Antony, in The Tragedy of Julius Caesar, William Shakespeare
My esteemed colleague Thom Hickey, who knows the MARC format more intimately than I ever will, has penned a defense of that venerable metadata format. He was kind enough to cite a column I wrote in 2002 for Library Journal. But even back then, my opinion had changed such so that I wrote a much longer and thorough piece that laid out the bibliographic future I wished to see. The journal in which it was published thought highly enough of it to award it the paper of the year award. I think my bribe helped.
Thom’s post lays out a pretty compelling use case for MARC, and that’s awesome. Frankly, if MARC wasn’t as good as it was it would not have lasted as long as it has. And let’s be clear, it’s far from dead.
But that is a fairly specific use case, and such specific use cases may still apply long after MARC is replaced with BIBFRAME (which is the intent of the Library of Congress). Or, perhaps, something else yet to be determined.
But I’m more concerned about the broader ecology of library bibliographic data, and how we fit within the even larger ecology of non-library bibliographic data. And there MARC is showing its age. I still think we will likely need to have a fairly complex metadata element set for library work, and a much simplified version for syndicating out in the world. And I think that a very good choice for that much simpler format for syndicating is Schema.org. At least that’s what we’re presently going with.
Meanwhile, we at OCLC will be consuming and offering MARC as well as other formats for some undetermined length of time to come. I come to neither praise nor bury MARC. I come to help create a bibliographic infrastructure that will take us into the future by accommodating many strategies, tools, and formats.About Roy Tennant
Roy Tennant works on projects related to improving the technological infrastructure of libraries, museums, and archives.Mail | Web | Twitter | Facebook | LinkedIn | Flickr | YouTube | More Posts (88)
Last updated May 12, 2015. Created by Peter Murray on May 12, 2015.
Log in to edit this page.
From the announcement:
Join us in beautiful Knoxville, Tennessee for an all-day workshop on Fedora, the open source digital content repository system.When:
The workshop will occur from 9 AM to 5 PM on Friday, June 26, with a break for lunch.
The National Digital Stewardship Alliance Innovation Working Group is proud to open the nominations for the 2015 NDSA Innovation Awards. As a diverse membership group with a shared commitment to digital preservation, the NDSA understands the importance of innovation and risk-taking in developing and supporting a broad range of successful digital preservation activities. These awards are an example of the NDSA’s commitment to encourage and recognize innovation in the digital stewardship community.
This slate of annual awards highlights and commends creative individuals, projects, organizations and future stewards demonstrating originality and excellence in their contributions to the field of digital preservation. The program is administered by a committee drawn from members of the NDSA Innovation Working Group.
Last year’s winners are exemplars of the diversity and collaboration essential to supporting the digital stewardship community as it works to preserve and make available digital materials.
The NDSA Innovation Awards focus on recognizing excellence in one or more of the following areas:
- Individuals making a significant, innovative contribution to the field of digital preservation;
- Projects whose goals or outcomes represent an inventive, meaningful addition to the understanding or processes required for successful, sustainable digital preservation stewardship;
- Organizations taking an innovative approach to providing support and guidance to the digital preservation community;
- Future stewards, especially students, but including educators, trainers or curricular endeavors, taking a creative approach to advancing knowledge of digital preservation theory and practices.
Acknowledging that innovative digital stewardship can take many forms, eligibility for these awards has been left purposely broad. Nominations are open to anyone or anything that falls into the above categories and any entity can be nominated for one of the four awards. Nominees should be US-based people and projects or collaborative international projects that contain a US-based partner. This is your chance to help us highlight and reward novel, risk-taking and inventive approaches to the challenges of digital preservation.
Nominations are now being accepted and you can submit a nomination using this quick, easy online submission form. You can also submit a nomination by emailing a brief description, justification and the URL and/or contact information of your nominee to ndsa (at) loc.gov.
Nominations will be accepted until Tuesday, June 30 and winners announced in mid-July. Help us recognize and reward innovation in digital stewardship and submit a nomination!