Anyone who’s followed legislative efforts over the past ten plus years to restore a fraction of the civil liberties lost by Americans to the USA PATRIOT Act and other surveillance laws will understand the photo accompanying this post. With the revelations of the last several years in particular, first by the New York Times and then Edward Snowden, many believed that real reform might be achieved in the last Congress by passing the USA FREEDOM Act of 2014. They were wrong.
In May 2014, the House passed a version of the USA FREEDOM Act (H.R. 3361) that was dramatically weakened from a civil liberties point of view in the House Judiciary Committee and then stripped of virtually all meaningful privacy-restoring reforms by the full House of Representatives. While strenuous efforts were made to bring a robust version of the bill (S. 2685) to the floor of the Senate, Republican members filibustered that bill and the 113th Congress ended without further action on any form of the USA FREEDOM Act of 2014.
Undeterred, the bill’s bipartisan sponsors in both chambers recently reintroduced the USA FREEDOM Act of 2015, H.R. 2048 and S. 1123, a tenuously calibrated agreement that garnered the support of both many civil liberties organizations, including the American Library Association (ALA), as well as congressional “surveillance hawks,” the nation’s intelligence agencies, and the Administration. On May 14, just one week after a federal appeals court ruled the NSA’s use of Section 215 to collect Americans’ telephone call records in bulk illegal, H.R. 2048 passed the House with a strongly bipartisan vote (338 yeas – 88 nays). At this writing, with effectively just one week remaining for Congress to consider expiring PATRIOT Act provisions before recessing for the Memorial Day holiday and the June 1 “sunset” of those provisions, the bill’s fate rests with the Senate and is highly uncertain.
Not all civil liberties advocates, however, are pushing for passage of this year’s version of the USA FREEDOM Act. The ACLU, for example, is calling on Congress to simply permit Section 215 and other expiring provisions of the PATRIOT Act to “sunset” as scheduled on June 1. The Electronic Frontier Foundation (EFF) also is urging Members of Congress to strengthen H.R. 2048 (rather than pass it in its current form) because, in EFF’s view, the reforms it makes will not sweep as broadly as the appeals court’s recent ruling could if upheld and broadened in its precedential effect by adoption in other courts (including eventually perhaps the U.S. Supreme Court). Neither group, however, is urging Members of Congress to vote against H.R. 2048.
These views by respected long-time ALA allies have, not unreasonably, caused some to ask (and no doubt many more to wonder) why ALA is actively urging its members and the public to work for passage of H.R. 2048. The answer is distillable to four words: policy, politics, permanence, and perseverance.
Since January of 2003, the Council of the American Library Association (the Association’s policy-setting body) has adopted at least eight Resolutions addressing the USA PATRIOT Act and the access to library patron reading, researching and internet usage records that it affords the government under Section 215 and through the use of National Security Letters (NSLs) and their associated “gag orders.” While somewhat different in individual focus based upon the legislative environments in which they were written, all make ALA’s position on Section 215 of the PATRIOT Act and related authorities consistently clear. Stated most recently in January of 2014, that position is that ALA “calls upon Congress to pass legislation supporting the reforms embodied in [the USA FREEDOM Act of 2014] (see ALA CD#20-1(A)).”
As detailed in this Open Technology Institute (OTI) section-by-section, side-by-side comparison of the current USA FREEDOM Act (H.R. 2048) with two versions introduced in the last Congress, the current bill is a long way from perfect (just as the “old” ones were). It does, however, achieve the principal objectives of last year’s legislation endorsed by ALA’s Council. Specifically, H.R. 2048:
- categorically ends the bulk collection not only of telephone call records but also of any “tangible things” (in the language of Section 215), library records included. Henceforth, any request for records must relate to a specific pending investigation and be based upon a narrowly defined “specific selection term” as defined in the law. Accordingly, no longer will the NSA or FBI be able to assert that the search histories of all public access computers are “tangible things” whose production they can lawfully and indefinitely compel as part of an essentially boundless fishing expedition. Nor will agencies be able to continue “bulk collection” under other legal authorities, including National Security Letters, or “PEN register” and “trap and trace” statutes;
- significantly strengthens judicial review of the non-disclosure (“gag”) orders that generally accompany NSLs by eliminating the current requirement in law that a court effectively accept without challenge mere certification by a high-level government official that disclosure of the order would endanger national security. H.R. 2048 also requires the government to initiate judicial review of nondisclosure orders and to bear the burden of proof in those proceedings that they are statutorily justified;
- permits more robust public reporting by companies and others who have received Section 215 orders or NSLs from the government of the number of such requests they’ve processed; and
- requires the secret “FISA Court” that issues surveillance authorities to designate a panel of fully “cleared” expert civil liberties counsel whom the court may appoint to advise it in cases involving significant or precedential legal issues, and to declassify its opinions or summarize them for public access when declassification is not possible. The bill also expands the opportunity for review of FISA Court opinions by federal appellate courts.
As OTI’s “side-by-side” also indicates, H.R. 2048 falls short of last year’s USA FREEDOM Act iteration in several important respects. Most significantly, records collected by the government on persons who ultimately are not relevant to an investigation may still be retained, and reforms affected in last year’s bill to Section 702 of the Foreign Intelligence Surveillance Act Amendments Act are decidedly weaker. The bill also extends expiring portions of the PATRIOT Act, as modified, for five years.
Determining whether ALA should support a particular piece of almost inevitably imperfect legislation turns not only on the content of the legislation (though that naturally receives disproportionate weight in an assessment), but also on the probability of achieving a better result and when such a result might conceivably be obtained. With the change in control of the Senate in 2014 and very high probability that control of the House will not shift for many elections to come, many groups including ALA believe that H.R. 2048 represents the “high water mark” in reform of Section 215 and related legal authorities achievable in the foreseeable future.
The recent landmark ruling by the U.S. Court of Appeals for the Second Circuit noted above was sweeping and clear in some respects, but limited and uncertainty producing in others. Specifically, the Court firmly ruled that the bulk collection of telephone records under Section 215 is illegal. That ruling, however, addressed only the NSA’s bulk collection of “telephony metadata.” It did not directly speak to the bulk collection of any other information, including library records of any kind.
Further, while binding in the states that make up the Second Judicial Circuit (Connecticut, New York, and Vermont), the court’s decision has no precedential effect in any other part of the country. It is also unclear whether the Second Circuit’s decision will be appealed by the government and, if so, what the outcome will be.
Finally, similar decisions are pending in two other federal Courts of Appeal. Should one or both rulings differ materially from the Second Circuit’s, further uncertainty as to what the law is and should be nationally will result. Resolution of such a “split in the Circuits” can only be accomplished through a multi-year appeal process to the U.S. Supreme Court, which is not required to hear the case.
Enactment of the current version of the USA FREEDOM Act would “lock in” the reforms noted above immediately, permanently and nationwide. Accordingly, on balance, ALA and its many coalition allies are supporting the bill and affirmatively urging Members of Congress to do the same.
Finally, and crucially, ALA and its allies have long been and remain fully committed to working for the most profound reform of all of the nation’s privacy and surveillance laws possible. ALA thus regards the USA FREEDOM Act of 2015 as a critical step — the first possible in 14 years — to make real progress toward that much broader permanent goal, but as only a step.
Work in this Congress (and beyond) will continue aggressively to pass comprehensive reform of the badly outdated Electronic Communications Privacy Act and to restore Americans’ civil liberties still compromised by, for example, other portions of the USA PATRIOT Act, Section 702 of the Foreign Intelligence Surveillance Act, Executive Order 12333 and many other privacy-hostile legal authorities.
With our allies at our side, and librarians and their millions of patrons behind us, the fight goes on.
The post Supporting the USA FREEDOM Act of 2015: ALA’s Perspective appeared first on District Dispatch.
Today I found the following resources and bookmarked them on Delicious.
- Dashboards by Keen IO Charts on Grids – Responsive Dashboard Templates with Bootstrap
Digest powered by RSS Digest
- ATO2014: Using Bootstrap to create a common UI across products
- Speeding up WordPress Dashboard
- Google Docs Templates
Library of Congress: The Signal: Digital Forensics and Digital Preservation: An Interview with Kam Woods of BitCurator.
We’ve written about the BitCurator project a number of times, but the project has recently entered a new phase and it’s a great time to check in again. The BitCurator Access project began in October 2014 with funding through the Mellon Foundation. BitCurator Access is building on the original BitCurator project to develop open-source software that makes it easier to access disk images created as part of a forensic preservation process.
Kam Woods has been a part of BitCurator from the beginning as its Technical Lead, and he’s currently a Research Scientist in the School of Information and Library Science at the University of North Carolina at Chapel Hill. As part of our Insights Interview series we talked with Woods about the latest efforts to apply digital forensics to digital preservation.
Butch: How did you end up working on the BitCurator project?
Kam: In late 2010, I took a postdoc position in the School of Information and Library Science at UNC, sponsored by Cal Lee and funded by a subcontract from an NSF grant awarded to Simson Garfinkel (then at the Naval Postgraduate School). Over following months I worked extensively with many of the open source digital forensics tools written by Simson and others, and it was immediately clear that there were natural applications to the issues faced by collecting organizations preserving born-digital materials. The postdoc position was only funded for one year, so – in early 2011 – Cal and I (along with eventual Co-PI Matthew Kirschenbaum) began putting together a grant proposal to the Andrew W. Mellon Foundation describing the work that would become the first BitCurator project.
Butch: If people have any understanding at all of digital forensics it’s probably from television or movies, but I suspect the actions you see there are pretty unrealistic. How would you describe digital forensics for the layperson? (And as an aside, what do people on television get “most right” about digital forensics?)
Kam: Digital forensics commonly refers to the process of recovering, analyzing, and reporting on data found on digital devices. The term is rooted in law enforcement and corporate security practices: tools and practices designed to identify items of interest (e.g. deleted files, web search histories, or emails) in a collection of data in order to support a specific position in a civic or criminal court case, to pinpoint a security breach, or to identify other kinds of suspected misconduct.
The goals differ when applying these tools and techniques within archives and data preservation institutions, but there are a lot of parallels in the process: providing an accurate record of chain of custody, documenting provenance, and storing the data in a manner that resists tampering, destruction, or loss. I would direct the interested reader to the excellent and freely available 2010 Council on Library and Information Resources report Digital Forensics and Born-Digital Content in Cultural Heritage Institutions (pdf) for additional detail.
You’ll occasionally see some semblance of a real-world tool or method in TV shows, but the presentation is often pretty bizarre. As far as day-to-day practices go, discussions I’ve had with law enforcement professionals often include phrases like “huge backlogs” and “overextended resources.” Sound familiar to any librarians and archivists?
Butch: Digital forensics has become a hot topic in the digital preservation community, but I suspect that it’s still outside the daily activity of most librarians and archivists. What should librarians and archivists know about digital forensics and how it can support digital preservation?
Kam: One of the things Cal Lee and I emphasize in workshops is the importance of avoiding unintentional or irreversible changes to source media. If someone brings you a device such as a hard disk or USB drive, a hardware write-blocker will ensure that if you plug that device into a modern machine, nothing can be written to it, either by you or some automatic process running on your operating system. Using a write-blocker is a baseline risk-reducing practice for anyone examining data that arrives on writeable media.
Creating a disk image – a sector-by-sector copy of a disk – can support high-quality preservation outcomes in several ways. A disk image retains the entirety of any file system contained within the media, including directory structures and timestamps associated with things like when particular files were created and modified. Retaining a disk image ensures that as your external tools (for example, those used to export files and file system metadata) improve over time, you can revisit a “gold standard” version of the source material to ensure you’re not losing something of value that might be of interest to future historians or researchers.
Disk imaging also mitigates the risk of hardware failure during an assessment. There’s no simple, universal way to know how many additional access events an older disk may withstand until you try to access it. If a hard disk begins to fail while you’re reading it, chances of preserving the data are often higher if you’re in the process of making a sector-by-sector copy in a forensic format with a forensic imaging utility. Forensic disk image formats embed capture metadata and redundancy checks to ensure a robust technical record of how and when that image was captured, and improve survivability over raw images if there is ever damage to your storage system. This can be especially useful if you’re placing a material in long-term offline storage.
There are many situations where it’s not practical, necessary, or appropriate to create a disk image, particularly if you receive a disk that is simply being used as an intermediary for data transfer, or if you’re working with files stored on a remote server or shared drive. Most digital forensics tools that actually analyze the data you’re acquiring (for example, Simson Garfinkel’s bulk extractor, which searches for potentially private and sensitive information and other items of interest) will just as easily process a directory of files as they would a disk image. Being aware of these options can help guide informed processing decisions.
Finally, collecting institutions spend a great deal of time and money assessing, hiring and training professionals to make complex decisions about what to preserve, how to preserve it and how to effectively provide and moderate access in ways that serve the public good. Digital forensics software can reduce the amount of manual triage required when assessing new or unprocessed materials, prioritizing items that are likely to be preservation targets or require additional attention.
Butch: How does BitCurator Access extend the work of the original phases of the BitCurator project?
Kam: One of the development goals for BitCurator Access is to provide archives and libraries with better mechanisms to interact with the contents of complex digital objects such as disk images. We’re developing software that runs as a web service and allows any user with a web browser to easily navigate collections of disk images in many different formats. This includes: providing facilities to examine the contents of the file systems contained within those images; interact with visualizations of file system metadata and organization (including timelines indicating changes to files and folders); and download items of interest. There’s an early version and installation guide in the “Tools” section of http://access.bitcurator.net/.
We’re also working on software to automate the process of redacting potentially private and sensitive information – things like Social Security Numbers, dates of birth, bank account numbers and geolocation data – from these materials based on reports produced by digital forensics tools. Automatic redaction is a complex problem that often requires knowledge of specific file format structures to do correctly. We’re using some existing software libraries to automatically redact where we can, flag items that may require human attention and prepare clear reports describing those actions.
Finally, we’re exploring ways in which we can incorporate emulation tools such as those developed at the University of Freiburg using the Emulation-as-a-Service model.
Butch: I’ve heard archivists and curators express ethical concerns about using digital forensics tools to uncover material that an author may not have wished be made available (such as earlier drafts of written works). Do you have any thoughts on the ethical considerations of using digital forensics tools for digital preservation and/or archival purposes?
Kam: There’s a great DPC Technology Watch report from 2012, Digital Forensics and Preservation (pdf), in which Jeremy Leighton John frames the issue directly: “Curators have always been in a privileged position due to the necessity for institutions to appraise material that is potentially being accepted for long-term preservation and access; and this continues with the essential and judicious use of forensic technologies.”
What constitutes “essential and judicious” is an area of active discussion. It has been noted elsewhere (see the CLIR report I mentioned earlier) that the increased use of tools with these capabilities may necessitate revisiting and refining the language in donor agreements and ethics guidelines.
As a practical aside, the Society of American Archivists Guide to Deeds of Gift includes language alerting donors to concerns regarding deleted content and sensitive information on digital media. Using the Wayback Machine, you can see that this language was added mid-2013, so that provides some context for the impact these discussions are having.
Butch: An area that the National Digital Stewardship Alliance has identified as important for digital preservation is the establishment of testbeds for digital preservation tools and processes. Do you have some insight into how http://digitalcorpora.org/ got established, and how valuable it is for the digital forensics and preservation communities?
Kam: DigitalCorpora.org was originally created by Simson Garfinkel to serve as a home for corpora he and others developed for use in digital forensics education and research. The set of materials on the site has evolved over time, but several of the currently available corpora were captured as part of scripted, simulated real-world scenarios in which researchers and students played out roles involving mock criminal activities using computers, USB drives, cell phones and network devices.
These corpora strike a balance between realism and complexity, allowing students in digital forensics courses to engage with problems similar to those they might encounter in their professional careers while limiting the volume of distractors and irrelevant content. They’re freely distributed, contain no actual confidential or sensitive information, and in certain cases have exercises and solution guides that can be distributed to instructors. There’s a great paper linked in the Bibliography section of that site entitled “Bringing science to digital forensics with standardized forensic corpora” (pdf) that goes into the need for such corpora in much greater detail.
We’ve used disk images from one corpus in particular – the “M57-Patents Scenario” – in courses taken by LIS students at UNC and in workshops run by the Society of American Archivists. They’re useful in illustrating various issues you might run into when working with a hard drive obtained from a donor, and in learning to work with various digital forensics tools. I’ve had talks with several people about the possibility of building a realistic corpus that simulated, say, a set of hard drives obtained from an artist or author. This would be expensive and require significant planning, for reasons that are most clearly described in the paper linked in the previous paragraph.
Butch: What are the next steps the digital preservation community should address when it comes to digital forensics?
Kam: Better workflow modeling, information sharing and standard vocabularies to describe actions taken using digital forensics tools are high up on the list. A number of institutions do currently document and publish workflows that involve digital forensics, but differences in factors like language and resolution make it difficult to compare them meaningfully. It’s important to be able to distinguish those ways in which workflows differ that are inherent to the process, rather than the way in which that process is described.
Improving community-driven resources that document and describe the functions of various digital forensics tools as they relate to preservation practices is another big one. Authors of these tools often provide comprehensive documentation, but they doesn’t necessarily emphasize those uses or features of the tools that are most relevant to collecting institutions. Of course, a really great tool tutorial doesn’t really help someone who doesn’t know about that tool, or isn’t familiar with the language being used to describe what it does, so you can flip this: describing a desired data processing outcome in a way that feels natural to an archivist or librarian, and linking to a tool that solves part or all of the related problem. We have some of this already, scattered around the web; we just need more of it, and better organization.
Finally, a shared resource for robust educational materials that reflect the kinds of activities students graduating from LIS programs may undertake using these tools. This one more or less speaks for itself.
Two weeks ago, I completed a semester-long, advanced TEI internship where I learned XSLT and utilized it to migrate two digital collections (explained more here, and check out the blog here) in the Digital Library Program. During these two weeks, I’ve had time to reflect on the impact that internships, especially tech-based, have on students.
At Indiana University, a student must complete an internship to graduate with a dual degree or specialization. However, this is my number one piece of advice for any student, but especially library students: do as many internships as you possibly can. The hands-on experience obtained during an internship is invaluable moving to a real-life position, and something we can’t always experience in courses. This is especially true for internships introducing and refining tech skills.
I’m going to shock you: learning new technology is difficult. It takes time. It takes patience. It takes a project application. Every new computer program or tech skill I’ve learned has come with a “drowning period,” also known as the learning period. Since technology exists in a different space, it is difficult to conceptualize how it works, and therefore how to learn and understand it.
An internship, usually unpaid or for student-paid credit, is the perfect safe zone for this drowning period. The student has time to fail and make mistakes, but also learn from them in a fairly low-pressure situation. They work with the people actually doing the job in the real world who can serve as a guide for learning the skills, as well as a career mentor.
The supervisors and departments also benefit from free labor, even if it does take time. Internships are also a chance for supervisors to revisit their own knowledge and solidify it by teaching others. They can look at their standards and see if anything needs updated or changed. Supervisors can directly influence the next generation of librarians, teaching them skills and hacks it took them years to figure out.
My two defining internships were: the Walt Whitman Archive at the University of Nebraska-Lincoln, introducing me to digital humanities work, and the Digital Library Program, solidifying my future career. What was your defining library internship? What kinds of internships does your institution offer to students and recent graduates? How does your institution support continuing education and learning new tech skills?
D-Lib: Semantic Description of Cultural Digital Images: Using a Hierarchical Model and Controlled Vocabulary
D-Lib: Facing the Challenge of Web Archives Preservation Collaboratively: The Role and Work of the IIPC Preservation Working Group
D-Lib: Statistical Translation of Hierarchical Classifications from Dewey Decimal Classification to the Regensburger Verbundklassifikation
D-Lib: Helping Members of the Community Manage Their Digital Lives: Developing a Personal Digital Archiving Workshop
Part ee of Amazon crawl..
This item belongs to: data/ol_data.
This item has files of the following types: Data, Data, Metadata, Text
Join a panel of Book Industry Study Group (BISG) and American Library Association (ALA) leaders at this year’s 2015 ALA Annual Conference in San Francisco when they discuss the results of a newly-released study on public library patrons’ use of digital content.
During the conference session “Digital Content in Public Libraries: What Do Patrons Think?” panelists will discuss the results of a new study by the BISG and ALA that was designed to provide invaluable insight into how readers interact with e-books in a library environment. The session takes place from 3:00 to 4:00 p.m. on Sunday, June 28, 2015, at the Moscone Convention Center in room 131 of the North Building.
The digital content survey was developed to understand the behavior of library patrons, including their use of digital resources and other services offered by public libraries. The study examined the impact of digital consumption behaviors, including the adoption of new business models, on library usage across America.Speakers
- Kathy Rosa, director, Office for Research and Statistics, American Library Association
- Carrie Russell, program director, Public Access to Information, Office for Information Technology Policy, American Library Association
- Nadine Vassallo, project manager, Research & Information, Book Industry Study Group
The post How do library patrons feel about digital content? appeared first on District Dispatch.
A previous post I made reviewing the Ithaka report “Streamlining access to Scholarly Resources” got a lot of attention. Thanks!
The primary issue I’m interested in there: Getting our patrons from a paywalled scholarly citation on the open unauthenticated web, to an authenticated library-licensed copy, or other library services. “Bridging the gap”.
Here, we use Umlaut to turn our “link resolver” into a full-service landing page offering library services for both books and articles: Licensed online copies, local print copies, and other library services.
This means we’ve got the “receiving” end taken care of — here’s a book and an article example of an Umlaut landing page — the problem reduces to getting the user from the open unauthenticated web to an Umlaut page for the citation in question.
Which is still a tricky problem. In this post, brief discussion of two things: 1) The new “Google Scholar Button” browser extension from Google, which is interesting in this area, but I think ultimately not enough of a solution to keep me from looking for more, and 2) Possibilities of Zotero open source code toward our end.The Google Scholar Button
This plugin will extract the title of an article from a page (either text you’ve selected on the page first, or it will try to scrape a title from HTML markup), and give you search results for that article title from Google Scholar, in a little popup window.
Interestingly, this is essentially the same thing a couple of third party software packages have done for a while: The LibX “Magic Button”, and Lazy Scholar. But now we get it in an official Google release, instead of hacky workarounds to Google’s lack of API from open source.
The Google Scholar Button is basically trying to bridge the same gap we are; it provides a condensed version of google scholar search results, with a link to an open access PDF if Google knows about one (I am still curious how many of these open access PDF’s are not-entirely-licensed copies put up by authors or professors without publisher permissions);
And it in some cases provides an OpenURL link to a library link resolver, which is just what we’re looking for.
However, it’s got some limitations that keep me from considering it a satisfactory ‘Bridging the Gap’ solution:
- In order to get the OpenURL link to your local library link resolver while you are off campus, you have to set your Google Scholar preferences in your browser, which is pretty confusing to do.
- The title has to match in Google Scholar’s index of course. Which is definitely extensive enough to still be hugely useful, as evidenced by the open source predecessors to Google Scholar Button trying to do the same thing.
- But most problematically at all, Google Scholar Button results will only show the local library link resolver link for some citations: The ones that have been registered as having institutional fulltext access in your institutional holdings registered with Google. I want to get users to the Umlaut landing page for any citation they want, even if we don’t have licensed fulltext (and we might even if Google doesn’t think we do, the holdings registrations are not always entirely accurate), I want to show them local physical copies (especially for books), and ILL and other document delivery services.
- The full Google Scholar gives a hard-to-find but at least it’s there OpenURL link for “no local fulltext” under a ‘more’ link, but the Google Scholar Button version doesn’t offer even this.
- Books/monographs might not be the primary use case, but I really want a solution that works for books too — and books are something users may be especially interested in a physical copy instead of online fulltext for, and books are also something that our holdings registration with Google pretty much doesn’t include, even ebooks. And book titles are a lot less likely to return hits in Google Scholar at all.
I really want a solution that works all or almost all of the time to get the patron to our library landing page, not just some of the time, and my experiments with Google Scholar Button revealed more of a ‘sometimes’ experience.
I’m not sure if the LibX or Lazy Scholar solutions can provide an OpenURL link in all cases, regardless of Google institutional holdings registration. They are both worth further inquiry for sure. But Lazy Scholar isn’t open source and I find it’s UI not great for our purposes. And I find LibX a bit too heavy weight for solving this problem, and have some other concerns about it.
So let’s consider another avenue for “Bridging the Gap”….Zotero’s scraping logic
Instead of trying to take a title and find a hit in a mega-corpus of scholarly citations like the Google Scholar Button approach, another approach would be to try to extract the full citation details from the source page, and construct an OpenURL to send straight to our landing page.
And, hey, it has occurred to me, there’s some software that already can scrape citation data elements from quite a long list of web sites our patrons might want to start from. Zotero. (And Mendeley too for that matter).
In fact, you could use Zotero as a method of ‘Bridging the Gap’ right now. Sign up for a Zotero account, install the Zotero extension. When you are on a paywalled citation page on the unauthenticated open web (or a search results page on Google Scholar, Amazon, or other places Zotero can scrape from), first import your citation into Zotero. Then go into your Zotero library, find the citation, and — if you’ve properly set up your OpenURL preferences in Zotero — it’ll give you a link to click on that will take you to your institutional OpenURL resolver. In our case, our Umlaut landing page.
We know from some faculty interviews that some faculty definitely use Zotero, hard to say if a majority do or not. I do not know how many have managed to set up their OpenURL preferences in Zotero, if this is part of their use of it.
Even of those who have, I wonder how many have figured out on their own that they can use Zotero to “bridge the gap” in this way. But even if we undertook an education campaign, it is a somewhat cumbersome process. You might not want to actually import into your Zotero library, you might want to take a look at the article first. And not everyone chooses to use Zotero, and we don’t want to require them to for a ‘briding the gap’ solution.
But that logic is there in Zotero, the pretty tricky task of compiling and maintaining ‘scraping’ rules for a huge list of sites likely to be desirable as ‘Bridging the Gap’ sources. And Zotero is open source, hmm.
We could imagine adding a feature to Zotero that let the user choose to go right to an institutional OpenURL link after scraping, instead of having to import and navigate to their Zotero library first. But I’m not sure such a feature would match the goals of the Zotero project, or how to integrate it into the UX in a clear way without disturbing from Zotero’s core functionality.
But again, it’s open source. We could imagine ‘forking’ Zotero, or extracting just the parts of Zotero that matter for our goal, into our own product that did exactly what we wanted. I’m not sure I have the local resources to maintain a ‘forked’ version of plugins for several browsers.
But Zotero also offers a bookmarklet. Which doesn’t have as good a UI as the browser plugins, and which doesn’t support all of the scrapers. But which unlike a browser plugin you can install on iOS and Android mobile browsers (although it’s a bit confusing to do so, at least it’s possible). And which it’s probably ‘less expensive’ for a developer to maintain a ‘fork’ of — we really just want to take Zotero’s scraping behavior, implemented via bookmarklet, and completely replace what you do with it after it’s scraped. Send it to our institutional OpenURL resolver.
I am very intrigued by this possibility, it seems at least worth some investigatory prototypes to have patrons test. But I haven’t yet figured out how where to actually find the bookmarklet code, and related code in Zotero that may be triggered by it, let alone the next step of figuring out if it can be extracted into a ‘fork’. I’ve tried looking around on the Zotero repo, but I can’t figure out what’s what. (I think all of Zotero is open source?).
Anyone know the Zotero devs, and want to see if they want to talk to me about it with any advice or suggestions? Or anyone familiar with the Zotero source code themselves and want to talk to me about it?
Filed under: General