You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 21 hours 16 min ago

Eric Lease Morgan: Limit to full text in VuFind

Wed, 2016-08-24 20:16

This posting outlines how a “limit to full text” functionality was implemented in the “Catholic Portal’s” version of VuFind.

While there are many dimensions of the Catholic Portal, one of its primary components is a sort of union catalog of rare and infrequently held materials of a Catholic nature. This union catalog is comprised of metadata from MARC records, EAD files, and OAI-PMH data repositories. Some of the MARC records include URLs in 856$u fields. These URLs point to PDF files that have been processed with OCR. The Portal’s indexer has been configured to harvest the PDF documents, when it comes across them. Once harvested the OCR is extracted from the PDF file, and the resulting text is added to the underlying Solr index. The values of the URLs are saved to the Solr index as well. Almost by definition, all of the OAI-PMH content indexed by Portal is full text; almost all of the OAI-PMH content includes pointers to images or PDF documents.

Consequently, if a reader wanted to find only full text content, then it would be nice to: 1) do a search, and 2) limit to full text. And this is exactly what was implemented. The first step was to edit Solr’s definiton of the url field. Specifically, its “indexed” attribute was changed from false to true. Trivial. Solr was then restarted.

The second step was to re-index the MARC content. When this is complete, the reader is able to search the index for URL content — “url:*”. In other words, find all records whose URL equals anything.

The third step was to understand that all of the local VuFind OAI-PMH identifiers have the same shape. Specifically, they all include the string “oai”. Consequently, the very astute reader could find all OAI-PMH content with the following query: “id:*oai*”.

The third step was to turn on a VuFind checkbox option found in facets.ini. Specifically, the “[CheckboxFacets]” section was augmented to include the following line:

id:*oai* OR url:* = “Limit to full text”

When this was done a new facet appeared in the VuFind interface.

Finally, the whole thing comes to fruition when a person does an initial search. The results are displayed, and the facets include a limit option. Upon selection, VuFind searches again, but limits the query by “id:*oai* OR url:*” — only items that have URLs or come from OAI-PMH repositories. Pretty cool.

Kudos go to Demian Katz for outlining this process. Very nice. Thank you!

LITA: Jobs in Information Technology: August 24, 2016

Wed, 2016-08-24 18:51

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week

American Institute for Radiologic Pathology, Medical Archivist / Case manager, Silver Spring, MD

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

Andromeda Yelton: An open letter to Heather Bresch

Wed, 2016-08-24 13:49

Dear Heather Bresch,

You lived in Morgantown. I did, too: born and raised. My parents are retired from the university you attended. My elementary school took field trips to Mylan labs. They were shining, optimistic.

You’re from West Virginia. I am, too. This means we both know something of the coal industry that has both sustained and destroyed our home. You know, as I do, how many miners have been killed in explosions: trapped underground when a pocket of methane ignites. We both know that miners long carried safety lamps: carefully shielded but raw flames that would go out when the oxygen went too low, a warning to get away — if they had not first exploded, as open flames around methane do. Perhaps you know, as I only recently learned, that miners were once required to buy their own safety lamps: so when safer ones came out, ones that would only warn without killing you first, miners did not carry them. They couldn’t afford to. They set probability against their lives, went without the right equipment, and sometimes lost, and died.

I’m a mother. You are, too. I don’t know if your children carry medication for life-threatening illnesses; I hope you have not had to face that. I have. In our case it’s asthma, not allergies, and an inhaler, not an Epi-Pen. It’s a $20 copay with our insurance and lasts for dozens of doses. It doesn’t stop asthma attacks once they start — my daughter’s asthma is too severe for that — but sometimes it prevents them. And when it does not, it still helps: we spend two days in the hospital instead of five; we don’t go to the ICU. (Have you ever been with your child in a pediatric ICU? It is the most miraculous, and the worst, place on earth.)

Most families can find their way to twenty dollars. Many cannot find six hundred. They’ll go without, and set probability against their children’s lives. Rich children will live; poor children will sometimes lose, and die.

I ask you to reconsider.

Sincerely,

Andromeda Yelton


Equinox Software: Year 2010 : Sine Qua Non

Wed, 2016-08-24 13:43

This is the fifth in our series of posts leading up to Evergreen’s Tenth birthday.  

I often tell people I hire that when you start a new job the first month is the honeymoon period. At month three you are panicking and possibly wondering why you thought you could do this. At six months you realize you’ve actually got the answers and at twelve months it’s like you never worked anywhere else. For me, 2010 represented months six through eighteen of my employment with Equinox and it was one of the most difficult, rewarding, and transformative years of my career. Coincidentally, it was also an incredibly transforming year for Evergreen.

In early 2010, Evergreen 1.6 was planned and released on schedule thanks to contributing efforts from the usual suspects back at that time. Bug fixes and new development were being funded or contributed by PINES, Conifer, Mohawk College, Evergreen Indiana, Calvin College, SAGE, and many others in the community. Somewhere in the midst of the ferocious adoption rate and and evolution of 2010, Evergreen quietly and without fanfare faced (and passed) its crucible. Instead of being thrown off stride, this amazingly determined community not only met the challenge, but deftly handled the inevitable friction that was bound to arise as the community grew.

In late August of 2010 KCLS went live on a beta version of Evergreen 2.0 after just over a year of intense and exhilarating development. It marked the beginning of another major growth spurt for Evergreen, including full support for Acquisitions, Serials, as well as the introduction of the template toolkit OPAC (or TPAC). I have nothing but positive things to say about the teams that worked to make that go-live a reality. KCLS and Equinox did amazing things together and, while not everything we did was as successful as we had envisioned, we were able to move Evergreen forward in a huge leap. More importantly, everyone involved learned a lot about ourselves and our organizations – including the community itself.

The community learned that we were moving from a small group of “insiders” and enthusiasts into a more robust and diverse community of users. This is, of course, natural and desirable for an open source project but the thing that sticks out in my mind is how quickly and easily the community adapted to rapid change. At the Evergreen Conference in 2010 a dedicated group met and began the process of creating an official governance structure for the Evergreen project. This meeting led to the eventual formation of the Evergreen Oversight Board and our current status as a member project of the Software Freedom Conservancy.

In the day-to-day of the Evergreen project I witnessed how the core principles of open source projects could shape a community of librarians. And I was proud to see how this community of librarians could contribute their core principles to strengthen the project and its broader community. We complement one another even as we share the most basic truths:
*The celebration of community
*The merit of the individual
*The empowerment of collaboration
*The belief that information should be free

Evergreen is special. More importantly, our community is special. And it’s special because behind each line of code there are dozens of people who contributed their time to create it. Each of those people brought with them their passion, their counter-argument, their insight, their thoughtfulness, and their sheer determination. And together, this community created something amazing. They made great things. They made mistakes. They learned. They adapted. They persevered. And those people behind those lines of code? They’re not abstractions. They are people I know and respect; people who have made indelible marks on our community. It’s Mike, Jason, Elizabeth, Galen, Kathy, Bill, Amy, Dan, Angela, Matt, Elaine, Ben, Tim, Sharon, Lise, Jane, Lebbeous, Rose, Karen, Lew, Joan, and too many others to name. They’re my community and when I think back on how much amazing transformation we’ve achieved in just one year, or ten years, I can’t wait to see what we do in the next ten.

– Grace Dunbar, Vice President

Open Knowledge Foundation: Open Knowledge Switzerland Summer 2016 Update

Wed, 2016-08-24 10:46

The first half of 2016 was a very busy one for the Open Knowledge Swiss chapter, Opendata.ch. Just between April to June the chapter had 3 Hackathons, 15 talks, 3 meetups and 10 workshops. In this blog post we highlight some of these activities to update the Open Knowledge Community about our chapter’s work.

 

Main projects

Our directors worked on relaunching the federal Open Government Data portal and its new online handbook. We gathered and published datasets and ran workshops in support of various hackdays – and we migrated and improved our web infrastructure with better support of the open Transport API (handling up to 1.7 Mio requests per day!).

 

Main events

We held our annual conference in June, ran energy-themed hackdays in April and ran an OpenGLAM hackathon in July. Additionally, we supported two smaller regional hackathons in the spring, and a meetup on occasion of Open Data Day.

 

Challenges

Like other organisations in this space, our main challenge is redefining our manifesto and restructuring our operations to become a smoother running chapter that is more responsive to the needs of our members and community. This restructuring continues to be a challenge that we are learning from – and need to learn more about.

 

Successes

Our media presence and public identity continues to be stronger than ever. We are involved in a wide range of political and inter-organizational activities in support of diverse areas of openness, and in general we are finding that our collective voice is stronger and our messages better received everywhere we go.

 

Governance

We have had several retreats with the board to discuss changes in the governance and to welcome new directors: Catherine Pugin (ta-swiss.ch, datastory.ch), Martin Grandjean (martingrandjean.ch) and Alexandre Cotting (hevs.ch)

We are primarily working on a better overall organizational structure to support our community and working groups: starting and igniting new initiatives will be the next step. Among them will be the launch of business-oriented advocacy group called “Swiss Data Alliance”.

 

Looking ahead

We will soon announce a national program on food data, which includes hackdays and a funded follow-up/incubation phase for prototypes produced. And we are busy setting up a hackathon at the end of September with international scope and support called Hack for Ageing Well. Follow #H4AW for more info.

We are excited about upcoming cross-border events like #H4AW and Jugend Hackt, opening doors to development and research collaborations. Reach out through the Open Knowledge forums and we’ll do our best to connect you into the Swiss community!

LibUX: Helping users easily access content on mobile

Wed, 2016-08-24 04:55

 Pages that show intrusive interstitials provide a poorer experience to users than other pages where content is immediately accessible. This can be problematic on mobile devices where screens are often smaller. To improve the mobile search experience, after January 10, 2017, pages where content is not easily accessible to a user on the transition from the mobile search results may not rank as highly.

I wonder, by their description, whether this describes exit-intent pop-ups OptinMonster made popular.

 Showing a popup that covers the main content, either immediately after the user navigates to a page from the search results, or while they are looking through the page.

One can hope.

Helping users easily access content on mobile

LibUX: A few things Brodie Austin learned doing usability tests on library websites

Wed, 2016-08-24 04:43

Preach.

 My #1 rule when it came to thinking about website usability was that no one was allowed to claim to know what “normal people” would think or do until we actually sat down with normal(ish) people.

So, you want to do usability testing on your library website

Galen Charlton: Visualizing the global distribution of Koha installations from Debian packages

Wed, 2016-08-24 04:15

A picture is worth a thousand words:

Click to get larger image.

This represents the approximate geographic distribution of downloads of the Koha Debian packages over the past year. Data was taken from the Apache logs from debian.koha-community.org, which MPOW hosts. I counted only completed downloads of the koha-common package, of which there were over 25,000.

Making the map turned out to be an opportunity for me to learn some Python. I first adapted a Python script I found on Stack Overflow to query freegeoip.net and get the latitude and longitude corresponding to each of the 9,432 distinct IP addresses that had downloaded the package.

I then fed the results to OpenHeatMap. While that service is easy to use and is written with GPL3 code, I didn’t quite like the fact that the result is delivered via an Adobe Flash embed.  Consequently, I turned my attention to Plotly, and after some work, was able to write a Python script that does the following:

  1. Fetch the CSV file containing the coordinates and number of downloads.
  2. Exclude as outliers rows where a given IP address made more than 100 downloads of the package during the past year — there were seven of these.
  3. Truncate the latitude and longitude to one decimal place — we need not pester corn farmers in Kansas for bugfixes.
  4. Submit the dataset to Plotly with which to generate a bubble map.

Here’s the code:

#!/usr/bin/python # adapted from example found at https://plot.ly/python/bubble-maps/ import plotly.plotly as py import pandas as pd df = pd.read_csv('http://example.org/koha-with-loc.csv') df.head() # scale factor the size of the buble scale = 3 # filter out rows where an IP address did more than # one hundred downloads df = df[df['value'] <= 100] # truncate latitude and longitude to one decimal # place df['lat'] = df['lat'].map('{0:.1f}'.format) df['lon'] = df['lon'].map('{0:.1f}'.format) # sum up the 'value' column as 'total_downloads' aggregation = { 'value' : { 'total_downloads' : 'sum' } } # create a DataFrame grouping by the truncated coordinates df_sub = df.groupby(['lat', 'lon']).agg(aggregation).reset_index() coords = [] pt = dict( type = 'scattergeo', lon = df_sub['lon'], lat = df_sub['lat'], text = 'Downloads: ' + df_sub['value']['total_downloads'], marker = dict( size = df_sub['value']['total_downloads'] * scale, color = 'rgb(91,173,63)', # Koha green line = dict(width=0.5, color='rgb(40,40,40)'), sizemode = 'area' ), name = '') coords.append(pt) layout = dict( title = 'Koha Debian package downloads', showlegend = True, geo = dict( scope='world', projection=dict( type='eckert4' ), showland = True, landcolor = 'rgb(217, 217, 217)', subunitwidth=1, countrywidth=1, subunitcolor="rgb(255, 255, 255)", countrycolor="rgb(255, 255, 255)" ), ) fig = dict( data=coords, layout=layout ) py.iplot( fig, validate=False, filename='koha-debian-downloads' )

An interactive version of the bubble map is also available on Plotly.

HangingTogether: Slam bam WAM: Wrangling best practices for web archiving metadata

Wed, 2016-08-24 01:14

The OCLC Research Library Partnership Web Archiving Metadata Working Group (WAM, of course) was launched last January and has been working hard–really hard–ever since. Twenty-five members from Partner libraries and archives have dug in to address the challenge of devising best practices for describing websites–which are, it turns out, very odd critters compared to other types of material for which descriptive standards and guidelines already exist. In addition, user needs and behaviors are quite different from those we’re familiar with.

Our plan at the outset: do an extensive literature review on both user needs and existing metadata practices in the web context, study relevant descriptive standards and institution-specific web archiving metadata guidelines, engage the community along the way to confirm the need for this work and obtain feedback, and, ultimately, issue two reports: the first on user needs and behaviors specific to archived web content, the second outlining best practices for metadata. The heart of the latter will be a set of recommended data elements accompanied by definitions and the types of content that each should contain.

At this juncture we’ve drawn several general conclusions:

  • Descriptive standards don’t address the unique characteristics of websites.
  • Local metadata guidelines have little in common with each other.
  • It’ll therefore be challenging to sort it all out and arrive at recommended best practices that will serve the needs of users of archived websites.

We’ve reviewed nine sets of institution-specific guidelines. The table below shows the most common data elements, some of which are defined very differently from one institution to another. Only three appear in all nine guidelines: creator/contributor, title, and description.

Collection name/title Language Creator/contributor Publisher Date of capture Rights/access conditions Date of content Subject Description Title Genre URL

Our basic questions: Which types of content are the most important to include in metadata records describing websites? And which generic data elements should be designated for each of these concepts?

Here are some of the specific issues we’ve come across:

  • Website creator/owner: Is this the publisher? Creator? Subject? All three?
  • Publisher: Does a website have a publisher? If so, is it the harvesting institution or the creator/owner of the live site?
  • Title: Should it be transcribed verbatim from the head of the home page? Or edited to clarify the nature/scope of the site? Should acronyms be spelled out? Should the title begin with, e.g., “Website of the …”
  • Dates: Beginning/end of the site’s existence? Date of capture by a repository? Content? Copyright?
  • Extent: How should this be expressed? “1 online resource”? “6.25 Gb”? “approximately 300 websites”?
  • Host institution: Is the institution that harvests and hosts the site the repository? Creator? Publisher? Selector?
  • Provenance: In the web context, does provenance refer to the site owner? The repository that harvests and hosts the site? Ways in which the site has evolved?
  • Appraisal: Does this mean the reason why the site warrants being archived? The collection of a set of sites as named by the harvesting institution? The scope of the parts of the site that were harvested?
  • Format: Is it important to be clear that the resource is a website? If so, how best to do this?
  • URL: Which URLs should be linked to? Seed? Access? Landing page?
  • MARC21 record type: When coded in the MARC 21 format, should a website be considered a continuing resource? Integrating resource? Electronic resource? Textual publication? Mixed material? Manuscript?

We’re getting fairly close to completing our literature review and guidelines analysis, at which point we’ll turn to determining the scope and substance of the best practices report. In addition to defining a set of data elements, it’ll be important to set the problem in context and explain how our analysis has led to the conclusions we draw.

So stay tuned! We’ll be sending out a draft for community review and are hoping to publish both reports within the next six months. In the meantime, please send your own local guidelines, as well as pointers to a few sample records, to me at dooleyj@oclc.org. Help us make sure we get it right!

About Jackie Dooley

Jackie Dooley leads OCLC Research projects to inform and improve archives and special collections practice.

Mail | Web | Twitter | Facebook | More Posts (19)

DuraSpace News: VIVO Updates for August 21–Conference Wrap-Up, Improved Documentation

Wed, 2016-08-24 00:00

From Mike Conlon, VIVO project director

Equinox Software: Evergreen 2009: Not Just Code

Tue, 2016-08-23 19:13

This is the fourth in our series of posts leading up to Evergreen’s Tenth birthday.  

I first became aware of Evergreen in 2007 when I saw a posting on a library technology listserv.  As an open source advocate and a librarian, I began following its progress. Skip forward to a cold morning in January 2009 and I was letting IT managers and library directors from around the state of South Carolina into a meeting room.  I was the IT manager at the Florence County Library and two months previously we, as a library, had decided to move to Evergreen.  We had written a long term technology plan and a critical part was our Integrated Library System.  Aside from Georgia, we saw Evergreen being adopted in Michigan and Indiana.  I knew that in time Evergreen would match and surpass our other options.

We also knew that an open source community was going to require changing our perspective of what our relationship to the ILS looked like.  The old proprietary vendor had legal control over aspects of the community and there were limits to what we could share among ourselves as customers.  Libraries had to strike special deals with strict non-disclosure agreements to gain access to source code and the insight to how the ILS worked behind the user interface.

To say that this was going to be different would be an understatement.  The source code was not only not confidential but openly published.  People developed reports and freely published them on community wikis while articles appeared in journals and on personal blogs.  The lack of a corporate gatekeeper was both invigorating and a little overwhelming.  Bringing in a vendor to run an ILS as a service made sense to us but could we convince others to join us?

We asked if other libraries would be interested in an Evergreen consortium.  The answer was yes.  There were a lot of concerns but the experts we called in, Equinox Software, seemed the perfect choice.  No one knew Evergreen as a team as well as they did and they had worked with small libraries and big consortiums.  Partnering with Equinox allowed us to start the migration process quickly despite very little in-house knowledge of Evergreen across our libraries.  And without a proprietary gatekeeper, other libraries in the consortium could dig into the deeper technologies to the degree they were comfortable doing so.  My library was definitely one of the others that wanted more.

We knew that user community was important.  Even with its limitations the user group of our previous ILS had been valuable.  2009 was the year I went to the inaugural Evergreen Conference.  2009 was the year I became active on the listservs, mostly watching but answering questions where I could.  2009 was the year I first volunteered to help out with community activities.  2009 was the year I first gave feedback on new features and bugs.  2009 was the year I, as a user, became a part of the community and saw an impact from it.  And, frankly, it was kind of easy.  I had an advantage being both a librarian and having a technical background but as I met others as new to the community as I was I saw them doing the same thing.  Where they became involved varies based on their interests and skills but everyone who wanted to found a place.  I even recognized a few from the user groups of my old ILS.  Before these people had been names and faces I vaguely recognized from meetings at ALA and listservs.  They had been users of the same thing I used.  Now, in the Evergreen community they were fellows and peers.

The open development process meant that I got a chance to provide feedback on features being developed that we weren’t paying for.  I had participated in feedback about features for proprietary ILSes.  It always felt like throwing pennies in a wishing well and crossing my fingers.  I didn’t work at libraries large enough to drive the process of development so we had to hope that the really big customers wanted the same things we did.  Here, the process wasn’t just open in name but discussion about requirements and needs was being done in public forums.  Input was not just allowed but encouraged.  It was clearly a matter of pride for the developers to know that their work was as widely useful as possible.  I could follow the process and choose to participate if it was a feature I was interested in.  And behind each of these things was a person, someone I got to know on a listserv or a conference.  

In December of 2009 our third wave of libraries went live.  Things had calmed down from the hectic early days of migrations.  It had been almost a year now since that early meeting when we went from “we want to” to “we are doing this.”  I remember having time to spend looking at bugs because we had the Christmas slowdown common to public libraries when a developer at another consortium sent me an email.  I commented on a bug that wasn’t a high community priority but it, well, bugged me.  I had helped this developer with testing some patches both to help him out and to give myself more patching experience.  He carved out time and fixed that bug for me.  The truth is that any human endeavor involves an economy of personalities.  But in the software world of meritocracy, open source projects are often more about code than people.  They are often tightly focused projects that do specific things.  An ILS isn’t tightly focused.  It touches on a vast swath of a library’s operations.  It took me a while to realize what now seems obvious in hindsight, but Evergreen isn’t tightly focused and it’s not about code.  Code is critical to the project as it is the means to an end but Evergreen is about people.  I learned a lot in 2009 but things have changed.  I’ve changed jobs and code has changed but the fact that Evergreen is about people hasn’t changed.

–Rogan Hamby, Project and Data Analyst

SearchHub: Solr Troubleshooting: Treemap Approach

Tue, 2016-08-23 17:58

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting a talk from the newest Solr committer, Alexandre Rafalovitch. Alexandre will also be presenting at Lucene/Solr Revolution 2016.

Solr is too big of a product to troubleshoot as if it were a black box or by random tests. Fortunately, there is a way to use Solr’s API, logs, specific queries, and even source code to iteratively narrow a problem definition to something concrete and fixable. Solr already has most of the tools for good troubleshooting, but they are not positioned or documented as such. Additionally, there are various operating system tools that can be used for troubleshooting Solr. This talk provides viewers with the mental model and practical tools to become better troubleshooters with their own Solr projects.

Alexandre is a full-stack IT specialist with more than 20 years of industry and non-profit experience, including in Java, C# and HTML/CSS/JavaScript. He develops projects on Windows, Mac and Linux. His current focus is a consultancy specialized on popularizing Apache Solr. Alex has written one book about Solr already (Apache Solr for Indexing Data How-to). He has presented at Lucene/Solr Revolution 2014 and 2015, as well as multiple times at JavaOne and various smaller venues. Alexandre became an Apache Lucene/Solr committer in August 2016.

Solr Troubleshooting – Treemap Approach: Presented by Alexandre Rafolovitch, United Nations from Lucidworks

Join us at Lucene/Solr Revolution 2016, the biggest open source conference dedicated to Apache Lucene/Solr on October 11-14, 2016 in Boston, Massachusetts. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post Solr Troubleshooting: Treemap Approach appeared first on Lucidworks.com.

Library of Congress: The Signal: Congress.gov Nominated for Award

Tue, 2016-08-23 17:24

Poster of the legislative process. Congress.gov

FedScoop, the Washington DC government tech media company, announced that Congress.gov is one of their nominees for the 2016 FedScoop 50 awards.

Features on Congress.gov (which In Custodia Legis has been posting about throughout its development) include:

  • Ability to narrow and refine search results
  • Ability to simultaneously search all content across all available years
  • Bill summaries and bill status from the 93rd Congress through the present
  • Bill text from the 103rd Congress through the present Congressional Record
  • Committee landing pages
  • Comprehensive searching across bill text
  • Congressional Record index
  • Congressional reports
  • Easier identification of current bill status
  • Effective display on mobile devices
  • Executive communications
  • House and Senate calendars
  • Links to video of the House and Senate floor
  • Members’ legislative history and biographical profiles
  • Nominations
  • Persistent URLs
  • Top searched bills
  • Treaties

The FedScoop website states, “Congress.gov is the official website for U.S. federal legislative information. The site provides free access to accurate, timely, and complete legislative information. The Library of Congress manages Congress.gov and ingests data from the Office of the Clerk of the U.S. House of Representatives, the Office of the Secretary of the Senate, the Government Publishing Office, and the Congressional Budget Office. Congress.gov is fully responsive and intuitive. The success of Congress.gov has enabled the Library of Congress to retire legacy systems, better serve the public, members of Congress and congressional staff, and to work more effectively with data partners.”

Vote for your favorite Tech Program of the Year.

LibUX: Blueprint for Trello

Tue, 2016-08-23 15:21

If you think of a journey map as an aerial view, the top-down plot of your user’s tour through a service — imagine each step involved registering for a new library card — then the service blueprint is its cross-section.

Think of the service blueprint as the cross-section of a journey map.

Sussing-out the systems and processes that underlie that journey returns a lot of value for your time spent in the way of insight, and I would guess that, like the journey map, the service blueprint might provide the most bang for your buck.

Blueprinting was born out of sticky notes and conference rooms and lends itself to being lo-fi, especially because it’s a team sport. And although the best UI for group-think is a wall, there is a need for remote tools like Mural that is priced for teams without an enterprise budget.

I noticed in Erik Flowers and Megan Miller’s Practical Service Blueprinting Guide, this example —

www.practicalservicedesign.com

— that, to me, looks a little like Trello. Can you see it?

Trello is an organize-how-you-will collaborative kanban board, sort of.

Trello brings a lot to the table but in the world of budget-usability its real boons are that it’s free, ubiquitous, and extensible. It can be integrated into just about any workflow and accessed anywhere from any device. Updates are real-time. Shoot, I almost wish I had an affiliate link. I use it for everything — and, chances are, a lot of you do too — and for that alone it makes sense to me to leverage its consistency in our user experience work as a blueprinting tool, if we can.

So, inspired by Scrum for Trello — a browser extension that adds Fibonacci numbers and a burn down chart — I started Blueprint for Trello. All it really does is make items using shortcodes like [touchpoint] look more like the practical service blueprint.

Really beta

This is my first browser extension — so it’s just Chrome, for now — and at the time of this writing there is an important to-do item that until checked makes this pretty incomplete. Namely, it only applies the skin on page load – so as you add cards or move things around, they lose their styling and need a refresh. Still, I want to give Blueprint for Trello its first spin at a workshop I’m teaching this afternoon, so I thought I’d share what I have.

Installation
  1. Download and unzip the folder (Github).
  2. Using Chrome, navigate to chrome://extensions and check the box “Developer Mode”
  3. Choose “Load unpacked extensions”
    1. Find the folder blueprint-for-trello or blueprint-for-trello-master
    2. And, depending how you downloaded it, choose the folder of the same name inside it.

 

Usage

Blueprint for Trello applies styling to cards that include in their title a number of short codes:

  • [touchpoint]
  • [actor]
  • [system]
  • [stakeholder]
  • [observation]
  • [data]
  • [question]
  • [critical]
  • [policy]
  • [idea]

It replaces these with icons and applies a color to the cards.

From there, the Practical Service Blueprinting Guide should take you the rest of the way.

David Rosenthal: Content negotiation and Memento

Tue, 2016-08-23 15:00
Back in March Ilya Kreymer summarized discussions he and I had had about a problem he'd encountered building oldweb.today thus:
a key problem with Memento is that, in its current form, an archive can return an arbitrarily transformed object and there is no way to determine what that transformation is. In practice, this makes interoperability quite difficult. What Ilya was referring to was that, for a given Web page, some archives have preserved the HTML, the images, the CSS and so on, whereas some have preserved a PNG image of the page (transforming it by taking a screenshot). Herbert van de Sompel, Michael Nelson and others have come up with a creative solution. Details below the fold.


I suggested that what we were really talking about was yet another form of content negotiation; Memento (RFC7089) specifies content negotiation in the time dimension, HTTP specifies content negotiation in the format and language "dimensions", and what Ilya wanted was content negotiation in the "transform" dimension to allow a requestor to choose between transformed and untransformed versions of the page. Ilya's list of transforms was:
  • none - the URL content exactly as originally received.
  • screenshot - an image of the rendered page.
  • altered-dom - the DOM altered as, for example, by archive.is.
  • url-rewritten - URLs in the page rewritten to point to preserved pages in the archive.
  • banner-inserted - the page framed by archival metadata as, for example, by the Wayback Machine.
Ilya's and my idea was that a new HTTP header would be defined to support this form of content negotiation.

Banner-inserted contentoutlined in redShawn Jones, Herbert and Michael objected that defining new HTTP headers was hard, and wrote a detailed post which explained the scope of the problem:
In the case of our study, we needed to access the content as it had existed on the web at the time of capture. Research by Scott Ainsworth requires accurate replay of the headers as well. These captured mementos are also invaluable to the growing number of research studies that use web archives. Captured mementos are also used by projects like oldweb.today, that truly need to access the original content so it can be rendered in old browsers. It seeks consistent content from different archives to arrive at an accurate page recreation. Fortunately, some web archives store the captured memento, but there is no uniform, standard-based way to access them across various archive implementations. Their proposal was to use two different Memento TimeGates, one for the transformed and one for the un-transformed content.

The elegance of Herbert et al's latest proposal comes from eliminating the need to define new HTTP headers or to use multiple TimeGates. Instead, they propose using the standard Prefer header from RFC7240. They write:
Consider a client that prefers a true, raw memento for http://www.cnn.com. Using the Prefer HTTP request header, this client can provide the following request headers when issuing an HTTP HEAD/GET to a memento. GET /web/20160721152544/http://www.cnn.com/ HTTP/1.1 Host: web.archive.org Prefer: original-content, original-links, original-headersConnection: close
As we see above, the client specifies which level of raw-ness it prefers in the memento. In this case, the client prefers a memento with the following features:
  1. original-content - The client prefers that the memento returned contain the same HTML, JavaScript, CSS, and/or text that existed in the original resource at the time of capture.
  2. original-links - The client prefers that the memento returned contain the links that existed in the original resource at the time of capture.
  3. original-headers - The client prefers that the memento response uses X-Archive-Orig-* to express the values of the original HTTP response headers from the moment of capture.
The memento that is returned can carry the the Preference-Applied HTTP response header indicating which of the requested preferences have been applied to the returned content. This is closely analogous to the earlier suggestion of content negotiation but doesn't require either new headers or multiple TimeGates.

The details of their proposal are important, you should read it.

Open Knowledge Foundation: Come talk Open Data and Land Governance with Cadasta on the LandPortal! Join the online discussion Sept 6-20th, 2016

Tue, 2016-08-23 11:39

Earlier this year, Open Knowledge International announced a joint-initiative with the Cadasta Foundation to explore property rights data with the ultimate goal of defining the land ownership dataset for the Global Open Data Index. Lindsay Ferris from the Cadasta team shares more on how you can get involved on issues related to open data in land governance. Register as a user on the Land Portal and take part in September 2016’s LandDebate.

Are you interested in ensuring land governance is open and transparent? Do you want to understand your role in using land administration information and how it can help provide greater security of  property rights? We are excited to hear from you!

Cadasta Foundation is pleased to announce that we are partnering with the Land Portal Foundation to organize an online discussion on Open Data and Land. From September 6th – 20th, 2016, we will facilitate a LandDebate on the Land Portal, posing questions to spark conversation. The Land Portal is an online resource hub and networking platform for the land community. We will hear from leading members of the open data and land governance communities on the topic of open data in land governance — all stakeholders, including CSOs, government officials, private sector actors and researchers are invited to be involved! To find out more about LandDebates, take a look at some past topics here.

Image credit: Cadasta

The land administration sector plays a critical role in governing what is often the most valuable asset of states – the land and natural resources. Unfortunately, given the high value of land, and the power that goes along with access to it, the land sector is ripe for potential abuse. As such, it is a sector where greater transparency plays a critical role in ensuring accountability and equitable access and enforcement of land rights. Opacity in land governance can enable major corruption in land management, increase difficulty in unlocking the value of the land as an asset, and foster a lack of awareness of land policies and legal frameworks by citizens; all of which can undermine land tenure security.

Unfortunately, land administration data ranging from property registries and cadastres to datasets collected through participatory mapping and research is often inaccessible. The information needed to close these gaps to understand who has a right to what property and under what terms remains closed, often at the expense of the most vulnerable populations. Further, due to privacy and security concerns associated with sharing information on vulnerable populations, opinions remain mixed on what should be released as “open data” for anyone to access, reuse and share. The hope is that opening up the data in a way that takes these concerns into account can level the playing field and reduce information asymmetry so that everyone — individuals, communities, NGOs, governments and the private sector — can benefit from land information.

As part of Cadasta Foundation’s on-going research on open data in land, the aim of this discussion is to bring together these stakeholders to address the implications of open data for land governance, including understanding the links between transparency and global challenges, such as overcoming poverty, strengthening property rights for vulnerable populations, enhancing food security and combating corruption. We also hope to broaden consensus on this issue, define what data is important for the community to be open and begin to collect examples of best practices that can be used as an advocacy point going forward. All of Cadasta’s open data research resources can be found here.

To be a part of the LandDebate, simply register as a user on the Land Portal. Then, you’ll be able to dive right in when the conversation begins on September 5th through the Open Data and Land page. If you’d like to reach out with questions on the content, how to get involved or to contribute comments in advance of the LandDebate, contact us atopen@cadasta.org! Finally, to get some background on open data in land, check out Cadasta’s existing resources on the topic here. We’re excited to hear from you.

This piece was written by Lindsay Ferris and is cross-posted on the Cadasta blog.

FOSS4Lib Recent Releases: ePADD - 2.0

Mon, 2016-08-22 19:56

Last updated August 22, 2016. Created by Peter Murray on August 22, 2016.
Log in to edit this page.

Package: ePADDRelease Date: Friday, August 19, 2016

SearchHub: Where Search Meets Machine Learning

Mon, 2016-08-22 18:38

As we countdown to the annual Lucene/Solr Revolution conference in Boston this October, we’re highlighting talks and sessions from past conferences. Today, we’re highlighting Verizon’s Joaquin Delgado and Diana Hu’s talk, “Where Search Meets Machine Learning”.

Joaquin and Diana discuss ML-Scoring, an open source framework they’ve created that tightly integrates machine learning models into popular search engines, replacing the default IR-based ranking function.

Joaquin A. Delgado, PhD. is currently Director of Advertising and Recommendations at OnCue (acquired by Verizon). Previous to that he held CTO positions at AdBrite, Lending Club and TripleHop Technologies (acquired by Oracle). He was also Director of Engineering and Sr. Architect Principal at Yahoo! His expertise lies on distributed systems, advertising technology, machine learning, recommender systems and search. He holds a Ph.D in computer science and artificial intelligence from Nagoya Institute of Technology, Japan.

Diana Hu is currently the lead data scientist on the architecture team at Verizon OnCue. She steers the algorithm efforts to bring models from research to production in machine learning, NLP, and computer vision TV projects in recommender systems and advertising. Previously, she worked at Intel labs, where she researched large scale machine learning frameworks. She holds an MS and BS in Electrical and Computer Engineering from Carnegie Mellon University where she graduated with highest honors and was inducted into the Electrical Engineering Honor Society – Eta Kappa Nu.

Where Search Meets Machine Learning: Presented by Diana Hu & Joaquin Delgado, Verizon from Lucidworks

Join us at Lucene/Solr Revolution 2016, the biggest open source conference dedicated to Apache Lucene/Solr on October 11-14, 2016 in Boston, Massachusetts. Come meet and network with the thought leaders building and deploying Lucene/Solr open source search technology. Full details and registration…

The post Where Search Meets Machine Learning appeared first on Lucidworks.com.

Equinox Software: Evergreen 2008

Mon, 2016-08-22 17:42

This is the third in our series of blog posts leading up to Evergreen’s 10th birthday.  The other posts can be found here and here.

At the beginning of 2008, I was working for the South Carolina State Library. Like many libraries, we were joining the OSS movement by implementing open source tools to market services and increase discoverability of our collections. We were encouraging our public libraries to do the same offering classes on blogging, wikis and social media. It was no surprise when the State Library hosted visitors from PINES and Equinox to introduce Evergreen.

Having worked for an ILS vendor previously, I was intrigued by the possibility of an open source ILS. The idea that libraries could take back some control over how their ILS developed was exciting! There was a fair amount of skepticism in the audience. Evergreen was in its toddler years and, for some, they needed to see it mature a little before jumping on board. For others, they were ready for a change and saw Evergreen as the opportunity they needed to move their libraries into the future. This is where SC LENDS started to form, but I’ll let my colleague Rogan tell you that story.

For me, 2008 was the year I packed my bags for Georgia to join the Equinox team. I was inspired by what PINES had started and wanted to be a part of building Evergreen further. I knew libraries would gravitate quickly toward the open source business model and developing their own solutions. After all, libraries have been centered around open access and community since their inception.

It’s been a privilege to watch Evergreen grow from those early days to adulthood. I no longer have to talk to potential customers about open source concerns or the maturity of Evergreen. Our discussions are centered around the robust features of Evergreen and how it can work for their library. I still encounter skepticism but it often results in the best discussions. On those occasions where skeptics become true believers, we find our strongest community supporters.

Evergreen is turning 10 years old! It feels more like a 21st birthday celebration because we’ve come so far so fast. I raise my glass to Evergreen and everyone who has been a part of this first 10 years. I can’t wait to see what the next 10 years will bring!

–Shae Tetterton, Director of Sales

Harvard Library Innovation Lab: Summer Fellows Share, Join Us

Mon, 2016-08-22 16:36

LIL fellows are wrapping up their terms this week! Please join us for and learn from our Fellows as they present their research involving ways we can explore and utilize technology to preserve, prepare, and present information for the common good.

Over 12 weeks, the Fellows produced everything from book chapters, web applications, and board games ­ and, ultimately, an immeasurable amount of inspiration that extends far beyond the walls of Langdell.  They explored subjects such as text data modeling, web archiving, opening legal data, makerspaces, and preserving local memory in places disrupted by disaster.

Please RSVP to Gail Harris

Our fellows will be sharing their work these fascinating topics on Wednesday, August 24 from 1:00-3:00 in Casperon Room.

Pages