Evergreen Hosting Booth 3345 at ALA San Francisco
The Evergreen community will be hosting booth 3345 at ALA San Francisco on June 25th through 30th. Evergreen is open source library software that is used in more than 1,300 libraries around the world.
ALA conferences provide attendees with information on global issues effecting libraries. At ALA San Francisco there will about 900 exhibitors, including Evergreen.
Evergreen has all of the modules of traditional ILS, but without the restrictions of vendor driven ILS. Included with the software are modules for self-check out, self-registrations, acquisitions, reports and serials. It is continuously improving through semiannual updates and is customizable to meet the needs of various sized libraries.
Make sure to check out the Evergreen booth 3345 and meet real Evergreen users and hear their stories.
Tim and I are headed to San Francisco this weekend for the ALA Annual Conference.
Visit Us. Stop by booth #3634 to talk to us, get a demo, and learn about all the new and fun things we’re up to with LibraryThing for Libraries!
Stay tuned this week for more announcements of what we’ll be showing off. No, really. It’s going to be awesome.
Get in Free. In the SF area and want to go to ALA? We have free exhibit only passes. Click here to sign up and get one. It will get you just into the exhibit hall, not the conference sessions themselves.
Director Zhang began by surveying the digital landscape, emphasizing the ride of ebooks, digital journals, and machine reading. The CAS decided to embrace the digital-first approach, and canceled all print subscriptions for Chinese-language journals. Anything they don’t own they obtain through consortial relationships ...
This approach works well for a growing proportion of the CAS constituency, which Xiaolin referred to as “Generation Open” or “Generation Digital”. This group benefits from – indeed, expects – a transition from print to open access. For them, and for our presenter, “only ejournals are real journals. Only smartbooks are real books… Print-based communication is a mistake, based on historical practicality.” It’s not just consumers, but also funders who prefer open access.Below the fold, some thoughts on Director Zhang's vision.
Almost a decade ago, Vicky Reich created this fake Starbucks page to illustrate the fate awaiting libraries without collections that saw their role simply as purchasing agents for subscription content. Even if it succeeded in competing with Starbucks, such a library wouldn't be a research library; nothing would distinguish it as a venue for research. In countries that are negotiating access to subscription content for all their institutions centrally, or even for all their citizens, or if the Max Planck Institute's plan for open access succeeds, the library's role as providing access goes away. Director Zhang sees this clearly:
Chinese faculty now see the library’s main role as that of a buyer and archive maintainer. Yet libraries have outsourced collections, either deliberately or by the rise of the web. Libraries now hold on to a diminishing part of scholarly knowledge. Moreover, director Zhang observed that his library’s foot traffic has been declining – and he helped make it happen, by making an aggressive shift to the digital world. Which led him to ask a dangerous question: are libraries losing the right to be research libraries?His answer was that libraries need to evolve:
To begin with, the library needs to embed itself more deeply in the research and development process. Researchers need to do environmental scanning, trends and path analysis, data management and analysis, content distribution, identifying emerging topics, mapping trends, technology scanning, competition analysis, R+D exploration + discovery, and more. Xiaolin urged us to repurpose libraries to directly support these needs. Put another way, an analytical platform should be at the center of research libraries.Researchers clearly need these capabilities, but what advantages does a library at an individual University, or even a single national library, have in delivering such a platform to researchers? The key requirements for success are:
- Access to all the data which, until the open access transition is complete, individual libraries are not going to have.
- A highly-skilled, fast-moving team of developers, which individual libraries are not going to have, because the rewards in industry are much better.
- Access to large-scale compute and storage resources, which both libraries and companies can rent from the cloud.
- Mind-share, which Google in particular already has over libraries.
First, libraries need to build out their data analysis capacity. Second, they should create customized information environments for researchers.will work to keep University libraries, or even most national libraries, as true research libraries. Nevertheless, I applaud the efforts he is making:
The National Science Library publicly advocates for open access policies, infrastructure, and financial support. NSL is growing its digital repositories. It also helps local libraries analyze research topics, collaboration opportunities, and talent profiles. NSL now plays a role in national digital preservation, assists with strategic decision-making for STEM researchers and enterprises, and is now developing knowledge mapping and research profiling services. These are all things a national library should be doing, but note how difficult they would be for an individual University library. For almost all these libraries, remaining a research library rather than just a generic campus service requires distinguishing themselves from the herd. They aren't going to do that by layering services on top of content to which everyone has access, because they will lose the competition with companies and, Director Zhang hopes, national libraries in those spaces. The only way to do it is to have unique content on which to base unique services. In other words, collections.
As part of the White House’s Open Ebooks initiative, DPLA is calling on librarians and other information professionals to help coordinate books for inclusion in the program to help connect children with ebooks.
We are seeking motivated, engaged community members who have experience with building and organizing children and young adult book collections, who have time to spend building out the first two collections.
What’s involved in being a member of the DPLA Curation Corps? Primarily, enriching metadata to ensure that the best books get connected to their reader. We will be asking participants to help us cull publisher contributions and public domain collections to find the best titles to publish in September 2015 and January 2016. It is important that we not only indicate a title’s reading level, but also its age appropriateness, as well as additional subject headings so that a girl interested in grasshoppers, for instance, can find the right book to meet her need.
If you are interested in helping us connect books to young readers, and you have expertise in this area, please consider being a member of our Ebook Collection Curation Corps. We will announce the first class of Collection Curation Corps on July 18, 2015.
You do NOT have to be affiliated with a current DPLA hub library to participate. Participants will receive a $1,500 stipend.
To find out more about this project go to our White House Open Ebooks initiative page.
Questions about the Ebooks Collection Curation Corps? Email us.
In this series, inspired by the New York Times’ Sunday Routines, we gain a glimpse into the lives of the people behind LITA. This post focuses on Aimee Fifarek, who was recently elected Vice-President/President-Elect.
Aimee is the Customer Service, Technology and Digital Initiatives Deputy Director for Phoenix Public Library in Arizona. She made the move to PPL in April 2013 from Scottsdale Public Library, where she’d worked for 10 years, first as the IT Manager and then later as Senior Manager over IT, Technical Services and Collection Development. Aimee’s typical work week can include everything from contract negotiations to planning technology projects to addressing customer concerns.
WORKING OUT AND CLEANING UP Sundays are days for sleeping in at South Scottsdale home that Aimee shares with her fiancée Jason Boland. A Senior Trainer for Innovative Interfaces, Jason is often away during the work week for training trips, so the weekends are when most of the chores get done. Laundry gets started before a trip to the gym for Yoga or Step Class, and cleanup of the remnants of a crazy week get done after – but all that doesn’t start until 7 or 8am.
GREEN THUMB In addition to the inside chores, Aimee and Jason enjoy spending time in the back yard vegetable garden. They built large planter boxes this past year in order to keep the weeds out and give the veggies a chance during the year-round growing season. Squash, carrots, peppers and herbs are frequent thrivers.
BATTER UP! Living as they do in the heart of Spring Training activates, Sundays in March and April are frequently involve trips to the many baseball stadiums in the area. Jason is a California native and a devoted Oakland A’s fan. In addition to the A’s, Aimee and Jason try to find time to take in a Milwaukee Brewers game (Wisconsin is Aimee’s home state) or the hometown Arizona Diamondbacks.
WHERE TO? The two spend many weekends traveling. Despite his intense schedule, Jason loves to travel and will happily fly off for a weekend just after returning from a week away for work. Sometimes Aimee is able meet up with Jason at the end of one of his business trips, like recent trips to Minneapolis and Toronto. She enjoys being able to take advantage of Jason’s frequent flier miles and A-List status.
KP Aimee is just as happy at home, however, especially when spending time in the kitchen. Although they take full advantage of the fabulous restaurants and craft cocktail venues that Scottsdale has to offer, Sundays afford the extra time needed for shopping for and preparing a really good meal. Eating healthy in the Boland-Fifarek household is more about avoiding processed foods and cooking from scratch than counting calories – not to mention using lots of fun gadgets like the sous vide or the garlic chopper. Regardless of what they are preparing there is a 99% chance it will contain garlic.
WORDS AND PLAY Evening calls for a little “couch time.” Jason and Aimee are big fans of Sci-Fi and mystery series and routinely give their DVR a workout. Marvel’s Agents of S.H.I.E.L.D., Orphan Black, and Elementary are particular favorites. This is also a good time to finish the New York Times Sunday Crossword and KenKen before heading off to bed at 9pm or 10pm.
This is the second part to a talk given June 3, 2015, called The User Experience (here’s part one!). The recording is courtesy of Florida Library Webinars, where you can find the video for this talk and many more.Slides Show Notes and Snippets
- What is a “heuristic evaluation”?
- The Truth about Carousels and Other Antipatterns by me — Michael Schofield
- UX, consideration, and a CMMI-based Model by Coral Sheldon-Hess
When an organization is well and truly steeped in UX, with total awareness of and buy-in on user-centered thinking, its staff enact those principles, whether they’re facing patrons or not. In short, UX thinking makes a person considerate. Coral Sheldon-Hess
- Heuristic Evaluations in Reverse by Bohyun Kim
- A customer journey map may be the most bang for your buck
Increasingly, the journey often begins online and is punctuated by time and potential disenchantment before the patron even enters the building Michael Schofield
- X\O Participatory Design by the University of Michigan Libraries UX Department
- A High Functioning Research Website
Libraries are approaching their mobile moment. Whether it’s true for you now, it will definitely be true for you later. … Navigation must flow. The content must be worth it. Everything must be fast. There is no place for bullshit. Michael Schofield
- Make sweeping improvements to the user experience by knowing a little about how people read the web.
- The LibUX Core Content Audit
- LibGuides — How Usable is the Three Column Layout? — on LibUX
Where your patrons spend most of their time influences their basic expectations of the library. Being user-centric demands that we are aware and don’t scoff at the habits our users have. Convention matters.Michael Schofield
- The most relevant trends libraries should watch are in e-commerce.
- When people talk about the size of websites in bytes, think in terms of seconds
- After 10 seconds, there is a missed opportunity. You will never know how many patrons you failed to reach, but the data suggests it’s a lot.
I write the Web for Libraries each week — a newsletter chock-full of data-informed commentary about user experience design, including the bleeding-edge trends and web news I think user-oriented thinkers should know.
Today I found the following resources and bookmarked them on Delicious.
- Sass: Syntactically Awesome Style Sheets Sass is the most mature, stable, and powerful professional grade CSS extension language in the world.
Digest powered by RSS Digest
Is your organization interested in taking a leadership role in the digital stewardship community? Consider applying to host and perform secretariat duties for the National Digital Stewardship Alliance.
The application process starts with your organization sending a letter of inquiry (up to one page) to the NDSA Coordinating Committee Chair, Micah Altman (escience [at] mit.edu). Follow-up calls and meetings will be scheduled to start a dialogue with the potential hosts about the goals of their organization, the goals of the NDSA and the mutual benefits in a partnership. These conversations with the Coordinating Committee and the potential hosts will continue over the summer with the hope that a selection is made in early August and the transition is made by the end of September.
Do you have questions about what is entailed in the host/secretariat role? Send an email to the NDSA Chair (escience [at] mit.edu) or send a letter of inquiry stating you organization is interested but wants to learn more. The deadline is June 30, 2015.
The American Library Association (ALA) filed comments with the Chief Officers of State Library Agencies (COSLA) to the Departments of Labor and Education on the Notice of Proposed Rulemaking for the Workforce Innovation and Opportunity Act (WIOA). The comments are now available for review (pdf).
America’s libraries are well positioned to play an essential role in this country’s efforts to assist the workforce. For the first time, libraries will be considered eligible training partners; non-mandatory one-stop partners; models for digital technology; adult education providers and leadership-training grant recipients.
Please continue to follow updates through our District Dispatch. We hope funds will become available soon.
When I graduated from library school, I worried about anti-online-degree bias. I worried that people would think my degree was somehow “less than” because I’d done it fully online. I remember being asked some questions about it at one interview that made the search committee’s biases pretty clear, but the people who eventually hired me seemed to see it as an asset rather than a weakness (mind you, it was for a distance learning librarian position).
That was in 2004. I assumed that 11 years later, people had gotten the message that online courses and online degrees are not necessarily less than, and that the people who go through them can be just as (and in some cases more) qualified as students who did on-site programs. That was until I read this article by Angela Galvan from In the Library With the Lead Pipe:
Hiring Librarians has documented responses from hiring managers claiming students in online programs cannot work in teams or learn effectively, when many students choose online programs for the exact opposite reasons. As with myths about poverty which overshadow the well-established resourcefulness of poor students, online MLIS students are dismissed as asocial and not “team players”… Suggesting online programs lack rigor or cannot result in “real” learning is harmful, technophobic, and helps maintain the whiteness of academic libraries. This attitude favors applicants with the wealth and time to enroll in face to face programs, even though very little of their development as librarians occurs in lecture style, classroom settings.
After reading this, I went down the rabbit hole into the Hiring Librarians site, which interviews people who hire librarians. I think I’d seen some appalling interview on the site a while back in which an interviewee said they wouldn’t hire a woman who didn’t wear makeup. That is beyond deranged and discriminatory in my opinion. Bfut I hadn’t really delved into the site since then. So I saw comments like this:
“I am reluctant to hire online only students. There is an important dynamic missing when one does not have to interact in person with other students and the instructor. I consider those who are currently working in a library as usually a better hire.”
Are there any library schools whose alumni you would be reluctant to hire?
“If program was purely online courses.”
“I feel graduates from online only schools suffer from lack of camaraderie and group study experiences. They do too much learning in a personal vacuum.”
Which library schools give candidates an edge (you prefer candidates from these schools)?
“University of Texas at Austin and other schools who do not rely entirely in online coursework.”
“Prefer someone who attended graduate school in person over and online degree.”
” I will say, and it pains me to say this, but exclusively online programs don’t graduate the same caliber students as those who have at least some on-site matriculation. There’s no substitute for creating relationships in the classroom that you’ll carry with you your entire career.”
“I don’t trust completely online programs.”
“‘Most’ librarians work with people. It is odd to get a degree for that kind of job online…I believe that many folks are graduating that should not…”
I’m amazed that there is this much ignorance in our profession.
I chose to do my degree online at FSU, in spite of getting offered a nearly full-ride to the University of Maryland, for love. I was in a new relationship with a guy I was crazy about, and while I knew it was silly to give up such an offer, I followed my heart. That guy ended up becoming my husband and I shudder to think of what I’d have missed had I chosen the other path. There are all sorts of reasons people choose to do their degree online and it rarely has anything to do with being lazy. People have spouses, kids, infirm parents, jobs, financial limitations, and myriad other things that tie them to a specific place. For the straight-out-of-college student, it is easier to move for school because they usually have fewer things tying them down. I moved to Tallahassee for social work school when I was 22, and it was a great experience. But, for a degree like ours, I really don’t feel like someone will be irreparably harmed by not taking classes face-to-face. So much of what we learn is on-the-job, and I think a library school does a greater disservice to students by not requiring an internship or some sort of work experience in libraries than they do in not requiring face-to-face class attendance.
Like Angela Galvan, I also feel like by saying that you would not want to hire someone with an online degree you are expressing a bias against people who do not have the means or privilege to move to a place with a library school. That feels really wrong to me.
I’ve been teaching for San Jose State University’s online iSchool program for about 7 1/2 years. When I first started teaching in the program it was not all online and most of the students in my online classes came from California. Soon, the program became 100% online and I started teaching students from all over the country, North America, and the world. Another shift was that I found that most of my students were working in libraries and some had more experience than I did. At Portland State, our Access Services Manager was going through the program and, in my current job, our serials librarian is doing it. Both women are amazing, full-of-energy, and experienced in the field. That said, I’ve had a lot of less experienced students in my classes who knocked my socks off and have gone on to do amazing work in libraries.
I’ve been nothing less than blown away by the caliber of students graduating from San Jose State’s program. Yes, in every class I teach there are a few slackers who do the minimum amount of work to make it through the class (or less), but the majority are thoughtful and deeply engaged with what they’re learning. I require blogging in my class, so students do a lot of reflective learning and then have discussions in the blog comments around those reflections. These really become thoughtful asynchronous conversations in many cases. I also usually require group work. (I’m not sure where this assumption that online programs don’t require group work comes from. It was required in my program in the age before collaborative tools like Google Docs.) This semester, I’m requiring more group work than I ever have before (for a three-part, scaffolded project), which I really think is valuable. In our jobs, we do projects with other people with diverse skill sets and levels of motivation and have to make things work. Often a lot of our collaboration happens online, even if we work in the same building (it’s even more challenging at PCC with four campuses/libraries). It seems like a good idea to prepare students for that reality.
I’ve heard lots of negative things over the years about San Jose State’s program from people who have no experience with it. The assumption is that if it’s big, it must be bad. It must be just a diploma mill — churning out degrees willy nilly to unqualified new librarians. While I agree that library schools are churning out too many degreed students vis a vis the job market, this is an across-the-board problem not just limited to one school. I’ve been so impressed with the quality of San Jose State’s program, which is heads and shoulders above my experience with the online program at FSU (keep in mind that I got my degree 11 years ago and probably their online program isn’t as dismal as it was then). There is a considerable focus on getting students practical experience and educating them about career options beyond just working in libraries. As an instructor, I’ve been impressed with the level of training and support they offer their instructors (even lowly lecturers like me). I was allowed to dump the LMS and use blogs for my course instead. I use a hosted-by-the-iSchool WordPress Multiuser platform for the class and have been totally supported in that. My course, like all courses, was assessed by a full-time member of the faculty. We’re offered all sorts of professional development and training and, now, each person teaching for the iSchool (even part-timers like me) are required to attend or watch some professional development programming that the iSchool offers each year. The administration is deeply devoted to quality and supporting faculty in supporting students. I have been nothing but impressed and that’s why I continue to teach for them after all these years.
It’s true that it’s more difficult to develop bonds with instructors and other students through fully online programs, though it’s not impossible. I stay in touch with some of my former students and one of my best students ended up getting a job at Portland State! But I don’t have a network of friends from library school nor do I think any of my instructors from FSU remember me. By doing an online program, you also lose out on the ability to work in the library located at that library school, which I know was an important experience for many of my friends who went to schools like UW and UNC. Yes, those are all benefits of face-to-face programs, but are they worth going into substantial debt to quit one’s job and/or uproot one’s family? Probably not for most people who do not have wealth and/or privilege. Do I feel like I’ve been irreparably harmed by not having those connections and experiences? Not at all. I developed a strong network of professional friends through work, online networking, blogging, and service.
My online program was pretty crappy in 2003-2004, but I learned like crazy on the job and was able to achieve a great deal professionally over the past decade. Unlike many of my students, my experience in libraries was extremely limited — 6 months in circulation at a public library and an internship in a university archive — but I was given an opportunity by my colleagues at Norwich who didn’t see my online degree as an indictment of my potential. Whether someone graduated from an online program or a face-to-face one means nothing in the long-run. What matters is the skills and passion they bring to their work.
The American Library Association’s (ALA) annual conference is right around the corner, so we here at the DPLA have pulled together a nifty little schedule of talks, panels, and presentations that feature members of our staff, Board, Committees, and Hubs. Sessions involving DPLA staff are marked [S], while sessions involving Board, Committee, or Hub members are marked with [A], for ‘affiliate’.FEATURED DPLA EVENTS SATURDAY, JUNE 27
[S] Knight Foundation Grantee Demo Booth, Day #1
2:00 PM – 3:00 PM / Moscone Convention Center, Exhibit Hall, Booth 3731
Connect with DPLA staffers at the Knight Foundation Grantee Demo Booth on Monday, June 29 from 10 AM – 11 AM.
Participants: Dan Cohen (DPLA Executive Director), Emily Gore (DPLA Director for Content), Amy Rudersdorf (DPLA Assistant Director for Content)SUNDAY, JUNE 28
[S] Knight Foundation Grantee Demo Booth, Day #2
12:00 PM – 1:00 PM / Moscone Convention Center, Exhibit Hall, Booth 3731
Connect with DPLA staffers at the Knight Foundation Grantee Demo Booth on Sunday, June 28 from 12 PM – 1 PM.
Participants: Dan Cohen (DPLA Executive Director), Emily Gore (DPLA Director for Content), Amy Rudersdorf (DPLA Assistant Director for Content)
[S] What’s Next for the Digital Public Library of America
4:30 PM – 5:30 PM / Moscone Convention Center, Room 131 (N)
After a busy first two years of bringing together the collections of America’s libraries, archives, and museums, the Digital Public Library of America is looking ahead to the next few years and some important strategic initiatives. DPLA will seek to complete its national network of hubs, work to simplify and harmonize right statements, make an effort to improve the landscape for ebooks, and create new technical infrastructure. DPLA staff will briefly detail some of these efforts, and will respond to questions from the audience. There will be ample time to mingle and interact with the staff and others from the DPLA community who are attending the ALA Annual meeting.
Speakers: Dan Cohen (DPLA Executive Director), Emily Gore (DPLA Director for Content), Amy Rudersdorf (DPLA Assistant Director for Content)
MORE DPLA EVENTS THURSDAY, JUNE 25
[A] CLA Preconference: Relationship-building and Community Engagement
8:30 AM – 4:00 PM / Moscone Convention Center, Room 2010 (W)
Survival of the public library is about relevance, and the key to relevance is engagement. That’s our future. Engagement, with customers, community, stakeholders, partners, and staff, is about people being in relationships. Public libraries need to approach relationships with the confidence that we have something of value to offer, and clarity about what we hope to gain from others that will move our strategic initiatives forward. In this session we’ll explore the various meanings of community engagement, talk about staff engagement, and discuss what it takes to build relationships in both our outward and inward worlds. We’ll hear about strategies for building productive relationships with staff, communities, partners, and stakeholders. We’ll talk about how to rightsize our relationships – recognizing there should be a correlation between the level of effort we put into nurturing relationships and the value we both offer and receive. We’ll discuss how to seek out strategic relationships that align to organizational priorities, and practice having conversations to build relationships in which you might have something to teach, want to learn, or hope to collaborate. Please show up ready to be engaged, interactive, and appreciative of all that is offered, you contribute, and acquire in this daylong session. Our guest speakers are Susan Hildreth, Gary Wasdin, Luis Herrera, and Jan Sanders. Cheryl Gould and Sam McBane Mulford will facilitate the workshop. If you are a member of CLA use special code AFL2015 to receive the price of $219.
Speakers: Cheryl Gould, Gary Wasdin, Jan Sanders, Luis Herrera (City Librarian, San Francisco Public Library, and member of the DPLA Board of Directors), Sam McBane Mulford, Susan HildrethFRIDAY, JUNE 26
[A] Looking to the Future: Strategic Foresight and Scenario Planning
9:00 AM – 4:00 PM / Moscone Convention Center, Room 228-230 (S)
Go beyond trend spotting and learn how professional futurists leverage strategic foresight tools and approaches to look and see the big picture of where we are headed. Join our presenters, two consultants trained in Foresight by the University of Houston, as they lead hands-on activities where you will learn tools and techniques you can leverage for creating future scenarios at your own organization.
Speakers: Jamie Hollier (Member of the DPLA Board of Directors), Jen Chang
[A] Building the New Nostalgia: Making the Case for Why Libraries Matter
2:00 PM – 3:o0 PM / Marriott Marquis San Francisco, Yerba Buena Salon 13-15
The new book “BiblioTech” argues that libraries are crucial institutions for serving Americans’ 21st century information needs, but that they are also at risk. Librarians and their allies explore how we can best position libraries to thrive in the digital age by leveraging existing—and new—assets amid dwindling government support.
Moderators: Carol Coletta (VP of Community and National Initiatives, John S. and James L. Knight Foundation), John Bracken (VP of Media Innovation, John S. and James L. Knight Foundation)
Speakers: Dale Dougherty (Founder and Executive Chairman, Maker Media), John Palfrey (Head of School, Phillips Academy, and former Chair of DPLA Board of Directors), Meaghan O’Connor (Assistant Director, Programs and Partnerships, District of Columbia Public Library)SATURDAY, JUNE 27
[A] Herding the fuzzy bits: What do you do after crowdsourcing?
8:30 AM – 10:00 AM / Moscone Convention Center, Room 133 (N)
Once you’ve invited the crowd to help, how do you use what they provide? Presenters will share ideas for incorporating crowdsource-enhanced data from many sources (flickr, transcription, twitter) back into collections, along with approaches–including “whoopsies” and remaining challenges–for quality control, data discovery, data disagreement, building communities, and scalability. The session will be interactive on Twitter locally and remotely, and will include fun activities to demonstrate some of the issues and methods at play. Additional information will be made available in an open online space for viewing and editing during the session at: http://s.si.edu/16ZfZ9b
Speakers: Grace Costantino (Outreach and Communication Manager, Biodiversity Heritage Library, Smithsonian Institution Libraries), Jacqueline Chapman (Digital Collections Librarian, Biodiversity Heritage Library, Smithsonian Institution Libraries), Martin Kalfatovic (Associate Director, Digital Services and Program Director, Biodiversity Heritage Library, Smithsonian Institution Libraries), Suzanne Pilsk, Head, Metadata Unit, Smithsonian Institution Libraries
[S] Data Clean-up: Let’s Not Sweep it Under the Rug
1:00 PM – 2:30 PM / Moscone Convention Center, Room 2022 (W)
Data migration is inevitable in a world in which technological infrastructures and data standards continue to evolve. Whether you work in a catalog database or a digital library/archives/institutional repository, working with library resource data means that you will eventually be required to usher data from one system or standard to another. Three speakers working in different library contexts will share their data normalization experiences.
Speakers: Amy Rudersdorf (DPLA Assistant Director for Content), Kyle Banerjee (Digital Collections and Metadata Librarian, Oregon Health and Science University), Terry Reese (Associate Professor, Head, Digital Initiatives, Ohio State University)SUNDAY, JUNE 28
[A] Getting Started with Library Linked Open Data: Lessons from UNLV and NCSU
8:30 AM – 9:30 AM / Moscone Convention Center, Room 2002 (W)
This program will focus on the practical steps involved in creating and publishing linked data including data modeling, data clean up, enhancing the data with links to other data sets, converting the data to various forms of RDF, and publishing the data set. At each step of the process, the speakers will share their experiences and the tools they used to give the audience multiple perspectives on how to approach linked data creation.
Speakers: Cory Lampert (Head, Digital Collections, University of Nevada, Las Vegas, and former DPLA Community Rep), Eric Hanson (Electronic Resources Librarian, North Carolina State University Libraries), Silvia Southwick, (Digital Collections Metadata Librarian, University of Nevada, Las Vegas)
[A] How to Work with Government Officials on Community Wide Issues
8:30 AM – 9:30 AM / Moscone Convention Center, Room 2016 (W)
A panel discussion of library leaders and local officials. Discussions will center around the library as an important community partner/leader and how we can lead change in our local community.
Speakers: Hydra Mendoza (Mayor’s Education Policy Advisor and SF Unified School District Board member), Karen Danczak Lyons (Director, Evanston Public Library), Luis Herrera (City Librarian, San Francisco Public Library, and member of the DPLA Board of Directors), Siobhan Reardon (Director, Free Library of Philadelphia), Wally Bobkiewicz (Evanston (Ill.) City Manager)
[A] Transforming Neighborhoods, One Library at a Time: The San Francisco Experience
10:30 AM – 11:30 AM / Moscone Convention Center, Room 121 (N)
The San Francisco Public Library’s Branch Library Improvement Program (BLIP) was the largest capital project in the library’s history and completed transformed the neighborhood branch system. During its 14 year span, the $200 million program under took the renovation of sixteen neighborhood libraries and eight new buildings. The completion of the ambitious program resulted in seismically safe, ADA accessible, 21st Century libraries. Each project required strong community engagement in the midst of one of the more politically charged cities in the nation.
Speakers: Charles Higueras, Jewell Gomez, Luis Herrera (City Librarian, San Francisco Public Library, and member of the DPLA Board of Directors), Mindy LinetzkyMONDAY, JUNE 29
[S][A] Digital Archiving for Humans
10:30 AM – 11:30 AM / Moscone Convention Center, Room 120 (N)
Socializing the archive! Hear about the overlaps and differences between traditional archives and digital startups engaging the social web to make material more interoperable, searchable and usable.
Speakers: Alexis Rossi (Director of Web Services, Internet Archive), Anne Wootton (Co-founder & CEO, Pop Up Archive), Dan Cohen (DPLA Executive Director), P. Toby Graham, (University Librarian and Associate Provost, University of Georgia Libraries)
The catchy all-encompassing titleCourtesy of Jirka Matousek (2012). Flickr
The title of the program is the catch. It serves as a brief description and hooks the interested party into reading the scope and objectives of the program. When a potential participant is browsing through a list of upcoming workshops from an e-mail, website or course catalog, certain terms/phrases will be the only reason for them to read the course description. “Building a Successful Website” is not as provocative as “Website Management with Google Analytics.” Usually the length of the ;8course name does not make a difference unless it requires two lines. Keep in mind your audience. Busy people are inundated with information. When you’re a member of multiple Listservs, you’ll receive an excessive amount of emails a day. I personally scan my list of new e-mails for subject lines that interest me, reading them and delete the rest. The title can function as a minor descriptor of what the course entails. It is also a summary of the main objectives of the course. If you’re only going to refer to Google Analytics for fifteen minutes during a two-hour workshop, then don’t put it in the title. Workshop participants will feel that you have wasted their time if you create a misleading description of your course.
Set objectives and goals upfront
The list of objectives can be a deciding factor. Providing a course outline ahead of time is an often overlooked concept. I personally like to pace myself by being aware of which topics will be included and for what length of time. There have been many times when, after receiving the course outline in class, I realize that the topic I was interested in is not being covered or is a small component of the lecture. I feel that workshop coordinator’s are either still revising their outline or guarding it like a trade secret. A lecture outline, with timetables, is a great resource for the attendee to have upfront and it also works as a time management tool for lecturers to prepare from. It’s a great organization tool for everyone involved.
Make time for questions
For short-term workshops answer questions after the workshop. Believe it or not, you can easily get off track and end up answering questions instead of meeting your objectives. If you are taking questions during a lecture, don’t hesitate to interrupt in order to get back on track. Also, asking participants to write down questions that come up, on a sheet of paper, for later is a great idea. As a presenter you should be prepared for a cold crowd. Sometimes participants don’t have immediate questions. Ahead of time, make a list of common questions that are asked about the topic. At the end of the workshop, if no one responds to your prompt for questions, be prepared to present those frequently asked questions. Provide your contact information so that they may contact you if they have follow-up questions after the workshop has ended.
The phrase “refreshments will be served” goes a long way
Food…and snacks. Everyone loves free food. Feeding a group of 30 can be pricey. If you charge a nominal fee for attendance it can be like crowd funding for group catering. Most workshops can cost upwards of $200 or more for attendance. If you charge everyone $10, it will be more inviting to attend and the payment easily covers the cost of catering for a sizable group. Refreshments go a long way and are highly appreciated during workshop breaks. One of the things about serving heavier food at workshops is that participants will be so busy trying to eat their meal that they won’t have time to mingle. Keep it light and simple. Additionally when serving food, consider food allergies, vegetarians, vegans and other special diets. In other words, don’t place peanut butter cookies next to the fresh fruit bowl.
There is always room for improvement
Conducting a user survey is one way to gauge the user experience of your attendees. You will want to know if any improvements are needed in terms of the presenter, handouts/materials, technology, seating arrangement, number of breaks, disability/accessibility accommodations, etc. You should also include an option to suggest other course topics they are interested in for the next class.
Rating systems are great, but don’t make them complicated. The goal is improvement, but you don’t want to make the process difficult or you will not receive thorough and complete responses. This would defeat the purpose and effort of conducting surveys. You may want to consider making them anonymous. Getting someone to participate in a survey that will be somehow associated with them may not be an easy task. Anonymity allows everyone to respond honestly without fear of the instructor/ coordinators knowing who they are. Consider if the format of the survey should be web or paper based. Web-based in easy and convenient. They are available for as long as your survey service will allow and can be fast and convenient for people to complete when they have time. Paper-based surveys are also effective and can be done in class. The workshop will be the best time to have their undivided attention. Scheduling time at the end of the workshop to conduct the survey is a beneficial option because participants will better recall their experience. Give the participants a reasonable amount of time to complete the survey. If deciding on paper, digitization for long term review is an option, but consider recycling. Years of using paper based survey’s can leave a hefty carbon footprint. The survey should focus on the class and not the instructor. A participant’s experience in the class will automatically be a reflection of their review of the class and the instructor. You can include a few questions about the instructor, but you want a survey that is evaluating the worth of the class.
If you build it, they will come. Do you have unique tips for creating a successful onsite workshop? Please share them in the comments section.
Part er of Amazon crawl..
This item belongs to: data/ol_data.
This item has files of the following types: Data, Data, Metadata, Text
From Emilio Lorenzo, Arvo Consulting
Tecnalia Research & Innovation Foundation, a technological applied-research centre with over 1500 employees, teamed up with Arvo Consulting, to build its Institutional Repository. The main objective was to expose and provide wider visibility to its multidisciplinary research results.
Part cf of Amazon crawl..
This item belongs to: data/ol_data.
This item has files of the following types: Data, Data, Metadata, Text
We do have a bit of a performance challenge with heavy faceting on large result sets in our Solr based Net Archive Search. The usual query speed is < 2 seconds, but if the user requests aggregations based on large result sets, such as all resources from a whole year, processing time jumps to minutes. To get an idea of how bad it is, here’s a chart for response times when faceting on a field with 640M unique values.
Yes, the 80M hits query does take 16 minutes! As outlined in Heuristically correct top-X facets, it seems possible to use sampling to determine the top-X terms of the facet result and then fine count only those terms. The first version of heuristically correct top-X facets has now been implemented (download the latest Sparse faceting WAR to try it out), so time for evaluation.Three facet fields
For this small scale evaluation we use just a single 900GB shard with 250M documents, generated from harvested web resources. The three fields of interests are
- domain, with 1 value/document and 1.1M unique values. Of these, 230K are only referenced by a single document. The most popular domains are referenced by 4M documents.
Intuitively, domain seems fitting for sampling, with relatively few unique values, not too many single instance values and a high amount of popular domains.
- url, with 1 value/document and 200M unique values. Of these, 185M are only referenced by a single document. The most popular urls are referenced by 65K documents.
Contrary to domain, url seems more problematic to sample, with relatively many unique values, a great deal of single value instances and not very many popular urls.
- links, with 10 values/document and 600M unique values. Of these, 420M are only referenced by a single document. The most popular links are referenced by 8M documents.
In between domain and url is links, with relatively many unique values, but only 10% of the 6 billion references being to single instance values and a with high amount of popular links.
Caveat lector: This test should not be seen as authoritative, but rather an indicator of trade-offs. It was done on a heavy loaded machine, so real-world performance should be better. However, the relative differences in speed should not be to far off (tested ad hoc at a time where the machine was not under heavy load).
11 very popular terms were extracted from the general text field and used as query term, to simulate queries, heavy in terms of the number of hits.Term Hits og 77M a 54M 10 50M to 45M ikke 40M søg 33M denne 25M også 22M under 18M telefon 10M indkøbskurv 7M
The top 25 terms were requested with facet.limit=25 and sampling was performed by using only part of the result set to update the facet counters. The sampling was controlled by 2 options:
- fraction (facet.sparse.heuristic.fraction=0.xx): How much of the total number of documents to sample. If fraction is 0.01, this means 1% or 0.01*250M = 2.5M documents. Note that these are all the documents, not only the ones in the result set!
- chunks (facet.sparse.heuristic.sample.chunks=xxx): How many chunks to split the sampling in. If chunks is 10 and fraction is 0.01, the 2.5M sample documents will be checked by visiting the first 250K, skipping ahead, visiting another 250K etc. 10 times.
To get a measure of validity, a full count was performed for each facet with each search term. The result from the samples runs were then compared to the full count, by counting the number of correct terms from the top to the first error. Example: If the fully counted result is
- a (100)
- b (80)
- c (50)
- d (20)
- e (20)
and the sample result is
- a (100)
- b (80)
- c (50)
- e (20)
- f (18)
then the score would be 3. Note that the counts themselves are guaranteed to be correct. Only the terms are unreliable.Measurements Facet field domain (1.1M unique values, 1 value/document)
First we sample using half of all documents (sample fraction 0.5), for varying amounts of chunks: c10 means 10 chunks, c10K means 10000 chunks. As facet.limit=25, highest possible validity score is 25. Scores below 10 are marked with red, scores from 10-19 are marked purple.Term Hits c10 c100 c1K c10K c100K og 77M 19 9 25 25 25 a 54M 20 4 25 25 25 10 50M 20 5 25 25 25 to 45M 18 14 25 25 25 ikke 40M 16 15 25 25 25 søg 33M 16 15 23 25 24 denne 25M 17 18 23 24 25 også 22M 17 12 25 25 25 under 18M 4 12 23 23 25 telefon 10M 16 8 23 23 25 indkøbskurv 7M 8 2 16 21 25
Looking at this, it seems that c1k (1000 chunks) is good, except for the last term indkøbskurv, and really good for 10000 chunks. Alas, sampling with half the data is nearly the full work.
Looking at a sample fraction of 0.01 (1% of total size) is more interesting:Term Hits c10 c100 c1K c10K c100K og 77M 4 9 24 23 25 a 54M 4 4 23 24 25 10 50M 3 4 23 25 20 to 45M 0 0 24 24 24 ikke 40M 5 13 25 24 25 søg 33M 0 0 20 21 25 denne 25M 0 0 18 22 23 også 22M 6 12 23 25 25 under 18M 3 4 22 23 24 telefon 10M 5 7 12 12 25 indkøbskurv 7M 0 1 4 16 23
Here it seems that c10K is good and c100K is really good, using only 1% of the documents for sampling. If we were only interested in the top-10 terms, the over-provisioning call for top-25 would yield valid results for both c10k and c100k. If we want all top-25 terms to be correct, over-provisioning to top-50 or something like that should work.
The results are viable, even with a 1% sample size, provided that the number of chunks is high enough. So how fast is it to perform heuristic faceting, as opposed to full count?
The blue line represents the standard full counting faceting, no sampling. It grows linear with result size, with worst case being 14 seconds. Sample based counting (all the other lines) also grows linear, but with worst case at 2 seconds. Furthermore the speed difference between the number of chunks is so small that choosing 100K chunks, and thereby the best chance of getting the viable results, is not a problem.
In short: Heuristic faceting on the domain field for large result sets is 4-7 times faster than standard counting, with a high degree of viability.Facet field url (200M unique values, 1 value/document)
The speed up is a modest 2-4 times for the url field, but worse the viability is low, even when using 100000 chunks. Raising the minimum result set size for heuristic faceting to 20M hits could conceivably work, but the url field still seems a poor fit. Considering that the url field does not have very many recurring values, this is not too surprising.Facet field links (600M unique values, 10 values/document)
The heuristic viability of the links field is just as good as with the domain field: As long af the number of chunks is above 1000, sampling with 1% yields great results. The performance is 10-30 times that of standard counting. This means that the links field is an exceptionally well fit for heuristic faceting.
Removing the full count from the chart above reveals that worst-case in this setup is 22 seconds. Not bad for a result set of 77M documents, each with 10 references to any of 600M values:Summary
Heuristically correct faceting for large result sets allows us to reduce the runtime of our heaviest queries by an order of magnitude. Viability and relative performance is heavily dictated by the term count distribution for the concrete fields (the url field was a poor fit) and by cardinality. Anyone considering heuristic faceting should test viability on their corpus before enabling it.Word of caution
Heuristic faceting as part of Solr sparse faceting is very new and not tested in production. It is also somewhat rough on the edges; simple features such as automatic over-provisioning has not been implemented yet.
Today I found the following resources and bookmarked them on Delicious.
- explainshell.com Write down a command-line to see the help text that matches each argument
Digest powered by RSS Digest
Time: 6:00 – 7:30 pm, Friday, June 26, 2015
Place: Marriott Marquis (map)
Room: Pacific H, capacity: 30
The MarcEdit user community is large and diverse and honestly, I get to meet far too few community members. This meeting has been put together to give members of the community a chance to come together and talk about the development road map, hear about the work to port MarcEdit to the Mac, and give me an opportunity to hear from the community. I’ll talk about future work, areas of potential partnership, as well as hearing from you what you’d like to see in the program to make your metadata live’s a little easier. If this sounds interesting to you — I really hope to see you there.Acknowledgements:
A *big* thank you to John Chapman and OCLC for allowing this to happen. As folks might guess, finding space at ALA can be a challenging and expensive endeavor so when I originally broached the idea with OCLC, I had pretty low expectations. But they truly went above and beyond any reasonable expectation, working with the hotel and ALA so this meeting could take place. And why they didn’t ask for it — they have my personal thanks and gratitude. If you can attend the event, or heck, wish you could have but your schedule made it impossible — make sure you let OCLC know that this was appreciated.