You are here

Feed aggregator

Cynthia Ng: Mozilla Festival 2015 Day 1: Opening Keynotes

planet code4lib - Sat, 2015-11-07 10:31
The first day of Mozilla Festival started off with a MozFest magic carpet ride talk from @amirad and how to make the most of MozFest before moving on to the keynotes. Mark Surman Small history of MozFest. The proto-Mozilla festival in 2010, called Mozilla Drumbeat with a hackfest and with tents out in the museum … Continue reading Mozilla Festival 2015 Day 1: Opening Keynotes

District Dispatch: Librarian of Congress now term limited

planet code4lib - Fri, 2015-11-06 22:23

With the stroke of a pen, the President has established for the first time a set term of office for the Librarian of Congress. Rather than serve for life, the next and all future Librarians will enjoy a 10-year term of office renewable for the same length of time upon reconfirmation by the Senate. Legislation authorizing the change, the Librarian of Congress Succession Modernization Act of 2015 (S. 2162), was both introduced in and passed in the Senate by unanimous consent on October 7. It was again approved by unanimous consent of the House less than two weeks later. The Librarian’s position remains vacant in the wake of James Billington’s resignation on September 30th. A successor has not yet been named. ALA has urged the President to appoint a credentialed librarian.

The post Librarian of Congress now term limited appeared first on District Dispatch.

FOSS4Lib Upcoming Events: Mid-Atlantic Fedora Users Group

planet code4lib - Fri, 2015-11-06 18:17
Date: Monday, November 30, 2015 - 08:00 to Tuesday, December 1, 2015 - 17:00Supports: Fedora RepositoryHydraIslandora

Last updated November 6, 2015. Created by Peter Murray on November 6, 2015.
Log in to edit this page.

From the website:
Our inaugural meeting will be a two-day event at the Chemical Heritage Foundation in Philadelphia, PA on Nov. 30th – Dec. 1st, 2015. The schedule offers a combination of community presentations, hands-on training with Fedora 4, a Hydra-in-a-Box focus group, and opportunities for self-organizing breakout sessions and workshops.

District Dispatch: DOL grants offer a window of opportunity

planet code4lib - Fri, 2015-11-06 14:32

The U.S. Department of Labor’s Employment and Training Administration is seeking grant applications from community organizations, including libraries.

Libraries are already in the business of serving their communities. Librarians may not be aware, however, that the U.S. Department of Labor (DOL) offers a wide variety of grants to organizations and other entities, such as libraries, that are able to match up their resources with the variety of DOL training, educational and employment programs aimed to prepare and equip the workforce for the 21st Century.

DOL’s Employment and Training Administration web portal offers a wealth of grant opportunities. There you’ll find a wide range of programs listed, many of which provide multi-million dollar grants. The nature of grant opportunities vary greatly, from programs to help give veterans a leg-up as they prepare after active duty to re-enter the workforce, to those that connect unemployed youth with training and certification programs to better arm them with the skills to successfully compete for jobs. Others seek data to better inform the government about those out of the workforce seeking employment.

The website also features a section that answers common questions about the grant application process, as well as tips toward a successful grant application. It also shares examples of grants awarded, as well as “how to” on filling out the financial information correctly.  It also provides guidance on how to make sure you include all the elements required for a complete proposal. Take a moment to check it out…you might find a window to possible resources that align with the knowledge, experience and strengths you can bring to bear.

The post DOL grants offer a window of opportunity appeared first on District Dispatch.

Journal of Web Librarianship: A Review of "Rightsizing the Academic Library Collection"

planet code4lib - Fri, 2015-11-06 01:09
Bradford Lee Eden

Journal of Web Librarianship: A Review of "Social Media for Creative Libraries"

planet code4lib - Fri, 2015-11-06 01:08
Robert J. Vander Hart

OCLC Dev Network: November 8 System Maintenance

planet code4lib - Thu, 2015-11-05 20:30

Scheduled maintenence affecting WSKey will occur on 11/8/2015 from 9pm to 10:00 pm EST.

David Rosenthal: Cloud computing; Threat or Menace?

planet code4lib - Thu, 2015-11-05 16:00
Back in May The Economist hosted a debate on cloud computing:
Big companies have embraced the cloud more slowly than expected. Some are holding back because of the cost. Others are wary of entrusting sensitive data to another firm’s servers. Should companies be doing most of their computing in the cloud?It was sponsored by Microsoft, who larded it with typical cloud marketing happy-talk such as:
The Microsoft Cloud creates technology that becomes essential but invisible, to help you build something amazing. Microsoft Azure empowers organizations with the creation of innovative apps. Dynamics CRM helps companies market smarter and more effectively, while Office 365 enables employees to work from virtually anywhere on any device. So whether you need on-demand scalability, real-time data insights, or technology to connect your people, the Microsoft Cloud is designed to empower your business, allowing you to do more and achieve more. Below the fold, some discussion of actual content.

Arguing "yes" was Simon Crosby, and "no" was Bruce Schneier, who also posted a three part essay on his blog. Crosby's opening statement for the "yes" side starts:
Running a given computing workload in the cloud, rather than on a company’s own information-technology (IT) infrastructure, yields little or no cost advantage today.Schneier's for the "no" side starts:
The economics of cloud computing are compelling. For companies, the lower operating costs, the lack of capital expenditure, the ability to quickly scale and the ability to outsource maintenance are just some of the benefits.Schneier ends by saying:
In the future, we will do all our computing in the cloud: both commodity computing and computing that requires personalised expertise. But this future will only come to pass when we manage to create trust in the cloud.So even Schneier on the "no" side thinks that the cloud is inevitable, but he zeros-in on the key question, why should anyone trust the cloud? He identifies the key areas in which trust is currently lacking:
  • Control: "Cloud computing is cheaper because of economics of scale, and—like any outsourced task—you tend to get what you get." The result is limited scope for customization. And, as Backblaze demonstrates, you don't have to be very big to get most of the economies. And, remember, with cloud services such as Amazon's, you aren't getting all the economies of scale, just the part left over after Amazon's margins.
  • Security: Crosby writes "Today’s IT infrastructure is a Swiss cheese of vulnerable networks, operating systems and applications developed before the internet. It is difficult and expensive to keep running—and easy to penetrate. In 2014 Verizon reported more than 2,100 data breaches." Schneier admits that "For most companies, the cloud provider is likely to have better security than them—by a lot. All but the largest companies benefit from the concentration of security expertise at the cloud provider." But he points out that "a large cloud provider is a juicier target. Whether or not this matters depends on your threat profile. Criminals already steal far more credit-card numbers than they can monetise; they are more likely to go after the smaller, less-defended networks. But a national intelligence agency will prefer the one-stop shop a cloud provider affords. That is why the National Security Agency (NSA) broke into Google’s data centres."
  • Accountability: Schneier calls this area "trust" but I think accountability describes it better. He writes: "I know that, at least in America, [cloud providers] can sell my data at will and disclose it to whomever they want. It can be made public inadvertently by their lax security. My government can get access to it without a warrant." And he points out "Try asking either Amazon Web Services or to see the details of their security arrangements, or even to indemnify you for data breaches on their networks."
Ludwig Siegle, the moderator, summed things up:
Simon Crosby did a great job in explaining the business imperatives for moving into the cloud. Bruce Schneier convincingly laid out the reasons why many firms will take their time to make that step: they do not feel entirely comfortable with living in the computing skies.He is right. It was a good debate and worth reading, because both sides made good arguments about general business use of the cloud. I'm still strongly of the opinion that, for digital preservation (PDF), the cloud can at most be one component of a hybrid system. I'm sorry it took me so long to get around to blogging abuout it.

DPLA: DPLA Announces Appointment of Sarah Burnes to Board of Directors

planet code4lib - Thu, 2015-11-05 15:50

The Digital Public Library of America is pleased to announce the appointment of Sarah Burnes to its Board of Directors. Burnes is an agent for The Gernert Company, a prominent literary agency located in New York City.

After stints in the editorial departments of Houghton Mifflin, the Knopf group, and Little, Brown, Sarah Burnes became an agent in 2001. Joining The Gernert Company in 2005, she now represents adult fiction writers (Alice McDermott and Tony Earley among them); children’s fiction writers (New York Times bestsellers Margaret Stohl and Pseudonymous Bosch); and journalists and critics (New York Times Magazine contributor Jon Gertner and Freeman’s John Freeman). The awards her writers have either won or been shortlisted for include the National Book Award, the Pulitzer Prize, the Story Prize, the Los Angeles Times First Book Prize, the Whiting Writer’s Award, and the Barnes and Noble Discover Award; and they have received grants and fellowships from the Guggenheim Foundation, the Lannan Foundation, and the National Endowment for the Arts, among others. Sarah also sits on the board of the non-profit progressive publisher The New Press and lives with her husband and three children in Brooklyn, NY.

“Sarah’s deep knowledge of authors and publishing coupled with her commitment to learning and reading makes her an ideal board member for the DPLA,” said Amy Ryan, DPLA Board Chair. “We welcome her ideas, expertise and vision as DPLA expands our vision to reach out to children throughout the country.”

“Sarah’s incredible experience, intelligence, and the way she understands and connects with authors make her a wonderful addition to the DPLA board,” said executive director Dan Cohen. “She will undoubtedly help us make sense of the changing landscape for writers and their readers.”

Working closely with Cohen, the Board seeks to fulfill DPLA’s broad commitment to openness, inclusiveness, and accessibility, and it endeavors towards those ends in the best interest of its stakeholders, employees, future users, and other affected parties. The Board supports the DPLA’s goal of creating and maintaining a free, open, and sustainable national digital library resource.

LITA: Follow Up Post to: Is Technology Bringing in More Skillful Male Librarians?

planet code4lib - Thu, 2015-11-05 14:48

My main motive for my recent post was to generate discussion on the topic of stereotypes of male librarians, technology, and our profession.  It can get lonely as a writer when you do not have exchange with readers.  It was not meant to be an opinion piece.  I wanted to move away from posting on a technology review or share something I tried at my library.  I wanted to present information I found while reading.  These negative views of our profession are alive and well in our society – to not write about it is to sweep it under the rug.

It may be an exploration of my own experience.  I live it every day.  I am a 40 year old male librarian who fits the stereotype and all these stereotypical elements point to someone who is less than.  When I tell someone that I am a librarian, I get the “you must read a lot” comment which insinuates that my job is not that important if I am leisurely reading passively. Or that librarianship is a “women’s profession” and not worthy of respect.  Or I could not make it in a more stressful, rigorous career environment, cell_phone_spyingso librarianship became my default.  Being a librarian was my first choice and I continue to love this profession.  Only recently have I seen a shift in reactions, since I work at a College of Medicine.  Since medicine has a higher reputation, I get some more respect and aww.   I am a father and married to my lovely wife, and I hold the opinion that our sexuality is fluid and not a box you can check off.  I do not follow or play sports.  I am not a manly man.  I love to read and consider myself scholarly.  I wear thick plastic glasses on purpose and did before the fad and will continue after the fad fades.  I am categorized as brown or colored in some parts of the nation.  All these elements make me less than in society’s eyes.

These are elements that affect the way we are perceived, affecting our salaries, seat at tables, and, most importantly, the level of respect our profession receives from the outside world.


I do recommend reading this month’s ALA article in  American Libraries magazine, The Stereotype Stereotype: Our Obsession with Librarian Representation,  that goes into the topic further at 

Coral Sheldon-Hess: On chronic illness (and other disabilities) as perceived imposition

planet code4lib - Wed, 2015-11-04 23:21

This is another entry in a series of posts on the mental aspects of chronic illness (for me; I speak only for myself). The image above (which says “You’re not a burden. You’re a human.”) is available as a card from Emily McDowell.

We have this ideal, in American (western?) society, of a “low maintenance” person, and it feels to me like this ideal is placed especially heavily on women’s shoulders.* We should be easygoing, never complaining; and whatever is offered to us should always be enough, should be accepted with gratitude. We must never impose on others. Being properly low-maintenance seems to also require that, if one suffers, they do so in silence. (OK, that one is applied equally across genders, I think.)

It’s a little bit Puritan and a lot … whatever you call the school of thought that yearns for the “good old days” and just wants things to be simple and straightforward. (Never mind that the past was not as simple or as great as these nostalgic folks seem to believe.) Maybe it’s conservative? Or regional? But wherever it comes from, it suffused so many of my childhood teachings that it became a large part of my behavior and preconceptions of others’ behavior well into adulthood.

I think I’m having trouble describing it, because it was like water to a goldfish for me, for a long time. Parts of it probably still are, as I’ll describe below.

The thing I should make clear, at this point, is that anybody who has a disability of any kind is, kind of by definition, “high maintenance.” We have to do a lot of extra work (maintenance) to keep our bodies functioning. We sometimes (often? always?) need different affordances than people who don’t have disabilities, and if you think of able-bodied people as default, then it’s easy for you to think of our needs as impositions.

And the world around me has always thought of able-bodied people (and people without illnesses or dietary restrictions or allergies) as default. For a long time, I’m ashamed to admit, so did I.

In some ways, I still do it—I treat my disabilities as impositions, as something I should apologize for—it’s hard to leave a worldview entirely behind, right? I take an apologetic stance about everything from my inability to wear most shoes (not just high heels, but most flats and even many tennis shoes) to my lack of energy for attending every possible event I might be invited to. (, why do you not have a “maybe” option? Seriously.)

Although I’ve only recently come to think of it in these terms, avoiding being labeled “high maintenance,” or an imposition, due to a disability has been a life-long struggle for me, because I have really, really bad pet (cat, dog, rabbit) allergies—like, serious asthma attack bad. Countless times, I have apologized profusely to people when I couldn’t spend time at their houses, because their pets make me physically ill, or—this is how serious this social conditioning was—I have gone anyway and felt miserable and excused myself to another room when I needed to use my inhaler, so that I would seem like less trouble. I have felt—and many people have subtly made it clear that they agreed—that my disability** was my fault and an imposition on them. Even my current landlord, no joke, put the word “allergies” in scare-quotes in an email and acted like my spouse and I were trying to swindle her when we asked her to remove the pet-stained carpet in the house we’re renting, because it smelled like dogs and was full of dander and was making us sick. (She told us she would replace all the carpets before we moved in. We didn’t get that in writing, so everything is horrible.)

Maybe pet allergies are a special thing; pets are basically family members, and here I am telling people that I can’t be in their houses because of them. I’m sure that sounds a lot like blaming the pets, or not liking them, and I get how that could make someone angry. In reality, I love cats and dogs (and rabbits and birds and lizards and hedgehogs and pretty much all animals, really), but, I admit, not being able to spend time near certain animals is a real damper on my exuberance.

So let’s talk about food, instead of pets. Maybe that’s closer to universal. (I’ll get to arthritis in a bit.)

Maybe this is a weird example, but although I remember practically nothing about the movie “Twister,” all these years later, I still remember how my gut wrenched at this scene, because I was vegetarian at the time (against my family’s wishes) and because I fell for the (rather good) evocation of simplicity and of family and friends-who-grow-into-a-chosen-family and of home and of … just lots of good things that I felt I couldn’t be a part of, because I didn’t eat steak; the lady who raised her eyebrows at the steak was so clearly an outsider, in the scene, and that hurt.

The thing is, vegetarianism was, for me, a choice; since then, although I have gone back to eating meat, I have also developed quite a few food allergies and sensitivities, so I’m in the same boat for a different reason. I imagine that clip would have hurt worse, when I was a kid, if I’d had an allergy to one of those foods.

(I should say, I’m over it now. Happily, I am far less credulous than I was as a teenager. Honestly, it’s a weird thing for me to remember. I should also say, as far as I know, vegetarianism is not ever really a disability, though there are people with allergies to various meats. Still, although neither is a disability, I believe vegetarianism and veganism are things we should respect, in the interests of inclusion.)

This is from 2013, and it boils my blood.

For a more recent example, check out this cartoon. Now, I get that the goal of this is more to skewer liberal parenting, or something, than kids with food allergies. But imagine you are a kid with food allergies, and you see this. Or not even a kid; this one got my goat, and I’m fairly information literate and used to awful rhetoric. Imagine how much it would hurt to have your legitimate needs—”if I eat this and don’t get to a hospital in time it will probably kill me”—lumped in with clearly ridiculous demands like “gender-neutral candy” and “caramel-phobia.”

The cartoon artist should be ashamed: picking on people with food allergies, even just catching them in the spray of some other social commentary, is kind of punching down instead of up, and picking on children is [figuratively] punching literally down. Children with food allergies should be protected, never made fun of.


As far as the arthritis goes, my main feelings of imposition—and subsequent apologies—have mostly had to do with not having the energy to be social. Every now and then there’s awkwardness about not being able to sit on the ground. (To be clear, I actually can sit on the ground. But standing up from it, with an arthritic wrist and knees, can be a challenge, and some days I’m not willing to risk it.) I don’t think I’ve ever actually apologized for that, but sometimes I want to, like when I’m visiting with my little niece and nephew.

And then there are weird one-off things. Like, I had an interview, a while back, and a few days before it happened they sent me the schedule, which had two building tours on it; it didn’t occur to them to ask if I needed accommodation, so I ended up writing a kind of awkward email, asking if I could wear sneakers for the tours. I didn’t disclose a disability, just referred to the foot issue as “an injury” (which is true enough; 3 years ago, I injured it, and because of arthritis it never healed). They were super cool about it, but part of me wonders if that affected my chances at the job. (I didn’t get it. It’s OK. It wasn’t Pittsburgh-based, and I am glad I live here now.)

Or, I don’t know, I went to a corn maize with some friends and almost couldn’t climb the hay bale “steps” into the top of the barn, where there was a huge slide. (I’m glad I made it; it was fun!) I don’t remember if I apologized, but it was awkward, because I also don’t think they knew I had arthritis, at that point. So maybe they just thought I was very out of shape? Again, nobody was mean about it—we’re all still friends—but I felt awkward and misunderstood.

But that’s kind of what it always comes down to: lots of time I either have to disclose my disability, or else offer some alternative explanation for why I am falling outside of people’s expectations for “normalcy.” And with that explanation, by default, comes an apology, because I’m worried people will feel imposed upon. Even friends. Even family.

These attitudes that privilege those who have the luxury of being “low maintenance,” that treat disabilities and differences of any kind, including allergies, as impositions? They are incredibly harmful.

Please understand that there are people with chronic illnesses, disabilities, depression, allergies, autism, etc.—there are lots of people who might fall outside of your defaults.

There are too many examples of people treating others’ disabilities as inconveniences or added costs. (On that last link: I’m not slamming Nina for writing that very good post; I’m slamming the folks who ask the question she was answering.)

So perhaps when I say that I am apologetic when I shouldn’t be, that isn’t quite right. I am apologetic when I shouldn’t have to be, but when most of the implicit signals I receive, both day to day and in the moment, suggest that I should be. Still, it’s a habit I’m trying to break.


* Maybe it’s worse where I grew up, or in my family. Maybe other people don’t feel this as strongly as I do, and it’s blown out of proportion in my head? (up)

** The Americans with Disabilities Act includes allergies in its protections. If you have a library cat, or your office allows dogs other than service animals, maybe think about who you’re excluding. Legally, you can be asked to change that policy. Personally, as someone who is in danger if I go somewhere full of cat or dog dander, I am asking you. Please. (up)

Patrick Hochstenbach: Jakarta Street View

planet code4lib - Wed, 2015-11-04 21:25
Filed under: Doodles, Sketchbook Tagged: copic, doodle, doodles, jakarta, Photoshop, sketchbook, urbansketching

LITA: Jobs in Information Technology: November 4, 2015

planet code4lib - Wed, 2015-11-04 19:53

New vacancy listings are posted weekly on Wednesday at approximately 12 noon Central Time. They appear under New This Week and under the appropriate regional listing. Postings remain on the LITA Job Site for a minimum of four weeks.

New This Week:

Serials Librarian, University of Arkansas, Fayetteville, AR

Head, Technology Systems and Support Services, Massachusetts Institute of Technology, Cambridge, MA

Vice President for Libraries & Information Technology Services, CUNY Queens College, Flushing, NY

Visit the LITA Job Site for more available jobs and for information on submitting a job posting.

NYPL Labs: Emigrant City: An Introduction

planet code4lib - Wed, 2015-11-04 17:09

NYPL Labs and the Irma and Paul Milstein Division of United States History, Local History and Genealogy are excited to announce the launch of Emigrant City, the Library's newest, online participatory project. Emigrant City invites you to help transcribe recently digitized mortgage and bond record books from the Library’s collection of Emigrant Savings Bank records. Your transcriptions will help make the materials digitally accessible to all, including genealogists, educators, historians. In the process, you'll get a detailed glimpse of real estate transactions and immigrant life during a foundational period of New York City's history. Help the Library build this exciting new resource!

Still operating today, Emigrant Bank is the oldest savings bank in New York City and the ninth-largest privately owned bank in the country. It was founded in 1850 by 18 members of the Irish Emigrant Society with the goal of serving the needs of the immigrant community in New York. NYPL’s Manuscripts and Archives Division houses the Library’s collection of the early records from the bank. The collection’s first mortgage record is dated February 20, 1851. From the mid-19th century through the 1920s, there are an estimated 6,400 mortgages, each telling a story of upward mobility in a rapidly expanding city. (Two of these stories are found in mortgages 1 and 87, belonging to Francis A. Kipp and Mary O'Connor, respectively. Stay tuned for a forthcoming blog post detailing their stories.)

These real estate records have remained largely invisible and difficult to search. However, the full Emigrant Savings Bank collection is frequently consulted by genealogists and historians, among others. This collection contains a wide variety of materials about the bank's depositors and borrowers, including minutes of the board of trustees and finance committee. Portions of this larger collection, the test books,  have even been digitized and made available through (This resource is available onsite at all NYPL locations.) Through digitization of the real estate records, and transcribing the hand-written information they contain, we hope to expose this underused portion of the collection to enable new discoveries and research.

Emigrant City is also an experiment. Digitizing materials is much more than simply creating a digital image of a manuscript or artifact. Though computers have made fantastic advances in automatically converting digitized pages into searchable text, vast troves of information exist in libraries and archives that require careful human labor to unlock their deeper contents to search engines and digital researchers. So here at NYPL Labs, we’ve been working with the citizen science mavens at Zooniverse, with generous support from the National Endowment for the Humanities, to prototype of a highly configurable crowdsourcing framework called Scribe that could be used on a wide range of historical and archival material.

Emigrant City joins a collection of crowdsourcing projects launched by NYPL in recent years, including Building Inspector and What's on the Menu? Go to to get started! There are lots of records to go through, and when finished, we’ll have a robust data-set of verified, structured data. Meanwhile, the team is working to create browsing and bulk download options for this wealth of information. With the growing data set, we’ll be able to find myriad stories, like those of Francis Kipp and Mary O'Connor, and to ask innumerable questions. It’s a lot of work, but we’re confident we can do it. Join us!

Roy Tennant: EBook Reader Ownership Falls. Duh.

planet code4lib - Wed, 2015-11-04 16:56

A Pew Research Center survey has discovered something that some might be surprised to read: “Today, about one-in-five adults (19%) report owning an e-reader, while in early 2014 that share was a third (32%).” This is quite a notable drop, especially considering that MP3 player ownership has dropped only slightly in the same period. One could argue that a smartphone is an excellent replacement for an MP3 player, but is a less than satisfactory replacement for an ebook reader.

Tablet ownership (45% of U.S. adults) is much higher than that of ebook readers, but of course as we all know tablets can be quite serviceable ebook readers. In fact, I called single purpose devices dead upon the arrival of Apple’s iPad, and although it has taken much longer than I expected, the trend surfaced by Pew seems to bear out that prediction. I know that I happily use my iPad as an e-reader, and I know that many others do as well.

The fact that I can also stream video, play music, do online banking, surf the web, etc. makes an ebook reader begin to sound like the brick that more people are discovering that it is.

Islandora: Dispatches from the User List: Embedding objects, CPU loads, and ingest performance improvements

planet code4lib - Wed, 2015-11-04 15:21

The Islandora listserv is a great place to get help, let the community know about interesting things you're working on, and seek collaborators. This week (as we have done in the past), we'll take a quick look at some conversations going on that more people should know about.

Embedding islandora content in other sites is a use case brought up by Jennifer Eustis from the University of Connecticut. They have users who might like to grab an Islandora object, copy a line of code, and plop it into other sites running of different platforms - jut s as you can do with a Youtube video or a Google map. Turns out there's been a JIRA ticket for this feature since June, where Nick Ruest notes that the University of Oklahoma has already started some work, using oEmbed and the Drupal oEmbed module, which could be generalized for the Islandora community. If it's something you might want to use too, chime in and add your use case. Certainly there are multiple ways of tackling this need - Donald Moses and Paul Pound from UPEI built their own Islandora Video Filter module to accomplish site-wide embedding of videos.

The thread CPU Load was Went up sharply! contains some brilliant troubleshooting by University of North Carolina Charlotte's Brad Spry, after a user reported problems with CPU load for an Islandora site that had to support 10,000 users. The original problem is still under investigation with further support from some other volunteer troubleshooters, but Brad's tools and methods for diagnosing and alleviating server load issues have broad applications for other sites and are well worth exploring if you have experiences similar issues.

In another example of volunteer troubleshooters being awesome, Eric Koester from Andrews University went to the listserv to get advice on options to improve overall ingest timeframes, and Diego Pino and Brad Spry delivered some options that are worth just quoting here:


First: RAM. Derivatives and ingestion of binaries is Memory consuming. The more fine tuned you have your java env, the more speed you will get. @Brad Spry has a deep knowledge on this. Second: Logging. Generating logs is good for debugging and understanding what is happening, but if you already have everything tuned and working, tested, etc, my experience is that if you have too fine logs for fedora, gsearch, solr and catalina, then this will also add some ingestion time. Third: if you disable gsearch (even ActiveMQ if needed) on massive ingestion, tenable afterwards and do reindexing manually, speed up is gained also. Same for derivatives, good idea to do them offline.   But there are also other options here:   a.1) you can batch ingest only metadata first, then put together a script for completing the binary datastreams (keeping track of the PIDS) using fedora client (look at a2 for ideas)   a.2) @Giancarlo Birello has some good info on batch ingesting (using external tools to islandora) They have a lot of books and they do derivatives outside Islandora.  a.3) They also have a taverna workflow.   b) Fedora allows read-only replication. This is very useful because you can have an master that gets the ingestion and some "clones - slaves" that serve (using a journaling system) read online to the outside world. Since the slaves get all the activeMQ messages, they do also gsearch indexing.   c) You can also easily /but time consuming rebuild a parallel Fedora server using only the object store (Akubra or the legacy) by shutting down one fedora, copying that folder to another Fedora, rebuild, start. You can copy  ActiveMQ messages still waiting for being processed if you wan't, but i think in your case b) is more optimal.   Also, other way, the way we do things, is to have multiple REPO's acting as "one to the public" by sharing a common Solr collection using, e.g Solr Cloud. So you can split your work on different servers and expose at least global search via a common search.   Lastly but very important. It's a good to take Fedora4 and Islandora2 in consideration. Fedora4 resolves a lot of the issues regarding distributed scenarios and concurrent ingesting, and @Daniel Lamb has come up with some very interesting implementations based on Camel and also directly on php (Chullo) to manage  your problems. We are on a development stage where use cases and of course involvement (developers from the community are very needed) is a must, so i encourage you to get involved.


One performance tip: If you place your object upload location in close proximity to Drupal's temp and Fedora's temp, some like ingest file operations can happen on the same drive instead of having to copy files across the system bus between multiple drives.    

This particular issue has a noticeable effect on derivative generation performance:

...but I'm shifting my hope to Islandora 2.0 and Fedora 4.0 for ultimately resolving that issue.   Every Islandorian shares the same desire for the very best ingest and derivative generation performance!  

There are also some very advanced Islandora implementations, like Diego mentioned, which background and offload derivative generation processes.  You can read more about the characteristics of such a configuration here:

To your question of creating a fleet of Islandora boxes for simultaneous Fedora ingest, that is an intriguing possibility...  If each system could utilize the same MySQL and same filesystem, it sounds feasible; it certainly inspires curiosity :-)    On AWS, one can use RDS for centralized MySQL and EFS for a true shared filesystem, but EFS is still in preview mode and not released for production, YET.    S3 is not appropriate for Fedora's objectStore and resourceIndex, this much I learned the hard way.    But an autoscaling fleet of ingest servers has a definite appeal, for sure :-)

It is absolutely possible to have single master "ingestion" box and then copy the results to a live production server at night; that pretty much describes my current implementation.   I have such a strategy with a built-in safety mechanism, which only allows a full sync (the Tomcat side) to happen if NO ingest or BagIt writing operations are detected:

drush_ready=$(ps aux 2>/dev/null |grep drush 2>/dev/null |wc -l)
loadingdock_ready=$(/usr/bin/lsof /mnt/island1-loadingdock | grep -e "[[:digit:]]\+[wu]\{1\}" |wc -l)

if (( $drush_ready == '0' || $drush_ready == '1' && $loadingdock_ready == '0'))

#full sync

I came up with a "heartbeat" style strategy to communicate with the receiving system exactly what is about to happen.   If a full sync is detected, the receiving system will shutdown Tomcat in anticipation of full synchronization.   After the full sync is complete, the receiving system rebuilds its Fedora Resource Index and starts itself up.    It can be done!

Want more? Sign up or browse the arhicves on Google Groups


Subscribe to code4lib aggregator