You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 17 hours 1 min ago

Terry Reese: MarcEdit 6 Update Posted

Wed, 2014-12-17 06:25

A new update has been posted.  The changes are noted below:

  • Enhancement: Installation changes – for administrators, the program will now allow for quite installations and an option to prevent users from enabling automated updates.  Some IT admins have been asking for this for a while.  The installation program will take an command-line option: autoupdate=no, to turn this off.  The way this is disabled (since MarcEdit manages individual profiles) is a file will be created into the program directory that if present, will present automatic updates.  This file will be removed (or not recreated) if this command-line isn’t set – so users doing automated installations will need to remember to always set this value if they wish to prevent this option from being enabled.  I’ve also added a not in the Preferences window noting if the administrator has disabled the option.
  • Bug Fix: Swap Field Task List – one of the limiters wasn’t being passed (the process one field per swap limiter)
  • Bug Fix: Edit Field Task List – when editing a control field, the positions text box wasn’t being shown. 
  • Bug Fix: Edit Field Regular Expression options – when editing control fields, the edit field function evaluated the entire field data – not just the items to be edited.  So, if I wanted to use a regular expression to evaluate two specific values, I couldn’t.  This has been corrected.
  • Enhancement: Linked Data Linker – added support for FAST headings. 
  • Bug Fix: Linked Data Linker – when processing data against LC’s id.loc.gov, some of the fuzzy matching was causing values to be missed.  I’ve updated the normalization to correct this.
  • Enhancement: Edit Subfield Data – Moving Field data – an error can occur if the field having data moved to is a control field, and the control field is smaller than the position where the data should be moved to.  An error check has been added to ensure this error doesn’t pop up.
  • Bug Fix: Auto Translation Plug-in – updated code because some data was being dropped on translation, meaning that it wouldn’t show up in the records.

Update can be found at: http://marcedit.reeset.net/downloads or via the automated updating tool.  The plug-in updates can be downloaded via the Plug-in Manager within the MarcEdit Application.

–tr

Information Technology and Libraries: Usability Testing for Greater Impact: A Primo Case Study

Wed, 2014-12-17 05:00

This case study focuses on a usability test conducted by four librarians at Texas Tech University. Eight students were asked to complete a series of tasks using OneSearch, the TTU Libraries’ implementation of the Primo discovery tool. Based on the test, the team identified three major usability problems, as well as potential solutions. These problems typify the difficulties patrons face while using library search tools, but have a variety of simple solutions.

 

Library Tech Talk (U of Michigan): Practical Relevance Ranking for 11 Million Books, Part 3: Document Length Normalization

Wed, 2014-12-17 00:00
Relevance is a complex concept which reflects aspects of a query, a document, and the user as well as contextual factors. Relevance involves many factors such as the user's preferences, task, stage in their information-seeking, domain knowledge, intent, and the context of a particular search. This post is the third in a series by Tom Burton-West, one of the HathiTrust developers, who has been working on practical relevance ranking for all the volumes in HathiTrust for a number of years.

District Dispatch: Families and Work Institute seeks library and museum input

Tue, 2014-12-16 19:13

If your library or museum has organized programs to help children develop their skills and their mind, the Families and Work Institute (FWI) asks that you fill out their brief survey. Data pulled from this survey will aid FWI in developing a nation report on the roles that libraries and museums play in supporting children, families, and the professionals who work with them.

For the past 14 years, Mind in the Making, a program of FWI, has been sharing research on what we can do to help children thrive now and in the future. They have been calling attention to the importance of early brain development and promoting Executive Function skills, which study after study reveals have been a critical missing piece in efforts to promote school readiness and school, work and life success.

Surveys should be submitted by December 22, 2014 to be included. More information is available at the IMLS blog, Museums and Libraries: Be a Part of our Brain Building Journey.

The post Families and Work Institute seeks library and museum input appeared first on District Dispatch.

Jenny Rose Halperin: Leaving Mozilla as staff

Tue, 2014-12-16 16:58

December 31 will be my last day as paid staff on the Community Building Team at Mozilla.

One year ago, I settled into a non-stop flight from Raleigh, NC to San Francisco and immediately fell asleep. I was exhausted; it was the end of my semester and I had spent the week finishing a difficult databases final, which I emailed to my professor as soon as I reached the hotel, marking the completion of my coursework in Library Science and the beginning of my commitment to Mozilla.

The next week was one of the best of my life. While working, hacking, and having fun, I started on the journey that has carried me through the past exhilarating months. I met more friendly faces than I could count and felt myself becoming part of the Mozilla community, which has embraced me. I’ve been proud to call myself a Mozillian this year, and I will continue to work for the free and open Web, though currently in a different capacity as a Rep and contributor.

I’ve met many people through my work and have been universally impressed with your intelligence, drive, and talent. To David, Pierros, William, and particularly Larissa, Christie, Michelle, and Emma, you have been my champions and mentors. Getting to know you all has been a blessing.

I’m not sure what’s next, but I am happy to start on the next step of my career as a Mozillian, a community mentor, and an open Web advocate. Thank you again for this magical time, and I hope to see you all again soon. Let me know if you find yourself in Boston! I will be happy to hear from you and pleased to show you around my hometown.

If you want to reach out, find me on IRC: jennierose. All the best wishes for a happy, restful, and healthy holiday season.

District Dispatch: Down with the divide – support the E-rate program

Tue, 2014-12-16 16:35

Librarians know how essential the E-rate has been and will be to meeting their communities’ needs for high-speed, broadband internet service and public access to the internet in the 21st century. That’s why ALA fought hard to create the program and, for the past 18 months, to encourage the Federal Communications Commission to dramatically increase the program’s funding and streamline its application procedures.

Libraries did it! Starting in 2015, an additional $1.5 billion will be available to libraries across the country that they can use to further narrow and ultimately close the digital divide . . . and funds will be easier to apply for.

According to the FCC, the E-rate modernization will make the program more efficient, maximize the use of ratepayer funds, and will provide support for libraries and schools across the country.  An FCC fact sheet notes that “the demand for broadband is growing at least 50% per year, which means that total bandwidth costs will continue to grow even with significant broadband price reductions…Chairman Wheeler’s…$1.5 billion cap increase is consistent with all schools and libraries achieving the long-term goals…because Wi-Fi within every classroom and library space is an essential element of 21st century learning.”

Some Members of Congress have expressed concerns with the FCC action and may not fully appreciate the urgent need in our library community for E-rate modernization. Some have gone as far as questioning the justification of the E-rate program’s existence at all.

Your help is needed to ensure Congress does not overturn the additional E-rate funding for our patrons. We urge you to contact your members of Congress during the December recess and inform them of what services E-rate enables you to supply to your community.

Our message to Congress:

  • Libraries across the country are far behind the broadband capacity they need. A 21st century E-rate program with additional funding will allow libraries to offer state-of-the-art connectivity and critical services to patrons. Many patrons can only access the internet through libraries.
  • The current 20th Century E-rate program has failed to keep pace with inflationary cost increases and has resulted in cost-prohibitive commercially available connectivity. Bringing the program into the 21st Century ensures libraries can secure affordable high-speed connectivity for their patrons.
  • The increasing demands on patrons to connect to the internet – for employment and entrepreneurship, education, community engagement, and individual empowerment – has placed tremendous need for greater bandwidth and faster access.
  • E-rate modernization benefits patrons at libraries of all sizes and in communities across the country, whether urban, suburban, or rural.
  • Please provide Congress with examples of the range of programs and services you offer to patrons benefiting the local community.

We have previously reported on E-rate here and here. Click here to take action now.

The post Down with the divide – support the E-rate program appeared first on District Dispatch.

David Rosenthal: Hardware I/O Virtualization

Tue, 2014-12-16 16:00
At enterprisetech.com, Timothy Prickett Morgan has an interesting post entitled A Rare Peek Into The Massive Scale Of AWS. It is based on a talk by Amazon's James Hamilton at the re:Invent conference. Morgan's post provides a hierarchical, network-centric view of the AWS infrastructure:
  • Regions, 11 of them around the world, contain Availability Zones (AZ).
  • The 28 AZs are arranged so that each Region contains at least 2 and up to 6 datacenters.
  • Morgan estimates that there are close to 90 datacenters in total, each with 2000 racks, burning 25-30MW.
  • Each rack holds 25 to 40 servers.
AZs are no more than 2ms apart measured in network latency, allowing for synchronous replication. This means the AZs in a region are only a couple of kilometres apart, which is less geographic diversity than one might want, but a disaster still has to have a pretty big radius to take out more than one AZ. The datacenters in an AZ are not more than 250us apart in latency terms, close enough that a disaster might take all the datacenters in one AZ out.

Below the fold, some details and the connection between what Amazon is doing now, and what we did in the early days of NVIDIA.


Amazon uses custom-built hardware, including network hardware, and their own network software. Doing so is simpler and more efficient than generic hardware and software because they only need to support a very restricted set of configurations and services. In particular they build their own network interface cards (NICs). The reason is particularly interesting to me, as it is to solve exactly the same problem that we faced as we started NVIDIA more than two decades ago.

The state-of-the-art of PC games, and thus PC graphics, were based on Windows, at that stage little more than a library on top of MS-DOS. The game was the only application running on the hardware. It didn't have to share the hardware with, and thus need the operating system (OS) to protect it from, any other application. Coming from the Unix world we knew how the OS shared access to physical hardware devices, such as the graphics chip, among multiple processes while protecting them (and the operating system) from each other. Processes didn't access the devices directly, they made system calls which invoked device driver code in the OS kernel that accessed the physical hardware on their behalf.

We understood that Windows would have to evolve into a multi-process OS with real inter-process protection. Our problem, like Amazon's, was two-fold; latency and the variance of latency. If the games were to provide arcade performance on mid-90s PCs, there was no way the game software could take the overhead of calling into the OS to perform graphics operations on its behalf. It had to talk directly to the graphics chip, not via a driver in the OS kernel.

If there would have been only a single process, such as the X server, doing graphics this would not have been a problem. Using the Memory Management Unit (MMU), the hardware provided to mediate access of multiple processes to memory, the OS could have mapped the graphic chip's IO registers into that process' address space. That process could access the graphics chip with no OS overhead. Other processes would have to use inter-process communications to request graphics operations, as X clients do.

SEGA's Virtua Fighter on NV1Because we expected there to be many applications simultaneously doing graphics, and they all needed low, stable latency, we needed to make it possible for the OS safely to map the chip's registers into multiple processes at one time. We devoted a lot of the first NVIDIA chip to implementing what looked to the application like 128 independent sets of I/O registers. The OS could map one of the sets into a process' address space, allowing it to do graphics by writing directly to these hardware registers. The technical name for this is hardware I/O virtualization; we pioneered this technology in the PC space. It provided the very low latency that permitted arcade performance on the PC, despite other processes doing graphics at the same time. And because the competition between the multiple process' accesses to their virtual I/O resources was mediated on-chip as it mapped the accesses to the real underlying resources, it provided very stable latency without the disruptive long tail that degrades the user experience.

Amazon's problem was that, like PCs running multiple graphics applications on one real graphics card, they run many virtual machines (VMs) on each real server. These VMs have to share access to the physical network interface card (NIC). Mediating this in software in the hypervisor imposes both overhead and variance. Their answer was enhanced NICs:
The network interface cards support Single Root I/O Virtualization (SR-IOV), which is an extension to the PCI-Express protocol that allows the resources on a physical network device to be virtualized. SR-IOV gets around the normal software stack running in the operating system and its network drivers and the hypervisor layer that they sit on. It takes milliseconds to wade down through this software from the application to the network card. It only takes microseconds to get through the network card itself, and it takes nanoseconds to traverse the light pipes out to another network interface in another server. “This is another way of saying that the only thing that matters is the software latency at either end,” explained Hamilton. SR-IOV is much lighter weight and gives each guest partition on a virtual machine its own virtual network interface card, which rides on the physical card.This, as shown on Hamilton's graph, provides much less variance in latency:
The new network, after it was virtualized and pumped up, showed about a 2X drop in latency compared to the old network at the 50th percentile for latency on data transmissions, and at the 99.9th percentile the latency dropped by about a factor of 10X. The importance of reducing the variance of latency for Web services at Amazon scale is detailed in a fascinating, must-read paper, The Tail At Scale by Dean and Barroso.

Amazon had essentially the same problem we had, and came up with the same basic hardware solution - hardware I/O virtualization.

DuraSpace News: DSpace-CRIS News from Cineca

Tue, 2014-12-16 00:00

FromMichele Mennielli, Cineca, International Relations

DuraSpace News: NOW OPEN: OR2015 Conference System–Submit Your Open Repositories Conference Proposal

Tue, 2014-12-16 00:00

From the organizers of the Open Repositories 2015 (OR2015) Conference.

Indianapolis, IN  The Tenth International Conference on Open Repositories, OR2015, will be held June 8-11, 2015 in Indianapolis (Indiana, USA). The organizers are pleased to invite you to contribute to the program.

HangingTogether: Gifts for archivists and librarians: from the practical to the luxurious

Mon, 2014-12-15 23:55

We asked for suggestions for gifts that would be suitable for librarians or archivists and the community responded! Thank you so much for all the wonderful and thoughtful gift ideas!

Here are the nominations: if you have other ideas please leave them in the comments below. To ensure that you get what you want, think about leaving this page on computers in your reading room or information commons — I’m sure that certain someone will get the hint.

Lumio Book Lamp

Practical gifts: some information professionals are very focused on getting the job done. For these folks, a gift that helps them do the work at hand is just the thing. Gifts in this category include:

  • A mobile scanner: Laura suggests that perhaps the Flip-Pal might be useful for those who are zipping around “scanning madly.”
  • Of course it’s not all about shelving books or arranging collections. We also attend lots of meetings and conferences. How about a fountain pen? Nadia Nasr suggests the Cross Stratford as a nice looking model that’s affordable.
  • For all that professional reading, what about a book shaped lamp? Lumio’s book lamp (although pricey) was suggested by Stephanie as being “pretty rad.” Comes in dark walnut and blonde maple to compliment any decor.
  • What is more painful that losing your place in a book? Hunting around for a bookmark. Lynn Jones suggests the Albatros Bookmark — you never need to look for your bookmark because it’s in the book — it also places itself.  Comes in packs of 6.
  • What about a card catalog shaped flash drive? These will be available soon from Unshelved. Thanks to Carol Street for the suggestion.

The Archivist Wine

Food and drink: everyone likes to eat and drink. Here are some suggestions vetted by librarians and archivists

  • The chefs among us might appreciate cookbooks from historical societies. Melissa M. loves her cookbooks from the King’s Landing Historical Settlement in New Brunswick. I couldn’t find those online but you can find plenty of good ideas in Cookbook Finder. I noted that King’s Landing does have an historic inn that serves period food, so check with your local historical society!
  • Beer for archivists: Although I normally hate to reinforce stereotypes about archivists that involve either attics or cellars, I was pleased to hear Jill Tatem’s nomination for Cellar Dweller, which is only available at the Great Lakes Brewing Company in Cleveland. Since this is the site of the 2015 Society of American Archivists annual meeting, I know many archivists will take a rain check on this brew.
  • A toast to archivists! From Sonoma Estate Vintners, the Archivist. Pick your poison: cab, chardonnay, or pinot noir. The description includes the word “appraise” so you know you are in the right place.

 

Pride and Prejudice Tote

Clothing and accessories: suggestions range from items that are practical to those that show your style.

  • Melissa M. says, “every processing archivist could use steel-toed boots (required for the first archival job I ever had, and I actually managed to find quite a stylish pair).” Melissa was not able to find her boots, which fetched compliments outside the workplace, but perhaps something like these engineer boots would work.
  • To go with your boots, perhaps some library card socks from NYPL? (Hat tip to Bruce Washburn.)
  • You can wear your heart on your sleeve, and now you can wear your favorite book, as a t-shirt, or water-resistant tote. From Lithographs. Also available, posters and (temporary) tatoos. From Lorcan Dempsey and Pam Kruger.
  • A favorite from last year was the microfiche jewelery from Oinx. Styles have been updated and now you and spread the “I’d rather be fiching” message via t-shirt and bumper sticker.

Mini Hollinger Document Cartons

Little luxuries: sometimes it’s the little things

  • Candles are a great seasonal gift. You can choose between The Archivist candles from Greenmarket (lots of choose from, particularly if you like the idea of “fragrance records accumulated to preserve moments, stories, and people they represent”) and Library candles from Paddywax (which feature scents that will conjure your favorite author). Thanks to Casey Davis and Carol Street for calling these to our attention!
  • Hollinger boxes are a staple for archivists, and mini document boxes have long been a popular giveaway at conferences — so popular that Hollinger now sells them as a separate item. Jennifer suggests that in addition to being just plain adorable, they would be the perfect way to pop the question.
  • Cream for hands, dried out from processing documents and handling other materials, was a popular item on last year’s list. This year, Melissa M. recommends Lush’s Charity Pot lotion.

Can’t buy happiness: of course, the things that everyone really wants can’t be purchased. At the top of almost every information professional’s wish list is space (to put anything, as our anonymous contributor put it). Another thing that we’d all like to see is reflected in this lovely blog post by Maarja Krusten:

…the greatest gift you can give archivists and librarians is the opportunity to share physically and virtually the knowledge found in their collections and holdings.

Now, that sentiment is something I think we can all get behind! Happy holidays to all of you!

About Merrilee Proffitt

Mail | Web | Twitter | Facebook | LinkedIn | More Posts (276)

Information Technology and Libraries: President's Message: Twitter Nodes to Networks: Thoughts on the #litaforum

Mon, 2014-12-15 23:28
President's Message: Twitter Nodes to Networks: Thoughts on the #litaforum

Harvard Library Innovation Lab: Link roundup December 15, 2014

Mon, 2014-12-15 19:44

Beach balls, stickers, books, moveable cities, and figuring out what you want. This is a good batch of links.

Eyeo 2014 – Santiago Ortiz

“A question that should be answered with action, not thought.”

The City That Is Moving Down the Road

What if the city were on legs? Fascinating piece on potential mobility of libraries, and other community commons.

Best Books of 2014 : NPR

What if the library had an interface like this, but weekly? Acquisitions of the week. Good covers. Fun to browse.

Who’s been naughty and nice this year? Our 2014 draft survey results | SchoolStickers

Clever use of data for holiday fun.

Spinning Beach Ball of Death

The spinning beach ball can be beautiful.

Library of Congress: The Signal: “Elementary!” A Sleuth Activity for Personal Digital Archiving

Mon, 2014-12-15 17:15

Sherlock Holmes and Doctor Watson. “The Adventure of Silver Blaze,” in The Strand Magazine. Illustration by Sidney Paget (1860-1908). On Wikimedia.

As large institutions and organizations continue to implement preservation processes for their digital collections, a smattering of self-motivated information professionals are trying to reach out to the rest of the world’s digital preservation stakeholders —  individuals and small organizations — to help them manage their digital collections.

Part of that challenge is just making people aware that:
1. Their digital possessions are at risk of becoming inaccessible
2. They need to take responsibility for preserving their own stuff
3. The preservation process is easy.

The Library of Congress offers personal digital archiving resources and takes an active role in outreach. [Watch for the announcement of Personal Digital Archiving 2015 next April in New York City.] And we are always happy to discover novel approaches by our colleagues to teaching personal digital archiving. Consider the work of one group of information professionals from Georgia.

The Society of Georgia Archivists, the Atlanta Chapter of ARMA International and the Georgia Library Association have collaborated on a curriculum for a personal digital archiving workshop that addresses the basic problems and solutions. Among the steps they outline, they emphasize the need to make files “findable.”

To that end they devised an activity called “Find the Person in the Personal Digital Archive” (the activity data set and all the workshop materials are free and available for download, reuse and remixing). The premise is simple and the game is fun but it drives home an important message about organizing your files. The producers created a folder filled with files and sub-folders — messy, disorganized files; pointless sub-folders; mis-named files; highly personal files mixed with business files; encrypted files and obsolete file formats, many sourced from the Open Preservation Foundation’s Format Corpus — and they invite people to participate in a forensics activity, to look through all the files and directories and try to piece together some information about the owner of the files.

Courtesy of the Society of Georgia Archivists.

As the user looks through the folder, there are questions to answer, such as “How would you describe the contents?”, “How did the creator of the archive name and arrange the files?” and “How do the features of the archive (such as file names, organization scheme, file format, etc.) make some of the records easy to understand and some of them impossible to understand?”

Though the goal is to deduce the identity and fate of the owner through various clues and “Aha!” moments, in doing the activity the users ends up making judgments about what is useful (like descriptive file and folder names) and what is not (files called “things.xml” and “untitled.txt”). Poring over a fake mess such as this drives home a point: how do you organize your own personal stuff? If someone, such as a loved one, had to go through your digital files, how easy or difficult would it be for them to find specific files and make sense of it all? Are you leaving a mess for someone else to trudge through?

Wendy Hagenmaier, the outreach manager for the Society of Georgia Archivists, is one of the workshop producers. Hagenmaier wanted to reach beyond her community to demystify digital archives stewardship and explain to the general public why digital preservation matters and how they can preserve their own stuff. She researched other like-minded organizations in Georgia to find interested parties for the workshop. “This topic really seems to be taking off in public libraries,” said Hagenmaier,”and genealogists are very much interested in personal digital archiving, though I don’t know if the topic comes up in their circles on its own.”

Michelle Kirk, president of the Atlanta Chapter of ARMA, gives a presentation. Photo courtesy of the Society of Georgia Archivists.

Hagenmaier — and her colleagues Michelle Kirk, Cathy Miller and Oscar Gittemeier — geared the workshop toward information professionals and encouraged the workshop attendees to go out and teach the workshop to others so that the message will reach the general public in a sort of trickle-down effect. So far she has presented the curriculum at a “train the trainer” webinar, a workshop and at a Georgia State Archives genealogy event.

The Society of Georgia Archivists also offer a Personal Digital Archiving Workshop Outreach Grant to help information professionals in Georgia promote the idea that librarians, archivists and records managers are a source of expertise for assisting individuals (the public, family members, students, corporate employees, etc.) with their personal digital archiving needs. The grant will be given to individuals who apply for the grant after hosting and teaching a workshop at their institutions or in their communities, using the curriculum materials designed by SGA, GLA and Atlanta ARMA.

Hagenmaier is fervent about getting the word out to people, making them aware that they casually create and use digital stuff in their everyday lives, so digital stewardship could and should be just as casual and effortless. She feels that knowledge of digital stewardship will empower people and assure them that their digital files can be safe if they keep them safe. She said that in the course of her work she sees in people a fear of the unknown, a huge anxiety about the fate of digital files. To illustrate her point she cites a moment during her genealogy conference presentation when she asked a group of genealogists, “How many of you think you will be able to access your digital files in ten years?” No one raised a hand.

“They are hopeful but not confident,” said Hagenmaier. “Personal digital archiving is still foreign to people. It is important for us to just get the word out that they can preserve their own stuff.”

OCLC Dev Network: WorldCat Registry RDF Interface Write Operations to be Decomissioned

Mon, 2014-12-15 17:00

The WorldCat Registry RDF interface write operations will be decommissioned in January.  Developers will continue to have read only access to the WorldCat Registry via the both the RDF and SRU/XSD interfaces.

LITA: Making Connections in the New Year

Mon, 2014-12-15 16:25

This new year, make a resolution to be more proactive, network and update your professional skills. Resolve to attend a professional conference, discussion or symposium!

Flickr, 2010

GameDevHacker Conference
New York, January 28

The GameDevHacker conference is just around the corner. Combining the wits of three segments of the gaming industry, the gamers, developers and hackers, the conference aims at discussing future developments. The tagline for next year’s conference is “Past Trends and Future Bets.”

 

The Creativity Workshop
New York, February 20 – 23 & April 17 – 20

Do you have writers block, want to create dynamic programming or transform the way you view digital arts? The creativity workshop is geared toward professionals in the sciences, business, arts and humanities. Two 4-day workshops will be held this spring 2015.

 

2015 National STEM Education Conference
Cleveland, April 16-17

The typical STEMcon audience includes educators in the K-12 arena. However, if altering the current landscape of STEM education is important to you, STEMcon may be a great venue to voice those concerns. Participants will, among other topics, discuss using educational technology and bridging gender and ethnic divides in the science, technology, engineering and math fields.

 

8th Annual Emerging Technologies for Online Learning Symposium
Dallas, April 22 – 24

Perhaps you may not be interested in STEM education at the K-12 level, but almost everyone in the information field has either facilitated or participated in online learning technologies. Web-based technology will continue to alter delivery of instruction to students around the world. Network, share and learn about new trends with participants from around the nation.

 

Educause Annual Conference
Indianapolis and Virtual, October 27 – 30

If travel is an issue, Educause will hold a virtual conference in October of next year. The conference is geared toward IT professionals in higher education, but can be useful for students and novice practitioners. More information will be published in the spring of 2015.

 

Have a happy New Year LITAblog readers!

Ed Summers: Languages on Twitter.

Mon, 2014-12-15 16:12

There have been some interesting visualizations of languages in use on Twitter, like this one done by Gnip and published in the New York Times. Recently I’ve been involved in some research on particular a topical collection of tweets. One angle that’s been particularly relevant for this dataset is language. When perusing some of the tweet data we retrieved from Twitter’s API we noticed that there were two lang properties in the JSON. One was attached to the embedded user profile stanza, and the other was a top level property of the tweet itself.

We presumed that the user profile language was the language the user (who submitted the tweet) had selected, and that the second language on the tweet was the language of the tweet itself. The first is what Gnip used in its visualization. Interestingly, Twitter’s own documentation for the /get/statuses/:id API call only shows the user profile language.

When you send a tweet you don’t indicate what language it is in. For example you might indicate in your profile that you speak primarily English, but send some tweets in French. I can only imagine that detecting language for each tweet isn’t a cheap operation for the scale that Twitter operates at. Milliseconds count when you are sending 500 million tweets a day, in real time. So at the time I was skeptical that we were right…but I added a mental note to do a little experiment.

This morning I noticed my friend Dan had posted a tweet in Hebrew, and figured now was as a good a time as any.

????? ?????

— Dan Chudnov (@dchud) December 4, 2014

I downloaded the JSON for the Tweet from the Twitter API and sure enough, the user profile had language en and the tweet itself had language iw which is the deprecated ISO 639-1 code for Hebrew (current is he. Here’s the raw JSON for the tweet, search for lang:

{ "contributors": null, "truncated": false, "text": "\u05d0\u05e0\u05d7\u05e0\u05d5 \u05e0\u05ea\u05d2\u05d1\u05e8", "in_reply_to_status_id": null, "id": 540623422469185537, "favorite_count": 2, "source": "<a href=\"http://tapbots.com/software/tweetbot/mac\" rel=\"nofollow\">Tweetbot for Mac</a>", "retweeted": false, "coordinates": null, "entities": { "symbols": [], "user_mentions": [], "hashtags": [], "urls": [] }, "in_reply_to_screen_name": null, "id_str": "540623422469185537", "retweet_count": 0, "in_reply_to_user_id": null, "favorited": true, "user": { "follow_request_sent": false, "profile_use_background_image": true, "profile_text_color": "333333", "default_profile_image": false, "id": 17981917, "profile_background_image_url_https": "https://pbs.twimg.com/profile_background_images/3725850/woods.jpg", "verified": false, "profile_location": null, "profile_image_url_https": "https://pbs.twimg.com/profile_images/524709964905218048/-CuYZQQY_normal.jpeg", "profile_sidebar_fill_color": "DDFFCC", "entities": { "description": { "urls": [] } }, "followers_count": 1841, "profile_sidebar_border_color": "BDDCAD", "id_str": "17981917", "profile_background_color": "9AE4E8", "listed_count": 179, "is_translation_enabled": false, "utc_offset": -18000, "statuses_count": 14852, "description": "", "friends_count": 670, "location": "Washington DC", "profile_link_color": "0084B4", "profile_image_url": "http://pbs.twimg.com/profile_images/524709964905218048/-CuYZQQY_normal.jpeg", "following": true, "geo_enabled": false, "profile_banner_url": "https://pbs.twimg.com/profile_banners/17981917/1354047961", "profile_background_image_url": "http://pbs.twimg.com/profile_background_images/3725850/woods.jpg", "name": "Dan Chudnov", "lang": "en", "profile_background_tile": true, "favourites_count": 1212, "screen_name": "dchud", "notifications": false, "url": null, "created_at": "Tue Dec 09 02:56:15 +0000 2008", "contributors_enabled": false, "time_zone": "Eastern Time (US & Canada)", "protected": false, "default_profile": false, "is_translator": false }, "geo": null, "in_reply_to_user_id_str": null, "lang": "iw", "created_at": "Thu Dec 04 21:47:22 +0000 2014", "in_reply_to_status_id_str": null, "place": null }

Although tweets are short they certainly can contain multiple languages. I was curious what would happen if I tweeted two words, one in English and one in French.

testing, essai

— Ed Summers (@edsu) December 15, 2014

When I fetched the JSON data for this tweet the language of the tweet was indicated to be pt or Portuguese! As far as I know neither testing nor essai are Portuguese.

This made me think perhaps the tweet was a bit short so I tried something a bit longer, with the number of words in each language being equal.

Désolé for le noise, je suis just seeing how détection de la language works.

— Ed Summers (@edsu) December 15, 2014

This one came across with lang fr. So having the text be a bit longer helped in this case. Admittedly this isn’t a very sound experiment, but it seems interesting and useful to see that Twitter is detecting language in tweets. It isn’t perfect, but that shouldn’t be surprising at all given the nature of human language. It might be useful to try a more exhaustive test using a more complete list of languages to see how it fairs. I’m adding another mental note…

Islandora: Islandora and Fedora 4

Mon, 2014-12-15 16:09

Now that Fedora 4 has a production release, we at the Islandora Foundation would like to share our plans to integrate so that Islandora users, new and existing, can take advantage of the expanded performance and functionality of this major update to Fedora.

The details of our proposed plan and budget can be found in our Fedora 4 Prospectus and Project Plan, prepared by the Fedora 4 Interest Group. In short, we have established Drupal 7.x as the front end for a Fedora 4 prototype, and will commence development in January 2015 with an update and demo of the new system to be ready in time for the Open Repositories conference in June. To reach this goal, we have identified funding needs and begun soliciting support from the Islandora community. The primary use for these funds will be a dedicated developer to serve as Technical Lead on the project and oversee our effort to get an initial product out to the community. The developer for this position was recently selected, and we are very pleased to announce that Daniel Lamb of discoverygarden will be spearheading the technical development of Fedora 4/Islandora 7 integration. 

Taking the lead on overall management of the project will be Project Director Nick Ruest, two-time Islandora Release Manager and heavy contributor to the Islandora stack, with project support from the Islandora Foundation. We will be calling for volunteers from the community to join in the effort and we plan to hold one or two hackfests in the coming year. While developers, both individual and as in-kind donations from supporter institutions, are vital to this project, we are also very much in need of non-developer volunteers to test, document, and provide use cases to determine the features and scope of the update. A call for volunteers will go out in late January or early February. In the mean time, if you have questions or would like to commit funds or developer time to support project, please contact us, or ask on the mailing list.

For more information about how Islandora will work with Fedora 4, please check out this recent webinar hosted by discoverygarden. In the coming months, Nick Ruest and Daniel Lamb will be providing more detailed plans and reports here, and the Fedora 4 Interest Group will move its mandate from being a group to get the community moving towards Fedora 4 integration to a group that will work through the integration by soliciting feedback on proposed use cases and implementations. In addition, the group will work with the greater Fedora community towards a generic Fedora 3.x to Fedora 4 migration scenario.

FOSS4Lib Recent Releases: CollectionSpace - 4.1.1

Mon, 2014-12-15 14:56

Last updated December 15, 2014. Created by Peter Murray on December 15, 2014.
Log in to edit this page.

Package: CollectionSpaceRelease Date: Monday, December 15, 2014

FOSS4Lib Recent Releases: Service-Proxy - 0.40

Mon, 2014-12-15 14:18
Package: Service-ProxyRelease Date: Monday, December 1, 2014

Last updated December 15, 2014. Created by Peter Murray on December 15, 2014.
Log in to edit this page.

0.40 Mon Dec 1 13:04:00 CEST 2014

- mutable request, adds mechanism for modifying request on the fly -MKSP-138
- adds one more RIS field (KW) to RIS export, MKSP-140
- changes mappings for three RIS fields (SP,VL,IS), MKSP-140
- adds option to override default RIS-to-Pazpar2 field mappings, MKSP-140
- removes handling of obsolete Identity field 'proxyPattern', MKSP-133
- fixes null pointer exception in request parameter handling, MKSP-139

ACRL TechConnect: How I Work (Margaret Heller)

Mon, 2014-12-15 14:00

Editor’s Note: This post is part of ACRL TechConnect’s series by our regular and guest authors about The Setup of our work.

 

Margaret Heller, @margaret_heller

Location: Chicago, IL

Current Gig: Digital Services Librarian, Loyola University Chicago

Current Mobile Device: iPhone 5s. It look me years and years of thinking to finally buy a smart phone, and I did it mainly because my iPod Touch and slightly smart phone were both dying so it could replace both.

Current Computer:

Work: Standard issue Dell running Windows 7, with two monitors.

Home: Home built running Windows 7, in need of an upgrade that I will get around to someday.

Current Tablet: iPad 3, which I use constantly. One useful tip is that I have the Adobe Connect, GoToMeeting, Google Hangout, and Lync apps which really help with participating in video calls and webinars from anywhere.

One word that best describes how you work: Tenaciously

What apps/software/tools can’t you live without?

Outlook and Lync are my main methods of communicating with other library staff. I love working at a place where IMing people is the norm. I use these both on desktop and on my phone and tablet. I love that a recent upgrade means that we can listen to voice mails in our email.

Firefox is my normal work web browser. I tend to use Chrome at home. The main reason for the difference is synced bookmarks. I have moved my bookmarks between browsers so many times that I have some of the original sites I bookmarked when I first used Netscape in the late 90s. Needless to say, very few of the sites still exist, but it reminds me of old hobbies and interests. I also don’t need the login to stream shows from my DVR at in my bookmark toolbar at work.

Evernote I use for taking meeting notes, conference notes, recipes, etc. I usually have it open all day at work.

Notepad++ is where I do most of my code writing.

OpenRefine is my favored tool for bulk editing metadata, closely aligned with Excel since I need Excel to get data into our institutional repository.

Filezilla is my favored FTP client.

WriteMonkey is the distraction free writing environment I use on my desktop computer (and how I am writing this post). I use Editorial on my iPad.

Spotify and iTunes for music and podcasts.

RescueTime for staying on track with work–I get an email every Sunday night so I can swear off social media for the next week. (It lasts about a day).

FocusBooster makes a great Pomodoro timer.

Zotero is my constant lifesaver when I can’t remember how to cite something, and the only way I stay on track with writing posts for ACRL TechConnect.

Feedly is my RSS reader, and most of the time I stay on top of it.

Instapaper is key to actually reading rather than skimming articles, though of course I am always behind on it.

Box (and Box Sync) is our institutional cloud file storage service, and I use it extensively for all my collaborative projects.

Asana is how we keep track of ongoing projects in the department, and I use it for prioritizing personal projects as well.

What’s your workspace like? :A large room in the basement with two people full time, and assorted student workers working on the scanner. We have pieces of computers sitting around, though moved out an old server rack that was taking up space. (Servers are no longer located in the library but in the campus data centers). My favorite feature is the white board wall behind my desk, which provides enough space to sketch out ideas in progress.

I have a few personal items in the office: a tea towel from the Bodleian Library in Oxford, a reproduction of an antique map of France, Belgium, & Holland, a photo of a fiddlehead fern opening, and small stone frogs to rearrange while I am talking on the phone. I also have a photo of my baby looking at a book, though he’s so much bigger now I need to add additional photos of him. My desk has in tray, out tray, and a book cart shaped business card holder I got at a long ago ALA conference. I am a big proponent of a clean desk, though the later in the semester it gets the more likely I am to have extra papers, but it’s important to my focus to have an empty desk.

There’s usually a lot going on in here and no natural light, so I go outside to work in the summer, or sometimes just to another floor in the building to enjoy the lake view and think through problems.

What’s your best time-saving trick?: Document and schedule routine tasks so I don’t forget steps or when to take care of them. I also have a lot of rules and shortcuts set up in my email so I can process email very quickly and not work out of my inbox. Learn the keyboard shortcuts! I can mainly get through Gmail without touching the mouse and it’s great.

What’s your favorite to-do list manager?: Remember the Milk is how I manage tasks. I’ve been using it for years for Getting Things Done. I pay for it, and so currently have access to the new version which is amazing, but I am sworn to secrecy about its appearance or features. I have a Google Doc presentation I use for Getting Things Done weekly reviews, but just started using an Asana project to track all my ongoing projects in one place without overwhelming Remember the Milk or the Google Doc. It tells me I currently have 74 projects. A few more have come in that I haven’t added yet either.

Besides your phone and computer, what gadget can’t you live without?: For a few more weeks, my breast pump, which I am not crazy about, but it makes the hard choices of parenting a little bit easier. I used to not be able to live without my Nook until I cut my commute from an hour on the train to a 20 minute walk, so now I need earbuds for the walk. I am partial to Pilot G2 pens, which I use all the time for writing ideas on scrap paper.

What everyday thing are you better at than everyone else?: Keeping my senses of humor and perspective available for problem solving.

What are you currently reading?: How to be a Victorian by Ruth Goodman (among other things). So far I have learned how Victorians washed themselves, and it makes me grateful for central heating.

What do you listen to while you work?: Podcasts (Roderick on the Line is required listening), mainly when I am doing work that doesn’t require a lot of focus. I listen mostly to full albums on Spotify (I have a paid account), though occasionally will try a playlist if I can’t decide what to listen to. But I much prefer complete albums, and try to stay on top of new releases as well as old favorites.

Are you more of an introvert or an extrovert?: A shy extrovert, though I think I should be an introvert based on the popular perception. I do genuinely like seeing other people, and get restless if I am alone for too long.

What’s your sleep routine like?: I try hard to get in bed at 9:30, but by 10 at the latest. Or ok, maybe 10:15. Awake at 6 or whenever the baby wakes up. (He mostly sleeps through the night, but sometimes I am up with him at 4 until he falls asleep again). I do love sleeping though, so chances to sleep in are always welcome.

Fill in the blank: I’d love to see _________ answer these same questions. Occasional guest author Andromeda Yelton.

What’s the best advice you’ve ever received?: You are only asked to be yourself. Figure out how you can best help the world, and work towards that rather than comparing yourself to others. People can adjust to nearly any circumstance, so don’t be afraid to try new things.

Pages