You are here

Feed aggregator

Jonathan Rochkind: Blacklight Community Survey

planet code4lib - Thu, 2015-08-20 17:18

I’ve created a Google Docs survey targetted at organizations who have Blacklight installations (or vendor-hosted BL installations on their behalf? Is that a thing?).

Including Blacklight-based stacks like Hydra.

The goal of the survey is to learn more about how Blacklight is being used in “the wild”, specifically but not limited to people’s software stacks they are using BL with.

If you host (or have, or plan to) a Blacklight-based application, it would be great if you filled out the survey!

Filed under: General

District Dispatch: Building bridges at the “Department of Opportunity”

planet code4lib - Thu, 2015-08-20 15:51

This week, the Department of Housing and Urban Development (HUD) (aka the Department of Opportunity) gathered national partners and local government and housing leaders from 28 communities to begin the real work of ConnectHome. Launched in July, the demonstration project aims to connect more than 275,000 households in 28 communities to low-cost internet, devices and technology training. It was my pleasure to participate on behalf of ALA and libraries (along with Metropolitan New York Library Council Director Nate Hill) and discuss the commitment and power of libraries and librarians in closing the digital divide and boosting creation/making as well as access/consumption.

Zach Leverenz, CEO of EveryoneOn, and Robert Greenblum, senior policy advisor to HUD Secretary Julián Castro, opened the convening and served as masters of ceremony throughout the day. “If you’re not digitally literate in the 21st century, you’re illiterate,” Greenblum said, recalling an early conversation with leaders at the 80/20 Foundation in Austin. Greenblum and other HUD officials highlighted the power of 28 communities working at the same time to build broadband connections and technology skills that boost educational and economic opportunity. At the national level, HUD is setting an agency goal around broadband adoption, including developing metrics for measuring progress on closing the digital divide.

ConnectHome Logo

“Empowerment” was the one-word description of the impact of digital inclusion work underway in Austin, according to Sylvia Blanco, executive vice president for the Housing Authority of the City of Austin (HACA). In a city with 92% broadband adoption, only about 30% of public housing residents had some sort of computing device, and only 28% of these residents were connected to the internet. Blanco and local Austin partners in the “Unlocking the Connection” program will serve as “peer mentors” for the ConnectHome initiative. (Of note, I also learned that the Institute of Museum and Library Services’ Building Digital Communities framework served as the template for Austin’s digital inclusion strategic plan.)

Besides meeting new and ongoing collaborators, a favorite thing about gatherings like this is when I make a presentation about the opportunities afforded by libraries, I am immediately approached by audience members who want to tell me about the great staff in their local libraries. In this case, I heard about the leadership of Denver Public Library Director Michelle Jeske and Rockford (IL) Public Library Director Lynn Stainbrook. While ConnectHome offers a new avenue for serving community residents, librarians have already made a mark through early learning opportunities, afterschool programs and technology training. This suite of programs and services is a hallmark of libraries.

At the risk of overusing the metaphor (I know, too late!), this week also is one of connecting the dots. Affordability is a key barrier for accessing the internet, and the Federal Communications Commission is currently considering how to address this through the Lifeline program. The ALA will file comments in this public comment period, joining with others in the civil rights and digital inclusion community to advocate for updating this universal service program to include broadband. Less than half of people with incomes under $25,000 have home broadband access, hobbling equitable access to information in the digital society and undermining economic and innovation goals.

But, as the ConnectHome effort clearly recognizes, broadband access must be married with robust adoption efforts. The Lifeline program is one part of a larger effort needed to scaffold digital opportunity, including relevant content, context and digital literacy training. We hope this message will be reflected in the Broadband Opportunity Council report that goes to President Obama this month.

And all of this activity is part of the concerted effort to increase awareness of libraries as part of the solution for advancing national policy priorities among decision makers and influencers. Digital inclusion and innovation is threaded throughout the National Policy Agenda for Libraries and The E’s of Libraries®.

The ConnectHome attention now turns to local convenings, which will take place in the 28 communities around the country through the end of October. We invite local librarians and other partners in this program to share their impressions and plans from these meetings so we can continue to learn from each other and move the country forward.

The post Building bridges at the “Department of Opportunity” appeared first on District Dispatch.

SearchHub: Introducing Anda: a New Crawler Framework in Lucidworks Fusion

planet code4lib - Thu, 2015-08-20 15:01

Introduction Lucidworks Fusion 2.0 ships with roughly 30 out-of-the-box connector plugins to facilitate data ingestion from a variety of common datasources. 10 of these connectors are powered by a new general-purpose crawler framework called Anda, created at Lucidworks to help simplify and streamline crawler development. Connectors to each of the following Fusion datasources are powered by Anda under-the-hood:

  • Web
  • Local file
  • Box
  • Dropbox
  • Google Drive
  • SharePoint
  • JavaScript
  • JIRA
  • Drupal
  • Github

Inspiration for Anda came from the realization that most crawling tasks have quite a bit in common across crawler implementations. Much of the work entailed in writing a crawler stems from generic requirements unrelated to the exact nature of the datasource being crawled, which indicated the need for some reusable abstractions. The below crawler functionalities are implemented entirely within the Anda framework code, and while their behavior is quite configurable via properties in Fusion datasources, the developer of a new Anda crawler needn’t write any code to leverage these features:

  • Starting, stopping, and aborting crawls
  • Configuration management
  • Crawl-database maintenance and persistence
  • Link-legality checks and link-rewriting
  • Multithreaded-ness and thread-pool management
  • Throttling
  • Recrawl policies
  • Deletion
  • Alias handling
  • De-duplication of similar content
  • Content splitting (e.g. CSV and archives)
  • Emitting content

Instead, Anda reduces the task of developing a new crawler to providing the Anda framework with access to your data. Developers provide this access by implementing one of two Java interfaces that form the core of the Anda Java API: Fetcher and FS (short for filesystem). These interfaces provide the framework code with the necessary methods to fetch documents from a datasource and discern their links, enabling traversal to additional content in the datasource. Fetcher and FS are designed to be as simple to implement as possible, with most all of the actual traversal logic relegated to framework code.

Developing a Crawler

With so many generic crawling tasks, it’s just inefficient to write an entirely new crawler from scratch for each additional datasource. So in Anda, the framework itself is essentially the one crawler, and we plug-in access to the data that we want it to crawl. The Fetcher interface is the more generic of two ways to provide that access.

Writing a Fetcher public interface Fetcher extends Component<Fetcher> { public Content fetch(String id, long lastModified, String signature) throws Exception; }

Fetcher is a purposefully simple Java interface that defines a method fetch() to fetch one document from a datasource. There’s a WebFetcher implementation of Fetcher in Fusion that knows how to fetch web pages (where the id argument to fetch() will be a web page URL), a GithubFetcher for Github content, etc. The fetch() method returns a Content object containing the content of the “item” referenced by id, as well as any links to additional “items”, whatever they may be. The framework itself is truly agnostic to the exact type of “items”/datasource in play—dealing with any datasource-specific details is the job of the Fetcher.

A Fusion datasource definition provides Anda with a set of start-links (via the startLinks property) that seed the first calls to fetch() in order to begin the crawl, and traversal continues from there via links returned in Content objects from fetch(). Crawler developers simply write code to fetch one document and discern its links to additional documents, and the framework takes it from there. Note that Fetcher implementations should be thread-safe, and the fetchThreads datasource property controls the size of the framework’s thread-pool for fetching.

Incremental Crawling

The additional lastModified and signature arguments to fetch() enable incremental crawling. Maintenance and persistence of a crawl-database is one of the most important tasks handled completely by the framework, and values for lastModified (a date) and signature (an optional String value indicating any kind of timestamp, e.g. ETag in a web-crawl) are returned as fields of Content objects, saved in the crawl-database, and then read from the crawl-database and passed to fetch() in re-crawls. A Fetcher should use these metadata to optionally not read and return an item’s content when it hasn’t changed since the last crawl, e.g. by setting an If-Modified-Since header along with the lastModified value in the case of making HTTP requests. There are special “discard” Content constructors for the scenario where an unchanged item didn’t need to be fetched.

Emitting Content

Content objects returned from fetch() might be discards in incremental crawls, but those containing actual content will be emitted to the Fusion pipeline for processing and to be indexed into Solr. The crawler developer needn’t write any code in order for this to happen. The pipelineID property of all Fusion datasources configures the pipeline through which content will be processed, and the user can configure the various stages of that pipeline using the Fusion UI.


Fetcher extends another interface called Component, used to define its lifecycle and provide configuration. Configuration properties themselves are defined using an annotation called @Property, e.g.:

@Property(title="Obey robots.txt?", type=Property.Type.BOOLEAN, defaultValue="true") public static final String OBEY_ROBOTS_PROP = "obeyRobots";

This example from WebFetcher (the Fetcher implementation for Fusion web crawling) defines a boolean datasource property called obeyRobots, which controls whether WebFetcher should heed directives in robots.txt when crawling websites (disable this setting with care!). Fields with @Property annotations for datasource properties should be defined right in the Fetcher class itself, and the title= attribute of a @Property annotation is used by Fusion to render datasource properties in the UI.

Error Handling

Lastly, it’s important to notice that fetch() is allowed to throw any Java exception. Exceptions are persisted, reported, and handled by the framework, including logic to decide how many times fetch() must consecutively fail for a particular item before that item will be deleted in Solr. Most Fetcher implementations will want to catch and react to certain errors (e.g. retrying failed HTTP requests in a web crawl), but any hard failures can simply bubble up through fetch().

What’s next?

Anda’s sweet spot is definitely around quick and easy development of crawlers at present, which usually connote something a bit more specific than the term “connector”. That items link to other items is currently a core assumption of the Anda framework. Web pages have links to other web pages and filesystems have directories linking to other files, yielding structures that clearly require crawling.

We’re working towards enabling additional ingestion paradigms, such as iterating over result-sets (e.g. from a traditional database) instead of following links to define the traversal. Mechanisms to seed crawls in such a fashion are also under development. For now, it may make sense to develop connectors whose ingestion paradigms are less about crawling (e.g. the Slack and JDBC connectors in Fusion) using the general Fusion connectors framework. Stay tuned for future blog posts covering new methods of content ingestion and traversal in Anda.

An Anda SDK with examples and full documentation is also underway, and this blog post will be updated as soon as it’s available. Please Contact Lucidworks in the meantime.

Download Fusion

Additional Reading

Fusion Anda Documentation

Planned upcoming blog posts (links will be posted when available):

Web Crawling in Fusion
The Anda-powered Fusion web crawler provides a number of options to control how web pages are crawled and indexed, control speed of crawling, etc.

Content De-duplication with Anda
De-duplication of similar content is a complex but generalizable task that we’ve tackled in the Anda framework, making it available to any crawler developed using Anda.

Anda Crawler Development Deep-dive
Writing a Fetcher is one of the two ways to provide Anda with access to your data; it’s also possible to implement an interface called FS (short for filesystem). Which one you choose will depend chiefly on whether the target datasource can be modeled in terms of a standard filesystem. If a datasource generally deals in files and directories, then writing an FS may be easier than writing a Fetcher.

The post Introducing Anda: a New Crawler Framework in Lucidworks Fusion appeared first on Lucidworks.

Open Knowledge Foundation: The 2015 Global Open Data Index is around the corner – these are the new datasets we are adding to it!

planet code4lib - Thu, 2015-08-20 14:57

After a two months, 82 ideas for datasets, 386 voters, thirteen civil society organisation consultations and very active discussions on the Index forum, we have finally arrived at a consensus on what datasets will be including in the 2015 Global Open Data Index (GODI).

This year, as part of our objective to ensure that the Global Open Data index is more than a simple measurement tool, we started a discussion with the open data community and our partners in civil society to help us determine which datasets are of high social and democratic value and should be assessed in the 2015 Index. We believe that by making the choice of datasets a collaborative decision, we will be able to raise awareness of and start a conversation around the datasets required for the Index to truly become a civil society audit of the open data revolution. The process included a global survey, a civil society consultation and a forum discussion (read more in a previous blog post about the process).

The community had some wonderful suggestions, making deciding on fifteen datasets no easy task. To narrow down the selection, we started by eliminating the datasets that were not suitable for global analysis. For example, some datasets are collected at the city level and can therefore not be easily compared at a national level. Secondly, we looked to see if there is was a global standard that would allow us to easily compare between countries (such as UN requirements for countries etc). Finally, we tried to find a balance between financial datasets, environmental datasets, geographical datasets and datasets pertaining to the quality of public services. We consulted with experts from different fields and refined our definitions before finally choosing the following datasets:

  1. Government procurement data (past and present tenders) – This dataset is crucial for monitoring government contracts be it to expose corruption or to ensure the efficient use of public funds. Furthermore, when combined with budget and spending data, contracting data helps to provide a full and coherent picture of public finance. We will be looking at both tenders and awards.
  2. Water quality -Water is life and it belongs to all of us. Since this is an important and basic building stone of society, having access to data on drinking water may assist us not only in monitoring safe drinking water but also to help providing it everywhere.
  3. Weather forecast – Weather forecast data is not only one of the most commonly used datasets in mobile and web applications, it is also of fundamental importance for agriculture and disaster relief. Having both weather predictions and historical weather data helps not only to improve quality of life, but to monitor climate change. As such, through the index, we will measure whether governments openly publish data both data on the 5 day forecast and historical figures.
  4. Land ownership – Land ownership data can help citizens understand their urban planning and development as well as assisting in legal disputes over land. In order to assess this category, we are using national cadastres, a map showing land registry.
  5. Health performance data – While this was one of the most popular datasets requested during the consultation, it was challenging to define what would be the best dataset(s) to assess health performance (see the forum discussion). We decided to use this category as an opportunity to test ideas about what to evaluate. After numerous discussions and debates, we decided that this year we would use the following as proxy indicators of health performance:
      Location of public hospitals and clinics.
      Data on infectious diseases rates in a country.
    That being said, we are actively seeking and would greatly appreciate your feedback! Please use the country level comment section to suggest any other datasets that you encounter that might also be a good measure of health performance (for example, from number of beds to budgets). This feedback will help us to learn and define this data category even better for next year’s Index.



In addition to the new datasets, we refined the definitions to some of the existing datasets, while using our new datasets definition guidelines. These were written in order to both produce a more accurate measurement and to create more clarity about what we are looking for with each dataset. The guidelines suggest at least 3 key data characteristics for each datasets, define how often each dataset needs to be updated in order to be considered timely, and suggests level aggregation acceptable for each datasets. The following datasets were changed in order to meet the guidelines:

Elections results – Data should be reported at the polling station level as to allow civil society to monitor elections results better and uncover false reporting. In addition, we added indicators such as number of registered voters, number of invalid votes and number of spoiled ballots.

National map – In addition to the scale of 1:250,000, we added features such as – markings of national roads, national borders, marking of streams, rivers, lakes, mountains.

Pollutant emissions – We defined the specific pollutants that should be included in the datasets.

National Statistics – GDP, unemployment and populations have been selected as the indicators that must be reported.

Public Transport – We refined the definition so it will examine only national level services (as opposed to inter cities ones). We also do not looking for real time data, but time tables.

Location datasets (previously Postcodes) – Postcode data is incredibly valuable for all kinds of business and civic activity; however, 60 countries in the world do not have a postcode system and as such, this dataset has been problematic in the past. For these countries, we have suggested examining a different dataset, administrative boundaries. While it is not as specific as postcodes, administrative boundaries can help to enrich different datasets and create better geographical analysis.

Adding datasets and changing definitions has been part of ongoing iterations and improvements that we have done to the Index this year. While it has been a challenge, we are hoping that these improvements help to create a more fair and accurate assessment of open data progress globally. Your feedback plays an essential role in shaping and improving the Index going forward, please do share it with us.

For the full descriptions of this year’s datasets can be found here.

Galen Charlton: Evergreen 2.9: now with fewer zombies

planet code4lib - Thu, 2015-08-20 01:57

While looking to see what made it into the upcoming 2.9 beta release of Evergreen, I had a suspicion that something unprecedented had happened. I ran some numbers, and it turns out I was right.

Evergreen 2.9 will feature fewer zombies.

Considering that I’m sitting in a hotel room taking a break from Sasquan, the 2015 World Science Fiction Convention, zombies may be an appropriate theme.

But to put it more mundanely, and to reveal the unprecedented bit: more files were deleted in the course of developing Evergreen 2.9 (as compared to the previous stable version) than entirely new files were added.

To reiterate: Evergreen 2.9 will ship with fewer files, even though it includes numerous improvements, including a big chunk of the cataloging section of the web staff client.

Here’s a table counting the number of new files, deleted files, and files that were renamed or moved from the last release in a stable series to the first release in the next series.

Between release… … and release Entirely new files Files deleted Files renamed rel_1_6_2_3 rel_2_0_0 1159 75 145 rel_2_0_12 rel_2_1_0 201 75 176 rel_2_1_6 rel_2_2_0 519 61 120 rel_2_2_9 rel_2_3_0 215 137 2 rel_2_3_12 rel_2_4_0 125 30 8 rel_2_4_6 rel_2_5_0 143 14 1 rel_2_5_9 rel_2_6_0 83 31 4 rel_2_6_7 rel_2_7_0 239 51 4 rel_2_7_7 rel_2_8_0 84 30 15 rel_2_8_2 master 99 277 0

The counts were made using git diff --summary --find-rename FROM..TO | awk '{print $1}' | sort | uniq -c and ignoring file mode changes. For example, to get the counts between release 2.8.2 and the master branch as of this post, I did:

$ git diff --summary --find-renames origin/tags/rel_2_8_2..master|awk '{print $1}'|sort|uniq -c 99 create 277 delete 1 mode

Why am I so excited about this? It means that we’ve made significant progress in getting rid of old code that used to serve a purpose, but no longer does. Dead code may not seem so bad — it just sits there, right? — but like a zombie, it has a way of going after developers’ brains. Want to add a feature or fix a bug? Zombies in the code base can sometimes look like they’re still alive — but time spent fixing bugs in dead code is, of course, wasted. For that matter, time spent double-checking whether a section of code is a zombie or not is time wasted.

Best for the zombies to go away — and kudos to Bill Erickson, Jeff Godin, and Jason Stephenson in particular for removing the remnants of Craftsman, script-based circulation rules, and JSPac from Evergreen 2.9.

DuraSpace News: UPDATE: SHARE Research Information Systems Task Group

planet code4lib - Thu, 2015-08-20 00:00

Winchester, MA  The SHARE ( Research Information Systems Task Group led by DuraSpace CEO Debra Hanken Kurtz will write a brief white paper that surfaces key considerations concerning the quality and completeness of research activity administrative data.

SearchHub: If They Can’t Find It, They Can’t Buy It

planet code4lib - Wed, 2015-08-19 20:13
Sarath Jarugula, VP Partners & Alliances at Lucidworks has a blog post up on IBM’s blog, If They Can’t Find It, They Can’t Buy It: How to Combine Traditional Knowledge with Modern Technical Advances to Drive a Better Commerce Experience: “Search is at the heart of every ecommerce experience. Yet most ecommerce vendors fail to deliver the right user experience. Getting support for the most common types of search queries can be a challenge for even the largest online retailers. Let’s take a look at how traditional online commerce and retail is being transformed by technology advances across search, machine learning, and analytics.” Read the full post on IBM’s blog. Join us for our upcoming webinar Increase Conversion With Better Search.

The post If They Can’t Find It, They Can’t Buy It appeared first on Lucidworks.

FOSS4Lib Upcoming Events: Fedora 4 Workshop at eResearch Australasia

planet code4lib - Wed, 2015-08-19 18:45
Date: Friday, October 23, 2015 - 08:00 to 17:00Supports: Fedora Repository

Last updated August 19, 2015. Created by Peter Murray on August 19, 2015.
Log in to edit this page.

From the announcement:

Harvard Library Innovation Lab: Link roundup August 19, 2015

planet code4lib - Wed, 2015-08-19 18:11

We found some cool stuff you might like.

Michael Itkoff :: How To

Vintage exercise how-to GIFs – mesmerizing

Delight Your Inner Kid With This Giant Lite-Brite | Mental Floss

A really big Lite-Bright

Locking the Web Open: A Call for a Distributed Web

All the pieces are in place for a better web. Let’s build it.

Looking for a Breakthrough? Study Says to Make Time for Tedium

“Moving innovation forward requires effort and time not directly related to the idea itself”

Kodak’s First Digital Moment

Tools, like cameras, are built by linking together complex chains of logic.

Roy Tennant: Where Your Favorite Programming Language Ranks

planet code4lib - Wed, 2015-08-19 16:04

Every programmer knows that any time you want to start a religious war just ask everyone’s favorite programming language and why. This will almost certainly touch off an ever-more-heated exchange as to why one’s particular choice should be every thinking person’s obvious selection. It may even devolve so far as to include name calling. But hey, we’re all friends here so no need to be nasty about our favorite tools.

And why use opinion when we have actual usage data? In this case, the popularity of various languages on Stack Overflow, a popular programming discussion site, and GitHub, the now “go to” code repository. Thus we have the twin lens of what people say that they do and what they actually do.

The result isn’t really all that surprising overall. All the usual suspects appear at the top of the scattergram: Java, Javascript, PHP, Python, flavors of C, Ruby, etc. But I have to say that I’m gratified that Perl, my classic tool (I’m now also dabbling in Python) is still fairly popular.

Why CSS (Cascading Style Sheets) and XML are there is anyone’s guess, as the last time I checked they weren’t programming languages (XSLT justifiably appears). But whatever.

Check out your favorites and see where they fall on the curve.


Note: Thanks to Lorcan Dempsey for pointing this out.

DPLA: Unexpected: Animals do the most amazing things

planet code4lib - Wed, 2015-08-19 15:30

We’ve always had a strange relationship with animals. Some are  beloved family members, we farm, hunt, and fish others, and we are awestruck by some for their natural beauty and power. Whatever we think of them, we love to photograph them. And, that’s been the case since the camera started to capture their likenesses in the 19th Century.

Dogs hold a particular place in our hearts. These sled dogs from the 1870s were part of the winter mail line near Lake Superior.

Gems of Lake Superior scenery, No. 95 [Stereograph], ca.1870s. Childs, B. F. (Brainard F.) (ca. 1841-1921). Courtesy of The New York Public Library.

Dogs are especially photogenic when they are doing tricks. Especially when they carry kittens, children, and tiny cans of dog food in their carts.

St. Bernard Lodge, P.O. Mill Creek, California, 1946. Eastman, Jervie Henry. Courtesy of the University of California, Davis, Library via the California Digital Library.


Publicity at Hollywood dog training school, Southern California, 1935. Courtesy of the University of Southern California Libraries.


Apparently, harnessing and riding animals of all sorts was, in the early era of photography, an American pastime.

Cawston Ostrich Farm Postcard: Anna Held Riding an Ostrich. Courtesy of the South Pasadena Public Library via the California Digital Library.


Frank Buck’s Jungleland from the New York World’s Fair (1939-1940). Courtesy of The New York Public Library.


Boy riding catfish, 1941. Douglass, Neal. Courtesy of the Austin History Center, Austin Public Library via The Portal to Texas History.


A Young Girl in a Goat-Drawn Wagon, 1926. Courtesy of the Private Collection of T. Bradford Willis via The Portal to Texas History.


Children Riding a Deer-Drawn Wagon. Courtesy of the Private Collection of T. Bradford Willis via The Portal to Texas History.


Photographs of animals riding other animals warm our hearts, too.

Horse & dog pals – winter time. Copyright (c) Leslie Jones. This work is licensed for use under a Creative Commons Attribution Non-Commercial No Derivatives License (CC BY-NC-ND). Courtesy of the Boston Public Library via Digital Commonwealth.


Monkey riding a goat, 1935. Copyright (c) Leslie Jones. This work is licensed for use under a Creative Commons Attribution Non-Commercial No Derivatives License (CC BY-NC-ND). Courtesy of the Boston Public Library via Digital Commonwealth.


And finally, this donkey on wheels just leaves us speechless.

Charles “Chick” Hoover and his roller skating donkey, Pinky, in Banning, California, ca. 1958. Courtesy of the Banning Library District via the California Digital Library.

LITA: Attend the 2015 LITA Forum

planet code4lib - Wed, 2015-08-19 12:00

Don’t Miss the 2015 LITA Forum
Minneapolis, MN
November 12-15, 2015

Registration is Now Open!

Join us in Minneapolis, Minnesota, at the Hyatt Regency Minneapolis for the 2015 LITA Forum, a three-day education and networking event featuring 2 preconferences, 3 keynote sessions, more than 55 concurrent sessions and 15 poster presentations. This year including content and planning collaboration with LLAMA. It’s the 18th annual gathering of the highly regarded LITA Forum for technology-minded information professionals. Meet with your colleagues involved in new and leading edge technologies in the library and information technology field. Registration is limited in order to preserve the important networking advantages of a smaller conference. Attendees take advantage of the informal Friday evening reception, networking dinners and other social opportunities to get to know colleagues and speakers.

Keynote Speakers:

  • Mx A. Matienzo, Director of Technology for the Digital Public Library of America
  • Carson Block, Carson Block Consulting Inc.
  • Lisa Welchman, President of Digital Governance Solutions at ActiveStandards.

The Preconference Workshops:

  • So You Want to Make a Makerspace: Strategic Leadership to support the Integration of new and disruptive technologies into Libraries: Practical Tips, Tricks, Strategies, and Solutions for bringing making, fabrication and content creation to your library.
  • Beyond Web Page Analytics: Using Google tools to assess searcher behavior across web properties.

Comments from past attendees:

“Best conference I’ve been to in terms of practical, usable ideas that I can implement at my library.”
“I get so inspired by the presentations and conversations with colleagues who are dealing with the same sorts of issues that I am.”
“After LITA I return to my institution excited to implement solutions I find here.”
“This is always the most informative conference! It inspires me to develop new programs and plan initiatives.”

Forum Sponsors:

EBSCO, Ex Libris, Optimal Workshop, OCLC, Innovative, BiblioCommons, Springshare, A Book Apart and Rosenfeld Media.

Get all the details, register and book a hotel room at the 2015 Forum Web site.

See you in Minneapolis.

William Denton: Music, Code and Data: Hackfest and Happening at Access 2015

planet code4lib - Wed, 2015-08-19 01:09

Access is the annual Canadian conference about libraries and technology. The 2015 conference is in Toronto (the program looks great). As usual at Access, before the conference starts there’s a one-day hackfest. Katie Legere and I are running a special hackfest about music and sonification, to be followed by a concert after the hackfest is over. It’s a hackfest! It’s a happening! It’s code and data and music! Full details: Music, Code and Data: Hackfest and Happening at Access 2015.

DuraSpace News: REGISTER: Fedora 4 Workshop at eResearch Australasia in October

planet code4lib - Wed, 2015-08-19 00:00

Winchester, MA  A one-day Fedora 4 Training Workshop will be held on October 23, 2015 in Brisbane, Queensland, Australia. The event coincides with the eResearch Australasia Conference and will take place in the same venue–the Brisbane Convention & Exhibition Centre. The workshop is being generously subsidized by the University of New South Wales (UNSW) so the cost for attending is only $80AUD. Register here.

Jonathan Rochkind: “Registered clinical trials make positive findings vanish”

planet code4lib - Tue, 2015-08-18 18:06

via, Registered clinical trials make positive findings vanish

The launch of the registry in 2000 seems to have had a striking impact on reported trial results, according to a PLoS ONE study1 that many researchers have been talking about online in the past week.

A 1997 US law mandated the registry’s creation, requiring researchers from 2000 to record their trial methods and outcome measures before collecting data. The study found that in a sample of 55 large trials testing heart-disease treatments, 57% of those published before 2000 reported positive effects from the treatments. But that figure plunged to just 8% in studies that were conducted after 2000….

…Irvin says that by having to state their methods and measurements before starting their trial, researchers cannot then cherry-pick data to find an effect once the study is over. “It’s more difficult for investigators to selectively report some outcomes and exclude others,” she says….

“Loose scientific methods are leading to a massive false positive bias in the literature,”

Filed under: General

David Rosenthal: Progress in solid-state memories

planet code4lib - Tue, 2015-08-18 15:00
Last week's Storage Valley Supper Club provided an update on developments in solid state memories.

First, the incumbent technology, planar flash, has reached the end of its development path at the 15nm generation. Planar flash will continue to be the majority of flash bits shipped through 2018, but the current generation is the last.

Second, all the major flash manufacturers are now shipping 3D flash, the replacement for planar. Stacking the cells vertically provides much greater density; the cost is a much more complex manufacturing process and, at least until the process is refined, much lower yields. This has led to much skepticism about the economics of 3D flash, but it turns out that the picture isn't as bad as it appeared. The reason is, in a sense, depressing.

It always important to remember that, at bottom, digital storage media are analog. Because 3D flash is much denser, there are a lot more cells. Because of the complexity of the manufacturing process, the quality of each cell is much worse. But because there are many more cells, the impact of the worse quality is reduced. More flash controller intelligence adapting to the poor quality or even non-functionality of the individual cells, and more of the cells used for error correction, mean that 3D flash can survive lower yields of fully functional cells.

The advent of 3D means that flash prices, which had stabilized, will resume their gradual decrease. But anyone hoping that 3D will cause a massive drop will be disappointed.

Third, the post-flash solid state technologies such as Phase Change Memory (PCM) are increasingly real but, as expected, they are aiming at the expensive, high-performance end of the market. HGST has demonstrated a:
PCM SSD with less than two microseconds round-trip access latency for 512B reads, and throughput exceeding 3.5 GB/s for 2KB block sizes.which, despite the near-DRAM performance, draws very little power.

But the big announcement was Intel/Micron's 3D XPoint. They are very cagey about the details, but it is a resistive memory technology that is 1000 times faster than NAND, 1000 times the endurance, and 100 times denser. They see the technology initially being deployed, as shown in the graph, as an ultra-fast but non-volatile layer between DRAM and flash, but it clearly has greater potential once it gets down the price curve.

FOSS4Lib Upcoming Events: Open Source with OPF: JHOVE Stewardship

planet code4lib - Tue, 2015-08-18 13:22
Date: Wednesday, August 26, 2015 - 09:00 to 10:00Supports: JHOVE

Last updated August 18, 2015. Created by Peter Murray on August 18, 2015.
Log in to edit this page.

From the announcement:

During March and April 2015 the OPF assumed stewardship of JHOVE after the existing maintainer, Gary McGath, expressed his wish to step down. The OPF’s initial aims were to take ownership of the JHOVE resources and establish sustainable home for the project on GitHub. Following this we’ve updated the build, testing and distribution process for the project.

LITA: Interacting with patrons through their mobile devices

planet code4lib - Tue, 2015-08-18 13:00

Mobile technologies, specifically smartphones, have become a peripheral appendage to our everyday experience. We often see individuals oblivious to current surroundings exhibiting dedicated attention to their mobile devices. This behavior is often viewed in a negative light; however, with the level of global media engagement people are able to achieve with these devices, it can be hard to blame them. The ability to participate in social media, sending quick messages to friends, listen to music, watch videos, surfing the web, fact check information, or even read a great book, is all right in your hand.

When attempting to interact with patrons through technology, utilizing their familiarity with their mobile device can help to achieve a more positive experience. This is when “Let’s build an app” is often reverberated. Although that is a great idea, it is a complex development process and there are a number of ways to achieve interactive experiences without the development of a new mobile application.

Over the course of the next several blog posts, I will be discussing various methods of interacting with patrons mobile devices to enhance their experiences through the use of QR codes, NFC (Near Field Communication) tags, and BLE (Bluetooth Low Energy) Beacons. Each of these technologies allow for a different experience, and have areas where they excel and falter, but when incorporating each technology appropriately they can create a comprehensive interactive experience to enhance information seeking.

SearchHub: Solr Developer Survey 2015

planet code4lib - Mon, 2015-08-17 20:09
Every day, we hear from organizations looking to hire Solr talent. Recruiters want to know how to find and hire the right developers and engineers, and how to compensate them accordingly. Lucidworks is conducting our annual global survey of Solr professionals to better understand how engineers and developers at all levels of experience can take advantage of the growth of the Solr ecosystem – and how they are using Solr to build amazing search applications. This survey will take about 2 minutes to complete. Responses are anonymized and confidential. Once our survey and research is completed, we’ll share the results with you and the Solr community. As a thank you for your participation, you’ll be entered in a drawing to win one of our blue SOLR t-shirts plus copies of the popular books Taming Text and Solr in Action. Be sure to include your t-shirt size in the questionnaire. We’d appreciate your input by Wednesday, Sept 9th. Click here to take the survey. Thanks so much for your participation!

The post Solr Developer Survey 2015 appeared first on Lucidworks.


Subscribe to code4lib aggregator