You are here

planet code4lib

Subscribe to planet code4lib feed
Planet Code4Lib - http://planet.code4lib.org
Updated: 1 hour 54 min ago

D-Lib: Tools for Discovering and Archiving the Mobile Web

Mon, 2015-03-16 11:14
Article by Frank McCown, Monica Yarbrough and Keith Enlow, Harding University

D-Lib: Trustworthiness: Self-assessment of an Institutional Repository against ISO 16363-2012

Mon, 2015-03-16 11:14
Article by Bernadette Houghton, Deakin University, Geelong, Australia

D-Lib: Digital Library Research in Action: Supporting Information Retrieval in Sowiport

Mon, 2015-03-16 11:14
Article by Daniel Hienert, Frank Sawitzki and Philipp Mayr, GESIS, Leibniz Institute for the Social Sciences, Germany

D-Lib: The Practice of Digital Libraries

Mon, 2015-03-16 11:14
Editorial by Laurence Lannom, CNRI

D-Lib: Managing Digital Collections Survey Results

Mon, 2015-03-16 11:14
Article by Liz Bishoff, The Bishoff Group, and Carissa Smith, DuraSpace

D-Lib: In Brief: DocSouth Data

Mon, 2015-03-16 11:14

D-Lib: In Brief: DigitalLearn.org

Mon, 2015-03-16 11:14

Hydra Project: OR2015 NEWS: Registration Opens; Speakers from Mozilla and Google Announced

Mon, 2015-03-16 09:39

Of interest to many in the Hydra Community:

We are pleased to announce that registration is now open for the 10th International Conference on Open Repositories, to be held on June 8-11, 2015 in Indianapolis, Indiana, United States of America. Full registration details and a link to the registration form may be found at: http://www.or2015.net/registration

OR2015 is co-hosted by Indiana University Bloomington Libraries, University of Illinois at Urbana-Champaign Library, and Virginia Tech Libraries.

OR2015 Registration and Fees:

An early registration fee of $450 USD will be available until May 8. After May 8, the registration fee will increase to $500 USD. This registration fee covers participation in general conference sessions, workshops, and interest group sessions, as well as the conference dinner on Wednesday, June 10 and poster reception on Tuesday, June 9. For a draft outline of the conference schedule, please see: http://www.or2015.net/program/schedule-at-a-glance

Participants may register online at: http://www.or2015.net/registration. If you have any questions about registering for OR2015, please contact the Conference Registrar at iuconfs@indiana.edu. Any other questions about the conference may be directed to the conference organizing committee by using the form at: http://www.or2015.net/contact-us

Hotel Reservations:

The OR2015 conference will take place at the Hyatt Regency Indianapolis hotel, conveniently located in the heart of downtown Indianapolis. Special room rates at the Hyatt starting at $159 USD per night have been negotiated for conference attendees and will be available for booking through May 16. More information on hotel reservations and travel is available at: http://www.or2015.net/conference-hotel-and-travel

Keynote and Featured Speakers:

Reflecting the significant milestone of the 10th Open Repositories conference and this year’s theme of “Looking Back, Moving Forward: Open Repositories at the Crossroads,” we are pleased to announce the conference’s two plenary speakers:

Kaitlin Thaney will be giving the opening keynote talk on the morning of Tuesday, June 9. Kaitlin is director of the Mozilla Science Lab, an open science initiative of the Mozilla Foundation focused on innovation, best practice and skills training for research. Prior to Mozilla, she served as the Manager of External Partnerships at Digital Science, a technology company that works to make research more efficient through better use of technology. Kaitlin also advises the UK government on infrastructure for data intensive science and business, serves as a Director for DataKind UK, and is the founding co-chair for the Strata Conference series in London on big data. Prior to Mozilla and Digitial Science, Kaitlin managed the science program at Creative Commons, worked with MIT and Microsoft, and wrote for the Boston Globe. You can learn more about the Science Lab at http://mozillascience.org  and follow Kaitlin online at @kaythaney.

Anurag Acharya will be the featured speaker at the plenary session on the morning of Wednesday, June 10, presenting on “Indexing repositories: pitfalls and best practices.” Anurag is a Distinguished Engineer at Google and creator of Google Scholar, and he previously led the indexing group at Google. He has a Bachelors in Computer Science from the Indian Institute of Technology, Kharagpur and a PhD in Computer Science from Carnegie Mellon. Prior to joining Google, he was a post-doctoral researcher at the University of Maryland, College Park and an assistant professor at the University of California, Santa Barbara.

We look forward to seeing you at OR2015!

Jon Dunn, Julie Speer, and Sarah Shreeves
OR2015 Conference Organizing Committee

Holly Mercer, William Nixon, and Imma Subirats
OR2015 Program Co-Chairs

Galen Charlton: Henriette Avram versus the world: Is COBOL capable of processing MARC?

Mon, 2015-03-16 02:17

Is the COBOL programming language capable of processing MARC records?

A computer programmer in 2015 could be excused for thinking to herself, what kind of question is that!?! Surely it’s obvious that any programming language capable of receiving input can parse a simple, antique record format?

In 1968, it apparently wasn’t so obvious. I turned up an article by Henriette Avram and a colleague, MARC II and COBOL, that was evidently written in response to a review article by a Hillis Griffin where he stated

Users will require programmers skilled in languages other than FORTRAN or COBOL to take advantage of MARC records.

Avram responded to Griffin’s concern in the most direct way possible: by describing COBOL programs developed by the Library of Congress to process MARC records and generate printed catalogs. Her article even include source code, in case there were any remaining doubts!

I haven’t yet turned up any evidence that Henriette Avram and Grace Hopper ever met, but it was nice to find a close, albeit indirect connection between the two of them via COBOL.

Is the debate between Avram and Griffen in 1968 regarding COBOL and MARC anything more than a curiosity? I think it is — many of the discussions she participated in are reminiscent of debates that are taking place now. To fair to Griffin, I don’t know enough about the computing environment of the late sixties to be able to definitely say that his statement was patently ill-informed at the time — but given that by 1962 IBM had announced that they were standardizing on COBOL, it seems hardly surprising that Avram and her group would be writing MARC processing code in COBOL on an IBM/360 by 1968. To me, the concerns that Griffin raised seem on par with objections to Library Linked Data that assume that each library catalog request would necessarily mean firing off a dozen requests to RDF providers — objections that have rejoinders that are obvious to programmers, but perhaps not so obvious to others.

Plus ça change, plus c’est la même chose?

FOSS4Lib Recent Releases: Avalon Media System - 3.3

Mon, 2015-03-16 01:12
Package: Avalon Media SystemRelease Date: Tuesday, March 10, 2015

Last updated March 15, 2015. Created by Peter Murray on March 15, 2015.
Log in to edit this page.

Indiana University and Northwestern University are pleased to announce Avalon Media System 3.3. Release 3.3 adds the following capabilities:

  • MARC Metadata Import
  • Ingestion of pre-transcoded derivatives with multiple quality levels 
  • Script for recovering disk space taken up by temporary Matterhorn files
  • UI Improvements and Bug fixes

Users of Avalon 3.2 can take advantage of these new features by Upgrading Avalon 3.2 to Avalon 3.3.

FOSS4Lib Updated Packages: MediaSCORE

Sun, 2015-03-15 23:41

Last updated March 15, 2015. Created by Peter Murray on March 15, 2015.
Log in to edit this page.

MediaSCORE (Media Selection: Condition, Obsolescence, and Risk Evaluation) enables a detailed analysis of degradation and obsolescence risk factors for most analog and physical digital audio and video formats.

MediaRIVERS (Media Research and Instructional Value Evaluation and Ranking System) guides a structured assessment of research and instructional value for media holdings.

Some additional key features of the software include:

  • Browser-based web-application that works on any Windows and Mac operating systems using all popular browsers.
  • Enables teams to enter and edit data simultaneously.
  • Permissions based access and views across MediaSCORE and MediaRIVERS.
  • Controlled vocabularies and field validation to help ensure consistent data entry.
  • Provides auditing path to help with quality assurance and transparency.

The two applications are bundled together but may be used separately. They can be found along with a detailed user guide on GitHub at https://github.com/IUMDPI/MediaSCORE . Also available is a conceptual document that explores assessment of research and instructional value.

The software requires installation and configuration on a server, requiring the appropriate expertise. AVPreserve is also offering MediaSCORE/RIVERS as a hosted application on a monthly subscription basis.

Package Type: Data Preservation and ManagementLicense: Apache 2.0 Package Links Development Status: In DevelopmentOperating System: Browser/Cross-PlatformProgramming Language: PHPOpen Hub Link: https://www.openhub.net/p/MediaSCOREOpen Hub Stats Widget: 

Jenny Rose Halperin: Productivity rules

Sun, 2015-03-15 21:44

Last month I blogged for Safari about successfully changing fields, being fearless, and improving yourself through reading and learning. My blog post received a wonderful response, and I am proud to share that this month I begin my official new position as Safari’s Customer Success Manager. I am staying with my team and am super excited to make our product more useful to both our new and existing customers. I’ll be blogging intermittently about what I’m doing, learning, and making with Safari.

While I believe what I wrote in that post, I’ve felt a bit hypocritical because it’s come to my attention throughout this long, dark winter that

I waste a lot of time.

So much time! From time spent writing emotive letters that I never send to time reading the Wikitravel details of places I want to visit. I sleep late on weekends, occasionally drink too many glasses of wine on weeknights, often eat way more than an allotted portion while distractedly checking my phone during dinner, and spend hours looking at pairs of black pants on the Internet that I will never buy. (I love black pants, particularly loose, comfortable ones. Let this link be a hint to anyone who ever wants to buy me a present.) I believe strongly that there is a healthy balance between time-wasting and productivity, and I am afraid that this winter I crossed my own line and need to work on getting myself back to my center.

I’ve always been an over-achieving time waster; I’m the kind of person who knows all the details of Madonna’s Wikipedia page and still somehow finds the time to do all the things. I manage to consistently find the time for birthday parties, lazy afternoons, potlucks, puppet shows, and performing while always submitting applications, papers, and my taxes on time. I have always volunteered with my community, whether gardening or teaching or manning a booth, and I try to be there both in time and spirit for my friends. I am a master of very little and a generalist who can do a lot of things adequately, including playing music, speaking German and Spanish, and holding intelligent conversation on about a million topics. My lack of focus is what drew me to the interdisciplinarity of American Studies and later Library Science, but

because I am okay at a lot of things, I have often felt like I am not good at anything.

My lack of mastery augments an incredible social knowledge that makes me great at cocktail parties, but not so great at specialized skills, particularly those that I have tried and failed to learn repeatedly like drawing or programming computers.

Lounging around and wasting time makes me stressed, and yet I find myself in Wikipedia holes, on Buzzfeed lists, mindlessly thumbing through Instagram, and Googling ex-boyfriends more than I would like to admit. I have an addictive information-seeking brain, and the Internet has been both an asset and a curse for me as I find myself up late, watching the bar below my apartment close, absorbing both everything and nothing at once. (Pro-tip for other addictive minds: Never begin a television program with a seemingly unlimited number of episodes at 9PM on a week night. You will regret it.)

The Internet has made it easier to live vicariously through others, which is another double-edged sword that often makes life feel more complicated than it actually is. All my friends, professional contacts, and the celebrities who interest me seem to be living fulfilled lives, so I submit to the worst kind of voyeurism, one that’s tinged with envy and the feeling that this life could be mine if I were only more “_______.” This kind of time wasting makes me want to delete all my Internet history, take a shower, and maybe smash my phone against a wall. Even admitting that I do it in a public manner makes me feel slightly uncomfortable, but I think it’s important to recognize this is a human byproduct of the Internet age.

I’m not taking the capitalist tack that says all time has to be productive, self-improvement time, and one only has to read a Romantic novel to realize that people actually probably were not more productive in “olden days.” (I wonder how much time a Jane Austen heroine spent staring at the wall?) Instead of judging or feeling shame, (both feelings that society unfortunately encourages,) I want to practice weening myself off behaviors that don’t make me feel like my best self and hope that others feel inspired to make similar changes for their health and the health of their communities.

In order to kick off this process, I did what I do best, and what I do to make most of my decisions: I made a chart.

I titled the page:

“Be more productive. Overcome winter blues. Get moving.”

The chart’s four cardinal directions pointed to:

  • Have to do
  • Want to do
  • Do less of
  • Do more of

I brainstormed for about 25 minutes and then wrote a list of the immediate tasks I needed to do within the next week in order to make these “productivity hacks” reality (excuse the jargon.)

I wasn’t sure what was going to come out of the exercise, but when I looked at the page, I was surprised to see that most of my “negative” behaviors revolved around a few, distinct categories. In making the chart, I saw that “worrying about the opinion of others” came up 4 times, “relying too much on technology” came up 5 times, and “drinking less frequently” came up 2 times. (My 26 year old hangovers are much worse than my 21 year old hangovers!)

In contrast, doing creative work like playing music, dancing, and writing came up 7 times and giving back to my community came up 4 times on my “positive” behavior list. Being kinder to my environment, both in terms of resources and social awareness also came up frequently.

I am going to use the weeks leading up to my 27th birthday to take some steps towards doing my best work and realizing my unique talents through this exercise and others encouraged by productivity experts. I am also going to use this month to research improving productivity and share out my findings on this blog.

It’s time to focus on my creative and nurturing self and feel more alive in my body this spring. Winter has been hard on all of us Bostonians, but in adapting my behaviors to fit my goals, I am taking the first steps toward a daily practice to be my best self.

 

 

Nicole Engard: SxSW: A New Generation: Creativity and Open Source

Sun, 2015-03-15 21:24

For my final session of the day I attended an interview of Ryan Leslie by Matt Mullenweg titled “A new generation: Creativity and Open Source”.

Ryan Leslie graduated from Harvard at 19 and on the forefront of technology and music. Ryan gave us all his cell phone number so we could text him during the session because he’s very in to being open and giving back. “As creators we try to find our way in the dark – we don’t have any concrete data on who’s buying our music”. When his second album was released he wondered by the label couldn’t just email everyone who bought his first album to tell them about his album. The problem is that the labels don’t have that info. So Ryan shares his info with his fans so that he can keep in touch with them. He used the Twilio API to create a tool to reply to text messages he gets asking the sender for their email address.

Ryan decided that he was going to sell records on his own instead of through a label and has made significantly more money that way. When working with the label it’s like the worst business loan that ever. Instead you can use tools like Tilt, Kickstarter and other crowdfunding sites to get that loan to start up the business and publish your own records. When you’re able to connect directly with your users you can identify who your real supports are.

Instead of spending money on a tool like salesforce Ryan decided to spend 40 days on CodeAcademy and learned Ruby on Rails and wrote his own tool using Twilio to now gather info on his audience. Before he would just get sales reports from iTunes – once he started selling direct he got a better feel for who it was who was listening to him music. “Everyone you know, whether they buy your album or not, can contribute to your project” – contacts as currency. “It’s beautiful when the communication can be two way”.

Ryan shared a story where he and several other musicians were all together and started coming up with a song together and collaborating on the sound. Matt said this was the story that was the most like open source – where several artists come together to collaborate and build something special together.

Ryan made a very open source comment – some of the most successful solutions are when they solve for a problem they face personally. He wants software that was developed by people like him. He mentioned TopSpin which never really worked for him because he didn’t know who wrote it – or where they came from (life experiences). It comes down to shared experiences – even moving up in the music industry is about the people they know and their relationship equity.

Matt recommended that we all read 1,000 True Fans by Kevin Kelly.

“When creators are inspired they share it” – Ryan Leslie

The post SxSW: A New Generation: Creativity and Open Source appeared first on What I Learned Today....

Related posts:

  1. SxSW: Building the Open Source Society
  2. ATO2014: Open Source Schools: More Soup, Less Nuts
  3. The open source behind Twitter

Nicole Engard: SxSW: Magical UX and the Internet of Things

Sun, 2015-03-15 18:31

This afternoon Josh Clarke spoke to us about ‘Magical UX and the Internet of Things’.

A lot of what we’re seeing these days with tech interaction has come with mobile technology. Touch started it all and now we have things like voice and facial recognition. So now makers of digital products need to think about these new ways we should interact with the digital world. There is now a way for us to cast “spells” – wave our arms and something happens. We can even get our own magic wand at thewandcompany.com. Josh even showed us how his wand could be used to light candles. It’s not all novelty like that though, there are some real business and practical uses for this.

Josh showed us a video : bit.ly/grab-magic where the developer created a hack where he grabs things from his TV and puts them on his mobile phone. While it looks awesome it’s so simple with our household devices.

Magic and Technology

“Any sufficiently advanced technology is indistinguishable from magic” – Arthur C. Clarke.

For example – no one is every going to want to wear a computer on their body – but now we have smart watches. Sometimes is what’s we see now that prevents us from developing things that seem like magic.

“Fantasy fulfills a need for a simpler more controllable world” – Alan Kay

We need to make technology seem like magic. Touch for example makes it seem like we’re actually touching the data. We need to use fantasy to think about how want to interact – Alan Kay also said “One goal: the computer disappears in to the environment.” An example of this is the magic wand. But we don’t have to go to Olivander’s to find that wand – we all have one already in our smart phone. Our phone is the magic wand for everyone. The phone is the first IoT (Internet of Things) device for us all. Now we want to put more of the smarts our phone as in to other things. Our phones (and other IoT devices) have “Sensors + Smarts + Connectivity”.

Think of the first time you used Shazam – that was another kind of magic – it was paying attention to the world around you to listen to music. Now we’re seeing this kind of thing with our cameras and translation apps (we used Google Translate to translate signs while in France last week). This also means that we can carry fewer things with us – we don’t carry maps, or cameras or alarm clocks – we use our phones for this. Mobile phones actually bring computing power to immobile objects because we have these phones with us all the time – locks, light bulbs, etc. We can embed smartphone brains in everything and anything. The nappy notifier is a device and app to help you manage your baby’s diapers.

On average we spend over 3 hours a day looking at our phone screen. The more connected we are the more disconnected we are – this means that for those who have been designing mobile interfaces these last few years have been doing too good a job. We want to move these things off of the screen in our hands and out in to the world. Neiman Marcus has this magic mirror that lets you see your entire outfit and compare it to others you have tried on.

Josh recommends that we read ‘Enchanted Objects’ by David Rose. We have many magically smart devices in our homes these days – the Roomba for example is like the broom in Fantasia or Google Now can be like the sorting hat. There is even a device that emulates the ruby slippers in the Wizard of Oz! Bt.tn is a tool that’s like the easy button for life. There’s also IFTTT that is all about magically making things happen! There’s also Zapier which is similar to IFTTT.

So the point of Magic and Technology is to make the computer invisible – we want it to be easier not harder. The magic happens at the point of inspiration – we embed the smarts in our devices around the house. We have centuries of UX ideas to pull from (Wizard of Oz and Fantasia for example).

Up til now we have been tying our digital only to screens. Now we can interact with the world – the world is the interface. For example how do we make the physical shopping experience as easy as the online shopping experience. There is all this info you can’t get from both environments when in the other.

The world is a data source – we need tools that gather data from around us. For example the Snapshot device from Progressive Insurance. Automatic though provides info to the driver (us) to help the car talk to us. Propellerhealth is a device you an add to an inhaler that will help those with asthma learn more about their disease – because they’re also used in communities the devices can gather group data and explain the environment. These devices are passive – they’re the modern crystal balls. They gather data from the physical world and push them back to the digital world.

The world is reactive – the things we do in the world cause a reaction. Our actions are a source of data! The Ares Sand Table is an example of this – this way physical and digital are totally in sync. The Minkoff Mirror is a device to help bring the digital in to the dressing room in the physical store – so now the world is interacting with the data. These are intentional interfaces.

The world is a big canvas – if you thought designing for mobile and desktop was tough – imagine designing for an entire room. For example the Immersion Room or this video: bit.ly/smart-dumb or this bit.ly/room-e. These are therables (instead of wearables) – these are smart environments – which means we can wear fewer bracelets.

The world has depth and mass – we’re used to 2 dimensional interfaces – the world is not flat! Our magic has to account for that. MIT Thaw is about this. Thaw shows us how to have these expensive devices (our phone and laptop) talk to each other better. What would be a better way to move music from your phone to your computer – why not have the phone and computer work together better : bit.ly/happy-together-app.

So we want to gather data for insight, channel intention in to action, use the whole big canvas of the world – make the world smarter and keep in mind that interactions have mass.

Magic Imagined

This is not a challenge of technology, it’s a challenge of imagination!

Let’s start with Google Glass – it always looked like an engineering project. Instead they should have asked ‘what if this thing was magic?’

Let’s look at a coffee cup and ask ‘what if this thing were magic’ – what is this cup witness to and what actions is it next to? What can it hear, what can it see and how can it serve us more. We’re not turning it into something else, we’re designing for the thing’s essential thinness. The goal is not to make things talk – the goal is to improve the conversation. We want things that do their jobs better. We should be bending technology to our lives – make us more human – not less. Colorup takes the color from around it for the light. We need to bank on illusion and embrace misdirection. Context aware experiences should reflect the lies of what we tell ourselves about how these things work. One of the ways we do this is to expose as little technology as possible (make the computer invisible). If everything can be an interface we don’t want everything shouting at all. Oliver for example weaves the number of unread messages we have in our email in to our lives – instead of shouting at us. Remember that magic can be a little ridiculous. It’s okay to make things a little bit ridiculous right now.

What happens when magic goes wrong? Technology lets us down all the time. As we have more and more technology in our lives it becomes more real. “How smart does your bed have to be before you are afraid to go to bed at night” – Rich Gold. Human’s know better. We need to build systems that know they’re not smart enough. Sometimes we build these systems with people first like Lyft and Uber – then when technology catches up (self driving cars) we can bring in the magic. It’s not Harry Potter’s wand that’s magic – it’s Harry. Humans are needed.

It’s not “can” we do this – it’s ‘How will we?”!

The post SxSW: Magical UX and the Internet of Things appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Open Source & the Internet of Things
  2. Planning for the handheld mobile future
  3. NFAIS: Creating new value for business professionals

Mark Leggott

Sun, 2015-03-15 16:30

It pains me to say this after a number of good Loomware years, but it is increasingly challenging for me to keep Loomware updated, so I am "archiving" the site with this post - leaving it open but making the last post the one that says "No more posts!".

Patrick Hochstenbach: First time in my life to a barber

Sun, 2015-03-15 09:07
Filed under: Doodles Tagged: arthur, barber, brushpen, cartoon, cartoons, cat, doodle

John Miedema: A first rough cut at Lila’s calculation of association between slips. Keep it simple to scale for large volumes of text.

Sun, 2015-03-15 03:43

How will Lila calculate the strength of association between slips? Here is a first rough cut. The key thing I want to illustrate is the use of simple steps to decide the subject of a slip and compute a quantitative measure of association with other slips. The method must be simple and quick to scale for large volumes of text. A chess program has to contend with infinite combinations. It uses a simple evaluation of game rules and a sum of chess piece points. E.g., if a move leaves white with ten points and black with nine points, then a particular move is a good one for one. This simple calculation (applied for as many iterations as a game level will allow) is sufficient to beat most chess players on the planet. Lila uses a comparable approach.

The simple rules for text analysis are the following:

  1. Extract nouns. Basic grammar says that you find the subject of a sentence in the nouns: people, places and things.
  2. Use word properties to rank their relative meaningfulness. For example, if a word has a lower frequency of usage it can be considered more interesting and important. There are several such word properties that can be applied by a simple calculation. Just word frequency will be used here. Based on the properties, rank the nouns.
  3. Use synonyms and variant forms to match on meaning rather than just a single surface form. Variant forms and synonyms are a simple and powerful semantic matching technique.

Here is an example. Suppose Stephen Hawking used Lila when writing A Brief History of Time. Suppose this line was a slip in his writing project:

Most people would find the picture of our universe as an infinite tower of tortoises rather ridiculous, but why do we think we know better? What do we know about the universe, and how do we know it?

In this first rough cut, Lila analyzes the slip and generates the following table:

Noun Frequency of usage
(academic) Rank Synonym tortoise 123 1 turtle tower 1563 2 castle universe 4868 3 cosmos

Lila has applied the rules:

  1. Nouns were extracted.
  2. For each noun, frequency of usage was used to calculate rank. An arbitrary rule for this example limits the nouns of interest to the top three ranked nouns.
  3. Synonyms, e.g., tortoise = turtle, were generated by simple look-up from a list.

The nouns are used to find other related slips and compute their strength of association.

A first Google search was performed on [tortoise tower universe]. It would make sense to apply a boost factor to the keywords based on the ranking; in this case I trusted Google to use word order. Many results were nearly identical to the original slip. Nearly identical slips may be interesting to Hawking but will not add much insight.

A second search was performed on the synonyms [turtle castle cosmos]. Divergent results were found, such as a website about Turtle’s ice cream. A snippet was selected from the website for analysis by the Lila algorithm:

“Turtles Oreo
There’s more to see…
Sign up to discover and save different things to try in 2015.
About this map
Ice Cream

28 Pins •
[Image: Cosmic Castle”

Noun Frequency of usage
(academic) Rank Association cream 419 1 No Match pins 724 2 No Match castle 788 3 Match

This site is about an unrelated subject, ice cream. Limiting again to the top three ranked nouns, there is only one match — between Hawking’s term “tower” and its synonym “castle.” A measure of association of 1/3 or 0.33 is computed. This low value could be used to obscure or exclude the slip.

Another result matched better, a blog about turtles in cosmology. A snippet was analyzed using Lila’s algorithm:

The Cosmic Turtle Around the World

Japan:

In Japanese mythology, the tortoise supports the ‘Abode of the Immortals’ and the ‘Cosmic Mountain’, where the Cosmic Mountain relates to the axis mundi – the world axis.

Noun Frequency of usage
(academic) Rank Association cosmos/cosmic 955 1 Match turtle 1116 2 Match axis 2046 3 No Match

Perhaps Hawking would not be interested in such a blog. Perhaps he would. In this case there are two matches. A measure of association of 2/3 or 0.66 is computed. A more refined algorithm would weigh in the triple use of “cosmic.” This site is related subject matter.

This is a first rough cut of how Lila will calculate the strength of association between slips. Certainly a more sophisticated algorithm is required, taking in account multiple word properties. The algorithm should weigh words as more important if they repeat within a slip, especially if they repeat in author-suggested categories and tags. But sophistication must always answer to the need for a simple algorithm. Simplicity is the only way to achieve reasonable performance when analyzing large quantities of text.

Pages