You are here

Feed aggregator

Patrick Hochstenbach: Homework assignment #5 – bis Sketchbookskool

planet code4lib - Thu, 2014-10-23 19:29
I was so happy with my new Lamy fountain pen that I drew a second version of my homework assignment: one using my favorite Arthur and Fietje Precies characters.   Filed under: Comics, Doodles Tagged: cartoon, cat, christmas, doodle, fondue,

Patrick Hochstenbach: Homework assignment #5 Sketchbookskool

planet code4lib - Thu, 2014-10-23 19:24
As second assignment we needed to draw some fantasy image. Preferably using some meta story inside the story. I was drawing monsters the whole week during my commute so I used these drawings as inspiration Filed under: Comics Tagged: cartoon,

Patrick Hochstenbach: Homework assignment #4 Sketchbookskool

planet code4lib - Thu, 2014-10-23 19:21
This week we were asked to draw a memory: our first day at school. I tried to find old school pictures but didn’t find anything nice I could use. I only remembered I cried a lot on my first day

Nicole Engard: ATO2014: How ‘Open’ Changes Products

planet code4lib - Thu, 2014-10-23 18:44

Next up at All Things Open was Karen Borchert talking about How ‘Open’ Changes Products.

We started by talking about the open product conundrum. There is a thing that happens when we think about creating products in an open world. In order to understand this we must first understand what a product is. A product is a good, idea, method, information or service that we want to distribute. In open source we think differently about this. We think more about tools and toolkits instead of packages products because these things are more conducive to contribution and extension. With ‘open’ products work a bit more like Ikea – you have all the right pieces and instructions but you have to make something out of it – a table or chair or whatever. Ikea products are toolkits to make things. When we’re talking about software most buyers are thinking what they get out of the box so a toolkit is not a product to our consumers.

Open Atrium is a product that Phase2 produces and people say a lot about it like “It’s an intranet in a box” – but in reality it’s a toolkit. People use it a lot of different ways – some do what you’d expect them to do, others make it completely different. This is the great thing about open source – this causes a problem for us though in open source – because in Karen’s example a table != a bike. “The very thing that makes open source awesome is what makes our product hard to define.”

Defining a product in the open arena is simple – “Making an open source product is about doing what’s needed to start solving a customer problem on day 1.” Why are we even going down this road? Why are we creating products? Making something that is useable out of the box is what people are demanding. They also provide a different opportunity for revenue and profit.

This comes down to three things:

  • Understanding the value
  • Understanding the market
  • Understanding your business model

Adding value to open source is having something that someone who knows better than me put together. If you have an apple you have all you need to grow your own apples, but you’re not going to both to do that. You’d rather (or most people would rather) leave that to the expert – the farmer. Just because anyone can take the toolkit and build whatever they want with it that they will.

Markets are hard for us in open source because we have two markets – one that gives the product credibility and one that makes money – and often these aren’t the same market. Most of the time the community isn’t paying you for the product – they are usually other developers or people using it to sell to their clients. You need this market because you do benefit from it even if it’s not financially. You also need to work about the people who will pay you for the product and services. You have to invest in both markets to help your product succeed.

Business models include the ability to have two licenses – two versions of the product. There is a model around paid plugins or themes to enhance a product. And sometimes you see services built around the product. These are not all of the business models, but they are a few of the options. People buy many things in open products: themes, hosting, training, content, etc.

What about services? Services can be really important in any business model. You don’t have to deliver a completely custom set of services every time you deliver. It’s not less of a product because it’s centered around services.

Questions people ask?

Is it going to be expensive to deal with an open source product? Not necessarily but it’s not going to be free. We need to plan and budget properly and invest properly.

Am I going to make money on my product this year?
Maybe – but you shouldn’t count on it. Don’t bet the farm on your product business until you’ve tested the market.

Everyone charges $10/mo for this so I’m just going to charge that – is that cool? Nope! You need to charge what the product is worth and what people will pay for it and what you can afford to sell it for. Think about your ROI.

I’m not sure we want to be a products company. It’s very hard to be a product company without buy in. A lot of service companies ask this. Consider instead a pilot program and set a budget to test out this new model. Write a business plan.

The post ATO2014: How ‘Open’ Changes Products appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Using Bootstrap to create a common UI across products
  2. ATO2014: Open source, marketing and using the press
  3. ATO2014: Saving the world: Open source and open science

Nicole Engard: ATO2014: Women in Open Source Panel

planet code4lib - Thu, 2014-10-23 17:04

Over lunch today we had a panel of 6 women in open source talk to us.

The first question was about their earlier days – what made them interested in open source or computer science or all of it.

Intros

Megan started in humanities and then just stumbled in to computer programming. Once she got in to it she really enjoyed it though. Elizabeth got involved with Linux through a boyfriend early on. She really fell in love with Linux because she was able to do anything she wanted with it. She joined the local Linux users group and they were really supportive and never really made a big deal about the fact that she was a woman. Her first task in the open source world was writing documentation (which was really hard) but from there her career grew. Erica has been involved in technology all her life (which she blames her brother for). When she went to school, she wanted to be creative and study arts, but her father gave her the real life speech and she realized that computer programming let her be creative and practical at the same time. Estelle started by studying architecture which was more sexist than her computer science program – toward the end of her college career she found that she was teaching people to use their computers. Karen was always the geekiest person she knew growing up – and her father really encouraged her. She went to engineering school and it wasn’t until she set up her Unix account at the college computer center. She got passionate in open source because of the pacemaker she needs to live – she realized that the entire system is completely proprietary and started thinking about the implications of that.

The career path

Estelle has noticed in the open source world that the men she knows on her level work for big corporations where as the women are working for themselves. This was because there aren’t as many options to move up the ladder. Now as for why she picked the career she picked it was because her parents were sexist and she wanted to piss them off! Elizabeth noticed that a lot of women get involved in open source because they’re recruited in to a volunteer organization. She also notices that more women are being paid to work on open source whereas men are doing it for fun more. Megan had never been interviewed by or worked for a woman until she joined academia. Erica noticed that the career path of women she has met is more convoluted than that of the men she has met. The men take computer science classes and then go in to the field, women however didn’t always know that these opportunities were available to them originally. Karen sees that women who are junior have to work a lot harder – they have to justify their work more often [this is something I totally had to deal with in the past]. Women in these fields get so tired because it’s so much work – so they move on to do something else. Erica says this is partially why she has gone to work for herself because she gets to push forward her own ideas. Megan says that there are a lot of factors that are involved in this problem – it’s not just one thing.

Is diversity important in technology?

Erica feels that if you’re building software for people you need ‘people’ not just one type of person working on the project. Megan says that a variety perspectives is necessary. Estelle says that because women often follow a different path to technology it adds even more diversity than just gender [I for example got in to the field because of my literature degree and the fact that I could write content for the website]. It’s also important to note that diversity isn’t just about gender – but so much more. Karen pointed out that even at 20 months old we’re teaching girls and boys differently – we start teaching boys math and problem solving earlier and we help the girls for longer. This reinforces the gender roles we see today. Elizabeth feels that diversity is needed to engage more talent in general.

What can we do to change the tide?

Megan likes to provide a variety in the types of problems she provides in her classes, with a variety of approaches so that it hits a variety of students instead of alienating those who don’t learn the way she’s teaching. Karen wants us to help women from being overlooked. When a woman make a suggestion acknowledge it – also stop people from interrupting women (because we are interrupted more). Don’t just repeat what the woman says but amplify it. Estelle brings up an example from SurveyMonkey – they have a mentorship program and also offer you to take off when you need to (very good for parents). Erica tries to get to youth before the preconceptions form that technology is for boys. One of the things she noticed was that language matters as well – telling girls you’re going to teach them to code turns them off, but saying we’re going to create apps gets them excited. Elizabeth echoed the language issue – a lot of the job ads are geared toward men as well. Editing your job ads will actually attract more women.

What have you done in your career that you’re most proud of?

Estelle’s example is not related to technology – it was an organization called POWER that was meant to help students who were very likely to have a child before graduation – graduate without before becoming a parent. It didn’t matter what what field they went in to – just that the finished high school. Erica is proud that she has a background that lets her mentor so many people. Elizabeth wrote a book! It was on her bucket list and now she has a second book in the works. It was something she never thought she could do and she did. She also said that it feels great to be a mentor to other women. Megan is just super proud of her students and watching them grow up and get jobs and be successful. Karen is mostly proud of the fact that she was able to turn something that was so scary (her heart condition) in to a way to articulate that free software is so important. She loves hearing others tell her story to other people to explain why freedom in software is so important.

The post ATO2014: Women in Open Source Panel appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Women in Open Source
  2. ATO2014: Open Source – The Key Component of Modern Applications
  3. ATO2014: Building a premier storytelling platform on open source

Open Knowledge Foundation: Open Access and the humanities: On our travels round the UK

planet code4lib - Thu, 2014-10-23 15:58

This post is part of our Open Access Week blog series to highlight great work in Open Access communities around the world. It is written by Alma Swan, Director of Key Perspectives Ltd, Director of Advocacy forSPARC Europe, and Convenor for Enabling Open Scholarship.

Large amounts of public money are spent on obtaining access to published research results, amounting to billions of dollars per year.

Whither the humanities in a world moving inexorably to open values in research? There has been much discussion and debate on this issue of late. It has tended to focus on two matters – the sustainability of humanities journals and the problem(s) of the monograph. Neither of these things is a novel topic for consideration or discussion, but nor have solutions been found that are satisfactory to all the key stakeholders, so the debate goes on.

While it does, some significant developments have been happening, not behind the scenes as such but in a quiet way nevertheless. New publishers are emerging in the humanities that are offering different ways of doing things and demonstrating that Open Access and the humanities are not mutually exclusive.

These publishers are scholar-led or are academy-based (university presses or similar). Their mission is to offer dissemination channels that are Open, viable and sustainable. They don’t frighten the horses in terms of trying to change too much, too fast: they have left the traditional models of peer review practice and the traditional shape and form of outputs in place. But they are quietly and competently providing Open Access to humanities research. What’s more, they understand the concerns, fears and some bewilderment of humanities scholars trying to sort out what the imperative for Open Access means to them and how to go about playing their part. They understand because they are of and from the humanities community themselves.

The debate about OA within this community has been particularly vociferous in the UK in the wake of the contentious Finch Report and the policy of the UK’s Research Councils. Fortuitously, the UK is blessed with some great innovators in the humanities, and many of the new publishing operations are also UK-based. This offers a great opportunity to show off these some new initiatives and help to reassure UK humanities authors at the same time. So SPARC Europe, with funding support from the Open Society Foundations, is now endeavouring to bring these new publishers together with members of the UK’s humanities community.

We are hosting a Roadshow comprising six separate events in different cities round England and Scotland. At each event there are short presentations by representatives of the new publishers and from a humanities scholar who can give the research practitioner perspective on Open Access. After the presentations, the publishers are available in a small exhibition area to display their publications and talk about their publishing programmes, their business models and their plans for the future.

The publishers taking part in the Roadshow are Open Book Publishers, Open Library of the Humanities, Open Humanities Press and Ubiquity Press. In addition, the two innovative initiatives OAPEN and Knowledge Unlatched are also participating. The stories from these organisations are interesting and compelling, and present a new vision of the future of publishing in the humanities.

Humanities scholars from all higher education institutions in the locality of each event are warmly invited to come along to the local Roadshow session. The cities we are visiting are Leeds, Manchester, London, Coventry, Glasgow and St Andrews. The full programme is available here.

We will assess the impact of these events and may send the Roadshow out again to new venues next year if they prove to be successful. If you cannot attend but would like further information on the publishing programmes described here, or would like to suggest other venues the Roadshow might visit, please contact me at sparceurope@arl.org

Library of Congress: The Signal: Results from the 2013 NDSA U.S. Web Archiving Survey

planet code4lib - Thu, 2014-10-23 15:25

The following is a guest post from Abbie Grotke, Web Archiving Team Lead, Library of Congress and Co-Chair of the NDSA Content Working Group.

The National Digital Stewardship Alliance is pleased to release a report of a 2013 survey of Web Archiving institutions (PDF) in the United States.

A bit of background: from October through November of 2013, a team of National Digital Stewardship Alliance members, led by the Content Working Group, conducted a survey of institutions in the United States that are actively involved in, or planning to start, programs to archive content from the web. This survey built upon a similar survey undertaken by the NDSA in late 2011 and published online in June of 2012. Results from the 2011-2012 NDSA Web Archiving Survey were first detailed in May 2, 2012 in “Web Archiving Arrives: Results from the NDSA Web Archiving Survey” on The Signal, and the full report (PDF) was released in July 2012.

The goal of the survey was to better understand the landscape of web archiving activities in the U.S. by investigating the organizations involved, the history and scope of their web archiving programs, the types of web content being preserved, the tools and services being used, access and discovery services being provided and overall policies related to web archiving programs. While this survey documents the current state of U.S. web archiving initiatives, comparison with the results of the 2011-2012 survey enables an analysis of emerging trends. The report therefore describes the current state of the field, tracks the evolution of the field over the last few years, and forecasts future activities and developments.

The survey consisted of twenty-seven questions (PDF) organized around five distinct topic areas: background information about the respondent’s organization; details regarding the current state of their web archiving program; tools and services used by their program; access and discovery systems and approaches; and program policies involving capture, availability and types of web content. The survey was started 109 times and completed 92 times for an 84% completion rate. The 92 completed responses represented an increase of 19% in the number of respondents compared with the 77 completed responses for the 2011 survey.

Overall, the survey results suggest that web archiving programs nationally are both maturing and converging on common sets of practices. The results highlight challenges and opportunities that are, or could be, important areas of focus for the web archiving community, such as opportunities for more collaborative web archiving projects. We learned that respondents are highly focused on the data volume associated with their web archiving activity and its implications on cost and the usage of their web archives.

Based on the results of the survey, cost modeling, more efficient data capture, storage de-duplication, and anything that promotes web archive usage and/or measurement would be worthwhile investments by the community. Unsurprisingly, respondents continue to be most concerned about their ability to archive social media, databases and video. The research, development and technical experimentation necessary to advance the archiving tools on these fronts will not come from the majority of web archiving organizations with their fractional staff time commitments; this seems like a key area of investment for external service providers.

We hope you find the full report interesting and useful, whether you are just starting out developing a web archiving program, have been active in this area for years, or are just interested in learning more about the state of web archiving in the United States.

Nicole Engard: ATO2014: Open source, marketing and using the press

planet code4lib - Thu, 2014-10-23 15:05

Steven Vaughan-Nichols was up to talk to us about open source, marketing and using the press.

Before Steven was a journalist he was a techie. This makes him unusual as a journalist who actually gets technology. Steven is here to tell us that marketing is a big part of your job if you want a successful open source company. He has heard a lot of people saying that marketing isn’t necessary anymore. The reason it’s necessary is because writing great code is not enough – if no one else knows about it it doesn’t matter. You need to talk with people about the project to make it a success.

We like to talk about open source being a meritocracy – that’s not 100% true – the meritocracy is the ideal or a convenient fiction. The meritocracy is only part of the story – it’s not just about your programming it’s about getting the right words to the right people so that they know about your project. You need marketing for this reason.

Any successful project needs 2 things – 1 you already know – is that it solves a problem that needs a solution – the other part is that it must be able to convince a significant number of people that your project is the solution to their problem. One problem open source has is that they confuse open source with the community – they are not the same thing. Marketing is getting info about your project to the world. The community is used for defining what the project really is.

Peter Drucker, says “The aim of marketing is to know and understand the customer so well the product or service fits him and sells itself.” Knowing the customer better than they know themselves is not an easy job – but it’s necessary to market/sell your product/service. If your project doesn’t fit the needs of your audience then it won’t go anywhere.

David Packard: “Marketing is too important to be left to the marketing department” – and it really is. There is a tendency to see marketing as a separate thing. Marketing should not be a separate thing – it should be honest about what you do and it should be the process of getting that message to the world. Each person who works on the project (or for the company) is a representative of your product – we are always presenting out product to the world (you might not like it – but it’s true). If your name is attached to a project/company then people are going to be watching you. You need to avoid zinging competing products and portray a positive image about you and your product. Even if you’re not thinking about what you’re saying as marketing it is.

Branding is another thing that open source projects don’t always think this through enough – they think this is trivial. Branding actually does matter! What images and words and name you use to describe your product matter. These will become the shorthand that people see your project as. For example if you see the Apply logo you know what it’s about. In our world of open source there is the Red Hat shadow man – whenever you see that image you know that means Red Hat and all the associations you have with that. You can use that association in your marketing. People might not know what Firefox is (yes there are people who don’t know) but they do recognize the cute little logo.

You can no longer talk just on IRC or online, you have to get out there. You need to go to conferences and make speeches and get the word out to people. And always remember to invite people to participate because this is open source. You have to make an active network and get away from the keyboard and talk to people to get the word out there. At this point you need to start thinking about talking to people from the press.

One thing to say to people, to the press, is a statement that will catch on – a catch phrase that will reach the audience you want to reach. The press are the people to talk to the world at large. These are people who are talking to the broader world – talking to people at opensource.com and other tech sites is great – but if you want to make the next leap you need to get to these type of people. Don’t assume that the press you’re talking to don’t know what you’re talking about – but just because they happen to like open source or what you’re talking about – it does not mean that they will write only positive things. The press are critics – they’re not really on your side – even if they like you they won’t just talk your products up. You need to understand that going in.

Having said all that – you do need to talk to the press at some point. And when you do, you need to be aware of a few things. Never ever call the press – they are always on perpetual deadline – you can’t go wrong with email though. When you do send an email be sure to remember to cover a few important things: tell then what you’re doing, tell them what’s new (they don’t care that you have a new employee – they might care if a bigwig quits or is fired), get your message straight (if you don’t know what you’re doing then the press can’t figure it out), and hit it fast (tell them in the first line what you’re doing, who your audience is and why the world should care). Be sure to give the name of someone they can call and email for more info – this can’t be emphasized enough – so often Steven has gotten press releases without contact info on them. Put the info on your website – make sure that there is always a contact in your company for the press. Remember if your project is pretty to send screenshots – this will save the press a lot of time in installing and getting the right images. Steven says “You need to spoon feed us”.

You also want to be sure to know what the press person you’re contacting writes about – do your homework – don’t contact them with your press release if it’s not something they write about. Also be sure to speak in a language that the person you’re talking to will understand [I know I always shy away from OPAC and ILS when talking to the press]. Not everyone you’re talking to has experience in technology. Don’t talk down to the press, just be sure to talk to the person in words they understand. Very carefully craft your message – be sure to give people context and tell them why they should care – if you can’t tell them that there they can’t tell anyone else your story.

Final points – remember to be sweet and charming when talking to the press. When they say something that bothers you, don’t insult the press. If you alienate the press they will remember. In the end the press has more ink/pixels than you do – their words will have a longer reach than you do. If the press completely misrepresents you be sure to send a polite note to the person explaining what was wrong – without using the word ‘wrong’. Be firm, but be polite.

The post ATO2014: Open source, marketing and using the press appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Open Source at Facebook
  2. ATO2014: Open Source – The Key Component of Modern Applications
  3. ATO2014: Women in Open Source

David Rosenthal: Facebook's Warm Storage

planet code4lib - Thu, 2014-10-23 15:00
Last month I was finally able to post about Facebook's cold storage technology. Now, Subramanian Muralidhar and a team from Facebook, USC and Princeton have a paper at OSDI that describes the warm layer between the two cold storage layers and Haystack, the hot storage layer. f4: Facebook's Warm BLOB Storage System is perhaps less directly aimed at long-term preservation, but the paper is full of interesting information. You should read it, but below the fold I relate some details.
A BLOB is a Binary Large OBject. Each type of BLOB contains a single type of immutable binary content, such as photos, videos, documents, etc. Section 3 of the paper is a detailed discussion of the behavior of BLOBs of different kinds in Facebook's storage system.
Figure 3 shows that the rate of I/O requests to BLOBs drops rapidly through time. The rates for different types of BLOB drop differently, but all 9 types have dropped by 2 orders of magnitude within 8 months, and all but 1 (profile photos) have dropped by an order of magnitude within the first week.

The vast majority of Facebook's BLOBs are warm, as shown in Figure 5 - notice the scale goes from 80-100%. Thus the vast majority of the BLOBs generate I/O rates at least 2 orders of magnitude less than recently generated BLOBs.

In my talk to the 2012 Library of Congress Storage Architecture meeting I noted the start of an interesting evolution:a good deal of previous meetings was a dialog of the deaf. People doing preservation said "what I care about is the cost of storing data for the long term". Vendors said "look at how fast my shiny new hardware can access your data".  ... The interesting thing at this meeting is that even vendors are talking about the cost.This year's meeting was much more cost-focused. The Facebook data make two really strong cases in this direction:
  • That significant kinds of data should be moved from expensive, high-performance hot storage to cheaper warm and then cold storage as rapidly as feasible.
  • That the I/O rate that warm storage should be designed to sustain is so different from that of hot storage, at least 2 and often many more orders of magnitude, that attempting to re-use hot storage technology for warm and even worse for cold storage is futile.
This is good, because hot storage will be high-performance flash or other solid state memory and, as I and others have been pointing out for some time, there isn't going to be enough of it to go around.

Haystack uses RAID-6 and replicates data across three data centers, using 3.6 times as much storage as the raw data. f4 uses two fault-tolerance techniques:
  • Within a data center it uses erasure coding with 10 data blocks and 4 parity blocks. Careful layout of the blocks ensures that the data is resilient to drive, host and rack failures at an effective replication factor of 1.4.
  • Between data centers it uses XOR coding. Each block is paired with a different block in another data center, and the XOR of the two blocks stored in a third. If any one of the three data centers fails, both paired blocks can be restored from the other two.
The result is fault-tolerance to drive, host, rack and data center failures at an effective replication factor of 2.1, reducing overall storage demand from Haystack's factor of 3.6 by nearly 42% for the vast bulk of Facebook's BLOBs.  When fully deployed, this will save 87PB of storage. Erasure-coding everything except the hot storage layer seems economically essential.

Another point worth noting that the paper makes relates to heterogeneity as a way of avoiding correlated failures:
We recently learned about the importance of heterogeneity in the underlying hardware for f4 when a crop of disks started failing at a higher rate than normal. In addition, one of our regions experienced higher than average temperatures that exacerbated the failure rate of the bad disks. This combination of bad disks and high temperatures resulted in an increase from the normal ~1% AFR to an AFR over 60% for a period of weeks. Fortunately, the high-failure-rate disks were constrained to a single cell and there was no data loss because the buddy and XOR blocks were in other cells with lower temperatures that were unaffected.

Nicole Engard: ATO2014: Women in Open Source

planet code4lib - Thu, 2014-10-23 14:01

DeLisa Alexander from Red Hat was up next to talk to us about women in open source.

How many of you knew that the first computer – the ENIAC was programmed by women mathematicians? DeLisa is here to share with us a passion for open source and transparency – and something similarly important – diversity.

Why does diversity matter? Throughout history we have been able to innovate our way out of all kinds of problems. In the future we’re going to have to do this faster than ever before. Diversity of thoughts, theories and views is critical to this process. It’s not just “good” to think about diversity, it’s important to innovation and for solving problems for quickly.

Why are we having so much trouble finding talent? 47% of the workforce is made up of women but only 12% are getting computer and information science degrees – and only 1-5% of open source contributors are women. How much faster could we solve the world’s big problems with the other 1/2 of the population were participating? We need to be part of this process.

When you meet a woman who is successful in technology – there is usually one person who mentored her (man or woman) to feel positive about her path – we could be that voice for a girl or woman that we know. Another thing that we can do is help our kids understand what is going on and what opportunities there are. Kids today don’t think about the fact that the games they’re playing were developed by a human – they just think that computers magically have software on them. They have no clue that someone had to design the hardware and program the software [I actually had someone ask me once what 'software' was - the hardest question I've ever had to answer!].

We can each think about the opportunities in open source. There is the GNOME for women program, Girl Develop It and the Women in Open Source award.

The challenge for us is to decide on one person that we’re going to try and influence to stay in the field, join the field, nominate for an award. If each of us do this one thing, next year this room could be filled with 50% women.

The post ATO2014: Women in Open Source appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Building a premier storytelling platform on open source
  2. ATO2014: Easing into open source
  3. ATO2014: Open Source Schools: More Soup, Less Nuts

Raffaele Messuti: a wayback machine (pywb) on a cheap, shared host

planet code4lib - Thu, 2014-10-23 11:00

For a long time the only free (i'm unaware of commercial ones) implementation of a web archival replay software has been the Wayback Machine (now Openwayback). It's a stable and mature software, with a strong community behind.
To use it you need to be confident with the deploy of a java web application; not so difficult, and documentation is exaustive.
But there is a new player in the game, pywb, developed by Ilya Kramer, a former Internet Archive developer.
Built in python, relatively simpler than wayback, and now used in a pro archiving project at Rhizome.

DuraSpace News: DSpaceDirect: Your Hosted “DSpace in the Cloud”

planet code4lib - Thu, 2014-10-23 00:00

 Winchester, MA  DSpaceDirect (http://dspacedirect.org) is a hosted repository solution for low-cost discovery, access, archiving, and preservation.

SearchHub: Stump The Chump D.C.: Meet The Panel

planet code4lib - Wed, 2014-10-22 22:04

If you haven’t heard: On November 13th, I’ll be back in the hot seat at Lucene/Solr Revolution 2014 answering tough Solr questions — submitted by users like you — live, on stage, sight unseen.

Today, I’m happy to announce the Panel of experts that will be challenging me with those questions, and deciding which questions were able to Stump The Chump!

In addition to taunting me with the questions, and ridiculing all my “Um”s and “Uhh”s as a struggle to answer them, the Panel members will be awarding prizes to the folks who have submitted the question that do the best job of “Stumping” me. Questions can be submitted to our panel via stump@lucenerevolution.org any time until the day of the session. Even if you won’t be able to attend the conference, you can still participate — and do your part to humiliate me — by submitting your tricky questions.

To keep up with all the “Chump” news fit to print, you can subscribe to this blog (or just the “Chump” tag).

The post Stump The Chump D.C.: Meet The Panel appeared first on Lucidworks.

Nicole Engard: ATO2014: Pax Data

planet code4lib - Wed, 2014-10-22 21:39

Doug Cutting from Cloudera gave our closing keynote on day 1.

Hadoop started a revolution. It is an open source platform that really harnesses data.

In movies the people who harness the data are always the bad guys – so how do we save ourselves from becoming the bad guy? What good is coming out of good data?

Education! The better data we have the better our education system can be. Education will be much better if we can have a custom experience for each student – these kinds of observations are fed by data. If we’re going to make this happen we’re going to need to study data about these students. The more data you amass the better predictions you can make. On the flip side it’s scary to collect data about kids. inBloom was an effort to collect this data, but they ended up shutting down because of the fear. There is a lot of benefit to be had, and it would be sad if we didn’t enable this type of application.

Heathcare is another area this becomes handy. Medical research benefits greatly from data. The better data we collect the better we can care for people. Once again this is an area that people have fears about shared data.

Climate is the last example. Climate is changing and in order to understand how we can effect it data plays a huge role. Data about our energy consumption is part of this. Some people say that certain data is not useful to collect – but this isn’t a good approach. We want to collect all the data and then evaluate it. You don’t know in advance what value the data you collect will have.

How do we collect this data if we don’t have trust? How do we build that trust? There are some technology solutions like encrypting data and anonymizing data sets – these methods are imperfect though. In fact if you anonymize the data too much it muddies it and makes it less useful. This isn’t just a technical problem – instead we need to build trust.

The first way to build trust is to be transparent. If you’re collecting data you need to let people know you’re collecting it and what you’re going to use it for.

The next key element is establishing best practices around data. These are the technical elements like encryption and anonymization. This also includes language to agree/disagree to ways our data is shared.

Next we need to draw clear lines that people can’t step over – for example we can’t show someone’s home address without their express permission. Which gives us a basis for the last element.

Enforcement and oversight is needed. We need someone who is checking up on these organizations that are collecting data. Regulation can sound scary to people, but we have come to trust it in many markets already.

This is not just a local issue – it needs to be a global effort. As professionals in this industry we need to think about how to build this trust and get to the point where data can be stored and shared.

The post ATO2014: Pax Data appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Modern Applications & Data
  2. ATO2014: How Raleigh Became an Open Source City
  3. ATO2014: What Academia Can Learn from Open Source

Nicole Engard: ATO2014: Saving the world: Open source and open science

planet code4lib - Wed, 2014-10-22 20:49

Marcus Hanwell, another fellow opensource.com moderator, was the last session of the day with his talk about saving the world with open source and open science!

In science there was a strong ethic of ‘trust, but verify’ – and if you couldn’t reproduce the efforts of the scientist then the theory was dismissed. The ‘but verify’ part of that has kind of gone away in recent years. In science the primary measure of whether you were successful or not was to publish – citations to your work are key. Then when you do publish your content is locked down in costly journals instead of available in the public domain. So if you pay large amounts of money you can have access to the article – but not the data necessarily. Data is kept locked up more and more to keep the findings with the published person so that they get all the credit.

Just like in the talk earlier today on what Academia can learn from open source Marcus showed us an article from the 17th century next to an article today – the method of publishing has not changed. Plus these articles are full of academese which is obtuse.

All of this makes it very important to show what’s in the black box. We need to show what’s going on in these experiments at all levels. This includes sharing your steps to run calculations – the source code used to get this info should be written in open source because now the tools used are basically notebooks with no version control system. We have to stop putting scientists on these pedestals and start to hold them accountable.

A great quote that Marcus shared from an Economist article was: “Scientific research has changed the world. Now it needs to change itself.” Another was “Publishing research without data is simply advertising, not science.” Scientists need to think more about licenses – they give their rights away to journals because they don’t pay enough attention to the licenses that are out there like the creative commons.

What is open? How do we change these behaviors? Open means that everyone has the same access. Certain basic rights are granted to all – the ability to share, modify and use the information. There is a fear out there that sharing our data means that we could prove that we’re wrong or stupid. We need to change this culture. We need more open data (shared in open formats) and using open source software, more open standards and open access.

We need to push boundaries – most of what is published in publicly funded so it should be open and available to all of us! We do need some software to share this data – that’s where we come in and where open source comes in. In the end the lesson is that we need to get scientists to show all their data and not reward academics solely for their citations because this model is rubbish. We need to find a new way to reward scientists though – a more open model.

The post ATO2014: Saving the world: Open source and open science appeared first on What I Learned Today....

Related posts:

  1. ATO2014: What Academia Can Learn from Open Source
  2. ATO2014: Easing into open source
  3. ATO2014: How Raleigh Became an Open Source City

Nicole Engard: Bookmarks for October 22, 2014

planet code4lib - Wed, 2014-10-22 20:30

Today I found the following resources and bookmarked them on <a href=

  • vokoscreen Open source screencasting
  • Waffle.io Waffle creates a full project management solution from your existing GitHub Issues.

Digest powered by RSS Digest

The post Bookmarks for October 22, 2014 appeared first on What I Learned Today....

Related posts:

  1. Open Access Day in October
  2. Governments Urging the use of Open Source
  3. Digsby Goes Open Source

Nicole Engard: ATO2014: Open Source in Healthcare

planet code4lib - Wed, 2014-10-22 19:59

Luis Ibanez, my fellow opensource.com moderator, was up next to talk to us about Open Source in Healthcare. Luis’s story was so interesting – I hope I caught all the numbers he shared – but the moral of the story is that hospitals could save insane amounts of money if they switched to an open system.

There are 7 billion people on the planet making $72 trillion a year. In the US we have 320 million people and that’s 5% of the global population, but we make 22% of the economic production on the planet – what do we do with that money? 24% of that money is spent on healthcare ($3.8 trillion) – not just the government, this is the spending of the entire country. This is more than they’re spending in Germany and France. However we’re ranked 38th in healthcare quality in the world. France is #1 however and they spend only 12% of their money on healthcare. This is an example of how spending more money on the problem is not helping.

Is there something that geekdom can do to set this straight? Luis says ‘yes!’

So, why do we go to the doctor? To get information. We want the doctor to tell us if we have a problem they can fix and know how to fix it. Information connects directly to our geekdom.

Today if you go to a hospital our data will be stored in paper and will go in to a “data center” (a filing cabinet). In 2010 84% of hospitals were keeping paper records versus using software. The healthcare industry is the only industry that needs to be paid to get them to switch to using software to store this information – $20 billion spent between 2010 and 2013 to get us to 60% of hospitals storing information electronically. This is one of the reasons we’re spending so much on healthcare right now.

The problem here (and this is Luis’s rant) is that the hospitals have to pay for this software in the first place. And you’re not allowed to share anything about the system. You can’t take screenshots, you can’t talk about the features, you are completely locked down. This system will run your hospital (a combination of hotel, restaurant, and medical facility) – they have been called the most complex institution of the century. These systems for a 400 bed hospital cost $100 million – and they have to buy these systems with little or no knowledge of how they work because of the security measures around seeing/sharing information about the software. This is against the idea of a free market because of the NDA you have to sign to see the software and use the software.

An example that Luis gave us was Wake Forest hospital which ended up being in the red by $56 million. All because they bought software for $100 million – leading to them having to fire their people, stop making retirement payments and other cuts. [For me this sounds a lot like what libraries are doing - paying salaries for an ILS instead of putting money toward people and services instead and saving money on the ILS]

Another problem in the medical industry is that 41% (less than 1/2) have the capability to send secure messages to patients. This is not a technology problem – this is a cultural problem in the medical world. Other industries have solved this technology problem already.

So, why do we care about all of this? There are 5,723 hospitals in the US, 211 of them are federally run (typically military hospitals), 413 are psychiatric, 2,894 are non profits and the others are private or state run. That totals nearly 1 million beds and $830 billion a year is spent in hospitals. The software that these hospitals are buying costs about $250 billion.

The federal hospitals are running a system that was released in to the public domain called VistA. OSEHRA was founded to protect this software. This software those is written in MUMPS. This is the same language that the $100 million software is written in! Except there is a huge difference in price.

If hospitals switched they’d spend $0. To keep this software running/updated we’d need about 20 thousand developers – but if you divide that by the hospitals that’s 4 developers per hospital. These developers don’t need to be programmers though – they could be doctors, nurses pharmacists – because MUMPS is so easy to learn.

The post ATO2014: Open Source in Healthcare appeared first on What I Learned Today....

Related posts:

  1. ATO2014: Open Source – The Key Component of Modern Applications
  2. ATO2014: Easing into open source
  3. ATO2014: Open Source & the Internet of Things

LITA: LITA Forum: Online Registration Ends Oct. 27

planet code4lib - Wed, 2014-10-22 19:59

Don’t miss your chance to register online for the 2014 LITA Forum “From Node to Network” to be held Nov. 5-8, 2014 at the Hotel Albuquerque in Albuquerque N.M. Online registration closes October 27, 2014. You can register on site, but it’s so much easier to have it all taken care of before you arrive in Albuquerque.

Book your room at the Hotel Albuquerque. The guaranteed LITA room rate date has passed, but when you call at: 505-843-6300 ask for the LITA room rate, there might be a few rooms left in our block.

Three keynote speakers will be featured at this year’s forum:

  • AnnMarie Thomas, Engineering Professor, University of St. Thomas
  • Lorcan Dempsey, Vice President, OCLC Research and Chief Strategist
  • Kortney Ryan Ziegler, Founder Trans*h4ck.

More than 30 concurrent colleague inspired sessions and a dozen poster sessions will provide a wealth of practical information on a wide range of topics.

Two preconference workshops will also be offered;

  • Dean B. Krafft and Jon Corson-Rikert of Cornell University Library will present
    “Linked Data for Libraries: How libraries can make use of Linked Open Data to share information about library resources and to improve discovery, access, and understanding for library users”
  • Francis Kayiwa of Kayiwa Consulting will present
    “Learn Python by Playing with Library Data”

Networking opportunities, a major advantage of a smaller conference, are an important part of the Forum. Take advantage of the Thursday evening reception and sponsor showcase, the Thursday game night, the Friday networking dinners or Kitchen Table Conversations, plus meals and breaks throughout the Forum to get to know LITA leaders, Forum speakers, sponsors, and peers.

2014 LITA Forums sponsors include EBSCO, Springshare, @mire, Innovative and OCLC.

Visit the LITA website for more information.

Library and Information Technology Association (LITA) members are information technology professionals dedicated to educating, serving, and reaching out to the entire library and information community. LITA is a division of the American Library Association.

LITA and the LITA Forum fully support the Statement of Appropriate Conduct at ALA Conferences

Islandora: Islandora Deployments Repo

planet code4lib - Wed, 2014-10-22 18:58

Ever wonder what another institution's Islandora deployment looks like in detail? Look no further: York and Ryerson have shared their deployments wit the community on GitHub, including details such as software versions, general settings, XACML policies, and Drupal modules. If you would like to share your deployment, please contact Nick Ruest so he can add you as a collaborator on the repo.

Pages

Subscribe to code4lib aggregator