news aggregatorReese, Terry: MarcEdit UpdateI just uploaded a new version of MarcEdit 5.1. Most of the updates are related to changes in the MarcEditor. So here’s the list of changes:
Couple of notes – I’m currently writing up some new notes on using MarcEdit on Linux. Mono 2.0+ essentially has added all the functionality necessary to run MarcEdit on Linux. I’ll be creating a handful of Youtube videos for folks interested in giving this a try. As for running on a Mac – well, I’ll look at that next. You can download the new version of MarcEdit from here: MarcEdit_Setup.msi –TR Eaton, Alf: YQL Open Data TablesI nearly got as far as setting up an Open Data Table definition for WeFollow, so that it could be queried using YQL. Sadly the HTML parser that YQL exposes for arbitrary URLs isn't available to web services defined using Open Data Tables - they have to return either well-formed XML or JSON. Still, I like the style of the Open Data Table definitions, and found some useful documentation in Chapter 5 of the Yahoo Query Language guide ("Using YQL Open Data Tables"). There are lots of community-contributed definitions in Sam Pullara (of Yahoo!)'s GitHub. Also, it would be great if YQL could use CSS selectors, in the way that Freebase's Acre does using Sizzle and Rhino, alongside the existing XPath selectors that are available for use on parsed HTML. Eaton, Alf: Festive 50 Spotify PlaylistsLots of people seem to be making Spotify playlists of John Peel's Festive 50 charts. The raw data for Festive 50 track listings doesn't seem to be available so I scraped them into TSV files. This should be the same data as an OpenDocument spreadsheet. I can't check if it's valid, because OpenOffice is retarded and Google Docs' upload isn't working. Once I can get the despotify gateway working (it's not accepting the session IDs at the moment), it should be possible to resolve the tracks to Spotify track IDs and create Spotify playlists automatically. Update: I got the Python bindings installed, but it's apparently not possible to get a track's Spotify URI this way, only the relative track number within a playlist or list of search results. Here are XSPF versions of the tracklists, for use with Last.fm's Boffin (if you have the audio files locally). Ideally Spotify would read XSPF playlists and resolve them itself. The description of how the despotify authors found the recent security vulnerability in Spotify is interesting: It turned out that whenever you added someone else's shared playlist, the Spotify client software would request information from Spotify's servers about the author of that playlist. The information returned contained things like a hash (based on a salt and the user's password), date of birth, city and other things that Spotify knew about this user.We realized that the password hash that was transfered to the client when you added someone else's playlist could be used as a way of authenticating to the server as the owner of the playlist, without knowing his or her password. Corrado, Ed: Corned Beef and Cabbage in a Crock PotI don’t normally do food recipes on my blog, but a couple people on facebook have asked me how I cook corned beef and cabbage in my crock pot. Here is how I do it. I really like the way this comes out, but as they say, your mileage may vary:
Corned Beef (I like ones with separate seasoning package) 1 head cabbage Carrots 3 or 4 white or red potatoes (I keep the skin on) 1 Turnip (optional, if you use turnips, make pretend they are potatoes in the rest of the recipe) 1 onion (I usually use red, but it doesn’t matter) 1 bay leaf (optional) 1 or 2 crushed garlic cloves (optional - I don’t bother dicing it, because I’m going to take it out and not eat it) Water 2 tablespoons of dry white wine OR 1 splash of white wine vinegar Method: 1.Skin and cut carrots into about 1 1/2 or 2 inch long pieces or so (less if they are thick) (optionally: cheat and use baby carrots) 2. Cut onion up into disks (about 1/3 to 1/2 inch thick… it doesn’t really matter because I don’t eat them) 3. Cut potatoes (usually in thirds, depending on size). I keep the skin on, but do what you want 4. Place onions and carrots on bottom of crock pot (if they don’t all fit, don’t worry, just put them in with the the potatoes). 5. Place corned beef on top of onions and carrots. 6. Place potatoes and any remaining carrots.onions around the corned beef. 7. Cover with water 8. Add wine or vinegar. I’ve also used beer in the past and it was OK, but I think wine is better. If you use wine, for best results place a little extra into chef. 9. Throw in bay leaf and crushed garlic 10. Turn crock pot on low. 11. Make coffee and go to work 12. Come home from work 13. About 20 or 30 minutes before you want to eat (depending on how al dente you want your cabbage) cut your cabbage into wedges. Depending on the size of the cabbage, I either quarter it or cut it into sixths 14. Place cabbage in crock pot. 15. Turn crock pot on high 16. Cook until cabbage is to your liking 17. Serve with spicy mustard and if you can get it, some nice rye bread. Note: In Binghamton rye bread is not a viable option :-(.Note: If you cook the corned beef all day, the corned beef will basically fall apart when you try to cut it. I like that, but you may choose to cook a little shorter length of time if you rather have nice slides. If you live close enough to work, you can get everything ready before work and go home for lunch and turn the crock pot on then. Note 2: If I have some other extra veggies, such as celery, I might throw them in to the pot as well, but I take them out before serving. del.icio.us: Science in the open " There are crowds, and then there are crowds...“there is no crowd”"Polymath project (via Michael Nielsen). For those who missed this, the project aimed to solve a defined mathematical problem through a series of small contributions from a group of people; essentially building a mathematical proof, or the outlines of one, via crowd-sourcing. Tim’s blogpost and the accompanying comments are a gold mine for anyone who wants to understand how this type of project does and does not work. The main success of the project was in solving the problem, or in fact a more general form of the originally stated problem. The main “failure” was that the team of actively involved people was rather small and made up of people who might be seen as “the usual suspects” in this context, a group of the best mathematicians who are active on the web."""there are at least two distinctly different types of crowd sourcing. The “wisdom of crowds”...“broadcast request - expert response”."
Mignault, John: Molecular mixologyShaken and Stirred - Creating an Irish Whiskey Cocktail, With a Twist - NYTimes.com: Remind me not to take the Times seriously anymore. After this quite sensible advice: YOU will be tempted, on Tuesday, to drink something green. Resist it. No, what I’m talking about is the cocktail equivalents of green beer, all the “obligatory Midori and crème de menthe drinks,” as Anthony Malone, the Dublin-born general manager and bartender at Puck Fair, an Irish bar on Lafayette Street near Houston, put it. “All those awful green things,” he said, such as the Everybody’s Irish, a drink that calls for Irish whiskey, crème de menthe, Chartreuse and a green olive. Everybody’s Irish? Everybody’s gagging. We get this utterly sickening bilge: Earlier this year, he issued a curious challenge to a select group of bartenders in New York, Chicago, Boston and San Francisco. He asked them to create cocktails based upon the traditional Irish breakfast — eggs, bacon, black and white pudding, and toast. And Bushmills, of course, though Mr. Egan hesitated to cite his product as a breakfast staple. “More like brunch,” he demurred. Here in New York, Jim Meehan of PDT responded with a drink in which bacon-infused Bushmills is combined with maple syrup, orange and lemon juice and a whole egg. The entry from Eben Freeman, at Tailor, was more baroque: bacon-infused Bushmills, again, adorned with roasted tomato gelée squares, a slow-poached quail egg yolk, an Irish breakfast-tea foam and crispy black-pudding bits. Shite and onions, as John Joyce would say. If you’re going to have whiskey for breakfast, leave off the garnishes. Brantley, Peter: The Orphan MonopolyLast Friday, I was able to attend a very interesting meeting at Columbia Law School on the long term ramifications of the Google Book Search settlement. Some of what was discussed will be drawn out over future posts, here or elsewhere. The conference was covered in twitter at #gbslaw. There is a lot to ponder: This is arguably a massive re-writing of copyright for books without any legislative input; Marybeth Peters (MBP), the U.S. Registrar of Copyrights, observed that the settlement essentially proposes a private agreement for compulsory licensing between a large class of IP holders and world’s largest search engine. The potential scope and policy ramifications are significant. MBP mentioned that there might be treaty implications under international conventions. And despite that, one of the most shocking of her statements was that the Copyright Office has not received a single inquiry from any of the 535 elected representatives of the people of the United States. Not. One. Orphan works What I want to discuss in this post is a persistent theme that ranged across the panels and discussions: concern with the status of orphan works in the settlement proposal. Only a subset of the works covered by the settlement will actually be orphan: some of the works will have identifiable rights holders, and many new rights holders will come forward. Indeed, the settlement offers to change the rights status of a great number of works, which is by and large a useful clarification. However, there will be a tremendous number of works for which the rights status is murky at best: they may be likely in-copyright, but with no identified rightsholder, or they might be likely out of copyright, but no one can easily verify this to be the case. An indirect indication of the magnitude of this body of unclaimed books is foretold by Google’s set-aside of $45 million to compensate rightsholders (RH) for already digitized works. There are differing payments for books and inserts, but let’s assume all works with newly identified rightsholders are books, which is the maximum payout ($60/title). Dividing $45 million by $60 gives us a maximum count of 750,000 titles expected compensation. The settlement does note that $45 million might not be enough to cover claims and more funds might be required to be added by Google, but nonetheless this must be a rough, best-guess on Google’s, the publishers, and the authors part. There are rough estimates of around 7 million digitized volumes in GBS; subtracting 750,000 newly identified works gives us 6.25 million. Let’s take a guess that there are maybe 1.5 million public domain works (this is not entirely out of the blue, but informed by earlier orphan works studies and reports), leaving 4.75 million titles. That’s a lot of books – about 2/3 of the total. It might be more, it might be less; it is a big number. This is not inexplicable. There are a large number of ways that books might fall into orphan status. A quick consultation of Peter Hirtle’s copyright table at Cornell Univ. allows us to see how easy this is. The impact of foreign rights is fiendishly complicated, and even the rules for U.S. publications are baroque; for older works it is a crafty rightsholder indeed who can figure out whether they might retain a claim. As Peter Hirtle observed to me in an email, “The lengthening copyright terms and the gradual removal of formalities (especially the automatic renewal of works published since 1963) means that works that would have passed into the public domain in the past because the rights owners weren't concerned are still protected. The chances that the rights holders are either unidentifiable or not locatable also goes up.” Further, many Copyright Office records have not yet been digitized and require manual examination; a very high portion of these records are dirty, with missing metadata (including basic information such as Title or Author); obviously incorrect metadata (e.g. misspellings); transposed metadata fields; updated records with no explicit connection to superseded records; and so on. (In other words, they are a real mess). There have been several efforts to digitize these data, with varying success and rigor. The most active rights identification efforts currently are those at Google and the University of Michigan. A large number of these orphans are going to be truly public domain books, just like pre-1923 works. However, we may never know that they actually have public domain status due to historically incomplete record keeping, and the lack of a national rights tracking and notification infrastructure. Additionally, unlike the proposed orphan works legislation which almost, but didn’t, pass through the House and Senate last year, the rights claiming process is opt-in. This simplifies things considerably for Google and the BRR, because – unlike the proposed legislation – the BRR is not required to undertake at any point a “diligent search” for the rightsholder/s of works on an item by item basis. This puts the burden on the RHs to come forward to make their claims. The settlement parties are correct to observe that the agreement engendered perhaps the single largest class notification program in the history of class action settlements in the United States, but despite its completeness, it is just not going to reach everyone who might have a stake in the suit (e.g., classic lineage problems such as the daughter of the niece of the co-author who is the last surviving heir, who doesn’t even know there were transmitted rights). An entire group of authors that the notification will not reach are “non- active” authors of orphan works, who do not realize that they may have rights to titles digitized by Google under the proposed settlement. Orphan works authors and rightsholders won’t opt out of the settlement, nor will they opt-in; by definition they are not aware they have a right to file claims. This raises troubling questions about the representative completeness of the author sub-class in the settlement. Monetization of Orphans At Columbia Law on Friday, the most vexing issue for orphans was the distribution of income from their monetization by Google for the benefit of BRR, Google, and the Class parties (authors and publishers of books, as identified in the proposal). The distribution of income differs considerably depending on whether it is derived from non-subscription sales (mostly, individual purchases or licensed uses), versus through institutional sales to libraries and related, approved, organizations. In the rough, the non-subscription sale income goes first to the BRR for operational assistance, and to fund a reserve endowing support for future BRR programs. In the consequent improbable event that there are leftover funds, they are apportioned to RHs until they have received 70 percent of the gross revenues for each book, and then (finally) leftover funds go to not for profits supporting reading, literacy, libraries, etc. That trickle down is not likely to generate much dew on the thirsty gardens of the public sector. This distribution is likely to generate an appreciable percentage of the total income for the BRR, a complex entity with many diverse goals, including policy, arbitration, distribution, and rights maintenance, in addition to its own internal administration. (Even with these funds, it seems worthwhile to question whether the BRR can support itself as an independent concern without additional on-going subsidy.) For subscription sales, which might well be ultimately the most significant source of income, the revenue is apportioned straight to the rightsholders by the BRR. (I’ve appended the relevant settlement language at the end of this post, in its entirety). The essential problem is that the settlement parties have a vested interest in maintaining a monopoly over access to orphan books. Marybeth Peters speculated that the resolve of settlement participants to support future orphan works legislation might be weakened, regardless of their zeal for such clarification in the past. As the Chicago Law professor Randal Picker noted at the meeting [slides here], there is a built-in incentive for licensing associations to protect guaranteed income sources from external claimants: the settling parties want to maintain the property status of orphans as copyrighted works against outsiders. This is wrong on the face of it; it is an abrogation of the public’s right of access that there is no structural incentive to identify public domain works within the corpus of orphans, and that the largest share of revenue generated from their digitization goes to RHs who have, by definition, no right to that income. Randal Picker suggested that creating a more symmetric MFN status for commercial exploitation of the works covered by the settlement, such as unbundling orphan works by opening them up to exploitation by non-profits, might be a useful attenuation of this inherent danger. There is a further problem. In addition to the income from settlement-proposed schemes, Google uniquely will be able to generate income from not-covered uses, such as integrating the content with web, dataset, and news data to build more robust discovery services. The advertising revenue against this aggregation will be uniquely Google’s to reap. As Jule Sigall (formerly Copyright Office, now Microsoft) and Jane Ginsburg (Columbia Law) wryly noted at the Columbia Law meeting, it as if Google has managed to maneuver itself to the verge of a court-sanctioned release of potential liability covering the exploitation of orphan books, for the benefit of a single commercial actor. If this is the best train coming down the tracks, it might be time to throw a red light.
Settlement: 6.3 (a) Unclaimed Funds (ii) Unclaimed Funds-Subscription Revenue Models. Any revenues paid to the Registry and due to Rightsholders of Books under Section 4.1 (Institutional Subscriptions) and, if agreed, Section 4.7(d) (Consumer Subscription Models), but that are unclaimed by such Rightsholders within (5) years of the last date of the reporting period in which the Books earned such revenues (“Unclaimed Funds-Subscription”), will be distributed by the Registry as soon as practicable in accordance with the Plan of Allocation following the end of such five (5)-year period. Original post blogged on b2evolution. Leggott, Mark: One Key Reason to DigitizeGerman History Buried Under Rubble I tend to cringe and hold my tongue when I hear people say that digitization is bad for preservation because it makes people think we don't have to preserve the paper or microfilm, which is the only way to keep these cultural artifacts around for the long term. Bollocks. The only way to preserve ALL of our cultural heritage is to digitize it. When I think of the wealth of manuscript and print material that has been lost over the ages, whether like this most recent example in Cologne from disaster, or from war, religious or political book burning, it is clear that the only way to preserve everything, regardless of stripe, is to digitize. And not just digitize, but make freely available so it can be copied and preserved in multiple online libraries and never lost again. One need only think back to the selective "preservation" of the Bush regime when it came to information to also realize that allowing corporate or political interests to steward and preserve is a bad idea as well. Libraries have a critical role to play not just in digitization, but also in ensuring that all information is preserved for future generations. Schneider, Jodi: Newspapers in an Age of Revolution (aka The Internet as an Agent of Change)Clay Shirky writes of newspapers in an age of revolution: 15 years of anticipated problems* viewed optimistically, patched with one-size-fits-all solutions. Those solutions don’t attack the main issue: “the core problem publishing solves — the incredible difficulty, complexity, and expense of making something available to the public — has stopped being a problem.” It’s a revolution, he says, drawing on the print revolution of the early 1400s, and no one knows what will happen. The old stuff gets broken faster than the new stuff is put in its place. The importance of any given experiment isn’t apparent at the moment it appears; big changes stall, small changes spread. Even the revolutionaries can’t predict what will happen. Agreements on all sides that core institutions must be protected are rendered meaningless by the very people doing the agreeing. (Luther and the Church both insisted, for years, that whatever else happened, no one was talking about a schism.) Ancient social bargains, once disrupted, can neither be mended nor quickly replaced, since any such bargain takes decades to solidify. And so it is today. When someone demands to know how we are going to replace newspapers, they are really demanding to be told that we are not living through a revolution. They are demanding to be told that old systems won’t break before new systems are in place. They are demanding to be told that ancient social bargains aren’t in peril, that core institutions will be spared, that new methods of spreading information will improve previous practice rather than upending it. They are demanding to be lied to. There are fewer and fewer people who can convincingly tell such a lie. Shirky sees the future of journalism as “overlapping special cases” with a variety of funding and business models. It’s a time for experimentation, and while he sees failure and risk, he has hope, too: Many of these models will fail. No one experiment is going to replace what we are now losing with the demise of news on paper, but over time, the collection of new experiments that do work might give us the reporting we need. Society needs reporting, not newspapers. That need is real, and worth restating: Society doesn’t need newspapers. What we need is journalism. For a century, the imperatives to strengthen journalism and to strengthen newspapers have been so tightly wound as to be indistinguishable. That’s been a fine accident to have, but when that accident stops, as it is stopping before our eyes, we’re going to need lots of other ways to strengthen journalism instead. When we shift our attention from ’save newspapers’ to ’save society’, the imperative changes from ‘preserve the current institutions’ to ‘do whatever works.’ And what works today isn’t the same as what used to work. Go read the whole essay, then let it stew with other thoughts on the future of publishing. *Circa 1993: “When a 14 year old kid can blow up your business in his spare time, not because he hates you but because he loves you, then you got a problem.” Via John Dupuis’ post in Confessions of a Science Librarian. Mignault, John: Merge [del.icio.us]Karen Coyle's notes on MARC merging
Schmidt, Aaron: 5 euro coin features books as buildingsThis is a stunning coin. The portrait consists of the names of Dutch architects. The back side is even better. The artist made the shape of the Netherlands by creating building-esque shapes out of Dutch architecture book spines. Amazing. Want more? Each bird is flying over the capitol of a Dutch province. The artist’s post How to Make Money with Free Software is worth the click to see the process. Schneider, Karen G: A Basic Homebrewing Collection for Your LibraryIn the last week I have been immersed in a writing project I am thoroughly enjoying, so I’ve had just enough personal time to exercise, fiddle around with homebrewing, and do a little reading (finally almost done with The Astonishing Life of Octavian Nothing — which is nothing less than astonishing). But I keep meaning to update you with my homebrewing — the reading, in any event. Homebrewing is a surprisingly bookish craft, and many of the books make wonderful reading. But if you can’t read all of the homebrewing books (or watch all the videos) you could start with these two books (scandalously underrepresented in public library collections): Papazian, Charlie. The Complete Joy of Homebrewing. Now in its third edition, this cheery, reassuring book has walked many a new homebrewer through that crucial first brew. The pictures and illustrations are hokey, but not in a bad way. Palmer, John. How to Brew. A thorough book that digs deep into the technical aspect of brewing. Palmer is a metallurgist, and his love of science and technical precision combine with an engaging voice to make an absorbing read. A great second book after Papazian. If your poor downturn-eviscerated book budget has even a nickel to spare, you could add these as well: Mosher, Randy. Radical brewing : recipes, tales, and world-altering meditations in a glass. Go to the edge of brewing and back! Elegant and inspiring. Hieronymus, Stan. Brew Like a Monk. Great for understanding those wonderful Belgian beers, and beautiful reading. A book of style and history — not a how-to or recipe book. Spencer, James. Introduction to extract home brewing. This is a DVD by the host of Basic Brewing podcasts and video casts. I listen to Spencer’s podcasts regularly and have watched his free online videos. Though I haven’t yet viewed his DVDs, I recommend anything he produces. His relaxed, reassuring style and his deep domain knowledge are a winnable combo, particularly when he pairs up with cohort Steve Wilkes and they nerd it up in their button-down shirts in an average American kitchen (I love it when the dog wanders in and out). Spencer has a number of other videos; his Stepping Into All-Grain is on my personal purchase wish list (since only 3 libraries carry it!). I’m not sure I want to try all-grain brewing without Spencer holding my hand. There are many more good brewing books, some broad and some quite specific (I’m seriously tempted to write Brewing for Little Old Librarians) and I may have left yours off. Make a pitch for the brewing books you love! Bookmark to:del.icio.us: MARC-XML -> Qualified Dublin Core XSLT - CODE4LIB Archivesthe LOC maintains a large collection of XSLT for MARCXML that are very thorough http://www.loc.gov/standards/marcxml/xslt/
del.icio.us: Announcing the New eXtensible Text Framework (XTF) Tutorial - CODE4LIB ArchivesThe California Digital Library (CDL) is pleased to announce the availability of an extensive self-guided tutorial for its eXtensible Text Framework (XTF) <http://xtf.wiki.sourceforge.net/> application <http://application/> (http://xtf.wiki.sourceforge.net/). XTF is an open source, highly customizable piece of software supporting the search, browse, and display of heterogeneous digital content and offering efficient and practical methods for creating customized end-user interfaces for distinct digital collections. The tutorial provides guidance for implementing and customizing XTF, from core functionality to overall look and feel.
|
SearchBrowse archives
NavigationActive forum topicsWho's onlineThere are currently 0 users and 2 guests online.
Who's new
User login |
Recent comments
18 weeks 5 days ago
1 year 17 weeks ago
1 year 17 weeks ago
1 year 18 weeks ago
1 year 18 weeks ago
2 years 2 weeks ago
2 years 12 weeks ago
2 years 16 weeks ago
2 years 18 weeks ago
2 years 19 weeks ago