Using the TeraGrid1 and the SRB DataGrid2, we have sufficient
computational and storage facilities to run normally prohibitively
expensive processing tasks. By integrating text and data mining
tools3[4] within the Cheshire35 information architecture, we can
parse the natural language present in 20 million MARC records (the
University of California’s MELVYL collection) and extract information to
provide to search/retrieve applications. In this talk, we’ll discuss
the results of applying new techniques to ‘old’ data.
1: http://www.teragrid.org
2: http://www.sdsc.edu/srb
3: http://www.ailab.si/orange
4: http://www-tsujii.is.s.u-tokyo.ac.jp/
5: http://www.cheshire3.org/
Rob Sanderson, (azaroth@liv.ac.uk)
| Attachment | Size |
|---|---|
| code4lib.odp.zip | 32.3 KB |

Recent comments
20 weeks 1 day ago
20 weeks 3 days ago
21 weeks 4 days ago
21 weeks 4 days ago
1 year 5 weeks ago
1 year 14 weeks ago
1 year 19 weeks ago
1 year 21 weeks ago
1 year 22 weeks ago
1 year 25 weeks ago