You are here

Feed aggregator

David Fiander: Mac OS vs Emacs: Getting on the right (exec) PATH

planet code4lib - Wed, 2016-04-20 13:36

One of the minor annoyances about using Emacs on Mac OS is that the PATH environment variable isn't set properly when you launch Emacs from the GUI (that is, the way we always do it). This is because the Mac OS GUI doesn't really care about the shell as a way to launch things, but if you are using brew, or other packages that install command line tools, you do.

Apple has changed the way that the PATH is set over the years, and the old environment.plist method doesn't actually work anymore, for security reasons. For the past few releases, the official way to properly set up the PATH is to use the path_helper utility program. But again, that only really works if your shell profile or rc file is run before you launch Emacs.

So, we need to put a bit of code into Emacs' site_start.el file to get things set up for us:


(when (file-executable-p "/usr/libexec/path_helper")
(let ((path (shell-command-to-string
"eval `/usr/libexec/path_helper -s`;
echo -n \"$PATH\"")))
(setenv "PATH" path)
(setq exec-path (append (parse-colon-path path)
(list exec-directory)))))

This code runs the path_helper utility, saves the output into a string, and then uses the string to set both the PATH environment variable and the Emacs exec-path lisp variable, which Emacs uses to run subprocesses when it doesn't need to launch a shell.

If you are using the brew version of Emacs, put this code in /usr/local/share/emacs/site-lisp/site-start.el and restart Emacs.

Open Knowledge Foundation: Global Open Data Index Insights – Open Data in the Arab world

planet code4lib - Wed, 2016-04-20 08:55

This blog post was written by Riyadh Al Balushi from the Sultanate of Oman.

I recently co-authored with Sadeek Hasna a report that looks at the status of open data in the Arab World and the extent to which governments succeed or fail in making their data available to the public in a useful manner. We decided to use the results of the Global Open Data Index as the starting point of our research because the Index covered all the datasets that we chose to examine for almost all Arab countries. Choosing to use the Global Open Data Index as a basis for our paper saved us time and provided us with a systematic framework for evaluating how Arab countries are doing in the field of open data.

We chose to examine only four datasets, namely: the annual budget, legislation, election results, and company registration data. Our selection was driven by the fact that most Arab countries already have published data in this area and therefore there is content to look at and evaluate. Furthermore, most of the laws of the countries we examined make it a legal obligation on the government to release these datasets and therefore it was more likely for the government to make an effort to make this data public.

Our analysis uncovered that there are many good examples of government attempts at releasing data in an open manner in the Arab World. Examples include the website of Ministry of Finance of the UAE which releases the annual budget in Excel format, the legislation website of Qatar which publishes the laws in text format and explicitly adopts a Creative Commons license to the website, the Elections Committee website of Egypt, which releases the elections data in Excel format, and the website of the Company Register of Bahrain, which does not make the data directly available for download, but provides a very useful search engine to find all sorts of information about companies in Bahrain. We also found several civil society projects and business initiatives that take advantage of government data such as Mwazna – a civil society project that uses the data of the annual budget in Egypt to communicate to the public the financial standing of the government in a visual way, and Al Mohammed Network – a business based on the legislation data in the Arab World.

“Map of Arabic-speaking countries” by Illegitimate Barrister – Licensed under CC Attribution 3.0.

What was interesting is that even though many Arab countries now have national open data initiatives and dedicated open data portals, all the successful open data examples in the Arab World are not part of the national data portals and are operated independently by the departments responsible for creating the data in question. While the establishment of these open data portals is a great sign of the growing interest in open data by Arab governments, in many circumstances these portals appear to be of a very limited benefit, primarily because the data is usually out of date and incomplete. For example, the Omani open data portal provides population data up to the year 2007, while Saudi’s open data portal provides demographic data up to the year 2012. In some cases, the data is not properly labeled, and it is impossible for the user to figure out when the data was collected or published. An example of this would be the dataset for statistics of disabilities in the population on the Egyptian government open data page. The majority of the websites seem to be created through a one-off initiative that was never later updated, probably in response to the global trend of improving e-government services. The websites are also very hard to navigate and are not user-friendly.

Another problem we noticed, which applies to the majority of government websites in the Arab World, is that very few of these websites license their data using an open license and instead they almost always explicitly declare that they retain the copyright over their data. In many circumstances, this might not be in line with the position of domestic copyright laws that exempt official documents, such as the annual budget and legislation, from copyright protection. Such practices confuse members of the public and give the impression to many that they are not allowed to copy the data or use it without the permission of the government, even when that is not true. Another big challenge for utilising government data is that many Arab government websites upload their documents as scanned PDF files that cannot be read or processed by computer software. For example, it is very common for the annual budget to be uploaded as a scanned PDF file when instead it would be more useful to the end user if it was uploaded in a machine-readable format such as Excel or CSV. Such formats can easily be used by journalists and researchers to analyse the data in more sophisticated ways and enables them to create charts that help present the data in a more meaningful manner. Finally, none of the datasets examined above were available for download in bulk, and each document had to be downloaded individually. While this may be acceptable for typical users, those who need to do a comprehensive analysis of the data over an extensive period of time will not be able to do efficiently so. For example, if a user wants to analyse the change in the annual budget over a period of 20 years, he or she would have to download 20 individual files. A real open data portal should enable the user to download the whole data in bulk. In conclusion, even though many governments in the Arab World have made initiatives to release and open their data to the public, for these initiatives to have a meaningful impact on government efficiency, business opportunities, and civil society participation, the core principles of open data must be followed. There is an improvement in the amount of data that governments in the Arab World release to the public, but more work needs to be done. For a detailed overview of the status of open data in the Arab World, you can read our report in full here.

Casey Bisson: get list of functions in bash script…look for those in argv

planet code4lib - Wed, 2016-04-20 03:59
# Get function list as array funcs=($(declare -F -p | cut -d " " -f 3)) # parse out functions and non-functions i=1 declare -a cmdargs declare -a otherargs for var in "$@"; do if [[ " ${funcs[@]} " =~ " ${var} " ]]; then cmdargs[i]=${var} else otherargs[i]=${var} fi ((i++)) done echo ${cmdarg[*]} echo ${otherargs[*]}

DuraSpace News: WORKSHOP: Publishing Assets as Linked Data with Fedora 4

planet code4lib - Wed, 2016-04-20 00:00

Austin, TX  David Wilcox, Fedora product manager and Andrew Woods, Fedora tech lead, will offer a workshop entitled, "Publishing Assets as Linked Data with Fedora 4" at the Library Publishing Forum (LPForum 2016) to be held at the University of North Texas Libraries, Denton, Texas on May 18 from 1:00 PM-3:30 PM. All LPForum 2016 attendees are welcome—there is no need to pre-register for this introductory-level workshop.

Max Planck Digital Library: MPG/SFX server maintenance, Wednesday 20 April, 8-9 am

planet code4lib - Tue, 2016-04-19 23:54

The MPG/SFX server updates to a new database (MariaDB) on Wednesday morning. The downtime will begin at 8 am and is scheduled to last until 9 am.

We apologize for any inconvenience.

District Dispatch: LIVE: Watch the confirmation hearing for Dr. Carla Hayden

planet code4lib - Tue, 2016-04-19 19:52

Watch Dr. Carla Hayden’s Confirmation Hearing Live on Wednesday, April 20, 2016 at 2:15 pm ET

The Confirmation Hearing for Librarian of Congress Nominee, Carla Hayden, by the U.S. Senate Committee on Rules and Administration, will air LIVE on C-SPAN3, C-SPAN Radio and C-SPAN.org on Wednesday, April 20, 2016 at 2:15pm ET.

The hearing will also be webcast from the Senate Committee on Rules and Administration hearing page. The webcast will be available approximately 15 minutes prior to the start of the hearing, and the archive will be available approximately 1 hour after the completion of the hearing.

Previous: Preparing for a librarian…Librarian (March 4, 2016)

The post LIVE: Watch the confirmation hearing for Dr. Carla Hayden appeared first on District Dispatch.

District Dispatch: 2016 winners of the National Medal for Museum and Library Service announced

planet code4lib - Tue, 2016-04-19 19:06

10 Winners of the 2016 National Medal for Library and Museum Service Announced

On April 19, 2016, the Institute of Museum and Library Services (IMLS) announced the 10 recipients of the 2016 National Medal for Museum and Library Service, the nation’s highest honor given to museums and libraries for service to the community. Now in its 22nd year, the National Medal celebrates libraries and museums that “respond to societal needs in innovative ways, making a difference for individuals, families, and their communities.”

The award will be presented in Washington, D.C. on June 1st. To learn more about the 2016 National Medal winners and 30 finalists, click here.

The 2016 National Medal recipients are:

  • Brooklyn Public Library (Brooklyn, N.Y.)
  • The Chicago History Museum (Chicago, Ill.)
  • Columbia Museum of Art (Columbia, S.C.)
  • Lynn Meadows Discovery Center for Children (Gulfport, Miss.)
  • Madison Public Library (Madison, Wis.)
  • Mid-America Science Museum (Hot Springs, Ark.)
  • North Carolina State University Libraries (Raleigh, N.C.)
  • Otis Library (Norwich, Conn.)
  • Santa Ana Public Library (Santa Ana, Calif.)
  • Tomaquag Museum (Exeter, R.I.)

Read More from IMLS:

This year’s National Medal recipients show the transforming role of museums and libraries from educational destinations to full-fledged community partners and anchors,” said Dr. Kathryn K. Matthew, director of the Institute of Museum and Library Services. “We are proud to recognize the extraordinary institutions that play an essential role in reaching underserved populations and catalyzing new opportunities for active local involvement.”

The Institute of Museum and Library Services (IMLS) is the primary source of federal support for the nation’s 123,000 libraries and 35,000 museums.

Previous: Libraries: Apply now for 2016 IMLS National Medals (July 23, 2015)

The post 2016 winners of the National Medal for Museum and Library Service announced appeared first on District Dispatch.

District Dispatch: 3D/DC: A great day for libraries

planet code4lib - Tue, 2016-04-19 18:18

L-R: Diego Tamburini, Autodesk; Charlie Wapner, American Library Association; Adam Schaeffer, D.C. Public Library; U.S. Rep. Mark Takano (CA-41). Image from Becky Button, via Twitter.

Last week, I participated in 3D/DC, an annual Capitol Hill event exploring 3D printing and public policy. Programming focused on 3D printing’s implications for education, the arts, the environment, the workforce and the public good. In my reflections on last year’s 3D/DC, I averred that the event was “a good day for libraries.” This year, “good” graduated to “great.” Libraries were mentioned as democratizers of technology too many times to count over the course of the day, and the library community had not one, but two representatives on the speaker slate.

It was my privilege to be a panelist for a program exploring the role of 3D printing in closing the workforce skills gap. Thankfully, my national-level outlook on how libraries harness 3D printing to build critical workforce skills was buttressed by the on-the-ground perspective of Library Associate and Maker Extraordinaire Adam Schaeffer of the Washington, D.C. Public Library (DCPL). The other participants on the panel were Robin Juliano of the White House National Economic Council, Gad Merrill of TechShop and Diego Tamburini of Autodesk.

I described libraries as informal learning labs; places where people are free to use digital technologies like 3D printers, laser cutters and computer numerical control (CNC) routers to build advanced engineering skills through the pursuit of their personal creative interests. I argued that in combination with the suite of other job search and skill-building services libraries provide, library 3D printers are powerful tools for fostering workforce preparedness. Adam Schaeffer offered powerful anecdotes to illustrate this point. His kinetic overview of the wide array of products he’d helped patrons launch with a 3D printer was a tour de force of the 21st century library’s power as an innovative space.

It was a pretty light lift to convince those in attendance of the value of library 3D printing services to the task of workforce development. Nearly every word I, and my Library Land compatriot, Adam, uttered in furtherance of this effort was met with a sturdy nod or a knowing grin. I found this surprising at first – but after a minute, I realized it was in keeping with an ongoing trend. In my just-over two years at ALA, I’ve seen a steady proliferation of stories in popular news and blog outlets about 3D printers being used in libraries to build prototypes of new products and foster engineering and design skills. As a result, library “making” has reached an inflection point. It’s no longer seen as quaint, cute or trivial; it’s acknowledged as a means of advancing personal and societal goals.

That this is the case is a testament to the ingenuity of library professionals. From New York to California and everywhere in between, the men and women of the library community have built communities around their 3D printers; library makerspaces have become cathedrals of creativity – and their congregations are growing by the day. I know…Because last week, I was preaching to the converted. To all the library makers out there: keep up the good work.

ALA would like to thank Public Knowledge for including libraries in 3D/DC this year. I’d personally like to thank Public Knowledge for the opportunity to speak during the event.

The post 3D/DC: A great day for libraries appeared first on District Dispatch.

David Rosenthal: How few copies?

planet code4lib - Tue, 2016-04-19 15:00
A question that always gets asked about digital preservation is "how many copies do I need to be safe?" The obvious questions in response are "how safe do you need to be?" - it isn't possible to be completely safe - and "how much can you afford to spend being safe?" - costs tend to rise rapidly with each additional 9 of reliability.

User rblandau at MIT-Informatics has a high-level simulation of distributed preservation that looks like an interesting way of exploring these questions. Below the fold, my commentary.

rblandua's conclusions from the first study using the simulation are:
  • Across a wide range of error rates, maintaining multiple copies of documents improves the survival rate of documents, much as expected.
  • For moderate storage error rates, in the range that one would expect from commercial products, small numbers of copies suffice to minimize or eliminate document losses.
  • Auditing document collections dramatically improves the survival rate of documents using substantially fewer copies (than required without auditing).
  • Auditing is expensive in bandwidth. We should work on (cryptographic) methods of auditing that do not require retrieving the entire document.
  • Auditing does not need to be performed very frequently.
  • Glitches increase document loss more or less in proportion to their frequency and impact. They cannot be distinguished from overall increases in error rate.
  • Institutional failures are dangerous in that they remove entire collections and expose client collections to higher risks of permanent document loss.
  • Correlated failures of institutions could be particularly dangerous in this regard by removing more than one copy from the set of copies for long periods.
  • We need more information on plausible ranges of document error rates and on institutional failure rates.
My comments on these conclusions are:
  • Auditing document collections dramatically improves the survival rate - no kidding! If you never find out that something has gone wrong you will never fix it, so you will need a lot more copies.
  • Auditing is expensive in bandwidth - not if you do it right. There are several auditing systems that do not require retrieving the entire document, including LOCKSS, ACE and a system from Mehul Shah et al at HP Labs. None of these systems is ideal in all possible cases, but their bandwidth use isn't significant in their appropriate cases. And note the beneficial effects of combining local and networked detection of damage.
  • Auditing does not need to be performed very frequently - it depends. Oversimplifying, the critical parameters are MeanTimeToFailure (MTTF), MeanTimeToDetection (MTTD) and MeanTimeToRepair (MTTR), and the probability that the system is in a state with an un-repaired failure is (MTTD+MTTR)/MTTF. MTTD is the inverse of the rate at which auditing occurs. A system with an un-repaired failure is at higher risk because its replication level is reduced by one.
  • Institutional failures are dangerous - yes, because repairs are not instantaneous. At scale, MTTR is proportional to the amount of damage that needs to be repaired. The more data a replica loses, the longer it will take to repair, and thus the longer the system will be at increased risk. And the bandwidth that it uses will compete with whatever bandwidth the audit process uses.
  • Correlated failures of institutions could be particularly dangerous - yes! Correlated failures are the elephant in the room when it comes to simulations of systems reliability, because instead of decrementing the replication factor of the entire collection by one, they can reduce it by an arbitrary number, perhaps even to zero. If it gets to zero, its game over.
  • We need more information - yes, but we probably won't get much. There are three kinds of information that would improve our ability to simulate the reliability of digital preservation:
    • Failure rates of storage media. The problem here is that storage media are (a) very reliable, but (b) less reliable in the field than their specification. So we need experiments, but to gather meaningful data they need to be at an enormous scale. Google, NetApp and even Backblaze can do these experiments, preservation systems can't, simply because they aren't big enough. It isn't clear how representative of preservation systems these experiments are, and in any case it is known that media cause only about half the failures in the field.
    • Failure rates of storage systems from all causes including operator error and organizational failure. Research shows that the root cause for only about half of storage system failures is media failure. But this means that storage systems are also so reliable that collecting failure data requires operating at large scale.
    • Correlation probabilities between these failures. Getting meaningful data on the full range of possible correlations requires collecting vastly more data than for individual media reliability.

DuraSpace News: Upcoming LYRASIS and DuraSpace CEO Town Hall Meetings

planet code4lib - Tue, 2016-04-19 00:00

Austin, TX  The LYRASIS and DuraSpace Boards announced an "Intent to Merge" the two organizations in January. As part of ongoing merger investigations LYRASIS CEO Robert Miller and DuraSpace CEO Debra Hanken Kurtz have been working with our communities to share information widely about the proposed merger and to gather input. 

District Dispatch: ALA urges House and Senate approps subcommittees to support LSTA, IAL

planet code4lib - Mon, 2016-04-18 21:17

IAL Grant Applications Due by May 9

The American Library Association filed comments last week with House and Senate Appropriations Committees in support of funding for the Library Services and Technology Act (LSTA) and Innovative Approaches to Literacy (IAL).

As the Appropriations Committees begin their consideration of 12 appropriations bills, ALA is urging the Committees to fund LSTA at $186.6 million and IAL at $27 million for FY 2017. Both programs received increases in last year’s FY 2016 funding bills and were included in the President’s February budget request to Congress.

“Without LSTA funding, these and many other specialized programs targeted to the needs of their communities across the country likely will be entirely eliminated, not merely scaled back. In most instances, LSTA funding (and its required but smaller state match) allows libraries to create new programs for their patrons,” noted Emily Sheketoff in comments to both Committees.

The $186.6 million funding level for LSTA mirrors last year’s request to Congress from the President and is also supported by “Dear Appropriator” letters recently circulated in the Senate and House for Members’ signatures. LSTA was funded at $155.8 million for FY 2016 and ALA expressed concern that the President is requesting only $154.8 million for FY 2017.

In supporting $27 million in IAL funding for school libraries, ALA commented that “studies show that strong literacy skills and year-round access to books is a critical first-step towards literacy and life-long learning. For American families living in poverty, access to reading materials is severely limited. These children have fewer books in their homes than their peers, which hinders their ability to prepare for school and to stay on track.”

Congress provided $27 million in FY 2016 IAL funding and the President requested the same level for FY 2017. IAL, which dedicates half of its resources for school libraries, was authorized in last year’s Every Student Succeeds Act. “Dear Appropriator” letters circulated in the Senate and House on its behalf called for $27 million in FY 2017 funding.

ALA reminds its members that the Department of Education recently announced that it has opened its FY 2016 window for new IAL grant applications. The DOE’s announcement with full application filing details is available online. Grant applications must be submitted by May 9, 2016.

In additional support for library funding, LSTA and IAL were highlighted in the annual Committee for Education Funding (CEF) Budget Response to Congress: Education Matters: Investing in America’s Future. The CEF budget response, which reserves two chapters for the LSTA and IAL programs, provides an explanation of the programs, examples of how funds have been used, and a justification for the funding levels sought.

Note: You can view the testimonies submitted by Emily Sheketoff to the House and Senate.

The post ALA urges House and Senate approps subcommittees to support LSTA, IAL appeared first on District Dispatch.

HangingTogether: Metadata for research data management

planet code4lib - Mon, 2016-04-18 19:39

That was the topic discussed recently by OCLC Research Library Partners metadata managers, initiated by John Riemer of UCLA.  With increasing expectations that research data creation made possible through grant funding will be archived and made available to others, many institutions are becoming aware of the need to collect and curate this new scholarly resource. To maximize the chances that metadata for research data are shareable (that is, sufficiently comparable) and helpful to those considering re-using the data, our communities would benefit from sharing ideas and discussing plans to meet emerging discovery needs. OCLC Research Scientist Ixchel Faniel’s two-part blog entry “Data Management and Curation in 21st Century Archives” (Sept 2015) provided useful background to this discussion.

The discussions revealed a wide range of experiences, from those just encountering researchers who come to them with requests to archive and preserve their research data to those who have been handling research data for some years. National contexts differ. For example, our Australian colleagues can take advantage of Australia’s National Computational Infrastructure for big data and the Australian Data Archive for the social sciences. Canada is developing a national network called Portage for the “shared stewardship of research data”.

The US-based metadata managers were split about whether to have a single repository for all data or a separate repository for research data, although there seems to be a movement to separate data that is to be re-used (providing some capacity for computing on it) from data that is only to be stored. A number of fields have a discipline-based repository, or researchers take advantage of a third-party service such as DataCite, also used for discovery. The library can fill the gap for research data without a better home.

Recently-published Building Blocks: Laying the Foundation for a Research Data Management Program includes a section on metadata:

Datasets are useful only when they can be understood. Encourage researchers to provide structured information about their data, providing context and meaning and allowing others to find, use and properly cite the data. At minimum, advise researchers to clearly tell the story of how they gathered and used the data and for what purpose. This information is best placed in a readme.txt file that includes project information and project-level metadata, as well as metadata about the data itself (e.g., file names, file formats and software used, title, author, date, funder, copyright holder, description, keywords, observation unit, kind of data, type of data and language).

A number of institutions have developed templates to capture metadata in a structured form. Some metadata managers noted the need to keep such forms as simple as possible as it can be difficult to get researchers to fill them in. All agreed data creators needed to be the main source of metadata. But how to inspire data creators to produce quality metadata? New ways of training and outreach are needed.

We also had general agreement on the data elements required to support re-use by others: licenses, processing steps, tools, data documentation, data definitions, data steward, grant numbers and geospatial and temporal data (where relevant). Metadata schema used include Dublin Core, MODS (Metadata Object Description Schema) and DDI (Data Documentation Initiative’s metadata standard). The Digital Curation Centre in the UK provides a linked catalog of metadata standards. The Research Data Alliance’s Metadata Standards Directory Working Group has set up a community-maintained directory of metadata standards for different disciplines.

The importance of identifiers for both the research data and the creator has become more widely acknowledged. DOIs, Handles and ARKs (Archival Resource Key) have been used to provide persistent access. Identifiers are available at the full data set level and for component parts, and they can be used to track downloads and potentially help measure impact. Both ORCID (Open Researcher and Contributor ID) and ISNI (International Standard Name Identifier) are in use to identify data creators uniquely.

Some have started to analyze the metadata requirements for the research data life cycle, not just the final product. Who are the collaborators? How do various projects use different data files? What kind of analysis tools do they use? What are the relationships of data files across a project, between related projects, and to other scholarly output such as related journal articles? The University of Michigan’s Research Data Services is designed to assist researchers during all phases of the research data life cycle.

Curation of research data as part of the evolving scholarly record requires new skill sets, including deeper domain knowledge, data modeling, and ontology development. Libraries are investing more effort in becoming part of their faculty’s research process and offering services that help ensure that their research data will be accessible if not also preserved. Good metadata will help guide other researchers to the research data they need for their own projects—and the data creators will have the satisfaction of knowing that their data has benefitted others.

Graphic by Martin Grandjean. Source: https://commons.wikimedia.org/wiki/File:Social_Network_Analysis_Visualization.png

 

About Karen Smith-Yoshimura

Karen Smith-Yoshimura, senior program officer, works on topics related to creating and managing metadata with a focus on large research libraries and multilingual requirements.

Mail | Web | Twitter | More Posts (66)

District Dispatch: What to know before you go

planet code4lib - Mon, 2016-04-18 19:23

Preparing for National Library Legislative Day 2016 (or Virtual Library Legislative Day?), but not sure what to expect? We have some handy resources for you, straight from advocacy expert Stephanie Vance!

 

The post What to know before you go appeared first on District Dispatch.

Library of Congress: The Signal: A Beginners Guide to Record Retention

planet code4lib - Mon, 2016-04-18 16:29

This is a guest post by Carmel Curtis.

Interior view of auditorium in the Howard Gilman Opera House at BAM, 2010. Photograph by Elliot Kaufman.

Over the past eight months I have been working as the National Digital Stewardship Resident at the Brooklyn Academy of Music. BAM is the oldest continually running performing arts center in the country and is home to a range of artistic expressions in dance, theater, film, and beyond. Over 150 years old, BAM has a rich history.

I have been working on a records management project at BAM. My mentor, processing archivist Evelyn Shunaman, and I have conducted 41 hour-long interviews with all divisions, departments and sub-departments to get a sense of what and how many electronic records are being created, saved and accessed. Then we created or revised departmental Record Retention Schedules to ensure they reflect BAM’s current workflows and practices.

Here are some of basics of records retention and tips on creating a Records Retention Schedule.

A Records Retention Schedule is an institutional or organizational policy that defines the information that is being created and identifies retention requirements based on legal, business and preservation requirements. An RRS can take many forms. Example 1 shows our RRS spreadsheet.

Item

Record Series Title

Description/Example

Total Retention

Transfer to Archives

Item Number

Category of Record

Explanation of record category

Time period records are retained

Whether or not records are sent to the Archives

AD -1

Audience
Development Survey

Questionnaire
and results from survey conducted every 3 years on BAM audience demographic.

permanent

yes

Example 1. BAM RRS spreadsheet.

An RRS is a way for an institution to:

  • Be accountable to any legal requirements – An RRS is a policy that ensures records are retained in accordance with state or federal legal requirements. It provides an outline for the minimum legal requirements related to the retention and destruction of records.
  • Identify archivally significant materials – Appraisal and selection are not dead. While storage may be increasing in capacity and decreasing in cost, there is still considerable need for decisions to be made around what comes into the Archive and what does not. An RRS can help provide a framework for this decision making process.
  • Identify when things can be deleted – People want permission to be able to delete their digital content. Similar to paper and other physical based records, there is little incentive to get rid of things until one runs out of space. With electronic records, it is not uncommon to purchase more storage instead of deleting unnecessary files. However, digital clutter is a real thing that can induce stress and anxiety as well as make retrievability challenging. Having an RRS can help reduce digital clutter by identifying what records can be deleted and when.
  • Assist archive in preservation planning – Once an RRS has been created, it can be a helpful tool in planning for the specific preservation needs of the categories of records coming into the Archive. With the assistance of an RSS, you can think through file-format identification and decisions around normalization, requirements around minimum associated metadata, and estimations of how much information will be needed to be transferred into the Archive and thus how much space will be required.

Records management may be different than archives management but when there is no Records Manager, the responsibility often falls on the Archivist. While records management is concerned with all information created, not exclusively information that has archival significance, it can be useful for the Archive to have a comprehensive picture of work that is being done across the institution. Having a wide-ranging understanding of workflows will only strengthen decisions around selection of what needs to come into the Archive.

So how do you begin? Here are some tips on developing an RRS based off of my experience at BAM.

  1. Work with IT. While the creation of an RRS does not necessarily require the technical expertise or someone with an information technology background, the eventual transfer of materials into the Archive and the management of an electronic repository will take some technical know-how. Collaborating with IT at an early stage will only improve relations down the road. If you don’t have an IT Department, it is okay! The Archivist often wears many hats.
  2. Talk to as many staff members as possible! Those who create records are the experts in the records they are creating. Trust their words and do not aim to alter their workflows. Work with them! Conduct an interview with a general framework, not a strict roadmap. Give people space to speak and guide them when necessary. Consider this interview outline:
    1. Walk through the general responsibilities of your department with an emphasis on what kinds of records or information is being created.
      1. Who creates record(s)?
      2. How it is created? Specific software?
      3. What format is it?
      4. How is it identified (filename/folder)? Standard naming conventions?
      5. Are there multiple copies? Multiple versions? How are finals identified?
      6. Where is it stored?
      7. How long is it used/accessed/relevant to your department?
    2. What is the historical significance/long-term research value in information created by your department?
  3. Make people feel comfortable and not embarrassed. The Archive asking about records can have an intimidating feel. Few people are as organized as they would like to be. These interviews should not be about shaming people but are an opportunity to listen and identify issues across your institution.
  4. To record or not to record? To transcribe or not to transcribe? Think carefully about the decision to audio or video record these interviews. You want your interviewee to feel comfortable and you also want to be able to refer back to things you may have missed. Transcribing interviews can be helpful but it takes a considerable effort. Consider the amount of time and resources that are available to you.
  5. Determine a format for your RRS. Consider making a spreadsheet with the column headings from Example 1.
  6. Develop Record Series Titles based off of workflows present within the department. To encourage compliance to an RRS, it is recommended to have the categories be as reflective of workflows within your institution as possible. If you think of it as a map or a crosswalk, developing an RRS to mirror record types and folder structures currently being used will only make things easier. Directly referencing language used by departments within the Records Series Title or Description will facilitate the process of compliance.
  7. Determine retention periods and whether or not records should be transferred to the Archive. Use this decision tree to help establish appropriate time periods.
  8. Get legal advice. For record series with legal considerations, consult your legal department. If there is no legal department, look at existing records retention schedules and at your local legal requirements. Here are some useful resources:
    1. New York State Archives Retention and Disposition Schedule for Government Based Records – Includes useful justifications of all retention categories.
    2. IRS – How Long Should I Keep Records? – Guidance on financial based records.
    3. Society for Human Resource Management’s Federal Records Retention Requirements – Legal guidance on retention periods for HR based records.

    It is always best to look up the underlying laws cited in example RRSs to confirm applicable interpretation.

  9. To help mitigate duplication, consider limiting records transferred to the Archive exclusively to the creating department. In other words, for information shared across departments or created collaboratively across departments, consider getting the department that holds the final version to transfer the record to the Archive, as opposed to all departments that have a copy.
  10. Make a note of information that is required to be transferred to the Archive but is stored in databases or other systems used by your institution. If any information that is required to be transferred into the Archive is stored on removable media or third party proprietary systems, make sure these are flagged and a specific archival ingest process is developed for these records.
  11. Appoint a departmental records coordinator and require yearly approval. Designating responsibility to a specific person will dissuade finger pointing. If every department has a specific records retention coordinator, there will be a person with whom the Archives can communicate with, thus improving likelihood of compliance. It is important to make sure that the RRS is reviewed annually to ensure that it continues to reflect current workflows and practices.

Writing an RRS is big step; however, it is only the beginning. At BAM, now that we have completed revisions on our RRS, we are working on developing workflows for transferring materials into the Archive.

Using TreeSize Pro, we have scanned the network storage systems of all departments and have estimated the amount of data that will need to be brought into the Archives based off of the RRS.

We are now working to establish timelines and requirements for when and how departments should transfer materials to the Archive. Presently, we are testing AVPS’s Exactly file delivery tool as a way to receive files and require minimum metadata associated with deposits. Follow the NDSR-NY blog for updates on this phase of the project as it continues to unfold.

LITA: Transmission #2

planet code4lib - Mon, 2016-04-18 16:27

Welcome back to Begin Transmission, the biweekly vlog interview series. Joining me for today’s discussion is none other than Brianna Marshall, our fearless leader here at the LITA Blog. Remember to follow her on Twitter @notsosternlib.

 

Begin Transmission will return May 2nd! Stay tuned.

District Dispatch: Librarians, you have been challenged!

planet code4lib - Mon, 2016-04-18 14:10

As part of National Library Legislative Day 2016 (NLLD), we are offering library advocates unable to attend in person the chance to participate through Virtual Library Legislative Day (VLLD)!

This year, we’re teaming up with the Harry Potter Alliance (HPA) to help expand our efforts and take advantage of the momentum started by the nearly 400 library supporters who plan to attend NLLD. Their Chapters’ members are pledging their time to make calls, send emails, tweet, and otherwise raise awareness about library issues during the week of May 2nd. To date, the HPA has received pledges for nearly 500 actions from their members!

We think we can do our wizarding friends even better.

Over the next few weeks, please take a second to register and then ask everyone in your circles — members, followers, patrons, fellow library staffers, and listservs — to join us! We’ll follow up by sending out talking points and other handy resources you can use to advocate easily and effectively. We’ll also be including a link to a webstream of the National Library Legislative day program, live from Washington, on the morning of May 2nd. You’ll get to hear our keynote speaker, former Congressman Rush Holt, and listen in on this year’s issue briefing.

There’s also a handy resource toolkit, put together by the Harry Potter Alliance, for librarians who may want to get younger advocates involved. You can also find out more by visiting the United for Libraries and the Harry Potter Alliance webpages, or by subscribing to the Action Center.

Please feel free to contact Lisa Lindle, Grassroots Communications Specialist for ALA Washington, if you have any questions.

The post Librarians, you have been challenged! appeared first on District Dispatch.

FOSS4Lib Recent Releases: VuFind - 2.5.3

planet code4lib - Mon, 2016-04-18 13:24
Package: VuFindRelease Date: Monday, April 18, 2016

Last updated April 18, 2016. Created by Demian Katz on April 18, 2016.
Log in to edit this page.

Minor bug fix / PHP 7 compatibility release.

DuraSpace News: VIVO Updates for April 18–OpenVIVO for Everyone, Help with Implementation

planet code4lib - Mon, 2016-04-18 00:00

From Mike Conlon, VIVO Project Director

Did we launch OpenVIVO?  Yes, we did.  See http://openvivo.org  Have an ORCID?  Sign on.  Don't have an ORCID?  Get an ORCID at http://orcid.org and sign on.  It's that easy. If you follow VIVO on Twitter (@vivocollab) you'll see good people saying nice things about OpenVIVO.  It would be great if you did that too!

DuraSpace News: JOIN the LYRASIS and DuraSpace CEO Town Hall Meeting on April 21

planet code4lib - Mon, 2016-04-18 00:00

Austin, TX  The LYRASIS and DuraSpace Boards announced an "Intent to Merge" the two organizations in January. Join us for the second session of the CEO Town Hall Meeting series with Robert Miller, CEO of LYRASIS and Debra Hanken Kurtz, CEO of DuraSpace. Robert and Debra will review how the organizations came together to investigate a merger that would build a more robust, inclusive, and truly global community with multiple benefits for members and users. They will also unveil a draft mission statement for the merged organization.

DuraSpace News: CALL for Proposals: Digital Preservation 2016 "Building Communities of Practice"

planet code4lib - Mon, 2016-04-18 00:00

From Oliver Bendorf, program associate, Digital Library Federation

Washington, DC  The National Digital Stewardship Alliance (NDSA) invites proposals for Digital Preservation 2016, to be held in Milwaukee, Wisconsin, 9-10 November 2016.

Pages

Subscribe to code4lib aggregator