Data Access, Data Preservation and Data Publication Issues (6/20/12)
Access to high-quality, scientific data is important for geoscience research, development and education. It is important that federal government agencies, such as the U.S. Geological Survey, are able to maintain accessible databases on natural resources, geologic and geophysical information, and basic information such as topography. In addition to government research, much information is collected by the academic and private sectors using, at least in part, U.S. federal funding. Though this information is often competitive and proprietary, many feel that any research coming from taxpayers’ dollars should be publicly accessible. Geosciences are global, so it is important that data is maintained and accessible worldwide.
The government plays a role in these issues by maintaining federal databases and helping to ensure data access in other countries through funding and federal preservation programs. In addition, the federal government has been held increasingly accountable for data integrity and the quality of scientific data it uses in policymaking. Funding and legislation will hopefully continue to provide widespread access to scientifically sound data. The 111th Congress approved the reauthorization of America COMPETES Act (H.R. 5116), which authorizes increases for research at the National Science Foundation, the National Institute of Standards and Technology and the Energy Department’s Office of Science. It requires the White House Office of Science and Technology Policy (OSTP) to coordinate and organize public access to government-funded research, including the development of online databases of scientific information within agencies. How the requirement will affect peer-reviewed journals and publishers, though, is unknown.
Some of AGI's member societies have issued their own position statements on data preservation and open access.
Briefing on “Digitizing Science Collections: Unlocking Data for Research and Innovation”(06/12)
On June 5, 2012, the Natural Science Collections (NSC) Alliance hosted a briefing for “Digitizing Science Collections: Unlocking Data for Research and Innovation.” The briefing discussed the technology behind digitizing the natural science collections and its ability to increase access. The presenters for the day focused on biological collections and the impact digitizing would have on the future of information access.
Mary Jameson began the briefing by saying “collecting is a scientific pursuit.” She discussed the powers of collections in various aspects of biodiversity. Even Science, Technology, Engineering, and Mathematics (STEM) Education would benefit from digitizing information, as it would provide a resource for students. Jameson said cyberinfrastructure is used to drill down into the data and re-use specimens. Digitizing collections would assist societal needs like management of agricultural pests, land use planning, and conservation planning. She closed by saying digitizing biological collections is the “nexus for innovation” and new discoveries.
Larry Page said biodiversity contained in natural history collections is more “than in all other sources of information combined.” He emphasized the point that information is inaccessible to potential users, as they are “dispersed” across the U.S. Page broke digitizing into three parts: databasing, georeferencing, and imaging. Databasing is the “what, where, and when” aspect, while georeferencing can provide useful information towards the generation of distribution maps. Images are beneficial in scientific studies and to various types of people. This list includes federal and state agencies, as well as educators who can use very basic principles to “get them [students] excited about science.” He discussed the barrier due to lack of digitizing and how National Science Foundation (NSF) is providing funding for this project. He briefly mentioned the Integrated Digitized Biocollections (iDigBio), which is funded by NSF. Currently, 130 institutions across the U.S. are receiving support. He said that getting past this barrier will “lead to enhanced environmental policy.”
Michael Mares finished off the presentations by emphasizing collections “are irreplaceable.” He referred to biological collections as a “universal investment” and the U.S. would have an advantage if data is fully utilized, which it is currently not. Mares listed some problems with collections, including: storage in old buildings, disbursement, poor maintenance, and poor funding. He listed what type of basic data must be gathered in order to fully comprehend and protect collections. Mares said after digitization, “we’re going to see robotics and rapid identifications.”
The three speakers fielded questions from the audience. One question asked where the rest of the money was coming from because the collections are huge and NSF is giving only a small amount. Mares used the University of Oklahoma as an example and said they got bond elections from the state and some money from private donors. Page mentioned NSF giving $100 million over 10 years to the biological digitization project, which was cited in the Bioscience September 2011 article “New Push to Bring U.S. Biological Collections to the World’s Online Community.” NSF lists these grants on their web site to be available to the geosciences for paleontological collections as well as biological sciences. Proposals will only be accepted from universities, non-profit, non-academic organizations, and state and local governments.
The aforementioned article discusses the development of InvertNet, “a virtual insect museum” by the Illinois Natural History Survey. The principal investigator for this project is Chris Dietrich (email@example.com). Corinna Gries (firstname.lastname@example.org) at the University of Wisconsin-Madison will capture images of 2.3 million specimens of North American lichen and mosses in her research. Digitization would allow the information to be shared among the public, which can advance learning.
The Geological and Geophysical Data Preservation Program was created by the Energy Policy Act of 2005 (P.L. 109-58, Sec. 351). The program is managed by the United States Geological Survey (USGS) with support from the state geological surveys. The data archived by the program includes “geologic, geophysical, and engineering data, maps, well logs, and samples.” Authorizations for this program were $30 million every year for the years of 2006 until 2010. USGS plans to award about $600,000 in fiscal year 2012 to state geological surveys to fund preservation activities.For more information on the briefing, please visit the NSC Alliance web site.
Administration Announces Big Data Initiative (04/12)
In 2012, NSF will be awarding the first round of grants to support EarthCube, an initiative to support integrated data management structures in the geosciences. DOE will provide $25 million in funding to establish the Scalable Data Management, Analysis and Visualization (SDAV) Institute. Based at Lawrence Berkeley National Laboratory, the SDAV Institute will develop tools to manage and visualize data on DOE’s supercomputers. USGS has announced its fiscal year 2012 working groups for the John Wesley Powell Center for Analysis and Synthesis. Working groups at the Powell center collaborate on Earth systems science projects with state-of-the-art computing and facilities.
OSTP Releases Interagency Public Access Coordination Report to Congress (04/12)
World Bank Implements New Open Access Policy (04/12)
On the day of the announcement, the World Bank adopted a set of copyright licenses for content published by the institution. These new licenses, offered by Creative Commons, allow anyone to distribute, reuse, and build upon the Bank’s published work though users are required to credit the Bank for the data. The portal, known as the Open Knowledge Repository, meets the Open Archives Initiative’s protocol for metadata harvesting.
Issa Will Not Seek Passage of Research Work Act, Elsevier Drops Support (02/12)
H.R. 3699 was introduced by Representative Darrell Issa (R-CA) and is cosponsored by Representative Carolyn Maloney (D-NY). The measure would prevent federal agencies from any type of “network dissemination” of “private-sector research” without the prior consent of the publisher, or the assent of the author or employer. “Private-sector research” is defined as an article intended to be published in a scientific journal describing or interpreting research funded by a federal agency to which the publisher has entered into an agreement to make a value-added contribution such as editing or peer review. Since Elsevier’s withdrawal of support, Issa has announced he will no longer seek passage of the bill.
White House Extends Opportunity to Comment on Open Access Policy (12/11)
Bill to Increase Transparency of Federal Grants Passes Committee (12/11)
UNESCO Hosts Open Access Forum (12/11)
Kentucky Unveils Most Detailed State Geologic Map Series in United States (11/11)
Geologic maps can show surface and subsurface rock types, formations, and structures and are a tremendous economic and recreational contribution to society. Information provided by geologic maps assists in the production of resources, protection of groundwater and the environment, stability of foundations and infrastructure, and avoidance of hazards. The Kentucky Geological Survey has found that its geologic maps value at $2.25 - $3.35 billion in 1999 U.S. dollars.
Federal Geographic Data Committee Unveils Geospatial Web Platform (11/11)
On the web site, users can create their own maps utilizing the web site’s data as well as their own data brought to the platform, without any software download requirements. Users can also collaborate in public or private groups through the platform, giving them the ability to share data.
This project brought together the efforts of many related agencies. The FGDC is composed of representatives from the Executive Office of the President, and cabinet level and independent federal agencies including the Department of the Interior, the Environmental Protection Agency and the National Oceanic and Atmospheric Administration. In conjunction with feedback from stakeholders and experts, FGDC will collaborate with its partners to continuously expand the content and resources available through the site.
OSTP Seeks Input on Public Access to Data (11/11)
OSTP released two Requests for Information (RFI) in November, soliciting comment from the public and stakeholders on ways to preserve long term access to federally funded research. Comments and answers to the RFIs are encouraged and should be sent by email to the Public Access and Digital Data policy groups.
University of Virginia Resists Subpoena for Professor’s Research (04/11)
Judge Rules Google Cannot Digitize Books (03/11)
Senate Passes Bipartisan Patent Reform Act (03/11)
Chinese Court Rejects Jailed U.S. Geologist's Appeal (2/11)
Publishing Company Wiley Announces Open Access Journals (2/11)
Nature Publishing Group (NPG) has also expanded the number of its journals with an open access option. Dozens of NPG publications now allow researchers to self-archive articles for free or have their work published immediately for open access for a publication fee. More information and a full list of participating journals can be found here.
Court Case to Impact University Patents (01/11)
More than 50 universities and science societies have joined Stanford in the case by filing “friend of the court” briefs. Former Senator Birch Bayh (D-IN) has filed a brief as well, stating his intention to favor the rights of universities over individual inventors.
Stanford initially lost the suit in a federal district court, but the Supreme Court has decided to hear the case on February 28, 2011. Amicus Briefs from the Association of Public and Land-Grant Universities (APLU brief) and other organizations are available online.
Administration Releases Scientific Integrity Guidelines (12/10)
The memorandum is divided into five parts. The first part states the foundations of scientific integrity in government work, including honesty, credibility, open access and principles for science communication. OSTP calls for government data to be accessible to the public following the Open Government Initiative. The second part primarily discusses communication of government science with the media. Science communications should be objective and non-partisan and scientific findings may not be altered by any agency officials. Disputes about proceeding with media interviews should be resolved by “mechanisms”, but the memo does not define the mechanisms. The third part discusses the role and establishment of federal advisory committees. The fourth part discusses professional development of government scientists, including how they can publish in peer-review literature and how they can work with science societies. The fifth part discusses the role of the Office of Management and Budget (OMB) in reviewing science-based congressional testimony. The memo concludes by asking each agency to prepare a report within 120 days on how they will implement the policies set forth in the document.
Any questions regarding the memorandum can be directed to email@example.com
Data is important for geoscience research and development and geoscience education. Federal government agencies, such as the U.S. Geological Survey, maintain data on natural resources and basic details such as topography. It is important that these databases are maintained and accessible. Geosciences are global and it is also important that data from around the world is maintained and accessible. The federal government plays a role in these issues by maintaining federal data and helping to ensure data access in other countries.
In addition to government data, information is also collected by the academic, private and other sectors. Much of the information collected by the U.S. academic sector is derived from federal funding in the U.S. and thus should be maintained and accessible. Information collected by the private sector is competitive and proprietary. Over time, some private sector information becomes non-proprietary, but can remain useful for the broad geoscience community if there is some mechanism to maintain this data.
Efforts are being made to preserve this data as well as other data within federal-state-local partnerships. One example of such a partnership, supported by the 109th Congress, is the data preservation section of the Energy Policy Act of 2005. The section authorizes funding for the U.S. Geological Survey Data Preservation Program in partnership with the state geological surveys to maintain data, including that contributed by the private sector.
Although the intent is to make federally-funded publications accessible to the taxpayers who pay for the research, the measure seems to work outside copyright law and can negatively impact non-profit society journals. Non-profit society journals tend to have a long history of high quality publications and the publications provide needed support for their societies. If the NIH model would be applied to all federally funded science then geoscience society journals might suffer serious deterioration and lead to the demise of the societies themselves. This would lead to a reduction of high quality publications and an increase in commercial publication prices (as commercial publishers would not have to compete with lower cost non-profit publishers).
In the past decade, data integrity has become more of an issue. Accusations of political appointees changing scientific data to further their own goals have called into question the reliability and quality of data being released by the government. A 2003 report by the House Committee on Government Reform said the Bush Administration had “manipulated the scientific process and distorted or suppressed scientific findings.” A follow-up report in 2007, Political Interference with Climate Change Science Under the Bush Administration, found that the White House censored climate change scientists and significantly edited climate change reports. In addition, the variety of sources and ease of access provided by the internet presents another problem in monitoring the integrity of scientific data. Responding to these concerns, President Obama introduced a Presidential Memorandum on Scientific Integrity in March 2009. In response, the White House OSTP released scientific integrity guidelines for federal agencies on December 17, 2010, which mandates federal agencies to release their guidelines within 120 days of the OSTP release.
The Federal Information Quality Act
The Data Quality Act aimed to quell concerns about the accuracy of federal data. The Center for Regulatory Effectiveness claimed, “Federal agencies regularly publish information on which influential policies or decisions are made that have significant impact on the economy of a region or even the entire nation or that can influence wrong decisions about public and private investments, opportunities and other issues. States, cities, counties, as well as private and public organizations, and citizens in general have had difficulties getting the federal agencies to either substantiate the information published or correct the information to ensure its quality.”
OMB defines information "quality" as information that offers utility, objectivity and integrity to information consumers where: OMB defines information "quality" as information that offers utility, objectivity and integrity to information consumers where:
Agencies must define these terms in a rigorous, operational fashion to ensure that they have a shared understanding with their customers of the information provided."
For online access to the agency specific OMB guidelines go to the Center for Regulatory Effectiveness Website.
The 111th Congress passed the reauthorization of the America COMPETES Act, (H.R. 5116), which authorizes increases for research at the National Science Foundation, the National Institute of Standards and Technology and the Energy Department’s Office of Science. The act requires the White House Office of Science and Technology Policy to coordinate and organize public access to government-funded research, including the development of online databases of scientific information within agencies. Congress included a statement recognizing the role of scientific publishers in the peer-review process; however, non-profit science societies will need to consider the impacts of this legislation on the quality and value of their long-standing journals.
Sources: AGI's Monthly Review.
Contributed by Linda Rowan, Geoscience Policy staff; Dana Thomas, AAPG/AGI Spring 2011 Intern; Erin Camp, AAPG/AGI Fall 2011 Intern; and Krista Rybacki, AIPG/AGI Summer 2012 Intern.
Background section includes material from AGI's summaries and updates for Data Quality and Public Access in the 111th Congress.
Please send any comments or requests for information to AGI Geoscience Policy.
Last updated on June 20, 2012