Data Access, Data Preservation and Data Publication Issues (10/8/10)

Untitled Document

Access to high-quality, scientific data is important for geoscience research, development and education. It is important that Federal government agencies, such as the U.S. Geological Survey, are able to maintain accessible databases on natural resources, geologic and geophysical information, and basic information such as topography. In addition to government research, much information is also collected by the academic and private sectors using, at least in part, U.S. federal funding. Though this information is often competitive and proprietary, many feel that any research coming from taxpayers dollars should be publically accessible. Geosciences are also a global study, so it is important that data is maintained and accessible worldwide.

The government plays a role in these issues by maintaining federal databases and helping to ensure data access in other countries through funding and federal preservation programs. In addition, the federal government has been held increasingly accountable for data integrity and the quality of scientific data it uses in policymaking. Funding and legislation will hopefully continue to provide widespread access to scientifically sound data.

Recent Action

Open Access to Climate Codes
Two British software engineers, David Jones and Nick Barnes, and a British administrator for the British National Park Authority, Philippa Davey, have founded a new non-profit organization called the Climate Code Foundation in August of 2010. The foundation hopes to promote the public understanding of climate science, improve the clarity of source code for climate science software and encourage the publication of more source code in climate science, according to their website description.

There are two projects associated with the foundation, the Clear Climate Code to provide clarity of computer codes and the Open Climate Code to encourage access to source codes. So far the Open Climate Code page only lists the founders’ ideas, while the Clear Climate Code offers a “simplified” version of the GISTEMP analysis software used by NASA’s Goddard Institute for Space Studies.

Back to top

Previous Action

Chinese Court Sentences American Geologist (7/10)
Xue Feng, an American geologist and employee of IHS Inc., has been sentenced to 8 years in Chinese prison and fined $30,000 for selling a classified Chinese oil industry database. U.S. Ambassador to China Jon Huntsman has issued a statement calling for Xue’s immediate release and repatriation to the United States.

Born in China, Xue studied in the United States and earned a doctorate from the University of Chicago, where he eventually became an American citizen. Xue’s sentence concludes a two-and-a-half year case. Chinese officials claim that Xue received documents on geological conditions of onshore oil wells and coordinates to more than 30,000 of those wells, which belong to the China National Petroleum Corporation and are considered state secrets. Interestingly, the Dui Hua Foundation reports that the Chinese government did not classify the information until after it had been sold to IHS Inc. The Colorado- based company declined to comment on China’s broad interpretation of state secrets. An IHS Inc. spokesman stated that the company was never notified of wrongdoing. Xue’s thesis advisor, David Rowley, at the University of Chicago said that Xue is a “straight-up individual who worked hard, who didn’t push limits,” and that he was “simply doing his job.”

The Beijing No. 1 Intermediate People’s Court stated that Xue’s actions “endangered our country’s national security.” During the trial, Xue argued that the information he gathered is “data that the oil sector in countries around the world make public.”

House Introduces Open Access Bill (4/10)
Representatives Mike Doyle (D-PA), Henry Waxman (D-CA), Rick  Boucher (D-VA), Debbie Wasserman-Schultz (D-FL), Dana Rohrabacher (R-CA) and Gregg Harper (R-MO) introduce a measure for federal agencies to provide public access to federally-funded research on April 15. The bill, Federal Research Public Access Act of 2009 (H.R. 5037) would require every federal agency with more than $100 million in extramural research expenditures to provide public access to research papers and research results. The measure stipulates that an author should submit an electronic version of the original manuscript or research results to the agency.  The author must then submit any revised manuscript after peer-review and with the publisher’s permission the final published version. The manuscripts must be made available to the public within six months of publication. Agencies would also be required to maintain a database of all published research findings.

The bill has been referred to the House Committee on Homeland Security and Government Affairs. The bill is identical to a Senate measure (S.1373) introduced by Senator Joseph Lieberman (I-CT) in June of 2009 and referred to the Senate Homeland Security and  Government Affairs Committee. It is unclear whether either bill will move through Congress or whether the bills might be included with others.

There is language in the America COMPETES reauthorization (H.R. 5116) that calls for further study of public access to research. Specifically the introduced measure calls for a working group to be established under the National Science and Technology Council to coordinate federal science agency policies related to the dissemination and long-term preservation of the results of unclassified research, including digital data and peer-reviewed publications.

Authors, publishers and other stakeholders of unclassified research should consider the language in these bills and provide input to policymakers now.

White House Update on Open Access Policy (3/10)
On March 8, 2010 the White House Office of Science and Technology Policy (OSTP) posted an update on their Public Access Policy Forum. The forum allowed interested parties to submit comments about improving public access to the results of federally funded research. The update states that OSTP is still reviewing all of the comments and is not yet ready to publicly discuss policy recommendations. OSTP has moved all of the blog posts and email submissions to a series of PDF documents. Interested readers can go to the update post to view the PDF documents through multiple links. AGI and several member societies submitted comments. All geoscience societies with publications should review the OSTP initiative and submissions and consider how policy changes might affect their publications and organizations.

CIA Will Share Satellite Data with Select Scientists (1/10)
An old environmental surveillance program has been reopened for the benefit of science. The Measurements of Earth Data for Environmental Analysis (Medea) program at the Central Intelligence Agency (CIA) has been reopened after unexpectedly being shut down by President George W. Bush in 2001 after nine years of operation. Medea gives 60 of the nation’s top scientists access to classified reconnaissance satellite data and other spy sensors. The scientists, mainly from academia with a few representatives from industry and federal agencies, conduct scientific research under the guidance of the National Academy of Sciences.

CIA Director Leon Paneta strongly supports the program, believing the national security implications of desertification, sea level rise, and population shifts justify this collaboration. However the program has come under scrutiny in Congress, particularly by Senator John Barrasso (R-WY) who thinks the CIA should spend more time fighting terrorists, “not spying on sea lions.”

The Medea program has little to no impact on regular intelligence gathering and is more or less free. What is does is release information already collected or utilizes already deployed sensors to gather environmental data while passing over wilderness areas. The images that have been declassified are released at a lower resolution to mask the true abilities of CIA satellites. So far the data scientists have received has allowed them to analyze Arctic sea ice to help with summer melt records. In addition to sea ice data, scientists hope to gather information on clouds, glaciers, deserts, and tropical forests.

Updates on Open Access Plans (1/10)
The White House Office of Science and Technology Policy (OSTP) held an online forum to receive input on public access policy. Stakeholders could submit information and ideas through a blog or email. The blog allowed for additional comments on submissions. The project started on December 9, 2009 and was suppose to end on January 7, 2010. The American Geological Institute and others requested an extension of the deadline because of the holidays and the comment period was extended to January 21, 2010. Several geoscience societies, including the Mineralogical Society of America, the Association of American Petroleum Geologists, the Association of Limnology and Oceanography and the Society for Sedimentary Geology submitted comments. Now OSTP will assess all of the comments and draft a policy plan. View the comments from their blog:

Last year the House Science and Technology Committee in coordination with OSTP convened a Scholarly Publishing Roundtable of stakeholders to consider ways to increase public access to research papers published in peer-reviewed journals. Their report was released on January 12, 2010.

The report recommends that published, peer-reviewed papers, where the research was supported by federal funds, should be made available through a public database. The report suggests an embargo period of as long as 12 months, but allows for shorter or longer periods.

The roundtable web page for the full report as well as press releases and related materials:

In related news, ArXiv, the “free” e-print server of physics papers, is requesting donations from educational institutions to maintain the database. ArXiv received $883,000 in stimulus funds from the National Science Foundation (NSF) in 2009 to enhance ArXiv, but the system needs additional funds for basic operation and maintenance (basic O&M costs about $400,000 annually). Cornell University Library posted a collaborative business model for ArXiv on its web page on January 21, 2010. Cornell manages the database and provides 15 percent of the operating budget. All stakeholders in peer-reviewed scientific publishing are encouraged to review the business model and consider the future of open access for all fields of research.

Read the ArXiv press release from NSF:
View the Cornell University Library business model online:

Administration Considers Open Access (12/09)
The Obama Administration through the White House Office of Science and Technology Policy (OSTP) is seeking input on access to federally funded research. A full description of the request for information, as well as the ongoing input, can be found at the OSTP web site: The original deadline for input was January 7, 2010, however, a December 31st federal register notice extended the deadline to January 21, 2010. As of this posting, the OSTP web site does not note the new deadline.

OSTP is asking for comments on two issues regarding free and open access to publications of federally-funded research results. The first issue is which other federal agencies besides the National Institutes of Health, should adopt public access policies. The second issue is a four-part question: how should public access be designed with regards to timing, what version of the paper to use, mandatory versus voluntary posting requirements and other concerns.

The American Geological Institute requested an extension of the January 7 deadline to allow geoscience societies more time to consider the request and offer input.

You may view the request through the updated Federal Register notice at This notice provides additional contact information (telephone and email) and questions, while the OSTP web site provides entry primarily to a blog where you can add your comments and view the comments of others.

Congress Is Tweeting Away Says CRS Report (10/09)
A recent Congressional Research Service (CRS) report analyzed congressional Twitter use during a two-week period in August 2009. Twitter is a micro-blogging service that allows users to post “tweets” of 140 characters or less online that are in turn delivered to their subscribers. The report found that 29 percent of the House and 31 percent of the Senate was registered on Twitter, capitalizing on the new social networking and communication tool to increase communication with their constituency. At the time of the report, 158 Representatives and Senators were using Twitter. Now over 200 are reported “tweeters” according to

The CRS data shows that nearly 1, 200 “tweets” were sent in the two-week sampling, at an average of 85 per day and most being sent on Thursday. From the report, House Republicans sent the most tweets (54 percent), followed by House Democrats (27 percent), Senate Republicans (10 percent), and Senate Democrats (9 percent). “Members' use of Twitter can be divided into six categories: position taking, press or web links, district or state activities, official congressional action, personal, and replies. The data suggest that the most frequent type of tweets were press and web link tweets…followed by official congressional action tweets during session (33 percent) and position-taking tweets during recess (14 percent).”

Refer to or to find members of Congress on Twitter.

House Authorizes $10 Million for AmericaView (10/09)
The House approved a bill to establish a national cooperative geospatial mapping program. The National Land Remote Sensing Outreach Act (H.R. 2489) authorizes $10 million annually for an AmericaView program at the U.S. Geological Survey (USGS) in fiscal years 2010 through 2019.

The program goal is to disseminate imagery from satellites and airplanes for education, research, assessment, and monitoring purposes across the nation. It is based on the current AmericaView project that started as a partnership between the USGS Earth Resources Observation and Science (EROS) Center and a consortium of universities that expanded the use and distribution of satellite data and imagery. The bill would create a consolidated program at the USGS.

A companion bill (S. 1078) was voted out of committee in the Senate in August, but is awaiting full Senate approval.

Bill for Open Source Textbooks and Other Materials (9/09)
On September 24, 2009 Senator Richard Durbin (D-IL) introduced the Open College Textbook Act (S. 1714), a measure to authorize grants for the creation or adaptation of open source college textbooks. The purpose of the bill is to make college textbooks freely available to the public and reduce the costs of textbooks for college students. The grants would be distributed through the Department of Education for the creation of high quality open textbooks for postsecondary education that would be licensed through an open license and freely available in digital content. The National Science Foundation would be involved in helping the Education Department set-up a merit-based, peer-review system for distributing the grants. Education would receive $15 million for the grants in fiscal year 2010.

In addition to this grant program, the bill requires all other educational materials for elementary, secondary and postsecondary courses created using federal grants to be licensed under an open license. This section of the bill takes the legislation far beyond open access for new college textbooks created under the Education grants and would mean that all educational materials must be posted, free of charge on an accessible website. This may have significant impact on any organization or individual who prepares educational materials using federal grants.

In March, a similar bill was introduced in the House by Representative Bill Foster (D-IL). The measure, entitled Learning Opportunities With Creation of Open Source Textbooks (LOW COST) (H.R. 1464), would “require federal agencies to collaborate in the development of freely-available open source educational materials for college-level physics, chemistry, and math and for other purposes.” This measure would require every federal agency that spends more than $10 million on scientific education to use at least 2 percent of such funds for collaboration on the development of open source materials for educational outreach.

This effort would be guided by the National Science Foundation (NSF) and the Department of Energy (DOE) who would oversee the “veracity, accuracy and educational effectiveness” of open source materials. The materials must be posted on a “Federal Open Source Material website” and must be free of copyright violations. In addition to using 2 percent of educational funds, NSF and DOE would jointly administer a grant program to produce open source materials and $15 million would be authorized for fiscal year 2010.

Senate Committee Approves Geospatial Mapping Bill (8/09)
The Senate Commerce Committee unanimously passed a bill to establish a comprehensive geospatial imagery mapping program at the U.S. Geological Survey (USGS) on August 5, 2009. The bill (S. 1078) was introduced by Senators Tim Johnson (D-SD) and John Voinovich (R-OH) in May. The program would work to disseminate imagery from satellites and airplanes for education, research, assessment, and monitoring purposes across the nation. It is based on the AmericaView project that started as a partnership between the USGS Earth Resources Observation and Science (EROS) Center and a consortium of ten Ohio universities that cooperatively expanded the use and distribution of satellite data and imagery. The EROS satellite data was given to these universities, who established the computer and network infrastructure to distribute the data to researchers and citizens.

AmericaView now has members in 35 states with programs dedicated to expanding access to satellite data. The bill would consolidate the state programs and the EROS Center to form the AmericaView program at the USGS. The program would benefit local, state, and federal agencies, industry, communities, and educational institutions by making national remote sensing data accessible and easy to use.

The House Subcommittee on Energy and Minerals Resources is considering an identical bill (H.R. 2489) introduced by Congresswoman Stephanie Herseth Sandlin (D-SD) and Steven LaTourette (R-OH). The House bill has not moved to the full committee yet and is behind the Senate bill. Wants Your Opinion (6/09)
The Exchange was created as part of the Open Government Initiative recently established by the White House to bring greater transparency, participation, and collaboration in how the government can better serve the public. The Environmental Protection Agency (EPA) is the leading partner agency of and the Exchange. is the on-line source for citizens to search, view, and comment on regulations issued by the U.S. government. The Exchange is the on-line forum for the public to explore designs and features proposed to improve the website. You can share your suggestions on how the can best meet your needs by posting your thoughts on from now until July 21, 2009.

Supreme Court Agrees to Hear Gene Patent Case (6/09)
In mid-May the American Civil Liberties Union and the Public Patent Foundation filed a lawsuit challenging the U.S. government’s practice of granting patents on human genes, specifically the BRCA1 and BRCA2 genes that are associated with breast cancer. In early June, the U.S. Supreme Court agreed to hear the case in the fall.

The outcome could have far-reaching implications for scientists and engineers, including those working in the geosciences. An invention is patent-eligible subject matter if it is new and useful. A process is patent-eligible subject matter if it (1) is “tied to a particular machine or apparatus” or (2) transforms a particular article into “a different state or thing” according to previous court decisions. Patent-eligible processes or inventions do not include “laws of nature, natural phenomena, [or] abstract ideas” according to U.S. law and previous court decisions. The question of whether human genes or other “natural articles” are patentable will test some of these past court decisions and U.S. patent law.

Preliminary Scientific Integrity Report Released (5/09)
In March, the Bipartisan Policy Center (BPC) released recommendations to help the Office of Science and Technology Policy create guidelines to comply with President Obama’s scientific integrity memorandum. This preliminary report, which will be finalized early this summer, aims to separate science and policy in regulatory issues. Their key recommendations are to: 1) identify whether a dispute is over scientific results or policy in regulatory documents, 2) use advisory panels of scientific experts solely for conclusions on science and not for policy recommendations, and 3) not give equal weight to all studies in a field. This is the first report from the BPC “Science for Policy Project.”

The report is available as a PDF on the BPC website:

The President’s memorandum for the heads of executive departments and agencies on the subject of scientific integrity is available at: the_press_office/Memorandum-for-the-Heads-of Executive-Departments-and-Agencies-3–9–09/).

President’s Task Force Seeks Comments on Scientific Integrity (4/09)
The following federal register notice appeared on April 23, 2009 in volume 74, number 77 on page 18597.

On March 9, 2009, the President issued a memorandum for the heads of executive departments and agencies on the subject of scientific integrity. The memorandum requires the Director of the Office of Science and Technology Policy (OSTP) to craft recommendations for Presidential action to ensure scientific integrity in the executive branch. This notice solicits public input to inform the drafting of those recommendations. The notice asks a series of questions to help guide the public in responding to this request.

There is a 21 day period for public comment from April 23, 2009 to May 13, 2009.

You may submit comments by any of the following methods: Web Site: Click the link to "Scientific Integrity" and follow the instructions for submitting comments electronically. Electronic Mail:

All members of the geosciences community are encouraged to view the full notice online and consider providing comments as appropriate and related to the six specific principles listed in the full notice. A link is also provided below in the Key Federal Register section of this Monthly Review.

House Introduces Fair Copyright Bill (2/09)
Congressman John Conyers (D-MI), chairman of the House Judiciary Committee has introduced a measure that would eliminate the National Institutes of Health public archive of published scientific research. The “Fair Copyright in Research Works Act” (H.R. 801) would eliminate PubMed Central, a digital archive of peer-reviewed journal articles that were funded in whole or in part by the National Institutes of Health. The legislation would also prevent other federal agencies from establishing similar archives.

This measure is favored by some non-profit professional societies and for profit publishers because it protects the value and copyright of their journal articles. The measure is opposed by some institutions, libraries and associations because it limits access to published research.

Here is the summary of the measure as prepared by the Congressional Research Services and posted on Thomas:

“Prohibits any federal agency from imposing any condition, in connection with a funding agreement, that requires the transfer or license to or for a federal agency, or requires the absence or abandonment, of specified exclusive rights of a copyright owner in an extrinsic work.

Prohibits any federal agency from: (1) imposing, as a condition of a funding agreement, the waiver of, or assent to, any such prohibition; or (2) asserting any rights in material developed under any funding agreement that restrain or limit the acquisition or exercise of copyright rights in an extrinsic work.

Defines 'funding agreement' as any contract, grant, or other agreement entered into between a federal agency and any person under which funds are provided by a federal agency for the performance of experimental, developmental, or research activities.

Defines 'extrinsic work' as any work, other than a work of the U.S. government, that is related to a funding agreement and is also funded in substantial part by, or results from a meaningful added value contributed by, one or more nonfederal entities that are not a party to the funding agreement.”

Back to top


Data is important for geoscience research and development and geoscience education. Federal government agencies, such as the U.S. Geological Survey, maintain data on natural resources and basic details such as topography. It is important that these databases are maintained and accessible. Geosciences are global and it is also important that data from around the world is maintained and accessible. The federal government plays a role in these issues by maintaining federal data and helping to ensure data access in other countries.

In addition to government data, information is also collected by the academic, private and other sectors. Much of the information collected by the U.S. academic sector is derived from federal funding in the U.S. and thus should be maintained and accessible. Information collected by the private sector is competitive and proprietary. Over time, some private sector information becomes non-proprietary, but can remains useful for the broad geoscience community if there is some mechanism to maintain this data.

Efforts are being made to preserve this data as well as other data within federal-state-local partnerships. One example of such a partnership, supported by the 109th Congress, is the data preservation section of the Energy Policy Act of 2005. The section authorizes funding for the U.S. Geological Survey Data Preservation Program in partnership with the state geological surveys to maintain data, including that contributed by the private sector.

Data is also important to scientific publications and concerns have been raised regarding access to data or to publications. Congress has received some pressure, particularly from health-related groups to make publications that are in part or whole supported by federal funds accessible to the public without any fees. Authors and publishers add value to the publications and hold copyrights, making no-fee access difficult to implement. Nonetheless, Congress required the National Institutes of Health (NIH) to develop a database of all publications funded by NIH in whole or in part and all authors must voluntarily submit their publications to the database.

Although the intent is to make federally-funded publications accessible to the taxpayers who pay for the research, the measure seems to work outside copyright law and can negatively impact non-profit society journals. Non-profit society journals tend to have a long history of high quality publications and the publications provide needed support for their societies. If the NIH model would be applied to all federally funded science then geoscience society journals might suffer serious deterioration and lead to the demise of the societies themselves. This would lead to a reduction of high quality publications and an increase in commercial publication prices (as commercial publishers would not have to compete with lower cost non-profit publishers).

In the past decade, data integrity has become more of an issue. Accusations of political appointees changing scientific data to further their own goals have called into question the reliability and quality of data being released by the government. A 2003 report by the House Committee on Government Reform said the Bush Administration had “manipulated the scientific process and distorted or suppressed scientific findings.” A follow-up report in 2007, Political Interference with Climate Change Science Under the Bush Administration, found that the White House censored climate change scientists and significantly edited climate change reports. In addition, the variety of sources and ease of access provided by the internet presents another problem in monitoring the integrity of scientific data. New rules will have to be introduced in the upcoming years in order to ensure the quality of data, especially federally funded research.

The Federal Information Quality Act
On February 22, 2002, in accordance with the Data Quality Act (Section 515 of Public Law 106-554), the White House Office of Management and Budget (OMB) released Information-Quality Guidelines. These were issued to promote "the quality, objectivity, utility, and integrity of information (including statistical information) disseminated by Federal agencies.'' The OMB also directed all federal agencies to implement their own guidelines to correct information that did not comply with those released by OMB.

The Data Quality Act aimed to quell concerns about the accuracy of federal data.  The Center for Regulatory Effectiveness claimed, “Federal agencies regularly publish information on which influential policies or decisions are made that have significant impact on the economy of a region or even the entire nation or that can influence wrong decisions about public and private investments, opportunities and other issues. States, cities, counties, as well as private and public organizations, and citizens in general have had difficulties getting the federal agencies to either substantiate the information published or correct the information to ensure its quality.”
OMB defines information "quality" as information that offers utility, objectivity and integrity to information consumers where: OMB defines information "quality" as information that offers utility, objectivity and integrity to information consumers where:

  • Utility is the usefulness of the disseminated information to the intended consumers.
  • Objectivity is that the disseminated information is presented in an accurate, clear, complete and unbiased manner and, as a matter of substance, is accurate, reliable and unbiased.
  • Integrity refers to security: the protection of information from unauthorized access or revision (modification) to ensure that the information is not compromised through corruption or falsification.

Agencies must define these terms in a rigorous, operational fashion to ensure that they have a shared understanding with their customers of the information provided."

For online access to the agency specific OMB guidelines go to the Center for Regulatory Effectiveness Website.

Sources: AGI's Monthly Review.

Contributed by Corina Cerovski-Darriau, Rachel Poor, and Linda Rowan, Government Affairs staff; Merilie Reynolds, AGI/AAPG Fall 2008 Intern; Joey Fiore, AGI/AIPG Summer 2009 Intern; Mollie Pettit, AGI/AAPG Fall 2009 Intern; Elizabeth Brown, AGI/AIPG Summer 2010 Intern; Elizabeth Huss, AGI/AIPG Summer 2010 Intern; and Kiya Wilson, AGI/AIPG Summer 2010 Intern.

Background section includes material from AGI's summaries and updates for Data Quality and Public Access in the 109th Congress.

Please send any comments or requests for information to AGI Government Affairs Program.

Last updated on October 8, 2010