April 2, 1999
Mr. F. James Charney
Office of Management and Budget, Room 6025
New Executive Office Building
Washington, DC 20503
Dear Mr. Charney:
We are writing to communicate the views of the American Association for
the Advancement of Science (AAAS) on the proposed revision to OMB Circular
A-110 published in the Federal Register on February 4, 1999. AAAS is the
world's largest multidisciplinary science association, with more than
250 affiliated scientific, engineering, and medical societies representing
all disciplines of science. We have long supported data access and sharing
in science. For example, just this past January, in response to the law
that precipitated OMB's proposal, the AAAS Council adopted a resolution
stating that "it supports the public disclosure of scientific data that
form the evidentiary basis for scientific findings and regulatory decisions,
at the appropriate time and with appropriate safeguards…." Despite our
long standing commitment to access and sharing of data, we have deep concerns
about the proposed changes to Circular A-110, especially with the use
of the Freedom of Information Act (FOIA) as the mechanism for implementing
the new requirement.
We acknowledge OMB's efforts to limit the scope of the proposed rule.
Nevertheless, the proposed revisions to Circular A-110 represent a fundamental
shift in federal policy in a direction that will create serious unintended
consequences for scientists, their institutions, federal funding agencies,
and the wider public. While the objective of improving the rule-making
process to make it more transparent and intelligible to the public is
laudable, we believe that the proposed revision is poorly constructed
and too vague to achieve that goal.
The revision proposes that "data relating to published research finding
produced under an award that were used by the Federal Government in developing
policy or rules" be made available to the public under FOIA. This represents
an expansion of FOIA to include materials that have not traditionally
been considered under the control of the government, whereas a 1980 ruling
of the U.S. Supreme Court (Forsham v. Harris, 445 U.S. 169) rejected such
expansions of FOIA's mandate. As such, it places new burdens on researchers
as well as their institutions with respect to the interpretation of the
rule. This has enormous implications for the scientific community and
the public's well-being. For these reasons, we would like to express our
specific reservations about the proposed revision and, where appropriate
and feasible, offer recommendations for addressing our concerns.
How will "Data" be Defined?
As a professional society that includes scientists from all disciplines,
we are acutely aware of the differences across scientific fields in the
types of data collected. A definition that fails to take into account,
for example, the difference between data generated by a survey instrument
and perishable data, such as blood and tissue samples or rare fossil remains,
will inevitably prove to be disruptive to the course of research and adversely
affect the production of valuable knowledge for society. Both NASA and
NSF have aggressively promoted openness and sharing in the research they
support. Nevertheless, these organizations decided to restrict access
to pieces of a Martian meteorite in order to reduce the risk of contamination.
This common sense recognition of important differences undermines the
notion of a one-size-fits-all definition of data for regulatory purposes.
At this point, however, we do not even know what would be included within
the meaning of "data."
Given the realities of conducting scientific research across a wide range
of disciplines, we recommend that the proposal state explicitly that
the definition of data shall be determined as part of the grant negotiating
process between federal agencies and grantee institutions. Researchers
are entitled to know what is expected of them and the agencies should
be authorized to specify what obligations researchers assume for archiving
and releasing data when they accept federal funding. This negotiating
process will identify appropriately different levels of access based on
the sensitivity of the data.
At What Point Must Data be Released?
In the proposed revision the timing of release of data is linked to the
publication of research findings. This suggested timing raises a number
of ambiguities. What does OMB consider "publication"? In some scientific
fields, abstracts presented at scientific meetings are published as part
of the conference proceedings, even though the findings may be preliminary
or incomplete. Would such an activity trigger the data release requirement?
Would posting research findings on a scientist's home page on the World
Wide Web be considered a publication? What about longitudinal studies
that produce a series of publications over time? Would the first publication
based on early data require the release of all data as the study progressed?
If not, would grantees be expected to make new releases with each publication?
Our concerns about this matter are sparked by the possibility of the
premature release of undocumented and unverified data. In 1985, the National
Research Council issued a report that reflects our own views on data release.
It declared that "Scientists have a special responsibility to share data
as quickly and as widely as possible where the data are or will become
relevant to public policy" (Committee on National Statistics, Sharing
Research Data, p. 27). But the report then stated: "This recommendation
is not intended to support the public release of analyses prior to appropriate
review." Good reasons for this caution should be readily apparent.
The premature release of research data before careful analysis and without
independent scientific review could increase the risk of disclosing unreliable
or misleading findings, perhaps leading to public confusion and bad policy.
In longitudinal studies conducted over several years, disclosure of data
collected in the early stages may discourage people from participating
in the study, or alter their behavior in a way that confounds the study.
Moreover, raw data are virtually useless without documentation and interpretation,
thus leading us to question how much would be gained by the infusion of
massive amounts of raw data into the public arena. We strongly recommend
that any reference to published research findings state that "publication"
refers to "appearance in a scientific journal after formal peer review."
When are Data Regarded as "Used" in Developing Federal Policy or
The proposed revision does not define how it will be determined that
the government has "used" a federally-funded study to develop a policy
or rule. There is a difference between data collected for the expressed
purpose of developing a policy or regulation and data that simply provide
background. Yet the proposal offers no guidance on the degree of direct
linkage required to prompt release of data. What is the threshold that
would trigger the requirement? We recommend that the threshold be any
new regulation or policy submitted for public comment; that any such regulation
or policy include explicit reference to specific studies used to develop
it and that the agency sponsoring the research be so notified; and that
the sponsoring agency, in consultation with the regulatory agency, shall
determine which data produced under federally-funded grants are relevant
and therefore subject to release. These specific proposals would reduce
the potential for nuisance requests to agencies and harassment of researchers
who may never have intended that their studies would impact policy.
Since policy and science are often inextricably linked, we must also
question how scientists can be expected to know whether research done
today will be used to develop future policies. How long will researchers
have to retain their data? This issue is further clouded because the proposed
revision is unclear about whether it will be applied retroactively. Will
the release requirement be made applicable to all existing research data
used to establish past rules or policies? If not, how will data covered
by the revision be distinguished from earlier data in ongoing research
projects that include both types? This is not a trivial matter, since
many federally-funded laboratories are conducting research initiated decades
ago. Either way the law is interpreted, a costly administrative burden
will be incurred.
How will Reasonable Fees for Covering Costs be Determined and Allocated?
Both the legislation and OMB proposal allow for a "reasonable fee" to
cover the "cost of obtaining the data." But both are conspicuously silent
on how the fees will be determined and apportioned among the agencies,
researchers, and their institutions. Indeed, it is the agencies that may
charge the fees without any requirement that these fees be shared with
those who bear the burden of archiving and preparing the data for release.
If the requirement for the public release of data is implemented, then
AAAS strongly recommends the inclusion of a cost-recovery system ensuring
that grantees are appropriately reimbursed for the costs associated with
archiving and releasing research data to the public.
What will be the Effects of the OMB Proposal on Collaboration and
Participation in Research?
One of the strengths of the research enterprise in the United States
is the partnerships among scientists, between scientists in academe and
in industry and between them and foreign partners, and between researchers
and volunteer subjects. We worry about the effects of the OMB proposal
on those relationships and the consequences for science. Most of the foreseeable
problems in this regard stem from the limitations of FOIA as a mechanism
for making grantee research data available to the public.
Data and funding associated with a particular study can originate from
many sources in addition to a federal grant. For example, they may originate
with industry or foreign partners, or with collaborating researchers who
have their own institutional funding. Once commingled, it may be difficult
to distinguish data produced with federal funds from those produced with
other funds. We are concerned that such partners may grow reluctant to
enter or to continue a collaboration that could lead to the public release
of data they would prefer to disclose at their discretion. Despite the
exemption in FOIA that protects "commercial or financial information,"
the ambiguity associated with determining which data in a university-industry
partnership would be subject to release is likely to make industry nervous
about pursuing such collaborations. This result would, of course, be contrary
to the objectives of the Bayh-Dole Act (P.L. 96-517), enacted in 1980
to spur the commercialization of research results by granting patent rights
to universities for inventions developed with federal funds.
Our nation owes a great deal to those who have voluntarily participated
as subjects in research done to increase our knowledge of human biology
and behavior. Although FOIA exempts "the disclosure of [information] that
would constitute a clearly unwarranted invasion of personal privacy,"
there is considerable fear among scientists and funders that this exemption
may not be sufficient to offer adequate assurances of protection to research
subjects. There are several reasons for these concerns.
Under FOIA, it is the agency--not the scientists who interact with research
subjects--that would determine what data to mask in order to protect personal
privacy. Subjects might be less than forthcoming in the details they reveal,
or even reluctant to participate at all, if they thought that the federal
government, let alone members of the public, might have access to their
data. It is likely, for example, that it would become far more difficult
to recruit participants for clinical studies on self-destructive or dangerous
behaviors such as the use of illegal substances or violence.
FOIA's exemption is limited in another way. In much research, the focus
of study is not individuals but institutions or communities, which are
not protected under FOIA. Furthermore, once one identifies the institution
or community in which research subjects are patients or in which they
reside, only a short additional step leads to a personal identification.
If research subjects lose faith in the ability of scientists to protect
their privacy, we risk losing an indispensable source of knowledge.
How will the OMB Proposal Protect Sensitive Research Data?
In addition to proprietary and human subjects' data, other types of research
data, if disclosed, could also adversely affect the conduct of science.
Data released to the public that could lead to the identification of historically
and scientifically valuable archeological sites could invite looting and
destruction. Similarly, data enabling one to identify the location of
rare botanical species outside the United States could lead to unwanted
bioprospecting and could damage the relationship between researchers and
the host community. There appears to be no mechanism through FOIA to protect
such identification from those determined to use released data to serve
their private interests.
In light of the complexity and uncertainty with the new law and OMB's
proposal, we recommend that a study be conducted of the issues raised
by the scientific community and of alternative approaches to achieving
public access to research data in a way that balances the public's right
to know with safeguards for the conduct of science. AAAS is prepared
to offer its services in conducting such a study; the National Research
Council or the General Accounting Office may also be well positioned to
undertake this effort.
We agree with the recent statement by the National Science Board ("On
the Sharing of Research Data," February 23, 1999) that "Current sharing
practices promote free and open exchange of research data in a context
that supports the rapid creation of knowledge,…." Both NSF and NIH are
well known for their aggressive data sharing policies affecting grantees.
Indeed, NIH announced late last year that it plans to spend $100 million
over the next five years to secure public access to genetic data (Science,
December 11, 1998, p. 1967). Scientific societies have incorporated expanded
data sharing provisions into their professional standards. For example,
the Code of Ethics of the American Sociological Association states the
"Sociologists share data and pertinent documentation as a regular practice"
and that data sharing should be anticipated "as an integral part of a
research plan…." Scientific journals also play an important role in fostering
data sharing. In 1998, for example, both Science and Nature, declaring
that "the interests of the scientific community and the public are best
served by making freely available the data on which the ideas in our papers
are based," began to require "unrestricted release" of certain types of
data "on or before the date of publication for all new manuscripts…."
Those are only a few examples of current practices. The study we recommend
should examine these and related practices in order to determine whether
models exist among them that, if modified or expanded, could serve the
objectives of the new law without unleashing the problems that we identified
above. AAAS recommends that the comment period be deferred until such
a study is completed. At a minimum, OMB should extend the period of
comment in view of the complex issues involved and the absence of any
public discussion at the time the law was enacted.
We hope that these comments are helpful and we appreciate your giving
them careful attention. AAAS is prepared to work with you and others to
achieve a policy acceptable to both lawmakers and to the scientific community.
Chair, Board of Directors, AAAS
Chancellor, University of California, Santa Cruz
Stephen Jay Gould
Mary L. Good
Managing Member, Venture Capital Investors, LLC
Professor of Zoology, Harvard University