Introducing the Gateway Data Enclave
Written by: Codi Young and Drystan Phillips
Published on: Oct 22, 2024
Introducing the Gateway Data Enclave
We are pleased to introduce the Gateway Data Enclave, a secure, remote-access computing desktop that allows U.S.-based researchers to analyze publicly available data from the Health and Retirement Studies – International Network of Studies (HRS-INS) while safeguarding these data sources from unauthorized access and use. The Enclave is hosted on the National Institute on Aging’s LINKAGE platform.
The Gateway Data Enclave initiative was partially motivated by the NIH’s new Data Management and Sharing Policy, enacted in January 2023, which required all NIH-funded research to include a data management and sharing plan that specifies how, where, and when data will be shared. Scientific data must be shared by the time of publication and/or at the end of the research project. In addition to this new policy, the Gateway user community has frequently reported their difficulties with obtaining and accessing data from the HRS-INS network.
With the Gateway Data Enclave, researchers only need to submit one data use agreement to access data from multiple HRS-INS studies whose data are housed in the Enclave. This will save time and efforts required for preparing multiple, separate data use agreements for each survey and downloading each data individually.
How to apply
To apply for a Gateway Data Enclave workspace, interested researchers must first create a Gateway user account and then fill out a signed data use agreement via Docusign, which can be accessed on our Data Enclave page by selecting Application for a Gateway Enclave Workspace.
On the form, applicants will be asked to answer some general questions about their research interests and past experiences using the HRS-INS studies, indicate the public-use data files that they would like access to and the type of workspace they’re applying for, and fill in their personal details.
Researchers can apply for two types of workspaces: individual and shared. An individual workspace is recommended for those working alone on their research projects. A shared workspace is perfect for researchers working on group papers, where all members plan to work on the analyses together.
After a Gateway Compliance Officer has approved the application, prospective users will be asked to complete Remote Identity Proofing to verify their identity. Once these steps have been completed, a Gateway Compliance Officer will set up the workspace and provide users access to the data requested.
Note: Due to the restrictions of the LINKAGE Platform, the Gateway Data Enclave is currently only available to users affiliated with a U.S.-based institution with a U.S. government-issued photo ID. Expanding access to international researchers is currently in the works.
What data resources are in the Gateway Data Enclave?
As of the time of this blog post, the Gateway Data Enclave houses publicly available data from the following select HRS-INS studies:
- Health and Retirement Study (HRS), United States:
- Harmonized data for core HRS, End of Life, HRS-HCAP, and HRS COVID
- 2002-20 Core HRS Interview data (10 waves)
- English Longitudinal Study of Ageing (ELSA), England:
- Harmonized data for core ELSA, End of Life, ELSA-HCAP, and ELSA COVID
- 2002-19 Core ELSA Interview data (9 waves)
- Longitudinal Aging Study in India (LASI), India:
- Harmonized data for core LASI and LASI-DAD
- 2017-19 Core LASI Interview data (1 wave) and 2020-22 LASI-DAD Interview data (1 wave)
- Chilean Social Protection Survey (SPS), Chile
- 2017-18 Core SPS Interview data (Wave 7 data only) and 2019 Chile-Cog Interview data (1 wave)
- Northern Ireland Cohort for the Longitudinal Study of Ageing (NICOLA), Northern Ireland
- 2013-19 Core NICOLA Interview data (2 waves)
In addition to these study data, the Gateway has prepared two specialized analysis datasets focused on:
- Harmonized longitudinal data on disability from both the HRS and ELSA
- Cross-country cognition data using data from the Harmonized Cognitive Assessment Protocol (HCAP) studies in the US, England, and India as well as data from linked Core studies.
The Gateway is designed to easily and efficiently provide access to these data for research analysis by allowing users to conduct their analysis work on these data directly in the Enclave. By design, however, the Enclave does not allow users to export these data out of the Enclave.
What can you do in the Gateway Data Enclave?
Researchers can run analyses using common analytic packages like R-Python, SAS, and Stata from within the Enclave. Additionally, users are welcome to upload external data to their Enclave workspace and could link to the existing data resources where applicable.
The platform also provides access to productivity software including Adobe Acrobat Reader, Microsoft Office products (Excel, Word, and PowerPoint), and Notepad++ for the user’s convenience.
While the provided data resources can’t be exported out of the Enclave, statistical summaries, such as frequency tabulations and regression coefficients, and analysis programs are allowed. All output will be reviewed and approved by a Gateway Compliance Officer prior to download.
To orient new users of the Enclave, we offer a comprehensive Gateway Enclave User Guide and a webinar that provide detailed instructions for researchers to follow as they work on their analyses within the Enclave. And of course, users can always reach out to the Gateway Enclave Help Desk by emailing enclave@g2aging.org.
Looking ahead, the Gateway Data Enclave plans to include more HRS-INS studies and add new kinds of data previously not provided anywhere else, like contextual data capturing historical and present policies and environmental exposures (e.g., air pollution, climate). Make sure to sign up for data alerts and announcements in your profile to stay updated!
- Codi Young is a Programmer at the University of Southern California.
- Drystan Phillips is a Project Manager at the University of Southern California.