Access to MagAO-X data will be provided using UA’s CyVerse infrastructure. CyVerse provides several resources for scalable data management with connections to large-scale and distributed computing resources.
Laboratory and Engineering Data¶
Laboratory and on-telescope-engineering data (whether daytime testing or night-time observations) generated for this project and used in a publcation will be made publicly available after a three-month period. This is primarily to give the team time to analyze the data and perform quality control.
Prior to publication, by default the raw data from science observations enabled by this project will be subject to a 24-month proprietary period during which the responsible astronomer will have exclusive access. The CyVerse system will automatically enforce this proprietary period by leveraging “rules” used by the Data Store (part of the iRODS architecture). Exceptions to this policy will be made based on astronomer requests and the policies of the Magellan partner institutions. Published data will be made available upon request, with approval by the responsible astronomer, using the CyVerse infrastructure.
Policies and provision for re-use, re-distribution and products of derivatives¶
Raw data access will come with a simple request to cite the relevant MagAO-X-based publication if used in a public product. Reproductions of published images made available on the MagAO-X project website will always include citations to the primary source, with standard academic citation practices applied. All software generated by this project will be released under the GPL or the MIT license, and integrated in the appropriate CyVerse resource for ease of reuse by the scientific community. No reach-through rights or intellectual property rights will be claimed on the outcomes of this research, including associated data, software, and hardware designs.
Archiving of Data¶
We will utilize the CyVerse infrastructure at the University of Arizona (UA) to store data for the life of the MagAO-X system and beyond. All data will be saved in FITS format, preserving all relevant metadata in the headers. Management of data at UA will be coordinated through three phases; initial, near term, and long term. The specific practices will depend on several factors, especially the available technology and associated costs, data quantity, and long-term public access. Initial data management will facilitate robust storage and later analysis phases. At least two copies of all data files will reside on redundant disk systems; a working copy will be backed up by at least one archive copy. All analyses will occur using the working copy. The primary goal of these considerations is to preserve raw data in case of catastrophic hardware failure. Near-term data management will support the project’s computational analyses and facilitate public access. Once data are made public, they will be made available for public anonymous access through a specific area of the CyVerse Data Store. Long-term the data will be safely stored for later use. The Data Store replicates data between computers at UA and the Texas Advanced Computing Center (TACC). Currently, each site has at least 2 petabytes storage.
Access permissions will be managed automatically. FITS headers will be used to assign user permissions and provide a searchable database. Data will automatically become public at the end of the proprietary period.
All data sets and software generated by this project will be documented and linked to using a wiki provided by UA, CyVerse, or another publicly available online documentation site. This site will also serve as a coordination hub between groups for documenting how data were handled, pre-processed, post-processed, published, and made publicly available.