LOFAR is a high-throughput data facility that has been operational since 2012 and is currently undergoing a major upgrade towards LOFAR 2.0. Operating such a state-of-the-art facility for the first time has given us the chance to optimize our tools, processes, and operational model with respect to the complexity of this groundbreaking telescope. The upgrades in LOFAR 2.0 will enable the simultaneous usage of the low and high band antennas, alternatively doubling the survey speed in one of the observing bands. The correlator with a fully commensal functionality will provide interferometric and tied array beam data products at the same time. The new observing regime will require high performance not only on the correlator, but also on the specification, scheduling, and quality assessment of the observations. Processing to produce science-ready data to be made available to the community will be performed in the long-term archive infrastructure. This will come with a cost in terms of computing and storage resources, as well as tackling challenges in terms of pipelines and algorithms developments and optimization and control of complex chains of data processes. In this talk, I will describe how the lessons learned in several aspects of LOFAR operations (from telescope calibration to data storage and discovery) have triggered important technological, operational, and policy advancements that will pave the path towards LOFAR 2.0 and beyond.
The LOFAR Data Valorization (LDV) project aims to curate and add value to the multi-petabyte distributed data collection of the LOFAR Long Term Archive (LTA) as well as to balance resource usage across data centers by re-distributing archived files. It provides a demonstration of practices to be implemented in production for future LTA management. As part of the LDV project, the following topics are described in this paper: Data curation: An effort to assure metadata completeness of tens of millions of files archived over fifteen years of telescope operations called for thorough screening of the content and consistent annotation of the information throughout the data collection catalog. We will show how this challenge was tackled over the first year of LDV operations. Data editing: The data collection of tens of petabytes consists of a broad spectrum of file sizes and data product types. For editing these collections, a suite of workflows has been developed and operated to reduce required storage resources by many petabytes. The first phase main goals of the LDV project that are presented are aimed at improving sustainability of operations as well as user data access experience. Data placement: Handling petabyte-scale data transfers is both time and effort consuming. To make transfer processes more manageable, we have used two services: the LOFAR Stager service, responsible for staging LOFAR data, and the SURF-operated File Transfer Service (FTS). FTS is responsible for bulk data transfers while allowing users to monitor and debug problems. As part of the LDV project, these systems have been used to transfer over one petabyte of LOFAR data so far and are planned to be used to transfer at least three more petabytes. This paper will present the achieved results and provide the next steps for valorizing the LOFAR archive data collection.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.