Operating a cutting-edge radio telescope like ALMA demands optimal utilization of every minute available in the sky. With an increasing allocation of observation hours to researchers each year, the imperative for continuous, seamless operations grows. ALMA relies on an array of computer systems functioning on a full-time basis, with numerous concurrent users, generating approximately 50,000 logs per minute and a staggering 70 million logs per day. Addressing the challenge of managing this voluminous data flow, log detector emerges as an in-house solution designed to automate the detection and reporting of known issues. By scrutinizing logs, this tool empowers users to define Finite State Machine (FSM) states and transitions. Subsequently, users can feed logs into this machine, inducing state transitions that signal potential problems or facilitate system monitoring tasks. This article aims to spotlight the capabilities of Log Detector and its impact on operational efficiency. Additionally, it offers insights into the lessons learned while developing an in-house operational tool and outlines future development plans.
The Atacama Large Millimetre/submillimetre Array (ALMA) is the world’s largest ground-based facility for observations at millimeter/submillimeter wavelengths. Inaugurated in March 2013, ALMA has already accomplished ten years of continued steady-state operations. It comprises 66 antennas located approximately 5000 meters at the Chajnantor Plateau in the Atacama Desert in Northern Chile. The ALMA partnership established the ALMA 2030 development program to improve ALMA’s capability to avoid obsolescence for the next decade. The Wideband Sensitivity Upgrade (WSU) project, the first initiative of the ALMA 2030 development program, will replace the entire digital processing system, which includes the wideband digitizers, data transmission system, and data correlation system. A working group was charged to develop a WSU Deployment Concept based on a parallel deployment approach to minimize scientific downtime during the upgrade period, which could last up to five years. In this paper, the authors present the relevant aspects of this analysis and conclusions, which will pave the road to address the definition of the AIVC concept and the corresponding AIVC plan of the WSU project.
The COVID-19 pandemic forced some ALMA Observatory’s teams to change their working models from observatory on-site or office-based to fully remote. The performance results obtained by the groups during this emergency evidenced that a hybrid working model would be suitable to be implemented in the long term, especially for the teams that concentrate their activities out of the observatory site or Santiago’s offices. Science and computing groups were the most suitable teams for adopting a different working model. There were many lessons learned from this experience which contributed to establishing a permanent hybrid model. The ALMA Software group, consisting of 18 engineers, transitioned in this direction taking into consideration all the knowledge learned during the pandemic and developing a smooth and successful experience by maintaining productivity levels and cohesive team spirit despite the physical location of the group members. This paper provides an overview of key considerations, challenges, and benefits associated with the shift towards a hybrid working model. Factors/challenges such as technological infrastructure, communication and collaboration, collaborators' well-being and performance metrics are analyzed from the manager/supervisors' point of view. The paper also describes the challenges that the group will face shortly, and the actions developed to mitigate the risks and disadvantages of the new working environment.
The Atacama Large Millimeter/submillimeter Array (ALMA) has been working in the operations regime since 2013. After almost 10 years of successful operation, obsolescence of hardware and software emerged. On the other hand, the ALMA 2030 plan will add new disrupting capabilities to the ALMA telescope. Both efforts will require an increased amount of technical time for testing in order to minimize the risks to introduce instability in the operation when new equipment and software are integrated into the telescope. Therefore, a process to design and implement a new simulation environment, which must be comparable to the production environment, was started in 2017 and passed the Critical Design and Manufacturing Review (CDMR) in 2020. In this paper, the current status of the project was reviewed focusing on the assembling and integration period, and use cases that are started to be built on top of this testing facility.
During operations, the ALMA observatory generates a huge amount of logs which contain not only valuable information related to specific failures but also for long term performance analysis. We implemented a big data solution based on Elasticsearch, Logstash and Kibana. They are configured as decoupled system which causes zero impact on the existent operations. It is able to keep more than six months of operation logs online. In this paper, we'll describe this infrastructure, applications built on top of it, and the problems that we faced during its implementation.
After the inauguration of the Atacama Large Millimeter/submillimeter Array (ALMA), the Software Operations Group in Chile has refocused its objectives to: (1) providing software support to tasks related to System Integration, Scientific Commissioning and Verification, as well as Early Science observations; (2) testing the remaining software features, still under development by the Integrated Computing Team across the world; and (3) designing and developing processes to optimize and increase the level of automation of operational tasks. Due to their different stakeholders, each of these tasks presents a wide diversity of importances, lifespans and complexities. Aiming to provide the proper priority and traceability for every task without stressing our engineers, we introduced the Kanban methodology in our processes in order to balance the demand on the team against the throughput of the delivered work.
The aim of this paper is to share experiences gained during the implementation of Kanban in our processes, describing the difficulties we have found, solutions and adaptations that led us to our current but still evolving implementation, which has greatly improved our throughput, prioritization and problem traceability.
The main telescope of the UC Observatory Santa Martina is a 50cm optical telescope donated by ESO to Pontificia
Universidad Catolica de Chile. During the past years the telescope has been refurbished and used as the main facility for
testing and validating new instruments under construction by the center of Astro-Engineering UC. As part of this work,
the need to develop a more efficient and flexible control system arises. The new distributed control system has been
developed on top of Internet Communication Engine (ICE), a framework developed by Zeroc Inc. This framework
features a lightweight but powerful and flexible inter-process communication infrastructure and provides binding to
classic and modern programming languages, such as, C/C++, java, c#, ruby-rail, objective c, etc. The result of this work
shows ICE as a real alternative for CORBA and other de-facto distribute programming framework. Classical control
software architecture has been chosen and comprises an observation control system (OCS), the orchestrator of the
observation, which controls the telescope control system (TCS), and detector control system (DCS). The real-time
control and monitoring system is deployed and running over ARM based single board computers. Other features such as
logging and configuration services have been developed as well. Inter-operation with other main astronomical control
frameworks are foreseen in order achieve a smooth integration of instruments when they will be integrated in the main
observatories in the north of Chile
The Atacama Large Millimeter /submillimeter Array (ALMA) will be a unique research instrument composed of at least
66 reconfigurable high-precision antennas, located at the Chajnantor plain in the Chilean Andes at an elevation of 5000
m. Each antenna contains instruments capable of receiving radio signals from 31.3 GHz up to 950 GHz. These signals
are correlated inside a Correlator and the spectral data are finally saved into the Archive system together with the
observation metadata. This paper describes the progress in the development of the ALMA operation support software,
which aims to increase the efficiency of the testing, distribution, deployment and operation of the core observing
software. This infrastructure has become critical as the main array software evolves during the construction phase. In
order to support and maintain the core observing software, it is essential to have a mechanism to align and distribute the
same version of software packages across all systems. This is achieved rigorously with weekly based regression tests and
strict configuration control. A build farm to provide continuous integration and testing in simulation has been established
as well. Given the large amount of antennas, it is imperative to have also a monitoring system to allow trend analysis of
each component in order to trigger preventive maintenance activities. A challenge for which we are preparing this year
consists in testing the whole ALMA software performing complete end-to-end operation, from proposal submission to
data distribution to the ALMA Regional Centers. The experience gained during deployment, testing and operation
support will be presented.
KEYWORDS: Antennas, Software development, Observatories, Optical correlators, Astronomy, Software engineering, Prototyping, Information technology, Solar thermal energy, Control systems
Starting 2009, the ALMA project initiated one of its most exciting phases within construction: the first antenna
from one of the vendors was delivered to the Assembly, Integration and Verification team. With this milestone and
the closure of the ALMA Test Facility in New Mexico, the JAO Computing Group in Chile found itself in the front
line of the project's software deployment and integration effort. Among the group's main responsibilities are the
deployment, configuration and support of the observation systems, in addition to infrastructure administration,
all of which needs to be done in close coordination with the development groups in Europe, North America
and Japan. Software support has been the primary interaction key with the current users (mainly scientists,
operators and hardware engineers), as the software is normally the most visible part of the system.
During this first year of work with the production hardware, three consecutive software releases have been
deployed and commissioned. Also, the first three antennas have been moved to the Array Operations Site, at
5.000 meters elevation, and the complete end-to-end system has been successfully tested. This paper shares the
experience of this 15-people group as part of the construction team at the ALMA site, and working together
with Computing IPT, on the achievements and problems overcomed during this period. It explores the excellent
results of teamwork, and also some of the troubles that such a complex and geographically distributed project
can run into. Finally, it approaches the challenges still to come, with the transition to the ALMA operations
plan.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.