Overview

The data source for the vidavis applications must be in the MeasurementSet v4 (MSv4) Zarr format, but they will accept MeasurementSet v2 (MSv2) file paths. The MSv2s will be automatically converted to MSv4 Zarr files if the necessary packages are installed, or the plots will fail.

Infrastructure

The applications utilize the Bokeh backend for the plots. Bokeh provides built-in plot tools allowing the user to zoom, pan, select regions, inspect data values, and save the plot. Additional libraries are used for data I/O, logging, plotting, and interactive dashboards:

../_images/bokeh_logo.svg ../_images/xradio_logo.webp ../_images/graphviper_logo.jpeg ../_images/hvplot.png ../_images/holoviews.png ../_images/panel.png
  • XRADIO (Xarray Radio Astronomy Data I/O) implements the MeasurementSet v4 schema as Zarr files and uses Xarray to provide an interface for data

  • ToolVIPER is used for creating the Dask.distributed client and for logging

  • GraphVIPER is used for Dask-based MapReduce to convert MSv2 datasets and calculate statistics

  • Holoviz library hvPlot allows easy visualization of Xarray data objects as interactive Bokeh plots

  • Holoviz library Holoviews allows the ability to easily layout and overlay plots

  • Holoviz library Panel streamlines the development of apps and dashboards for the plots

Implementation

Application Modes

The vidavis applications create plots for the user to view interactively or save to disk. The applications can be used in three ways from Python:

  • create plots to export to file

  • create interactive Bokeh plots to show in a browser window

  • use a GUI dashboard in a browser window to select plot parameters and create interactive Bokeh plots

Data Exploration

XRADIO allows the user to explore MeasurementSet data with a summary of its metadata, and to make plots of antenna positions and phase center locations of all fields. These features can be accessed in all applications.

Installation

Requirements

  • Python 3.11 or greater

The required dependency packages are automatically installed with vidavis, including those listed in the Infrastructure section.

Optionally, XRADIO with python-casacore or casatools is required to enable conversion from MSv2 to MSv4 in the applications.

Install vidavis

You may want to use the conda environment manager from miniforge to create a clean, self-contained runtime where vidavis and its dependencies can be installed:

conda create --name vidavis python=3.12 --no-default-packages
conda activate vidavis

Install required packages:

pip install vidavis

Install for MSv2 Conversion

Currently, casacore and zarr Python packages are automatically installed with XRADIO for vidavis if the system platform is Linux.

On macOS, zarr only is installed. To enable conversion from MSv2 to MSv4, it is required to pre-install python-casacore using conda install -c conda-forge python-casacore.

It is also possible to use casatools as the backend for reading the MSv2. See the XRADIO casatools setup guide for more information.

Dask.distributed Scheduler

For parallel processing workflows, you can set up a local Dask cluster using ToolVIPER. Dask.distributed is a centrally managed, distributed, dynamic task scheduler.

Prior to using the vidavis applications, you may elect to start a Dask Client (local machine) or a Dask LocalCluster (cluster). For a local client, set the number of cores and memory limit per core to use. The logging level for the main thread and the worker threads may also be set (default “INFO”). When plotting small datasets, however, this adds overhead which may make plotting slower than without the client.

ToolVIPER has an interface to create and access a local or distributed client. See the client tutorial and help(local_client) or help(distributed_client) for more information.

Warning

If Python scripts are used to make plots, the client should not be created in the main thread. For more details, see Standalone Python scripts in the Dask Scheduling documentation.