Overview
The data source for the vidavis applications must be in the MeasurementSet
v4 (MSv4) Zarr format, but they will accept MeasurementSet v2 (MSv2) file paths.
The MSv2s will be automatically converted to MSv4 Zarr files if the necessary
packages are installed, or the plots will fail.
Infrastructure
The applications utilize the Bokeh backend for the plots. Bokeh provides built-in plot tools allowing the user to zoom, pan, select regions, inspect data values, and save the plot. Additional libraries are used for data I/O, logging, plotting, and interactive dashboards:
|
|
|
|
|
XRADIO (Xarray Radio Astronomy Data I/O) implements the MeasurementSet v4 schema as Zarr files and uses Xarray to provide an interface for data
ToolVIPER is used for creating the Dask.distributed client and for logging
GraphVIPER is used for Dask-based MapReduce to convert MSv2 datasets and calculate statistics
Holoviz library hvPlot allows easy visualization of Xarray data objects as interactive Bokeh plots
Holoviz library Holoviews allows the ability to easily layout and overlay plots
Holoviz library Panel streamlines the development of apps and dashboards for the plots
Implementation
Application Modes
The vidavis applications create plots for the user to view interactively or
save to disk. The applications can be used in three ways from Python:
create plots to export to file
create interactive Bokeh plots to show in a browser window
use a GUI dashboard in a browser window to select plot parameters and create interactive Bokeh plots
Data Exploration
XRADIO allows the user to explore MeasurementSet data with a summary of its metadata, and to make plots of antenna positions and phase center locations of all fields. These features can be accessed in all applications.
Installation
Requirements
Python 3.11 or greater
The required dependency packages are automatically installed with vidavis,
including those listed in the Infrastructure section.
Optionally, XRADIO with python-casacore or casatools is required to enable conversion from MSv2 to MSv4 in the applications.
Install vidavis
You may want to use the conda environment manager from miniforge to create a clean, self-contained runtime where vidavis and its dependencies can be installed:
conda create --name vidavis python=3.12 --no-default-packages
conda activate vidavis
Install required packages:
pip install vidavis
Install for MSv2 Conversion
Currently, casacore and zarr Python packages are automatically installed with XRADIO for vidavis if the system platform is Linux.
On macOS, zarr only is installed. To enable conversion from MSv2 to MSv4, it is
required to pre-install python-casacore using
conda install -c conda-forge python-casacore.
It is also possible to use casatools as the backend for reading the MSv2. See the XRADIO casatools setup guide for more information.
Dask.distributed Scheduler
For parallel processing workflows, you can set up a local Dask cluster using ToolVIPER. Dask.distributed is a centrally managed, distributed, dynamic task scheduler.
Prior to using the vidavis applications, you may elect to start a Dask
Client (local machine) or a Dask LocalCluster (cluster). For a local client, set
the number of cores and memory limit per core to use. The logging level for the
main thread and the worker threads may also be set (default “INFO”). When
plotting small datasets, however, this adds overhead which may make plotting
slower than without the client.
ToolVIPER has an interface to create and access a local or distributed
client. See the
client tutorial
and help(local_client) or help(distributed_client) for more information.
Warning
If Python scripts are used to make plots, the client should not be created in the main thread. For more details, see Standalone Python scripts in the Dask Scheduling documentation.