1. Data download and preliminaries#

From prior work and data:

1.1. Preliminaries#

1.1.1. Configure env#

1.1.2. Obtain data#

  • Use osfclient for python methods below. Run pip install osfclient if required.

  • Alternatively can just pull data via web interface.

  • For full OSFclient CLI clone, clone pulls 99 files/1.1Gb.

  • For python case, use local module qbanalysis - run pip install -e . from repo root to install.

Options:

  1. Clone full OSF repository/project.

  2. Pull only data matching published case, file Xe_hyperfine_VMI_processing_distro_211217.zip in OSF repo.

from pathlib import Path

project = 'ds8mk'
# dataPath = Path('~/tmp/xe_analysis_2024_scratch')
dataPath = Path('/tmp/xe_analysis')
dataFile = 'Xe_hyperfine_VMI_processing_distro_211217.zip'
# Option (1): download full repo at CLI
# fetch all files from a project and store them in `output_directory`
# Should pull pulls 99 files/1.1Gb.

# !osf -p {project} clone {dataPath.as_posix()}
# Option (2) Minimal data via Python API
# Just pull final analysis `Xe_hyperfine_VMI_processing_distro_211217.zip`
# Note can also use CLI, `!osf fetch {project}/{dataFile} {(dataPath/dataFile).as_posix()}`
# If local env is configure for this.

# Load module
# import qbanalysis as qb
from qbanalysis import getOSFdata

# Get data
# Alternatively can run with project defaults as `getOSFdata.main()`
projDict = getOSFdata.getProjectFile(project,dataPath,dataFile)
2025-06-03 16:49:37.394 | INFO     | qbanalysis.config:<module>:11 - PROJ_ROOT path is: /home/runner/work/Quantum-Beat_Photoelectron-Imaging_Spectroscopy_of_Xe_in_the_VUV/Quantum-Beat_Photoelectron-Imaging_Spectroscopy_of_Xe_in_the_VUV
2025-06-03 16:49:38.740 | INFO     | qbanalysis.getOSFdata:getProjectFile:61 - Found OSF project: Quantum Beat Photoelectron Imaging Spectroscopy of Xe in the VUV, https://osf.io/ds8mk/ .
2025-06-03 16:49:39.892 | INFO     | qbanalysis.getOSFdata:getProjectFile:69 - Created destination dir /tmp/xe_analysis.
2025-06-03 16:49:39.893 | INFO     | qbanalysis.getOSFdata:getProjectFile:89 - Scanning OSF project files...
2025-06-03 16:49:41.717 | INFO     | qbanalysis.getOSFdata:getProjectFile:92 - Downloading Xe_hyperfine_VMI_processing_distro_211217.zip (index n=1)...
  0%|          | 0.00/10.1M [00:00<?, ?bytes/s]
  0%|          | 49.2k/10.1M [00:00<00:25, 391kbytes/s]
  1%|          | 115k/10.1M [00:00<00:19, 515kbytes/s] 
  2%|▏         | 246k/10.1M [00:00<00:11, 845kbytes/s]
  6%|▋         | 639k/10.1M [00:00<00:04, 1.98Mbytes/s]
 16%|█▌        | 1.61M/10.1M [00:00<00:01, 4.64Mbytes/s]
 42%|████▏     | 4.21M/10.1M [00:00<00:00, 11.7Mbytes/s]
 87%|████████▋ | 8.75M/10.1M [00:00<00:00, 22.5Mbytes/s]
100%|██████████| 10.1M/10.1M [00:00<00:00, 13.1Mbytes/s]
2025-06-03 16:49:44.112 | SUCCESS  | qbanalysis.getOSFdata:getProjectFile:97 - Downoaded data file to /tmp/xe_analysis/Xe_hyperfine_VMI_processing_distro_211217.zip.
2025-06-03 16:49:44.112 | INFO     | qbanalysis.getOSFdata:getProjectFile:102 - Unzipping Xe_hyperfine_VMI_processing_distro_211217.zip.

# The returned dictionary contains a file list and other info
projDict.keys()
dict_keys(['project', 'name', 'URL', 'dataPath', 'dataFile', 'fullPath', 'fileList', 'fileNames'])

1.2. Quick plot to check dataset#

Basic functions are configured to reformat the raw data, and plot the \(\beta_{LM}(t)\) - this should match figure 5 in the manuscript.

from qbanalysis.dataset import loadFinalDataset
from qbanalysis.plots import plotFinalDatasetBLMt
* sparse not found, sparse matrix forms not available. 
* natsort not found, some sorting functions not available. 
* Setting plotter defaults with epsproc.basicPlotters.setPlotters(). Run directly to modify, or change options in local env.
* Set Holoviews with bokeh.
* pyevtk not found, VTK export not available. 
dataDict = loadFinalDataset(dataPath)
2025-06-03 16:50:21.435 | INFO     | qbanalysis.dataset:loadDataset:268 - Loaded data cpBasex_results_cycleSummed_rot90_quad1_ROI_results_with_FT_NFFT1024_hanningWindow_270717.mat.
2025-06-03 16:50:21.485 | INFO     | qbanalysis.dataset:loadDataset:268 - Loaded data cpBasex_results_allCycles_ROIs_with_FTs_NFFT1024_hanningWindow_270717.mat.
2025-06-03 16:50:21.805 | INFO     | qbanalysis.dataset:loadFinalDataset:244 - Processed data to Xarray OK.
/opt/hostedtoolcache/Python/3.10.11/x64/lib/python3.10/site-packages/xarray/core/concat.py:500: FutureWarning: unique with argument that is not not a Series, Index, ExtensionArray, or np.ndarray is deprecated and will raise in a future version.
  common_dims = tuple(pd.unique([d for v in vars for d in v.dims]))
plotFinalDatasetBLMt(**dataDict)

Cf. Figure 5 in the manuscript, lower two panels for \(\beta\) parameters.

(Figure from Authorea version: https://doi.org/10.22541/au.156045380.07795038.)

1.3. Save reformatted data#

Write Xarrays to file. Here use routines from :py:mod:epsproc.IO, which includes complex number handling, although this may not be necessary with newer versions of Xarray (TBC).

from epsproc import IO

for item in dataDict.items():
    IO.writeXarray(item[1], fileName=f'Xe_dataset_{item[0]}', filePath=dataPath)
    # print(item[0])
writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
Retrying file write with sanitized attrs.
['Written to h5netcdf format, with sanitized attribs (may be lossy)', '/tmp/xe_analysis/Xe_dataset_BLMall.nc']
writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
Retrying file write with sanitized attrs.
['Written to h5netcdf format, with sanitized attribs (may be lossy)', '/tmp/xe_analysis/Xe_dataset_BLMerr.nc']
writeXarray caught exception: Invalid value for attr 'harmonics': {'dtype': 'sph', 'kind': 'complex', 'normType': 'ortho', 'csPhase': True}. For serialization to netCDF files, its value must be of one of the following types: str, Number, ndarray, number, list, tuple
Retrying file write with sanitized attrs.
['Written to h5netcdf format, with sanitized attribs (may be lossy)', '/tmp/xe_analysis/Xe_dataset_BLMerrCycle.nc']
# Check data - read from HDF5/NetCDF files
dictFileTest = {}
for item in dataDict.items():
    dictFileTest[item[0]] = IO.readXarray(fileName=f'Xe_dataset_{item[0]}.nc', filePath=dataPath.as_posix())
*** Read /tmp/xe_analysis/Xe_dataset_BLMall.nc.
*** Read /tmp/xe_analysis/Xe_dataset_BLMerr.nc.
*** Read /tmp/xe_analysis/Xe_dataset_BLMerrCycle.nc.
# Test for identical values to verify round-trip to file
import numpy as np
for item in dataDict.items():
    diff = (dictFileTest[item[0]] - dataDict[item[0]]).sum()
    
    if np.abs(diff) < 1e-10:
        print(f'{item[0]}: OK')
    else:
        print(f'{item[0]} Diff = {np.abs(diff)}')
BLMall: OK
BLMerr: OK
BLMerrCycle: OK

1.4. Versions#

import scooby
scooby.Report(additional=['qbanalysis','pemtk','epsproc', 'holoviews', 'hvplot', 'xarray', 'matplotlib', 'bokeh'])
Tue Jun 03 16:50:22 2025 UTC
OS Linux (Ubuntu 24.04) CPU(s) 4 Machine x86_64 Architecture 64bit
RAM 15.6 GiB Environment Jupyter File system ext4
Python 3.10.11 (main, Sep 30 2024, 21:36:13) [GCC 13.2.0]
qbanalysis 0.0.1 pemtk Module not found epsproc 1.3.2.dev0 holoviews 1.20.2
hvplot 0.11.3 xarray 2022.3.0 matplotlib 3.5.3 bokeh 3.7.3
numpy 1.23.5 scipy 1.15.3 IPython 8.37.0 scooby 0.10.1
# # Check current Git commit for local ePSproc version
# from pathlib import Path
# !git -C {Path(qbanalysis.__file__).parent} branch
# !git -C {Path(qbanalysis.__file__).parent} log --format="%H" -n 1
# # Check current remote commits
# !git ls-remote --heads https://github.com/phockett/qbanalysis
# Check current Git commit for local code version
import qbanalysis
!git -C {Path(qbanalysis.__file__).parent} branch
!git -C {Path(qbanalysis.__file__).parent} log --format="%H" -n 1
* master
4dc763f8848565a5c3d947470fb3eb687d07ba5d
# Check current remote commits
!git ls-remote --heads https://github.com/phockett/Quantum-Beat_Photoelectron-Imaging_Spectroscopy_of_Xe_in_the_VUV
4dc763f8848565a5c3d947470fb3eb687d07ba5d	refs/heads/master
2ff23ede221ac1a0ae8b5351c6c505a6ecd1b65d	refs/heads/uncertainties