NEXRAD Level-2: open_nexradlevel2_datatree crashes on files with skipped elevation cuts (KeyError)

## Summary

`open_nexradlevel2_datatree` raises `KeyError: <sweep_index>` on NEXRAD Level-2 files where the raw volume legitimately skips one or more elevation cuts. The raw file is well-formed per the NEXRAD ICD (2620002W) — the crash is in the datatree builder, which assumes sweep indices are contiguous `0..N-1` while `NEXRADLevel2File.data` is keyed by `elevation_number - 1` (non-contiguous when cuts are skipped).

## Reproduction (xradar 0.11.2.dev10+g6cba751ce)

```python
import fsspec, tempfile, os
from xradar.io import open_nexradlevel2_datatree

urls = [
    "s3://unidata-nexrad-level2/2020/10/22/KLOT/KLOT20201022_151509_V06",  # VCP-12,  KeyError: 1
    "s3://unidata-nexrad-level2/2020/10/04/KLOT/KLOT20201004_155850_V06",  # VCP-215, KeyError: 2
    "s3://unidata-nexrad-level2/2020/10/15/KLOT/KLOT20201015_030655_V06",  # VCP-215, KeyError: 6
    "s3://unidata-nexrad-level2/2020/10/21/KLOT/KLOT20201021_235554_V06",  # VCP-215, KeyError: 14
]
fs = fsspec.filesystem("s3", anon=True)
for url in urls:
    with tempfile.NamedTemporaryFile(suffix="_V06", delete=False) as out:
        fs.get_file(url, out.name)
        tmp = out.name
    try:
        open_nexradlevel2_datatree(tmp)
    except Exception as e:
        print(url.split("/")[-1], "->", type(e).__name__, e)
    finally:
        os.unlink(tmp)
```

Output:
```
KLOT20201022_151509_V06 -> KeyError 1
KLOT20201004_155850_V06 -> KeyError 2
KLOT20201015_030655_V06 -> KeyError 6
KLOT20201021_235554_V06 -> KeyError 14
```

## Root cause

Inspecting `KLOT20201022_151509_V06` (VCP-12, AVSET-terminated, 7 sweeps):

```python
from xradar.io.backends.nexrad_level2 import NEXRADLevel2File
nex = NEXRADLevel2File(tmp)
header_elev_nums = [sw[0]["elevation_number"] for sw in nex.msg_31_header]
print(header_elev_nums)       # [1, 3, 4, 5, 6, 7, 8]  <- elevation_number=2 skipped by RDA
print(sorted(nex.data.keys())) # [0, 2, 3, 4, 5, 6, 7]  <- data keyed by elev_num - 1
```

`.data` is keyed by `elevation_number - 1` (ICD index), so keys are `[0, 2, 3, 4, 5, 6, 7]`.

In `xradar/io/backends/nexrad_level2.py:~2124`:

```python
if incomplete_sweep == "drop":
    sweeps = [f"sweep_{i}" for i in range(act_sweeps) if i not in incomplete]
```

where `act_sweeps = len(nex.msg_31_data_header) = 7`. This produces `sweep_0..sweep_6`, then `open_sweeps_as_dict` looks up `nex.data[1]` — which does not exist — raising `KeyError: 1`.

## Why the raw file is valid

The NEXRAD ICD (2620002W §3.2, table III) does not require contiguous elevation numbers in MSG_31 records — the RDA may skip an elevation cut (e.g., operator override, AVSET early termination mid-VCP, hardware fault on a single elevation). The recorded `msg_31_header` honestly lists the cuts that were collected, and `.data` preserves their ICD elevation index. All four sample files above verify this pattern:

| File | msg_31 header elev_nums | .data keys |
|---|---|---|
| KLOT20201022_151509 (VCP-12, AVSET) | `[1, 3, 4, 5, 6, 7, 8]` | `[0, 2, 3, 4, 5, 6, 7]` |
| KLOT20201004_155850 (VCP-215, AVSET) | `[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12]` | `[0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11]` |
| KLOT20201015_030655 (VCP-215, AVSET) | `[1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15]` | `[0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14]` |
| KLOT20201021_235554 (VCP-215, AVSET) | `[1..14, 16, 17, 18]` | `[0..13, 15, 16, 17]` |

For every file: `data_keys == [elev_num - 1 for elev_num in header_elev_nums]` (100% consistent).

## Impact

Observed in a bulk ingestion of NEXRAD Level-2 KLOT data to AWS Open Data: **65 of 4090 files (1.6%) in a single month** (Oct 2020) fail this way. Extrapolated across multiple years and sites, thousands of valid files are unnecessarily unreadable through the xradar datatree path.

## Proposed fix

In `open_nexradlevel2_datatree` (`nexrad_level2.py` around line 2124), replace positional iteration with the actual `.data` keys so sweep names map to real ICD indices:

```python
if incomplete_sweep == "drop":
    # Use the actual data keys (ICD elevation index = elevation_number - 1),
    # not positional range(), since the raw file may legitimately skip cuts.
    actual_keys = sorted(nex.data.keys())
    sweeps = [f"sweep_{i}" for i in actual_keys if i not in incomplete]
```

A parallel adjustment is needed for the `incomplete_sweep == "pad"` branch and for any downstream code that computes `nex.data[i]` from a positional `range`.

Happy to open a PR with the fix + a regression test seeded from one of the reproducer files if the approach looks right.

## Environment

- xradar: `0.11.2.dev10+g6cba751ce` (openradar/xradar main @ 6cba751ce)
- Python 3.12, Linux

File	msg_31 header elev_nums	.data keys
KLOT20201022_151509 (VCP-12, AVSET)	`[1, 3, 4, 5, 6, 7, 8]`	`[0, 2, 3, 4, 5, 6, 7]`
KLOT20201004_155850 (VCP-215, AVSET)	`[1, 2, 4, 5, 6, 7, 8, 9, 10, 11, 12]`	`[0, 1, 3, 4, 5, 6, 7, 8, 9, 10, 11]`
KLOT20201015_030655 (VCP-215, AVSET)	`[1, 2, 3, 4, 5, 6, 8, 9, 10, 11, 12, 13, 14, 15]`	`[0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 12, 13, 14]`
KLOT20201021_235554 (VCP-215, AVSET)	`[1..14, 16, 17, 18]`	`[0..13, 15, 16, 17]`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NEXRAD Level-2: open_nexradlevel2_datatree crashes on files with skipped elevation cuts (KeyError) #356

Summary

Reproduction (xradar 0.11.2.dev10+g6cba751ce)

Root cause

Why the raw file is valid

Impact

Proposed fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

NEXRAD Level-2: open_nexradlevel2_datatree crashes on files with skipped elevation cuts (KeyError) #356

Description

Summary

Reproduction (xradar 0.11.2.dev10+g6cba751ce)

Root cause

Why the raw file is valid

Impact

Proposed fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions