concord-io

File reading, re-referencing, and data slicing — the only package that touches files

Role in the system concord-io is the gateway from disk to memory. It is the only package that knows about file formats. Everything it produces is a Recording container. It never returns raw file handles or format-specific objects.

Package Structure

concord-io/
  src/concord_io/
    __init__.py       exports read_edf, read_bids_ieeg, to_bipolar, to_car,
                              select_channels, slice_time, scan_datasets,
                              scan_subjects, scan_sessions, scan_recordings
    reader.py         EDF reader, BIDS sidecar parsing
    montage.py        re-referencing operations
    epoch.py          channel selection and time slicing
    bids.py           BIDS dataset scanning and discovery

reader.py

read_edf

fnread_edf(path, channel_names=None) → Recording

Reads a raw EDF file. Uses MNE's EDF reader internally. Data is returned in Volts (MNE converts from whatever physical unit the EDF stores, typically µV).

Parameter	Type	Description
path	str \| Path	Path to the EDF file.
channel_names	list[str] \| None	If given, only these channels are loaded (in the specified order). Names must match EDF header labels exactly after stripping whitespace. If None, all signals are loaded.

Returns a Recording with montage="monopolar". Events are taken from EDF-embedded annotations.

read_bids_ieeg

fnread_bids_ieeg(edf_path, load_subject_info=True) → Recording

Full BIDS iEEG loader. Calls read_edf() first, then enriches the Recording by parsing all BIDS sidecar files alongside the EDF.

Sidecar file	What is parsed	Where it goes in Recording
*_channels.tsv	Per-channel status, type, units, etc.	channel_metadata[ch_name]
*_events.tsv	Seizure events, trial types, onset/duration	events (replaces EDF annotations)
*_electrodes.tsv	MNI x/y/z coordinates per contact	merged into channel_metadata[ch_name]
*_ieeg.json	Recording-level metadata (task, institution, etc.)	metadata["ieeg"]
participants.tsv	Subject demographics (age, sex, outcome, etc.)	subject_metadata

The function uses the BIDS naming convention to find sidecar files: if the EDF is sub-X_run-01_ieeg.edf, it looks for sub-X_run-01_channels.tsv, etc. in the same directory.

For participants.tsv, it walks up the directory tree (up to 5 levels) until it finds the file.

Private helpers

Function	Purpose
_detect_encoding(path)	Reads first 4 bytes to detect BOM; supports UTF-8/16/32. Needed because some BIDS tools write UTF-16 TSV files on Windows.
_read_channels_tsv(path)	Parses TSV into {channel_name: {field: value}}. Converts "n/a" strings to None.
_read_events_tsv(path)	Parses TSV into list of (onset, duration, trial_type) tuples.
_read_electrodes_tsv(path)	Parses TSV into {contact_name: {x, y, z, ...}}.
_load_subject_metadata(edf_path)	Extracts subject ID from path (regex on "sub-*"), walks up to find participants.tsv.

montage.py

to_bipolar

fnto_bipolar(recording, pairs=None) → Recording

Creates a bipolar montage by computing differences between adjacent contacts. Returns a new Recording with montage="bipolar" and populated anode_names / cathode_names.

Auto-detection: If pairs is None, the function groups channel names by electrode prefix (the alphabetic part of names like "SEEG1", "SEEG2") and creates consecutive pairs: SEEG1−SEEG2, SEEG2−SEEG3, etc. This is the standard clinical SEEG bipolar reference.

Explicit pairs: Pass a list of (anode, cathode) tuples to override auto-detection.

The result channel name is formatted as "SEEG1-SEEG2".

to_car

fnto_car(recording) → Recording

Common average reference. Subtracts the mean signal across all channels from each channel: data_car = data - mean(data, axis=0). Returns a new Recording with montage="car".

Channel names are unchanged. The result has the same shape as the input.

Why Different Montages?

Montage	Equation	Effect	Typical use
Monopolar	raw signal	Includes all common-mode noise; reference electrode determines baseline	Raw inspection, source localization
Bipolar	V(ch_n) − V(ch_{n+1})	Cancels far-field activity; highlights local activity between adjacent contacts	Seizure onset zone localization (clinical standard)
CAR	V(ch) − mean(all V)	Reduces widespread artifact; assumes activity is spatially sparse	Resting state, network analysis

epoch.py

select_channels

fnselect_channels(recording, names=None, pattern=None) → Recording

Returns a new Recording containing only the specified channels.

Parameter	Type	Description
names	list[str] \| None	Explicit list of channel names to include.
pattern	str \| None	Regex pattern matched against channel names. E.g. `r"^SEEG\d+$"`.

slice_time

fnslice_time(recording, t_start, t_end) → Recording

Extracts a time window from a Recording. Returns a new Recording covering only the specified range. start_time on the result is set to t_start.

bids.py — Dataset Scanning

Functions for discovering and navigating BIDS dataset structures without loading any data. Used by the server's BIDS browser to let users navigate datasets, subjects, sessions, and recordings.

fnscan_datasets(root) → list[dict]

Scans a directory for BIDS datasets — subdirectories containing dataset_description.json.

Returns: list of dicts with keys: dataset_id, name, path, description, bids_version, n_subjects.

fnscan_subjects(dataset_path) → dict

Scans a BIDS dataset for subjects. Parses participants.tsv and participants.json (field descriptions). Tags each subject with downloaded: bool based on whether actual data files exist.

Returns: dict with keys: fields (from participants.json) and subjects (list of participant rows).

fnscan_sessions(dataset_path, subject_id) → list[dict]

Lists sessions (ses-* directories) for a subject. Detects available modalities (ieeg, eeg, meg, anat, etc.) per session.

Returns: list of dicts with keys: session_id, path, modalities.

fnscan_recordings(dataset_path, subject_id, session_id=None) → list[dict]

Lists recording files within a subject/session directory. Discovers recordings from both actual data files and JSON sidecars (marking undownloaded files as downloaded: false). Parses BIDS entities (task, run, acquisition, modality) from filenames.

Returns: list of dicts with keys: filename, path, task, run, acquisition, modality, has_events, has_channels, downloaded.

Example Usage

from concord_io import read_bids_ieeg, to_bipolar, slice_time

# Load full recording from BIDS dataset
rec = read_bids_ieeg("ds004100/sub-HUP117/ses-presurgery/ieeg/sub-HUP117_run-01_ieeg.edf")
print(rec.n_channels, rec.duration, rec.fs)   # e.g. 72  300.0  1024.0

# Re-reference to bipolar
rec_bi = to_bipolar(rec)
print(rec_bi.montage)                          # "bipolar"
print(rec_bi.channel_names[:3])               # ["SEEG1-SEEG2", "SEEG2-SEEG3", ...]

# Extract 10 seconds around a seizure
onset = rec.events[0][0]                       # first event onset time
window = slice_time(rec_bi, onset - 5, onset + 5)
print(window.duration)                         # 10.0

BIDS Dataset: HUP117

The primary test dataset is:

Dataset: HUP iEEG (OpenNeuro ds004100)
Subject: sub-HUP117 — SEEG, temporal lobe, lesional, Engel 1A outcome
Path: concord-data/datasets/ds004100/sub-HUP117/ses-presurgery/ieeg/
Contents: 3 ictal EDF runs, 2 interictal EDF runs, ~72 channels, ~56 with 3D coordinates
SOZ contacts: OFAL1-3, STG1-4 (labeled in channels.tsv as status_description="soz")