Metadata Handling¶
biosigIO provides comprehensive metadata management capabilities, allowing you to work with recording session information, subject details, and other contextual data. Proper metadata handling is particularly important when working with research data that needs to be shared or archived.
The Recording object stores various pieces of metadata loaded from the source file or added manually.
Accessing Metadata¶
When you load data into biosigIO, any available metadata from the source file is automatically imported:
# Load data
rec = Recording.from_file('data.set', importer='eeglab')
# Access all metadata
all_metadata = rec.metadata
print(all_metadata)
# Access specific metadata field
subject = rec.get_metadata('subject')
print(f"Subject: {subject}")
# Check if a metadata field exists
if rec.has_metadata('condition'):
condition = rec.get_metadata('condition')
print(f"Condition: {condition}")
You can access the general metadata dictionary directly or use helper methods:
# Access the entire metadata dictionary
all_metadata = rec.metadata
print(all_metadata)
# Get a specific metadata value
subject_id = rec.get_metadata('subject')
fs = rec.get_metadata('sampling_frequency')
print(f"Subject: {subject_id}, Fs: {fs}")
# Set or update a metadata value
rec.set_metadata('task', 'Isometric Contraction')
The specific keys available in rec.metadata depend on the data format and the information present in the source file header. Common keys might include sampling_frequency, subject_id, recording_date, device information, etc.
Channel-Specific Information¶
Information specific to each channel (like its type, physical dimension/unit, or prefiltering details) is stored within the rec.channels dictionary, keyed by the channel label:
for channel_name, channel_info in rec.channels.items():
print(f"Channel: {channel_name}")
print(f" Type: {channel_info.get('channel_type', 'N/A')}")
print(f" Unit: {channel_info.get('physical_dimension', 'N/A')}")
print(f" Sampling Freq: {channel_info.get('sample_frequency', 'N/A')}")
print(f" Prefilter: {channel_info.get('prefilter', 'N/A')}")
# Access info for a specific channel
emg1_info = rec.channels['EMG1']
print(f"EMG1 Unit: {emg1_info['physical_dimension']}")
Annotations / Events¶
Time-stamped annotations or events associated with the recording are stored in the rec.events attribute as a pandas DataFrame.
This DataFrame has the following standard columns:
onset: The start time of the event in seconds, relative to the beginning of the recording.duration: The duration of the event in seconds.description: A string describing the event.
Loading Annotations:
- EDF/BDF: Annotations stored in the EDF+/BDF+ annotation channel are automatically loaded into
rec.eventsby theEDFImporter. - WFDB: Annotations stored in a corresponding
.atr(or similar) file are automatically loaded intorec.eventsby theWFDBImporterwhen loading the.heafile. - EEGLAB
.set: EEGLAB events are loaded intorec.eventsby theEEGLABImporter; the event latency/duration (samples) are converted to onset/duration in seconds and the eventtypebecomes the description. - Other Formats: For formats that don't have standardized annotation support within the file (like CSV, Trigno, OTB), annotations are typically not loaded automatically. You may need to load them from a separate file and add them manually.
Accessing Annotations:
# Check if any events were loaded
if rec.events is not None and not rec.events.empty:
print(f"Loaded {len(rec.events)} events.")
print("First 5 events:")
print(rec.events.head())
# Filter events by description
marker_events = rec.events[rec.events['description'].str.contains('Marker')]
print("\nMarker Events:")
print(marker_events)
else:
print("No events/annotations loaded.")
Adding Annotations Manually:
You can add events programmatically using the rec.add_event() method:
rec.add_event(onset=10.5, duration=5.0, description="Rest Period")
rec.add_event(onset=15.5, duration=0.0, description="Stimulus Onset")
print("\nEvents after adding manually:")
print(rec.events)
Exporting Annotations:
When exporting data using rec.to_edf(), the events stored in rec.events are automatically written to the EDF+/BDF+ annotation channel.
Setting Metadata¶
You can add or modify metadata fields:
# Set a single metadata field
rec.set_metadata('subject', 'S001')
# Set several fields with repeated set_metadata calls
rec.set_metadata('condition', 'resting')
rec.set_metadata('experimenter', 'John Doe')
rec.set_metadata('recording_date', '2023-01-15')
# Or update the metadata dictionary directly
rec.metadata.update({
'subject': 'S001',
'condition': 'resting',
'experimenter': 'John Doe',
'recording_date': '2023-01-15'
})
Common Metadata Fields¶
While biosigIO is flexible about what metadata you can store, some common fields include:
| Field | Description | Example |
|---|---|---|
subject |
Subject identifier | "S001" |
session |
Session identifier | "1" |
condition |
Experimental condition | "rest" |
recording_date |
Date of recording | "2023-01-15" |
device |
Recording device | "Trigno Wireless" |
srate |
Sampling rate in Hz | 2000 |
Metadata in Exported Files¶
When exporting to EDF/BDF, biosigIO automatically includes metadata in the file header and generates a sidecar channels.tsv file with channel-specific metadata following BIDS conventions:
# Export to EDF with metadata
rec.to_edf('output')
# This will create:
# - output.edf or output.bdf (depending on format selection)
# - output_channels.tsv (channel metadata in tab-separated format, BIDS naming)
The channels.tsv file will include information like:
name type units sampling_frequency ...
EMG1 EMG µV 2000 ...
EMG2 EMG µV 2000 ...
ACC1 ACC g 2000 ...
Copying Metadata Between Recording Objects¶
When working with multiple Recording objects, you can copy metadata between them:
# Create a subset with only EMG channels
emg_only = rec.select_channels(channel_type='EMG')
# Copy all metadata from original to subset
emg_only.metadata = rec.metadata.copy()
# Or selectively copy metadata
emg_only.set_metadata('subject', rec.get_metadata('subject'))
Best Practices for Metadata¶
- Consistency: Establish conventions for metadata fields and stick to them
- Completeness: Include all relevant information about the recording context
- Standardization: Use standard units and nomenclature
- Validation: Verify metadata accuracy before export
- Documentation: Document your metadata structure for collaborators
By maintaining good metadata practices, you ensure that your EMG data remains interpretable and useful for future analysis.