Real-World Example: NEMAR EEG Processing PipelineΒΆ
This example demonstrates the complete NEMAR (EEGLAB-based) EEG processing pipeline documented in nemar_pipeline.signalJourney.json
. This production pipeline processes OpenNeuro EEG datasets and showcases advanced signalJourney features including inline data preservation, multi-level quality metrics, and extension schema integration.
Pipeline ArchitectureΒΆ
The NEMAR pipeline implements a 12-step EEG preprocessing workflow organized into four main processing stages:
- Data Import & Validation (Steps 1-3): BIDS import, status verification, channel location checks
- Channel Selection & Preprocessing (Steps 4-6): Non-EEG removal, DC offset correction, high-pass filtering
- Automated Artifact Rejection (Step 7): Clean Raw Data algorithm, ICA decomposition (Step 8), ICLabel classification (Step 9)
- Quality Assessment & Export (Steps 10-12): Data quality metrics, power spectral analysis, dataset export
Processing FlowΒΆ
The diagram below shows the complete 12-step NEMAR pipeline workflow with color-coded elements:
flowchart TD
A["1.Import BIDS Dataset
'pop_importbids'"] --> B["2.Check Import Status"]
B --> C["3.Check Channel Locations"]
C --> D["4.Remove Non-EEG Channels
'pop_select'"]
D --> E["5.Remove DC Offset
'pop_rmbase'"]
E --> F["6.High-pass Filter
'pop_eegfiltnew'"]
F --> G["7.Clean Raw Data
'clean_rawdata'"]
G --> H["8.Run ICA Decomposition
'runica'"]
H --> I["9.ICLabel Classification
'ICLabel'"]
I --> J["10.Data Quality Assessment
Quality metrics"]
J --> K["11.Power Spectral Analysis
Line noise assessment"]
K --> L["12.Save Processed Dataset
Final output"]
%% Inline Data Outputs
D --> D1["π removed_channels
Channel list"]
G --> G1["π clean_sample_mask
Time samples"]
G --> G2["π clean_channel_mask
Channel mask"]
G --> G3["π rejected_channels
Rejected list"]
H --> H1["π ica_weights
Weight matrix"]
H --> H2["π ica_sphere
Sphere matrix"]
H --> H3["π ica_components
Activations"]
I --> I1["π ic_classification
Probabilities"]
I --> I2["π flagged_components
Artifact indices"]
J --> J1["π data_quality_metrics
QC measures"]
K --> K1["π power_spectrum
PSD data"]
K --> K2["π line_noise_assessment
Noise levels"]
%% Saved File Outputs
J --> J2["πΎ dataqual.json
Quality report"]
L --> L1["πΎ processed_eeg.set
Final dataset"]
L --> L2["πΎ pipeline_status.csv
Step status"]
%% Styling
classDef processStep fill:#e1f5fe,stroke:#01579b,stroke-width:2px
classDef inlineData fill:#f3e5f5,stroke:#4a148c,stroke-width:1px
classDef savedFile fill:#e8f5e8,stroke:#1b5e20,stroke-width:2px
class A,B,C,D,E,F,G,H,I,J,K,L processStep
class D1,G1,G2,G3,H1,H2,H3,I1,I2,J1,K1,K2 inlineData
class J2,L1,L2 savedFile
Legend: - π΅ Light Blue: Processing steps (EEGLAB/MATLAB functions) - π£ Purple: Inline data (small parameters/values preserved in JSON) - π’ Green: Saved files (datasets and reports)
Note: Inmemory processing objects (Raw EEG structures, filtered data) are not shown as they represent the default temporary data flow between steps.
Advanced FeaturesΒΆ
Inline Data PreservationΒΆ
Critical intermediate results are preserved using inlineData
targets, enabling post-hoc analysis and reproducibility. Key examples include ICA decomposition matrices, component classifications, and quality control masks:
{
"targetType": "inlineData",
"name": "ica_weights",
"data": "{{ica_weights_matrix}}",
"formatDescription": "Matrix of ICA unmixing weights [n_components x n_channels]",
"description": "ICA unmixing weight matrix"
}
Multi-Level Quality AssessmentΒΆ
Quality metrics are computed at both step-level and pipeline-level, providing comprehensive quality control:
Step-level metrics (e.g., Step 7 clean_rawdata):
"qualityMetrics": {
"percentDataRetained": "{{percent_clean_data}}",
"percentChannelsRetained": "{{percent_clean_channels}}",
"channelsRejected": "{{num_rejected_channels}}"
}
Pipeline-level summary:
"summaryMetrics": {
"pipelineCompleted": true,
"totalProcessingSteps": 12,
"overallDataQuality": {
"goodDataPercent": "{{overall_good_data_percent}}",
"goodChannelsPercent": "{{overall_good_channels_percent}}",
"goodICAPercent": "{{overall_good_ica_percent}}"
}
}
Extension Schema IntegrationΒΆ
Domain-specific metadata is captured using the NEMAR extension schema:
"extensions": {
"nemar": {
"dataset_id": "{{openneuro_dataset_id}}",
"processing_cluster": "SDSC Expanse",
"eeglab_plugins": ["clean_rawdata", "ICLabel", "AMICA", "firfilt"],
"custom_code_applied": "{{custom_dataset_code}}",
"batch_processing": true
}
}
Conditional Algorithm SelectionΒΆ
Step 8 demonstrates algorithm selection logic with complete parameter documentation for both AMICA and runica ICA methods:
{
"stepId": "8",
"name": "Run ICA Decomposition",
"description": "Perform Independent Component Analysis using either AMICA (if >=5 channels) or extended Infomax ICA",
"software": {
"name": "AMICA/EEGLAB",
"version": "1.7/2023.1",
"functionCall": "runamica17_nsg(EEG, 'batch', 1) OR pop_runica(EEG, 'icatype', 'runica', 'extended', 1)"
},
"parameters": {
"method": "{{ica_method}}",
"amica_options": {"batch": 1},
"runica_options": {
"icatype": "runica",
"concatcond": "on",
"extended": 1,
"lrate": 1e-5,
"maxsteps": 2000
}
}
}
Template Variables and Batch ProcessingΒΆ
Template variables (e.g., {{subject}}
, {{session}}
, {{openneuro_dataset_id}}
) enable automated batch processing while maintaining complete parameter documentation. This approach supports systematic processing of multiple datasets with consistent methodology.
Research ApplicationsΒΆ
This documentation format enables:
- Exact reproduction through complete parameter and dependency documentation
- Quality assessment via comprehensive multi-level metrics
- Method comparison with complete parameter sets and summary metrics
- Regulatory compliance through full audit trails of the processing steps and parameters
- Storage efficiency through inline data preservation of key intermediate results, eliminating the need to store both raw and processed datasets
ReferencesΒΆ
This NEMAR example demonstrates the full potential of signalJourney for documenting complex, production-grade signal processing workflows with complete transparency and reproducibility.