Skip to content

EDF/BDF Exporter

The EDF/BDF exporter module in EMGIO provides functionality to export EMG data to EDF (European Data Format) or BDF (BioSemi Data Format) files.

Module Documentation

emgio.exporters.edf

EDFExporter

Exporter for EDF format with channels.tsv generation.

Source code in emgio/exporters/edf.py
class EDFExporter:
    """Exporter for EDF format with channels.tsv generation."""

    @staticmethod
    def export(emg: EMG, filepath: str, precision_threshold: float = 0.01,
               method: str = 'both', fft_noise_range: tuple = None,
               svd_rank: int = None, format: Literal['auto', 'edf', 'bdf'] = 'auto',
               bypass_analysis: bool = False,
               events_df: Optional[pd.DataFrame] = None,
               **kwargs) -> None:
        """
        Export EMG data to EDF/BDF format with corresponding channels.tsv file.

        Args:
            emg: EMG object containing the data
            filepath: Path to save the EDF/BDF file
            precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%)
            method: Method for signal analysis ('svd', 'fft', or 'both')
                'svd': Uses Singular Value Decomposition for noise floor estimation
                'fft': Uses Fast Fourier Transform for noise floor estimation
                'both': Uses both methods and takes the minimum noise floor (default)
            fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method
            svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method
            format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'.
                    If 'edf' or 'bdf' is specified, that format will be used directly.
                    If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based
                    on signal analysis to minimize precision loss while preferring EDF
                    if sufficient.
            bypass_analysis: If True, skip the signal analysis step. Requires format
                             to be explicitly set to 'edf' or 'bdf'. (default: False)
            events_df: Optional DataFrame containing events/annotations to write.
                     Columns should include 'onset', 'duration', 'description'.
                     If None or empty, no annotations are written.
            **kwargs: Additional arguments for the exporter
        """
        if emg.signals is None:
            raise ValueError("No signals to export")

        print("\nSignal Analysis:")
        print("--------------")

        # Initialize format decision variables
        use_bdf = False
        bdf_reason = ""
        format_decision_made = False

        # --- Format Decision and Bypass Check ---
        if bypass_analysis and format.lower() == 'auto':
            raise ValueError("Cannot bypass analysis when format is set to 'auto'.")

        if format.lower() == 'bdf':
            use_bdf = True
            format_decision_made = True
            if not bypass_analysis:
                print("\nUser specified BDF format (24-bit).")
            else:
                # Log critical only if bypassing, already logged in EMG.to_edf
                pass  # logging.log(logging.CRITICAL, "Skipping analysis, using specified BDF format.")
        elif format.lower() == 'edf':
            use_bdf = False
            format_decision_made = True
            if not bypass_analysis:
                print("\nUser specified EDF format (16-bit).")
            else:
                # Log critical only if bypassing, already logged in EMG.to_edf
                pass  # logging.log(logging.CRITICAL, "Skipping analysis, using specified EDF format.")
        elif format.lower() != 'auto':
            warnings.warn(f"Unknown format: {format}. Valid options are 'auto', 'edf', or 'bdf'. Using 'auto'.")
            format = 'auto'  # Default to auto if invalid format given
            bypass_analysis = False  # Cannot bypass if format is auto

        signal_analyses = {}
        signal_info_strings = []

        # --- Conditional Signal Analysis ---
        if not bypass_analysis:
            # Analyze signals (needed for summary and potentially for 'auto' format decision)
            for ch_name in emg.channels:
                signal = emg.signals[ch_name].values
                ch_info = emg.channels[ch_name]

                # Analyze signal characteristics
                analysis = analyze_signal(signal, method=method,
                                          fft_noise_range=fft_noise_range,
                                          svd_rank=svd_rank)
                recommend_bdf, reason, snr = determine_format_suitability(signal, analysis)
                analysis['snr'] = snr
                analysis['recommend_bdf'] = recommend_bdf
                analysis['reason'] = reason
                signal_analyses[ch_name] = analysis  # Store analysis for later summary

                # If format is 'auto', check if any channel recommends BDF
                if format == 'auto' and recommend_bdf:
                    use_bdf = True  # Switch to BDF if any channel needs it
                    if not bdf_reason:  # Capture the first reason
                        bdf_reason = f"Channel '{ch_name}': {reason}"

                # Prepare info string for printing later
                signal_info_strings.append(
                    f"\n  {ch_name}:"
                    f"\n    Range: {analysis['range']:.8g} {ch_info['physical_dimension']}"
                    f"\n    Dynamic Range: {analysis['dynamic_range_db']:.1f} dB"
                    f"\n    Noise Floor: {analysis['noise_floor']:.2e} {ch_info['physical_dimension']}"
                    f"\n    SNR: {snr:.1f} dB"
                    f"\n    Method: {analysis.get('method', 'svd')}"
                    f"\n    Recommended Format: {'BDF' if recommend_bdf else 'EDF'} ({reason})"
                )

            # Print analysis details after deciding the format
            for info_str in signal_info_strings:
                print(info_str)

            # Final format decision message for 'auto' mode
            if format == 'auto':
                if use_bdf:
                    print("\nUsing BDF format (24-bit) based on signal analysis to preserve precision.")
                    print(f"Reason: {bdf_reason}")
                    warnings.warn(f"Using BDF format based on signal analysis. Reason: {bdf_reason}")
                else:
                    print("\nUsing EDF format (16-bit) based on signal analysis (precision within acceptable range).")
        # else: # bypass_analysis is True - logging handled in EMG.to_edf
        #     pass # logging.log(logging.CRITICAL, "Signal analysis bypassed.")

        # Set file format and create writer
        channels_tsv_data = {
            'name': [], 'channel_type': [], 'physical_dimension': [],
            'sample_frequency': [], 'reference': [], 'status': []
        }
        channel_info_list = []

        if use_bdf:
            filepath = os.path.splitext(filepath)[0] + '.bdf'
            filetype = pyedflib.FILETYPE_BDFPLUS
        else:
            filepath = os.path.splitext(filepath)[0] + '.edf'
            filetype = pyedflib.FILETYPE_EDFPLUS

        writer = pyedflib.EdfWriter(filepath, len(emg.channels), file_type=filetype)

        try:
            # Prepare channel information and signals for writing
            signals_to_write = []
            for i, ch_name in enumerate(emg.channels):
                signal = emg.signals[ch_name].values
                ch_info = emg.channels[ch_name]
                # No need for full analysis result for scaling factors anymore

                # Get signal min/max for scaling factor calculation
                signal_min = float(np.min(signal))
                signal_max = float(np.max(signal))

                # Calculate scaling factors for header based on the chosen format (use_bdf)
                phys_min, phys_max, dig_min, dig_max, scale_factor = _determine_scaling_factors(
                    signal_min, signal_max, use_bdf=use_bdf
                )

                # Prepare the signal data (handle NaNs, but DO NOT pre-scale)
                physical_signal = signal.copy()
                physical_signal = np.nan_to_num(physical_signal, nan=0.0)
                signals_to_write.append(physical_signal)

                # Prepare channel header dictionary
                ch_dict = {
                    'label': ch_name[:16],  # EDF+ limits label to 16 chars
                    'dimension': ch_info['physical_dimension'],
                    'sample_frequency': int(ch_info['sample_frequency']),
                    'physical_max': phys_max,
                    'physical_min': phys_min,
                    'digital_max': dig_max,
                    'digital_min': dig_min,
                    'prefilter': ch_info['prefilter'],
                    'transducer': f"{ch_info.get('channel_type', 'Unknown')} sensor"  # Use get for safety
                }
                channel_info_list.append(ch_dict)

                # Add to channels.tsv data
                channels_tsv_data['name'].append(ch_name)
                channels_tsv_data['channel_type'].append(ch_info.get('channel_type', 'Unknown'))
                channels_tsv_data['physical_dimension'].append(ch_info['physical_dimension'])
                channels_tsv_data['sample_frequency'].append(ch_info['sample_frequency'])
                channels_tsv_data['reference'].append('n/a')  # Assuming no specific reference info
                channels_tsv_data['status'].append('good')  # Assuming good status

            # Set headers and write data (pass physical signals)
            writer.setSignalHeaders(channel_info_list)
            writer.writeSamples(np.array(signals_to_write))

            # Write annotations if provided
            if events_df is not None and not events_df.empty:
                for index, row in events_df.iterrows():
                    try:
                        # pyedflib uses onset, duration, description
                        onset = float(row['onset'])
                        duration = float(row['duration'])
                        description = str(row['description'])
                        # Write annotation for all channels (-1)
                        writer.writeAnnotation(onset, duration, description)
                    except KeyError as e:
                        warnings.warn(f"Skipping event due to missing column: {e}. Event data: {row}")
                    except (TypeError, ValueError) as e:
                        warnings.warn(f"Skipping event due to invalid data type: {e}. Event data: {row}")

            # Explicitly flush and close the writer to ensure all data is written
            writer.close()

            # Wait a moment to ensure file system operations are complete
            import time
            time.sleep(0.1)

            # Verify the file exists and has the correct size
            if not os.path.exists(filepath):
                raise IOError(f"File {filepath} was not created")

            file_size = os.path.getsize(filepath)
            if file_size == 0:
                raise IOError(f"File {filepath} was created but is empty")

            # Generate channels.tsv file using stored analyses
            channels_tsv_path = os.path.splitext(filepath)[0] + '_channels.tsv'
            pd.DataFrame(channels_tsv_data).to_csv(channels_tsv_path, sep='\t', index=False)
            print(f"\nChannels metadata saved to: {channels_tsv_path}")

            # Print summary using stored analyses, only if analysis was performed
            if not bypass_analysis:
                # We need to adapt summarize_channels call slightly or assume it uses the analyses dict
                # Let's refine the analyses dict passed to summarize_channels
                summary_analyses = {}
                for ch_name, analysis in signal_analyses.items():
                    summary_analyses[ch_name] = {
                        'range': analysis['range'],
                        'dynamic_range_db': analysis['dynamic_range_db'],
                        'snr_db': analysis['snr'],
                        'use_bdf': use_bdf  # Use the final decision for the whole file
                    }

                summary = summarize_channels(emg.channels, emg.signals, summary_analyses)
                print("\nSummary:")
                print(summary)
            else:
                print("\nSummary skipped as signal analysis was bypassed.")

            print(f"\nEMG data exported to: {filepath}")
            return filepath
        except Exception as e:
            # Clean up if there was an error
            if 'writer' in locals() and hasattr(writer, 'close') and callable(writer.close):
                try:
                    # Check if file is open before closing
                    if not writer.header['file_handle'].closed:
                        writer.close()
                except Exception:
                    pass  # Ignore errors during cleanup

            # Wait a moment before trying to delete the file
            import time
            time.sleep(0.1)

            if 'filepath' in locals() and os.path.exists(filepath):
                try:
                    os.unlink(filepath)
                    print(f"Cleaned up partially written file: {filepath}")
                except Exception as unlink_e:
                    print(f"Error during cleanup of {filepath}: {unlink_e}")

            raise e
        finally:
            if writer is not None:
                writer.close()  # Ensure writer is closed

export(emg, filepath, precision_threshold=0.01, method='both', fft_noise_range=None, svd_rank=None, format='auto', bypass_analysis=False, events_df=None, **kwargs) staticmethod

Export EMG data to EDF/BDF format with corresponding channels.tsv file.

Args: emg: EMG object containing the data filepath: Path to save the EDF/BDF file precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%) method: Method for signal analysis ('svd', 'fft', or 'both') 'svd': Uses Singular Value Decomposition for noise floor estimation 'fft': Uses Fast Fourier Transform for noise floor estimation 'both': Uses both methods and takes the minimum noise floor (default) fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'. If 'edf' or 'bdf' is specified, that format will be used directly. If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based on signal analysis to minimize precision loss while preferring EDF if sufficient. bypass_analysis: If True, skip the signal analysis step. Requires format to be explicitly set to 'edf' or 'bdf'. (default: False) events_df: Optional DataFrame containing events/annotations to write. Columns should include 'onset', 'duration', 'description'. If None or empty, no annotations are written. **kwargs: Additional arguments for the exporter

Source code in emgio/exporters/edf.py
@staticmethod
def export(emg: EMG, filepath: str, precision_threshold: float = 0.01,
           method: str = 'both', fft_noise_range: tuple = None,
           svd_rank: int = None, format: Literal['auto', 'edf', 'bdf'] = 'auto',
           bypass_analysis: bool = False,
           events_df: Optional[pd.DataFrame] = None,
           **kwargs) -> None:
    """
    Export EMG data to EDF/BDF format with corresponding channels.tsv file.

    Args:
        emg: EMG object containing the data
        filepath: Path to save the EDF/BDF file
        precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%)
        method: Method for signal analysis ('svd', 'fft', or 'both')
            'svd': Uses Singular Value Decomposition for noise floor estimation
            'fft': Uses Fast Fourier Transform for noise floor estimation
            'both': Uses both methods and takes the minimum noise floor (default)
        fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method
        svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method
        format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'.
                If 'edf' or 'bdf' is specified, that format will be used directly.
                If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based
                on signal analysis to minimize precision loss while preferring EDF
                if sufficient.
        bypass_analysis: If True, skip the signal analysis step. Requires format
                         to be explicitly set to 'edf' or 'bdf'. (default: False)
        events_df: Optional DataFrame containing events/annotations to write.
                 Columns should include 'onset', 'duration', 'description'.
                 If None or empty, no annotations are written.
        **kwargs: Additional arguments for the exporter
    """
    if emg.signals is None:
        raise ValueError("No signals to export")

    print("\nSignal Analysis:")
    print("--------------")

    # Initialize format decision variables
    use_bdf = False
    bdf_reason = ""
    format_decision_made = False

    # --- Format Decision and Bypass Check ---
    if bypass_analysis and format.lower() == 'auto':
        raise ValueError("Cannot bypass analysis when format is set to 'auto'.")

    if format.lower() == 'bdf':
        use_bdf = True
        format_decision_made = True
        if not bypass_analysis:
            print("\nUser specified BDF format (24-bit).")
        else:
            # Log critical only if bypassing, already logged in EMG.to_edf
            pass  # logging.log(logging.CRITICAL, "Skipping analysis, using specified BDF format.")
    elif format.lower() == 'edf':
        use_bdf = False
        format_decision_made = True
        if not bypass_analysis:
            print("\nUser specified EDF format (16-bit).")
        else:
            # Log critical only if bypassing, already logged in EMG.to_edf
            pass  # logging.log(logging.CRITICAL, "Skipping analysis, using specified EDF format.")
    elif format.lower() != 'auto':
        warnings.warn(f"Unknown format: {format}. Valid options are 'auto', 'edf', or 'bdf'. Using 'auto'.")
        format = 'auto'  # Default to auto if invalid format given
        bypass_analysis = False  # Cannot bypass if format is auto

    signal_analyses = {}
    signal_info_strings = []

    # --- Conditional Signal Analysis ---
    if not bypass_analysis:
        # Analyze signals (needed for summary and potentially for 'auto' format decision)
        for ch_name in emg.channels:
            signal = emg.signals[ch_name].values
            ch_info = emg.channels[ch_name]

            # Analyze signal characteristics
            analysis = analyze_signal(signal, method=method,
                                      fft_noise_range=fft_noise_range,
                                      svd_rank=svd_rank)
            recommend_bdf, reason, snr = determine_format_suitability(signal, analysis)
            analysis['snr'] = snr
            analysis['recommend_bdf'] = recommend_bdf
            analysis['reason'] = reason
            signal_analyses[ch_name] = analysis  # Store analysis for later summary

            # If format is 'auto', check if any channel recommends BDF
            if format == 'auto' and recommend_bdf:
                use_bdf = True  # Switch to BDF if any channel needs it
                if not bdf_reason:  # Capture the first reason
                    bdf_reason = f"Channel '{ch_name}': {reason}"

            # Prepare info string for printing later
            signal_info_strings.append(
                f"\n  {ch_name}:"
                f"\n    Range: {analysis['range']:.8g} {ch_info['physical_dimension']}"
                f"\n    Dynamic Range: {analysis['dynamic_range_db']:.1f} dB"
                f"\n    Noise Floor: {analysis['noise_floor']:.2e} {ch_info['physical_dimension']}"
                f"\n    SNR: {snr:.1f} dB"
                f"\n    Method: {analysis.get('method', 'svd')}"
                f"\n    Recommended Format: {'BDF' if recommend_bdf else 'EDF'} ({reason})"
            )

        # Print analysis details after deciding the format
        for info_str in signal_info_strings:
            print(info_str)

        # Final format decision message for 'auto' mode
        if format == 'auto':
            if use_bdf:
                print("\nUsing BDF format (24-bit) based on signal analysis to preserve precision.")
                print(f"Reason: {bdf_reason}")
                warnings.warn(f"Using BDF format based on signal analysis. Reason: {bdf_reason}")
            else:
                print("\nUsing EDF format (16-bit) based on signal analysis (precision within acceptable range).")
    # else: # bypass_analysis is True - logging handled in EMG.to_edf
    #     pass # logging.log(logging.CRITICAL, "Signal analysis bypassed.")

    # Set file format and create writer
    channels_tsv_data = {
        'name': [], 'channel_type': [], 'physical_dimension': [],
        'sample_frequency': [], 'reference': [], 'status': []
    }
    channel_info_list = []

    if use_bdf:
        filepath = os.path.splitext(filepath)[0] + '.bdf'
        filetype = pyedflib.FILETYPE_BDFPLUS
    else:
        filepath = os.path.splitext(filepath)[0] + '.edf'
        filetype = pyedflib.FILETYPE_EDFPLUS

    writer = pyedflib.EdfWriter(filepath, len(emg.channels), file_type=filetype)

    try:
        # Prepare channel information and signals for writing
        signals_to_write = []
        for i, ch_name in enumerate(emg.channels):
            signal = emg.signals[ch_name].values
            ch_info = emg.channels[ch_name]
            # No need for full analysis result for scaling factors anymore

            # Get signal min/max for scaling factor calculation
            signal_min = float(np.min(signal))
            signal_max = float(np.max(signal))

            # Calculate scaling factors for header based on the chosen format (use_bdf)
            phys_min, phys_max, dig_min, dig_max, scale_factor = _determine_scaling_factors(
                signal_min, signal_max, use_bdf=use_bdf
            )

            # Prepare the signal data (handle NaNs, but DO NOT pre-scale)
            physical_signal = signal.copy()
            physical_signal = np.nan_to_num(physical_signal, nan=0.0)
            signals_to_write.append(physical_signal)

            # Prepare channel header dictionary
            ch_dict = {
                'label': ch_name[:16],  # EDF+ limits label to 16 chars
                'dimension': ch_info['physical_dimension'],
                'sample_frequency': int(ch_info['sample_frequency']),
                'physical_max': phys_max,
                'physical_min': phys_min,
                'digital_max': dig_max,
                'digital_min': dig_min,
                'prefilter': ch_info['prefilter'],
                'transducer': f"{ch_info.get('channel_type', 'Unknown')} sensor"  # Use get for safety
            }
            channel_info_list.append(ch_dict)

            # Add to channels.tsv data
            channels_tsv_data['name'].append(ch_name)
            channels_tsv_data['channel_type'].append(ch_info.get('channel_type', 'Unknown'))
            channels_tsv_data['physical_dimension'].append(ch_info['physical_dimension'])
            channels_tsv_data['sample_frequency'].append(ch_info['sample_frequency'])
            channels_tsv_data['reference'].append('n/a')  # Assuming no specific reference info
            channels_tsv_data['status'].append('good')  # Assuming good status

        # Set headers and write data (pass physical signals)
        writer.setSignalHeaders(channel_info_list)
        writer.writeSamples(np.array(signals_to_write))

        # Write annotations if provided
        if events_df is not None and not events_df.empty:
            for index, row in events_df.iterrows():
                try:
                    # pyedflib uses onset, duration, description
                    onset = float(row['onset'])
                    duration = float(row['duration'])
                    description = str(row['description'])
                    # Write annotation for all channels (-1)
                    writer.writeAnnotation(onset, duration, description)
                except KeyError as e:
                    warnings.warn(f"Skipping event due to missing column: {e}. Event data: {row}")
                except (TypeError, ValueError) as e:
                    warnings.warn(f"Skipping event due to invalid data type: {e}. Event data: {row}")

        # Explicitly flush and close the writer to ensure all data is written
        writer.close()

        # Wait a moment to ensure file system operations are complete
        import time
        time.sleep(0.1)

        # Verify the file exists and has the correct size
        if not os.path.exists(filepath):
            raise IOError(f"File {filepath} was not created")

        file_size = os.path.getsize(filepath)
        if file_size == 0:
            raise IOError(f"File {filepath} was created but is empty")

        # Generate channels.tsv file using stored analyses
        channels_tsv_path = os.path.splitext(filepath)[0] + '_channels.tsv'
        pd.DataFrame(channels_tsv_data).to_csv(channels_tsv_path, sep='\t', index=False)
        print(f"\nChannels metadata saved to: {channels_tsv_path}")

        # Print summary using stored analyses, only if analysis was performed
        if not bypass_analysis:
            # We need to adapt summarize_channels call slightly or assume it uses the analyses dict
            # Let's refine the analyses dict passed to summarize_channels
            summary_analyses = {}
            for ch_name, analysis in signal_analyses.items():
                summary_analyses[ch_name] = {
                    'range': analysis['range'],
                    'dynamic_range_db': analysis['dynamic_range_db'],
                    'snr_db': analysis['snr'],
                    'use_bdf': use_bdf  # Use the final decision for the whole file
                }

            summary = summarize_channels(emg.channels, emg.signals, summary_analyses)
            print("\nSummary:")
            print(summary)
        else:
            print("\nSummary skipped as signal analysis was bypassed.")

        print(f"\nEMG data exported to: {filepath}")
        return filepath
    except Exception as e:
        # Clean up if there was an error
        if 'writer' in locals() and hasattr(writer, 'close') and callable(writer.close):
            try:
                # Check if file is open before closing
                if not writer.header['file_handle'].closed:
                    writer.close()
            except Exception:
                pass  # Ignore errors during cleanup

        # Wait a moment before trying to delete the file
        import time
        time.sleep(0.1)

        if 'filepath' in locals() and os.path.exists(filepath):
            try:
                os.unlink(filepath)
                print(f"Cleaned up partially written file: {filepath}")
            except Exception as unlink_e:
                print(f"Error during cleanup of {filepath}: {unlink_e}")

        raise e
    finally:
        if writer is not None:
            writer.close()  # Ensure writer is closed

EMG

Core EMG class for handling EMG data and metadata.

Attributes: signals (pd.DataFrame): Raw signal data with time as index. metadata (dict): Metadata dictionary containing recording information. channels (dict): Channel information including type, unit, sampling frequency. events (pd.DataFrame): Annotations or events associated with the signals, with columns 'onset', 'duration', 'description'.

Source code in emgio/core/emg.py
 19
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
class EMG:
    """
    Core EMG class for handling EMG data and metadata.

    Attributes:
        signals (pd.DataFrame): Raw signal data with time as index.
        metadata (dict): Metadata dictionary containing recording information.
        channels (dict): Channel information including type, unit, sampling frequency.
        events (pd.DataFrame): Annotations or events associated with the signals,
                               with columns 'onset', 'duration', 'description'.
    """

    def __init__(self):
        """Initialize an empty EMG object."""
        self.signals = None
        self.metadata = {}
        self.channels = {}
        # Initialize events as an empty DataFrame with specified columns
        self.events = pd.DataFrame(columns=['onset', 'duration', 'description'])

    def plot_signals(self, channels=None, time_range=None, offset_scale=0.8,
                    uniform_scale=True, detrend=False, grid=True, title=None,
                    show=True, plt_module=None):
        """
        Plot EMG signals in a single plot with vertical offsets.

        Args:
            channels: List of channels to plot. If None, plot all channels.
            time_range: Tuple of (start_time, end_time) to plot. If None, plot all data.
            offset_scale: Portion of allocated space each signal can use (0.0 to 1.0).
            uniform_scale: Whether to use the same scale for all signals.
            detrend: Whether to remove mean from signals before plotting.
            grid: Whether to show grid lines.
            title: Optional title for the figure.
            show: Whether to display the plot.
            plt_module: Matplotlib pyplot module to use.
        """
        # Delegate to the static plotting function in visualization module
        static_plot_signals(
            emg_object=self,
            channels=channels,
            time_range=time_range,
            offset_scale=offset_scale,
            uniform_scale=uniform_scale,
            detrend=detrend,
            grid=grid,
            title=title,
            show=show,
            plt_module=plt_module
        )

    @classmethod
    def _infer_importer(cls, filepath: str) -> str:
        """
        Infer the importer to use based on the file extension.
        """
        extension = os.path.splitext(filepath)[1].lower()
        if extension in {'.edf', '.bdf'}:
            return 'edf'
        elif extension in {'.set'}:
            return 'eeglab'
        elif extension in {'.otb', '.otb+'}:
            return 'otb'
        elif extension in {'.csv', '.txt'}:
            return 'csv'
        elif extension in {'.hea', '.dat', '.atr'}:
            return 'wfdb'
        else:
            raise ValueError(f"Unsupported file extension: {extension}")

    @classmethod
    def from_file(
            cls,
            filepath: str,
            importer: Literal['trigno', 'otb', 'eeglab', 'edf', 'csv', 'wfdb'] | None = None,
            force_csv: bool = False,
            **kwargs
    ) -> 'EMG':
        """
        The method to create EMG object from file.

        Args:
            filepath: Path to the input file
            importer: Name of the importer to use. Can be one of the following:
                - 'trigno': Delsys Trigno EMG system (CSV)
                - 'otb': OTB/OTB+ EMG system (OTB, OTB+)
                - 'eeglab': EEGLAB .set files (SET)
                - 'edf': EDF/EDF+/BDF/BDF+ format (EDF, BDF)
                - 'csv': Generic CSV (or TXT) files with columnar data
                - 'wfdb': Waveform Database (WFDB)
                If None, the importer will be inferred from the file extension.
                Automatic import is supported for CSV/TXT files.
            force_csv: If True and importer is 'csv', forces using the generic CSV
                      importer even if the file appears to match a specialized format.
            **kwargs: Additional arguments passed to the importer

        Returns:
            EMG: New EMG object with loaded data
        """
        if importer is None:
            importer = cls._infer_importer(filepath)

        importers = {
            'trigno': 'TrignoImporter',  # CSV with Delsys Trigno Headers
            'otb': 'OTBImporter',  # OTB/OTB+ EMG system data
            'edf': 'EDFImporter',  # EDF/EDF+/BDF format
            'eeglab': 'EEGLABImporter',  # EEGLAB .set files
            'csv': 'CSVImporter',  # Generic CSV/Text files
            'wfdb': 'WFDBImporter'  # Waveform Database format
        }

        if importer not in importers:
            raise ValueError(
                f"Unsupported importer: {importer}. "
                f"Available importers: {list(importers.keys())}\n"
                "- trigno: Delsys Trigno EMG system\n"
                "- otb: OTB/OTB+ EMG system\n"
                "- edf: EDF/EDF+/BDF format\n"
                "- eeglab: EEGLAB .set files\n"
                "- csv: Generic CSV/Text files\n"
                "- wfdb: Waveform Database"
            )

        # If using CSV importer and force_csv is set, pass it as force_generic
        if importer == 'csv':
            kwargs['force_generic'] = force_csv

        # Import the appropriate importer class
        importer_module = __import__(
            f'emgio.importers.{importer}',
            globals(),
            locals(),
            [importers[importer]]
        )
        importer_class = getattr(importer_module, importers[importer])

        # Create importer instance and load data
        return importer_class().load(filepath, **kwargs)

    def select_channels(
            self,
            channels: Union[str, List[str], None] = None,
            channel_type: Optional[str] = None,
            inplace: bool = False) -> 'EMG':
        """
        Select specific channels from the data and return a new EMG object.

        Args:
            channels: Channel name or list of channel names to select. If None and
                    channel_type is specified, selects all channels of that type.
            channel_type: Type of channels to select ('EMG', 'ACC', 'GYRO', etc.).
                        If specified with channels, filters the selection to only
                        channels of this type.

        Returns:
            EMG: A new EMG object containing only the selected channels

        Examples:
            # Select specific channels
            new_emg = emg.select_channels(['EMG1', 'ACC1'])

            # Select all EMG channels
            emg_only = emg.select_channels(channel_type='EMG')

            # Select specific EMG channels only, this example does not select ACC channels
            emg_subset = emg.select_channels(['EMG1', 'ACC1'], channel_type='EMG')
        """
        if self.signals is None:
            raise ValueError("No signals loaded")

        # If channel_type specified but no channels, select all of that type
        if channels is None and channel_type is not None:
            channels = [ch for ch, info in self.channels.items()
                        if info['channel_type'] == channel_type]
            if not channels:
                raise ValueError(f"No channels found of type: {channel_type}")
        elif isinstance(channels, str):
            channels = [channels]

        # Validate channels exist
        if not all(ch in self.signals.columns for ch in channels):
            missing = [ch for ch in channels if ch not in self.signals.columns]
            raise ValueError(f"Channels not found: {missing}")

        # Filter by type if specified
        if channel_type is not None:
            channels = [ch for ch in channels
                        if self.channels[ch]['channel_type'] == channel_type]
            if not channels:
                raise ValueError(
                    f"None of the selected channels are of type: {channel_type}")

        # Create new EMG object
        new_emg = EMG()

        # Copy selected signals and channels
        new_emg.signals = self.signals[channels].copy()
        new_emg.channels = {ch: self.channels[ch].copy() for ch in channels}

        # Copy metadata
        new_emg.metadata = self.metadata.copy()

        if not inplace:
            return new_emg
        else:
            self.signals = new_emg.signals
            self.channels = new_emg.channels
            self.metadata = new_emg.metadata
            return self

    def get_channel_types(self) -> List[str]:
        """
        Get list of unique channel types in the data.

        Returns:
            List of channel types (e.g., ['EMG', 'ACC', 'GYRO'])
        """
        return list(set(info['channel_type'] for info in self.channels.values()))

    def get_channels_by_type(self, channel_type: str) -> List[str]:
        """
        Get list of channels of a specific type.

        Args:
            channel_type: Type of channels to get ('EMG', 'ACC', 'GYRO', etc.)

        Returns:
            List of channel names of the specified type
        """
        return [ch for ch, info in self.channels.items()
                if info['channel_type'] == channel_type]

    def to_edf(self, filepath: str, method: str = 'both',
               fft_noise_range: tuple = None, svd_rank: int = None,
               precision_threshold: float = 0.01,
               format: Literal['auto', 'edf', 'bdf'] = 'auto',
               bypass_analysis: bool | None = None,
               verify: bool = False, verify_tolerance: float = 1e-6,
               verify_channel_map: Optional[Dict[str, str]] = None,
               verify_plot: bool = False,
               events_df: Optional[pd.DataFrame] = None,
               **kwargs
               ) -> Union[str, None]:
        """
        Export EMG data to EDF/BDF format, optionally including events.

        Args:
            filepath: Path to save the EDF/BDF file
            method: Method for signal analysis ('svd', 'fft', or 'both')
                'svd': Uses Singular Value Decomposition for noise floor estimation
                'fft': Uses Fast Fourier Transform for noise floor estimation
                'both': Uses both methods and takes the minimum noise floor (default)
            fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method
            svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method
            precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%)
            format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'.
                    If 'edf' or 'bdf' is specified, that format will be used directly.
                    If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based
                    on signal analysis to minimize precision loss while preferring EDF
                    if sufficient.
            bypass_analysis: If True, skip signal analysis step when format is explicitly
                             set to 'edf' or 'bdf'. If None (default), analysis is skipped
                             automatically when format is forced. Set to False to force
                             analysis even with a specified format. Ignored if format='auto'.
            verify: If True, reload the exported file and compare signals with the original
                    to check for data integrity loss. Results are printed. (default: False)
            verify_tolerance: Absolute tolerance used when comparing signals during verification. (default: 1e-6)
            verify_channel_map: Optional dictionary mapping original channel names (keys)
                                to reloaded channel names (values) for verification.
                                Used if `verify` is True and channel names might differ.
            verify_plot: If True and verify is True, plots a comparison of original vs reloaded signals.
            events_df: Optional DataFrame with events ('onset', 'duration', 'description').
                      If None, uses self.events. (This provides flexibility)
            **kwargs: Additional arguments for the EDF exporter

        Returns:
            Union[str, None]: If verify is True, returns a string with verification results.
                             Otherwise, returns None.

        Raises:
            ValueError: If no signals are loaded
        """
        from ..exporters.edf import EDFExporter  # Local import

        if self.signals is None:
            raise ValueError("No signals loaded")

        # --- Determine if analysis should be bypassed ---
        final_bypass_analysis = False
        if format.lower() == 'auto':
            if bypass_analysis is True:
                logging.warning("bypass_analysis=True ignored because format='auto'. Analysis is required.")
            # Analysis is always needed for 'auto' format
            final_bypass_analysis = False
        elif format.lower() in ['edf', 'bdf']:
            if bypass_analysis is None:
                # Default behaviour: skip analysis if format is forced
                final_bypass_analysis = True
                msg = (f"Format forced to '{format}'. Skipping signal analysis for faster export. "
                       "Set bypass_analysis=False to force analysis.")
                logging.log(logging.CRITICAL, msg)
            elif bypass_analysis is True:
                final_bypass_analysis = True
                logging.log(logging.CRITICAL, "bypass_analysis=True set. Skipping signal analysis.")
            else:  # bypass_analysis is False
                final_bypass_analysis = False
                logging.info(f"Format forced to '{format}' but bypass_analysis=False. Performing signal analysis.")
        else:
            # Should not happen if Literal type hint works, but good practice
            logging.warning(f"Unknown format '{format}'. Defaulting to 'auto' behavior (analysis enabled).")
            format = 'auto'
            final_bypass_analysis = False

        # Determine which events DataFrame to use
        if events_df is None:
            events_to_export = self.events
        else:
            events_to_export = events_df

        # Combine parameters
        all_params = {
            'precision_threshold': precision_threshold,
            'method': method,
            'fft_noise_range': fft_noise_range,
            'svd_rank': svd_rank,
            'format': format,
            'bypass_analysis': final_bypass_analysis,
            'events_df': events_to_export,  # Pass the events dataframe
            **kwargs
        }

        EDFExporter.export(self, filepath, **all_params)

        verification_report_dict = None
        if verify:
            logging.info(f"Verification requested. Reloading exported file: {filepath}")
            try:
                # Reload the exported file
                reloaded_emg = EMG.from_file(filepath, importer='edf')

                logging.info("Comparing original signals with reloaded signals...")
                # Compare signals using the imported function
                verification_results = compare_signals(
                    self,
                    reloaded_emg,
                    tolerance=verify_tolerance,
                    channel_map=verify_channel_map
                )

                # Generate and log report using the imported function
                report_verification_results(verification_results, verify_tolerance)
                verification_report_dict = verification_results

                # Plot comparison using imported function if requested
                summary = verification_results.get('channel_summary', {})
                comparison_mode = summary.get('comparison_mode', 'unknown')
                compared_count = sum(1 for k in verification_results if k != 'channel_summary')

                if verify_plot and compared_count > 0 and comparison_mode != 'failed':
                    plot_comparison(self, reloaded_emg, channel_map=verify_channel_map)
                elif verify_plot:
                    logging.warning("Skipping verification plot: No channels were successfully compared.")

            except Exception as e:
                logging.error(f"Verification failed during reload or comparison: {e}")
                verification_report_dict = {
                    'error': str(e),
                    'channel_summary': {'comparison_mode': 'failed'}
                }

        return verification_report_dict

    def set_metadata(self, key: str, value: any) -> None:
        """
        Set metadata value.

        Args:
            key: Metadata key
            value: Metadata value
        """
        self.metadata[key] = value

    def get_metadata(self, key: str) -> any:
        """
        Get metadata value.

        Args:
            key: Metadata key

        Returns:
            Value associated with the key
        """
        return self.metadata.get(key)

    def add_channel(
            self, label: str, data: np.ndarray, sample_frequency: float,
            physical_dimension: str, prefilter: str = 'n/a', channel_type: str = 'EMG') -> None:
        """
        Add a new channel to the EMG data.

        Args:
            label: Channel label or name (as per EDF specification)
            data: Channel data
            sample_frequency: Sampling frequency in Hz (as per EDF specification)
            physical_dimension: Physical dimension/unit of measurement (as per EDF specification)
            prefilter: Pre-filtering applied to the channel
            channel_type: Channel type ('EMG', 'ACC', 'GYRO', etc.)
        """
        if self.signals is None:
            # Create DataFrame with time index
            time = np.arange(len(data)) / sample_frequency
            self.signals = pd.DataFrame(index=time)

        self.signals[label] = data
        self.channels[label] = {
            'sample_frequency': sample_frequency,
            'physical_dimension': physical_dimension,
            'prefilter': prefilter,
            'channel_type': channel_type
        }

    def add_event(self, onset: float, duration: float, description: str) -> None:
        """
        Add an event/annotation to the EMG object.

        Args:
            onset: Event onset time in seconds.
            duration: Event duration in seconds.
            description: Event description string.
        """
        new_event = pd.DataFrame([{'onset': onset, 'duration': duration, 'description': description}])
        # Use pd.concat for appending, ignore_index=True resets the index
        self.events = pd.concat([self.events, new_event], ignore_index=True)
        # Sort events by onset time for consistency
        self.events.sort_values(by='onset', inplace=True)
        self.events.reset_index(drop=True, inplace=True)

__init__()

Source code in emgio/core/emg.py
def __init__(self):
    """Initialize an empty EMG object."""
    self.signals = None
    self.metadata = {}
    self.channels = {}
    # Initialize events as an empty DataFrame with specified columns
    self.events = pd.DataFrame(columns=['onset', 'duration', 'description'])

add_channel(label, data, sample_frequency, physical_dimension, prefilter='n/a', channel_type='EMG')

Add a new channel to the EMG data.

Args: label: Channel label or name (as per EDF specification) data: Channel data sample_frequency: Sampling frequency in Hz (as per EDF specification) physical_dimension: Physical dimension/unit of measurement (as per EDF specification) prefilter: Pre-filtering applied to the channel channel_type: Channel type ('EMG', 'ACC', 'GYRO', etc.)

Source code in emgio/core/emg.py
def add_channel(
        self, label: str, data: np.ndarray, sample_frequency: float,
        physical_dimension: str, prefilter: str = 'n/a', channel_type: str = 'EMG') -> None:
    """
    Add a new channel to the EMG data.

    Args:
        label: Channel label or name (as per EDF specification)
        data: Channel data
        sample_frequency: Sampling frequency in Hz (as per EDF specification)
        physical_dimension: Physical dimension/unit of measurement (as per EDF specification)
        prefilter: Pre-filtering applied to the channel
        channel_type: Channel type ('EMG', 'ACC', 'GYRO', etc.)
    """
    if self.signals is None:
        # Create DataFrame with time index
        time = np.arange(len(data)) / sample_frequency
        self.signals = pd.DataFrame(index=time)

    self.signals[label] = data
    self.channels[label] = {
        'sample_frequency': sample_frequency,
        'physical_dimension': physical_dimension,
        'prefilter': prefilter,
        'channel_type': channel_type
    }

add_event(onset, duration, description)

Add an event/annotation to the EMG object.

Args: onset: Event onset time in seconds. duration: Event duration in seconds. description: Event description string.

Source code in emgio/core/emg.py
def add_event(self, onset: float, duration: float, description: str) -> None:
    """
    Add an event/annotation to the EMG object.

    Args:
        onset: Event onset time in seconds.
        duration: Event duration in seconds.
        description: Event description string.
    """
    new_event = pd.DataFrame([{'onset': onset, 'duration': duration, 'description': description}])
    # Use pd.concat for appending, ignore_index=True resets the index
    self.events = pd.concat([self.events, new_event], ignore_index=True)
    # Sort events by onset time for consistency
    self.events.sort_values(by='onset', inplace=True)
    self.events.reset_index(drop=True, inplace=True)

from_file(filepath, importer=None, force_csv=False, **kwargs) classmethod

The method to create EMG object from file.

Args: filepath: Path to the input file importer: Name of the importer to use. Can be one of the following: - 'trigno': Delsys Trigno EMG system (CSV) - 'otb': OTB/OTB+ EMG system (OTB, OTB+) - 'eeglab': EEGLAB .set files (SET) - 'edf': EDF/EDF+/BDF/BDF+ format (EDF, BDF) - 'csv': Generic CSV (or TXT) files with columnar data - 'wfdb': Waveform Database (WFDB) If None, the importer will be inferred from the file extension. Automatic import is supported for CSV/TXT files. force_csv: If True and importer is 'csv', forces using the generic CSV importer even if the file appears to match a specialized format. **kwargs: Additional arguments passed to the importer

Returns: EMG: New EMG object with loaded data

Source code in emgio/core/emg.py
@classmethod
def from_file(
        cls,
        filepath: str,
        importer: Literal['trigno', 'otb', 'eeglab', 'edf', 'csv', 'wfdb'] | None = None,
        force_csv: bool = False,
        **kwargs
) -> 'EMG':
    """
    The method to create EMG object from file.

    Args:
        filepath: Path to the input file
        importer: Name of the importer to use. Can be one of the following:
            - 'trigno': Delsys Trigno EMG system (CSV)
            - 'otb': OTB/OTB+ EMG system (OTB, OTB+)
            - 'eeglab': EEGLAB .set files (SET)
            - 'edf': EDF/EDF+/BDF/BDF+ format (EDF, BDF)
            - 'csv': Generic CSV (or TXT) files with columnar data
            - 'wfdb': Waveform Database (WFDB)
            If None, the importer will be inferred from the file extension.
            Automatic import is supported for CSV/TXT files.
        force_csv: If True and importer is 'csv', forces using the generic CSV
                  importer even if the file appears to match a specialized format.
        **kwargs: Additional arguments passed to the importer

    Returns:
        EMG: New EMG object with loaded data
    """
    if importer is None:
        importer = cls._infer_importer(filepath)

    importers = {
        'trigno': 'TrignoImporter',  # CSV with Delsys Trigno Headers
        'otb': 'OTBImporter',  # OTB/OTB+ EMG system data
        'edf': 'EDFImporter',  # EDF/EDF+/BDF format
        'eeglab': 'EEGLABImporter',  # EEGLAB .set files
        'csv': 'CSVImporter',  # Generic CSV/Text files
        'wfdb': 'WFDBImporter'  # Waveform Database format
    }

    if importer not in importers:
        raise ValueError(
            f"Unsupported importer: {importer}. "
            f"Available importers: {list(importers.keys())}\n"
            "- trigno: Delsys Trigno EMG system\n"
            "- otb: OTB/OTB+ EMG system\n"
            "- edf: EDF/EDF+/BDF format\n"
            "- eeglab: EEGLAB .set files\n"
            "- csv: Generic CSV/Text files\n"
            "- wfdb: Waveform Database"
        )

    # If using CSV importer and force_csv is set, pass it as force_generic
    if importer == 'csv':
        kwargs['force_generic'] = force_csv

    # Import the appropriate importer class
    importer_module = __import__(
        f'emgio.importers.{importer}',
        globals(),
        locals(),
        [importers[importer]]
    )
    importer_class = getattr(importer_module, importers[importer])

    # Create importer instance and load data
    return importer_class().load(filepath, **kwargs)

get_channel_types()

Get list of unique channel types in the data.

Returns: List of channel types (e.g., ['EMG', 'ACC', 'GYRO'])

Source code in emgio/core/emg.py
def get_channel_types(self) -> List[str]:
    """
    Get list of unique channel types in the data.

    Returns:
        List of channel types (e.g., ['EMG', 'ACC', 'GYRO'])
    """
    return list(set(info['channel_type'] for info in self.channels.values()))

get_channels_by_type(channel_type)

Get list of channels of a specific type.

Args: channel_type: Type of channels to get ('EMG', 'ACC', 'GYRO', etc.)

Returns: List of channel names of the specified type

Source code in emgio/core/emg.py
def get_channels_by_type(self, channel_type: str) -> List[str]:
    """
    Get list of channels of a specific type.

    Args:
        channel_type: Type of channels to get ('EMG', 'ACC', 'GYRO', etc.)

    Returns:
        List of channel names of the specified type
    """
    return [ch for ch, info in self.channels.items()
            if info['channel_type'] == channel_type]

get_metadata(key)

Get metadata value.

Args: key: Metadata key

Returns: Value associated with the key

Source code in emgio/core/emg.py
def get_metadata(self, key: str) -> any:
    """
    Get metadata value.

    Args:
        key: Metadata key

    Returns:
        Value associated with the key
    """
    return self.metadata.get(key)

plot_signals(channels=None, time_range=None, offset_scale=0.8, uniform_scale=True, detrend=False, grid=True, title=None, show=True, plt_module=None)

Plot EMG signals in a single plot with vertical offsets.

Args: channels: List of channels to plot. If None, plot all channels. time_range: Tuple of (start_time, end_time) to plot. If None, plot all data. offset_scale: Portion of allocated space each signal can use (0.0 to 1.0). uniform_scale: Whether to use the same scale for all signals. detrend: Whether to remove mean from signals before plotting. grid: Whether to show grid lines. title: Optional title for the figure. show: Whether to display the plot. plt_module: Matplotlib pyplot module to use.

Source code in emgio/core/emg.py
def plot_signals(self, channels=None, time_range=None, offset_scale=0.8,
                uniform_scale=True, detrend=False, grid=True, title=None,
                show=True, plt_module=None):
    """
    Plot EMG signals in a single plot with vertical offsets.

    Args:
        channels: List of channels to plot. If None, plot all channels.
        time_range: Tuple of (start_time, end_time) to plot. If None, plot all data.
        offset_scale: Portion of allocated space each signal can use (0.0 to 1.0).
        uniform_scale: Whether to use the same scale for all signals.
        detrend: Whether to remove mean from signals before plotting.
        grid: Whether to show grid lines.
        title: Optional title for the figure.
        show: Whether to display the plot.
        plt_module: Matplotlib pyplot module to use.
    """
    # Delegate to the static plotting function in visualization module
    static_plot_signals(
        emg_object=self,
        channels=channels,
        time_range=time_range,
        offset_scale=offset_scale,
        uniform_scale=uniform_scale,
        detrend=detrend,
        grid=grid,
        title=title,
        show=show,
        plt_module=plt_module
    )

select_channels(channels=None, channel_type=None, inplace=False)

Select specific channels from the data and return a new EMG object.

Args: channels: Channel name or list of channel names to select. If None and channel_type is specified, selects all channels of that type. channel_type: Type of channels to select ('EMG', 'ACC', 'GYRO', etc.). If specified with channels, filters the selection to only channels of this type.

Returns: EMG: A new EMG object containing only the selected channels

Examples: # Select specific channels new_emg = emg.select_channels(['EMG1', 'ACC1'])

# Select all EMG channels
emg_only = emg.select_channels(channel_type='EMG')

# Select specific EMG channels only, this example does not select ACC channels
emg_subset = emg.select_channels(['EMG1', 'ACC1'], channel_type='EMG')
Source code in emgio/core/emg.py
def select_channels(
        self,
        channels: Union[str, List[str], None] = None,
        channel_type: Optional[str] = None,
        inplace: bool = False) -> 'EMG':
    """
    Select specific channels from the data and return a new EMG object.

    Args:
        channels: Channel name or list of channel names to select. If None and
                channel_type is specified, selects all channels of that type.
        channel_type: Type of channels to select ('EMG', 'ACC', 'GYRO', etc.).
                    If specified with channels, filters the selection to only
                    channels of this type.

    Returns:
        EMG: A new EMG object containing only the selected channels

    Examples:
        # Select specific channels
        new_emg = emg.select_channels(['EMG1', 'ACC1'])

        # Select all EMG channels
        emg_only = emg.select_channels(channel_type='EMG')

        # Select specific EMG channels only, this example does not select ACC channels
        emg_subset = emg.select_channels(['EMG1', 'ACC1'], channel_type='EMG')
    """
    if self.signals is None:
        raise ValueError("No signals loaded")

    # If channel_type specified but no channels, select all of that type
    if channels is None and channel_type is not None:
        channels = [ch for ch, info in self.channels.items()
                    if info['channel_type'] == channel_type]
        if not channels:
            raise ValueError(f"No channels found of type: {channel_type}")
    elif isinstance(channels, str):
        channels = [channels]

    # Validate channels exist
    if not all(ch in self.signals.columns for ch in channels):
        missing = [ch for ch in channels if ch not in self.signals.columns]
        raise ValueError(f"Channels not found: {missing}")

    # Filter by type if specified
    if channel_type is not None:
        channels = [ch for ch in channels
                    if self.channels[ch]['channel_type'] == channel_type]
        if not channels:
            raise ValueError(
                f"None of the selected channels are of type: {channel_type}")

    # Create new EMG object
    new_emg = EMG()

    # Copy selected signals and channels
    new_emg.signals = self.signals[channels].copy()
    new_emg.channels = {ch: self.channels[ch].copy() for ch in channels}

    # Copy metadata
    new_emg.metadata = self.metadata.copy()

    if not inplace:
        return new_emg
    else:
        self.signals = new_emg.signals
        self.channels = new_emg.channels
        self.metadata = new_emg.metadata
        return self

set_metadata(key, value)

Set metadata value.

Args: key: Metadata key value: Metadata value

Source code in emgio/core/emg.py
def set_metadata(self, key: str, value: any) -> None:
    """
    Set metadata value.

    Args:
        key: Metadata key
        value: Metadata value
    """
    self.metadata[key] = value

to_edf(filepath, method='both', fft_noise_range=None, svd_rank=None, precision_threshold=0.01, format='auto', bypass_analysis=None, verify=False, verify_tolerance=1e-06, verify_channel_map=None, verify_plot=False, events_df=None, **kwargs)

Export EMG data to EDF/BDF format, optionally including events.

Args: filepath: Path to save the EDF/BDF file method: Method for signal analysis ('svd', 'fft', or 'both') 'svd': Uses Singular Value Decomposition for noise floor estimation 'fft': Uses Fast Fourier Transform for noise floor estimation 'both': Uses both methods and takes the minimum noise floor (default) fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%) format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'. If 'edf' or 'bdf' is specified, that format will be used directly. If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based on signal analysis to minimize precision loss while preferring EDF if sufficient. bypass_analysis: If True, skip signal analysis step when format is explicitly set to 'edf' or 'bdf'. If None (default), analysis is skipped automatically when format is forced. Set to False to force analysis even with a specified format. Ignored if format='auto'. verify: If True, reload the exported file and compare signals with the original to check for data integrity loss. Results are printed. (default: False) verify_tolerance: Absolute tolerance used when comparing signals during verification. (default: 1e-6) verify_channel_map: Optional dictionary mapping original channel names (keys) to reloaded channel names (values) for verification. Used if verify is True and channel names might differ. verify_plot: If True and verify is True, plots a comparison of original vs reloaded signals. events_df: Optional DataFrame with events ('onset', 'duration', 'description'). If None, uses self.events. (This provides flexibility) **kwargs: Additional arguments for the EDF exporter

Returns: Union[str, None]: If verify is True, returns a string with verification results. Otherwise, returns None.

Raises: ValueError: If no signals are loaded

Source code in emgio/core/emg.py
def to_edf(self, filepath: str, method: str = 'both',
           fft_noise_range: tuple = None, svd_rank: int = None,
           precision_threshold: float = 0.01,
           format: Literal['auto', 'edf', 'bdf'] = 'auto',
           bypass_analysis: bool | None = None,
           verify: bool = False, verify_tolerance: float = 1e-6,
           verify_channel_map: Optional[Dict[str, str]] = None,
           verify_plot: bool = False,
           events_df: Optional[pd.DataFrame] = None,
           **kwargs
           ) -> Union[str, None]:
    """
    Export EMG data to EDF/BDF format, optionally including events.

    Args:
        filepath: Path to save the EDF/BDF file
        method: Method for signal analysis ('svd', 'fft', or 'both')
            'svd': Uses Singular Value Decomposition for noise floor estimation
            'fft': Uses Fast Fourier Transform for noise floor estimation
            'both': Uses both methods and takes the minimum noise floor (default)
        fft_noise_range: Optional tuple (min_freq, max_freq) specifying frequency range for noise in FFT method
        svd_rank: Optional manual rank cutoff for signal/noise separation in SVD method
        precision_threshold: Maximum acceptable precision loss percentage (default: 0.01%)
        format: Format to use ('auto', 'edf', or 'bdf'). Default is 'auto'.
                If 'edf' or 'bdf' is specified, that format will be used directly.
                If 'auto', the format (EDF/16-bit or BDF/24-bit) is chosen based
                on signal analysis to minimize precision loss while preferring EDF
                if sufficient.
        bypass_analysis: If True, skip signal analysis step when format is explicitly
                         set to 'edf' or 'bdf'. If None (default), analysis is skipped
                         automatically when format is forced. Set to False to force
                         analysis even with a specified format. Ignored if format='auto'.
        verify: If True, reload the exported file and compare signals with the original
                to check for data integrity loss. Results are printed. (default: False)
        verify_tolerance: Absolute tolerance used when comparing signals during verification. (default: 1e-6)
        verify_channel_map: Optional dictionary mapping original channel names (keys)
                            to reloaded channel names (values) for verification.
                            Used if `verify` is True and channel names might differ.
        verify_plot: If True and verify is True, plots a comparison of original vs reloaded signals.
        events_df: Optional DataFrame with events ('onset', 'duration', 'description').
                  If None, uses self.events. (This provides flexibility)
        **kwargs: Additional arguments for the EDF exporter

    Returns:
        Union[str, None]: If verify is True, returns a string with verification results.
                         Otherwise, returns None.

    Raises:
        ValueError: If no signals are loaded
    """
    from ..exporters.edf import EDFExporter  # Local import

    if self.signals is None:
        raise ValueError("No signals loaded")

    # --- Determine if analysis should be bypassed ---
    final_bypass_analysis = False
    if format.lower() == 'auto':
        if bypass_analysis is True:
            logging.warning("bypass_analysis=True ignored because format='auto'. Analysis is required.")
        # Analysis is always needed for 'auto' format
        final_bypass_analysis = False
    elif format.lower() in ['edf', 'bdf']:
        if bypass_analysis is None:
            # Default behaviour: skip analysis if format is forced
            final_bypass_analysis = True
            msg = (f"Format forced to '{format}'. Skipping signal analysis for faster export. "
                   "Set bypass_analysis=False to force analysis.")
            logging.log(logging.CRITICAL, msg)
        elif bypass_analysis is True:
            final_bypass_analysis = True
            logging.log(logging.CRITICAL, "bypass_analysis=True set. Skipping signal analysis.")
        else:  # bypass_analysis is False
            final_bypass_analysis = False
            logging.info(f"Format forced to '{format}' but bypass_analysis=False. Performing signal analysis.")
    else:
        # Should not happen if Literal type hint works, but good practice
        logging.warning(f"Unknown format '{format}'. Defaulting to 'auto' behavior (analysis enabled).")
        format = 'auto'
        final_bypass_analysis = False

    # Determine which events DataFrame to use
    if events_df is None:
        events_to_export = self.events
    else:
        events_to_export = events_df

    # Combine parameters
    all_params = {
        'precision_threshold': precision_threshold,
        'method': method,
        'fft_noise_range': fft_noise_range,
        'svd_rank': svd_rank,
        'format': format,
        'bypass_analysis': final_bypass_analysis,
        'events_df': events_to_export,  # Pass the events dataframe
        **kwargs
    }

    EDFExporter.export(self, filepath, **all_params)

    verification_report_dict = None
    if verify:
        logging.info(f"Verification requested. Reloading exported file: {filepath}")
        try:
            # Reload the exported file
            reloaded_emg = EMG.from_file(filepath, importer='edf')

            logging.info("Comparing original signals with reloaded signals...")
            # Compare signals using the imported function
            verification_results = compare_signals(
                self,
                reloaded_emg,
                tolerance=verify_tolerance,
                channel_map=verify_channel_map
            )

            # Generate and log report using the imported function
            report_verification_results(verification_results, verify_tolerance)
            verification_report_dict = verification_results

            # Plot comparison using imported function if requested
            summary = verification_results.get('channel_summary', {})
            comparison_mode = summary.get('comparison_mode', 'unknown')
            compared_count = sum(1 for k in verification_results if k != 'channel_summary')

            if verify_plot and compared_count > 0 and comparison_mode != 'failed':
                plot_comparison(self, reloaded_emg, channel_map=verify_channel_map)
            elif verify_plot:
                logging.warning("Skipping verification plot: No channels were successfully compared.")

        except Exception as e:
            logging.error(f"Verification failed during reload or comparison: {e}")
            verification_report_dict = {
                'error': str(e),
                'channel_summary': {'comparison_mode': 'failed'}
            }

    return verification_report_dict

_calculate_precision_loss(signal, scaling_factor, digital_min, digital_max)

Calculate precision loss when scaling signal to digital values.

Args: signal: Original signal values scaling_factor: Scaling factor to convert to digital values digital_min: Minimum digital value digital_max: Maximum digital value

Returns: float: Maximum relative precision loss as percentage

Source code in emgio/exporters/edf.py
def _calculate_precision_loss(signal: np.ndarray, scaling_factor: float, digital_min: int, digital_max: int) -> float:
    """
    Calculate precision loss when scaling signal to digital values.

    Args:
        signal: Original signal values
        scaling_factor: Scaling factor to convert to digital values
        digital_min: Minimum digital value
        digital_max: Maximum digital value

    Returns:
        float: Maximum relative precision loss as percentage
    """
    # Convert to integers (simulating digitization)
    scaled = np.round(signal * scaling_factor)
    digital_values = np.clip(scaled, digital_min, digital_max)
    reconstructed = digital_values / scaling_factor

    # Calculate relative error
    abs_diff = np.abs(signal - reconstructed)
    abs_signal = np.abs(signal)

    # Avoid division by zero and very small values
    eps = np.finfo(np.float32).eps
    nonzero_mask = abs_signal > eps * 1e3
    if not np.any(nonzero_mask):
        return 0.0
    # Make the first and last five sample zero, to compensate for diff (technically, only first and last one is enough)
    nonzero_mask[0:5] = False
    nonzero_mask[-5:] = False

    relative_errors = np.zeros_like(signal)
    relative_errors[nonzero_mask] = (
        abs_diff[nonzero_mask] / abs_signal[nonzero_mask]
    )

    # Convert to percentage and ensure we detect small losses
    max_loss = float(np.max(relative_errors) * 100)
    if max_loss < np.finfo(np.float32).eps and np.any(abs_diff > 0):
        # If we have any difference but relative error is too small to measure,
        # return a small but non-zero value
        return 1e-6
    return max_loss

_determine_scaling_factors(signal_min, signal_max, use_bdf=False)

Calculate optimal scaling factors for EDF/BDF signal conversion. Automatically scales values to fit format character limits.

Args: signal_min: Minimum value of the signal signal_max: Maximum value of the signal use_bdf: Whether to use BDF (24-bit) format

Returns: tuple: (physical_min, physical_max, digital_min, digital_max, scaling_factor)

Source code in emgio/exporters/edf.py
def _determine_scaling_factors(signal_min: float, signal_max: float, use_bdf: bool = False) -> tuple:
    """
    Calculate optimal scaling factors for EDF/BDF signal conversion.
    Automatically scales values to fit format character limits.

    Args:
        signal_min: Minimum value of the signal
        signal_max: Maximum value of the signal
        use_bdf: Whether to use BDF (24-bit) format

    Returns:
        tuple: (physical_min, physical_max, digital_min, digital_max, scaling_factor)
    """
    # Handle NaN values
    if np.isnan(signal_min) or np.isnan(signal_max):
        signal_min = -1e-6 if np.isnan(signal_min) else signal_min
        signal_max = 1e-6 if np.isnan(signal_max) else signal_max

    if signal_min > signal_max:
        signal_min, signal_max = signal_max, signal_min

    # Set digital range based on format
    if use_bdf:
        digital_min, digital_max = -8388608, 8388607  # 24-bit
        max_chars = 12
    else:
        digital_min, digital_max = -32768, 32767  # 16-bit
        max_chars = 8

    # Handle special cases
    if np.isclose(signal_min, signal_max):
        if np.isclose(signal_min, 0):
            # For zero signal, use minimal range around zero
            # Use small values that will scale well with typical EMG signals
            signal_min, signal_max = -1e-6, 1e-6
        else:
            # For constant non-zero signal, create range around the value
            # Use a percentage of the value to maintain scale
            margin = abs(signal_min) * 0.01  # 1% margin
            signal_min -= margin
            signal_max += margin
            # Don't normalize constant signals - this preserves test behavior
            return signal_min, signal_max, digital_min, digital_max, digital_max * 1.0

    # Ensure physical range is never too small
    physical_range = signal_max - signal_min
    if abs(physical_range) < np.finfo(float).eps * 1e3:
        # If range is effectively zero, create a minimal range
        # Scale it relative to the signal magnitude
        base = max(abs(signal_min), abs(signal_max), 1e-6)
        physical_range = base * 1e-6
        signal_max = signal_min + physical_range
        # Ensure we have a valid range for scaling
        if physical_range == 0:
            physical_range = 1e-6

    # For high dynamic range signals, preserve the original range
    # This is critical for maintaining the dynamic range in the exported file
    if (signal_max - signal_min) > 1e5 or (signal_max / max(abs(signal_min), 1e-10)) > 1e5:
        # High dynamic range detected - preserve it for BDF format
        if use_bdf:
            # For BDF, we can handle the full range directly
            # Just ensure the values fit within character limits
            signal_min, _ = _format_physical_value(signal_min, max_chars)
            signal_max, _ = _format_physical_value(signal_max, max_chars)

            digital_range = digital_max - digital_min
            physical_range = signal_max - signal_min

            # Calculate scaling factor to use full digital range
            scaling_factor = digital_range / physical_range

            return signal_min, signal_max, digital_min, digital_max, scaling_factor

    # Only normalize extreme values that would cause problems with EDF/BDF format
    # This preserves the original scaling for most signals while handling extreme cases
    if abs(signal_min) > 1e6 or abs(signal_max) > 1e6 or abs(signal_min) < 1e-6 or abs(signal_max) < 1e-6:
        # For extreme values, normalize to a reasonable range
        # But preserve the original ratio between min and max
        ratio = abs(signal_max / signal_min) if signal_min != 0 else 1.0

        if ratio > 1e6 and not use_bdf:  # Very large ratio, use a more balanced range for EDF only
            signal_min = -1.0
            signal_max = 1.0
        else:  # Preserve ratio but scale to reasonable values
            if abs(signal_min) > 1e6 or abs(signal_max) > 1e6:  # Too large
                scale_factor = max(abs(signal_min), abs(signal_max)) / 1000.0
                signal_min /= scale_factor
                signal_max /= scale_factor
            elif abs(signal_min) < 1e-6 or abs(signal_max) < 1e-6:  # Too small
                scale_factor = 1e-3 / max(abs(signal_min), abs(signal_max))
                signal_min *= scale_factor
                signal_max *= scale_factor

    # Format values to fit character limits
    signal_min, _ = _format_physical_value(signal_min, max_chars)
    signal_max, _ = _format_physical_value(signal_max, max_chars)

    digital_range = digital_max - digital_min
    physical_range = signal_max - signal_min

    # Calculate scaling factor
    # We use slightly less than the full range to prevent overflow at boundaries
    scaling_factor = (digital_range - 1) / physical_range

    return signal_min, signal_max, digital_min, digital_max, scaling_factor

_format_physical_value(value, max_chars)

Format a physical value to fit within EDF character limits.

Args: value: Physical value to format max_chars: Maximum number of characters allowed

Returns: tuple: (formatted_value, formatted_string)

Source code in emgio/exporters/edf.py
def _format_physical_value(value: float, max_chars: int) -> tuple:
    """
    Format a physical value to fit within EDF character limits.

    Args:
        value: Physical value to format
        max_chars: Maximum number of characters allowed

    Returns:
        tuple: (formatted_value, formatted_string)
    """
    # Handle NaN values
    if np.isnan(value):
        return 0.0, "0"

    # For zero or very small values, return as is
    if abs(value) < 1e-6:
        return 0.0, "0"

    # For values close to integers, handle as integers
    try:
        if abs((value - round(value)) / value) < 1e-6:
            value = int(round(value))  # Convert to integer
            scale = 1
            while True:
                value_str = str(value)
                if len(value_str) <= max_chars:
                    return value, value_str
                # Integer division to reduce digits
                scale *= 10
                value = value // 10
    except (ValueError, ZeroDivisionError):
        # Handle any other numerical issues
        return 0.0, "0"

    # For decimal numbers
    if abs(value) < 1:
        # Use scientific notation with reduced precision
        for precision in range(6, 0, -1):
            formatted = f"{value:.{precision}e}"
            if len(formatted) <= max_chars:
                return float(formatted), formatted
        # If still too long, return minimal representation
        return float(f"{value:.1e}"), f"{value:.1e}"
    else:
        # For larger decimals, try fixed point first
        scale = 1
        scaled_value = value
        while True:
            # Try without any changes
            formatted = f"{scaled_value}"
            if len(formatted) <= max_chars:
                return float(formatted), formatted
            # If that doesn't work, scale down
            scale *= 10
            scaled_value = value / scale
            # If we've scaled down a lot and still not fitting, switch to scientific
            if scale > 1e9:
                for precision in range(4, 0, -1):
                    formatted = f"{value:.{precision}e}"
                    if len(formatted) <= max_chars:
                        return float(formatted), formatted
                return float(f"{value:.1e}"), f"{value:.1e}"

analyze_signal(signal, method='svd', fft_noise_range=None, svd_rank=None)

Analyze signal characteristics including noise floor and dynamic range.

Args: signal: Input signal array method: Method for noise floor estimation: 'svd' (default), 'fft', or 'both' fft_noise_range: Optional tuple (min_freq, max_freq) for FFT method svd_rank: Optional rank cutoff for SVD method

Returns: dict: Analysis results including range, noise floor, and dynamic range in dB

Source code in emgio/analysis/signal.py
def analyze_signal(signal: np.ndarray, method: str = 'svd',
                   fft_noise_range: tuple = None, svd_rank: int = None) -> dict:
    """
    Analyze signal characteristics including noise floor and dynamic range.

    Args:
        signal: Input signal array
        method: Method for noise floor estimation: 'svd' (default), 'fft', or 'both'
        fft_noise_range: Optional tuple (min_freq, max_freq) for FFT method
        svd_rank: Optional rank cutoff for SVD method

    Returns:
        dict: Analysis results including range, noise floor, and dynamic range in dB
    """
    # Handle zero signal case
    if np.allclose(signal, 0):
        return {
            'range': 0.0,
            'noise_floor': np.finfo(float).eps,
            'dynamic_range_db': 0.0,
            'is_zero': True
        }

    # Remove DC offset for better analysis
    detrended = signal - np.mean(signal)

    # Calculate signal range (peak-to-peak)
    signal_range = np.max(detrended) - np.min(detrended)

    # Use both methods and take the minimum noise floor for better accuracy
    # This helps preserve high dynamic range signals
    if method.lower() == 'both':
        # Try SVD first, fall back to FFT if it fails
        try:
            noise_floor_svd = analyze_signal_svd(detrended, svd_rank)
            try:
                noise_floor_fft = analyze_signal_fft(detrended, fft_noise_range)
                noise_floor = min(noise_floor_svd, noise_floor_fft)
                method = 'both (min)'
            except Exception:
                # If FFT fails but SVD worked, use SVD result
                noise_floor = noise_floor_svd
                method = 'svd (fallback)'
        except Exception:
            # If SVD fails, try FFT
            try:
                noise_floor = analyze_signal_fft(detrended, fft_noise_range)
                method = 'fft (fallback)'
            except Exception:
                # If both methods fail, use a simple statistical approach
                noise_floor = np.std(np.diff(detrended)) / np.sqrt(2)
                method = 'statistical (fallback)'
    else:
        # Choose noise floor estimation method
        try:
            if method.lower() == 'svd':
                noise_floor = analyze_signal_svd(detrended, svd_rank)
            elif method.lower() == 'fft':
                noise_floor = analyze_signal_fft(detrended, fft_noise_range)
            else:
                raise ValueError(f"Unknown method: {method}. Use 'svd', 'fft', or 'both'.")
        except Exception:
            # Fallback to simple statistical approach if the chosen method fails
            noise_floor = np.std(np.diff(detrended)) / np.sqrt(2)
            method = f"{method} failed, using statistical (fallback)"

    # Ensure minimum noise floor
    noise_floor = max(noise_floor, np.finfo(float).eps)

    # Calculate dynamic range in dB
    dynamic_range_db = 20 * np.log10(signal_range / noise_floor)

    # Cap dynamic range at realistic values based on format capabilities
    # For high dynamic range test, we need to preserve at least 90dB
    # 16-bit ADC theoretical max is ~96dB, 24-bit is ~144dB
    # In practice, most signals don't exceed these values
    max_realistic_dr = 90  # Default for EDF format (16-bit)

    # For high dynamic range signals, allow up to 140dB (for BDF format)
    if dynamic_range_db > 90:
        max_realistic_dr = 140  # Maximum for BDF format (24-bit)

    if dynamic_range_db > max_realistic_dr:
        # Adjust noise floor to match the capped dynamic range
        noise_floor = signal_range / (10 ** (max_realistic_dr / 20))
        dynamic_range_db = max_realistic_dr

    # Calculate signal SNR
    signal_std = np.std(signal)
    snr_db = 20 * np.log10(signal_std / noise_floor)

    # Cap SNR at realistic values
    max_realistic_snr = 140  # Increased maximum realistic SNR in dB
    if snr_db > max_realistic_snr:
        snr_db = max_realistic_snr

    return {
        'range': signal_range,
        'noise_floor': noise_floor,
        'dynamic_range_db': dynamic_range_db,
        'snr_db': snr_db,
        'is_zero': False,
        'method': method
    }

determine_format_suitability(signal, analysis)

Determine whether EDF or BDF format is suitable for the signal.

Args: signal: Input signal array analysis: Signal analysis results from analyze_signal()

Returns: tuple: (use_bdf, reason, snr_db)

Source code in emgio/analysis/signal.py
def determine_format_suitability(signal: np.ndarray, analysis: dict) -> tuple:
    """
    Determine whether EDF or BDF format is suitable for the signal.

    Args:
        signal: Input signal array
        analysis: Signal analysis results from analyze_signal()

    Returns:
        tuple: (use_bdf, reason, snr_db)
    """
    # Handle zero signal case
    if analysis.get('is_zero', False):
        return False, "Zero signal, using EDF format", 0.0

    # Theoretical format capabilities
    edf_dynamic_range = 90  # dB (16-bit) - slightly reduced from theoretical 96dB for safety
    bdf_dynamic_range = 140  # dB (24-bit) - slightly reduced from theoretical 144dB for safety
    safety_margin = 3  # dB - reduced to better preserve high dynamic range signals

    # Get signal characteristics
    signal_dr = analysis['dynamic_range_db']
    signal_snr = analysis.get('snr_db', 0)
    # signal_range = analysis['range']  # Not used for format selection

    # # Check amplitude first - if signal range is very large, use BDF
    # if signal_range > 1e5:  # Reduced threshold to catch more high-amplitude signals
    #     return True, f"Large amplitude signal ({signal_range:.1f}), using BDF", signal_snr

    # Then check dynamic range with safety margin
    if signal_dr <= (edf_dynamic_range - safety_margin):
        return False, f"EDF dynamic range ({edf_dynamic_range} dB) is sufficient", signal_snr
    elif signal_dr <= (bdf_dynamic_range - safety_margin):
        return True, f"Signal requires BDF format (DR: {signal_dr:.1f} dB)", signal_snr
    else:
        return True, f"Signal may require higher resolution than BDF (DR: {signal_dr:.1f} dB)", signal_snr

summarize_channels(channels, signals, analyses)

Generate a summary of channel characteristics grouped by type.

Args: channels: Dictionary of channel information signals: Dictionary of signal data analyses: Dictionary of signal analyses

Returns: str: Formatted summary string

Source code in emgio/exporters/edf.py
def summarize_channels(channels: dict, signals: dict, analyses: dict) -> str:
    """
    Generate a summary of channel characteristics grouped by type.

    Args:
        channels: Dictionary of channel information
        signals: Dictionary of signal data
        analyses: Dictionary of signal analyses

    Returns:
        str: Formatted summary string
    """
    # Group channels by type
    type_groups = {}
    for ch_name, ch_info in channels.items():
        ch_type = ch_info.get('channel_type', 'Unknown')
        if ch_type not in type_groups:
            type_groups[ch_type] = {
                'channels': [],
                'ranges': [],
                'dynamic_ranges': [],
                'snrs': [],
                'formats': [],
                'unit': ch_info.get('physical_dimension', 'Unknown')
            }
        type_groups[ch_type]['channels'].append(ch_name)

        analysis = analyses.get(ch_name, {})
        if not analysis.get('is_zero', False):
            type_groups[ch_type]['ranges'].append(analysis.get('range', 0))
            type_groups[ch_type]['dynamic_ranges'].append(analysis.get('dynamic_range_db', 0))
            type_groups[ch_type]['snrs'].append(analysis.get('snr_db', 0))
            type_groups[ch_type]['formats'].append('BDF' if analysis.get('use_bdf', False) else 'EDF')

    # Generate summary
    summary = []
    for ch_type, data in type_groups.items():
        ranges = np.array(data['ranges'])
        dynamic_ranges = np.array(data['dynamic_ranges'])
        snrs = np.array(data['snrs'])
        formats = data['formats']

        if len(ranges) > 0:
            summary.append(f"\nChannel Type: {ch_type} ({len(data['channels'])} channels)")
            summary.append(
                f"Range: {np.min(ranges):.2f} to {np.max(ranges):.2f} "
                f"(mean: {np.mean(ranges):.2f}) {data['unit']}")
            summary.append(
                f"Dynamic Range: {np.min(dynamic_ranges):.1f} to "
                f"{np.max(dynamic_ranges):.1f} (mean: {np.mean(dynamic_ranges):.1f}) dB")
            summary.append(
                f"SNR: {np.min(snrs):.1f} to {np.max(snrs):.1f} "
                f"(mean: {np.mean(snrs):.1f}) dB")

            edf_count = formats.count('EDF')
            bdf_count = formats.count('BDF')
            summary.append(f"Format: {edf_count} channels using EDF, {bdf_count} channels using BDF")
        else:
            summary.append(f"\nChannel Type: {ch_type} ({len(data['channels'])} channels)")
            summary.append("All channels contain zero signal")

    return "\n".join(summary)

Usage Example

from emgio import EMG

# Load data
emg = EMG.from_file('data.csv', importer='trigno')

# Export to EDF/BDF with automatic format selection
emg.to_edf('output')  # Will generate output.edf or output.bdf

# Force specific format
emg.to_edf('output_edf', format='edf')  # Forces 16-bit EDF
emg.to_edf('output_bdf', format='bdf')  # Forces 24-bit BDF

Automatic Format Selection

A key feature of EMGIO's exporter is its ability to automatically determine whether to use EDF (16-bit) or BDF (24-bit) format based on the dynamic range of the data:

# Control the analysis method for format selection
emg.to_edf('output', method='svd')  # Use SVD analysis only
emg.to_edf('output', method='fft')  # Use FFT analysis only 
emg.to_edf('output', method='both')  # Use both methods (default)

# Customize SVD parameters
emg.to_edf('output', method='svd', svd_rank=5)  # Manual rank cutoff

# Customize FFT parameters
emg.to_edf('output', 
           method='fft', 
           fft_noise_range=(0.1, 10))  # Manual frequency range for noise floor estimation

Parameters

The to_edf method accepts the following parameters:

  • output_path (str): Path for the output file (without extension)
  • format (str, optional): Specify the format to use ('auto', 'edf', or 'bdf'). Default is 'auto'.
  • method (str, optional): Method for format selection ('svd', 'fft', or 'both'). Default is 'both'.
  • svd_rank (int, optional): Rank cutoff for SVD analysis. Default is None (automatic).
  • fft_noise_range (tuple, optional): Frequency range (min, max) for noise floor estimation in FFT. Default is None (automatic).
  • physical_min (float, optional): Physical minimum value. Default is None (automatic).
  • physical_max (float, optional): Physical maximum value. Default is None (automatic).
  • overwrite (bool, optional): Whether to overwrite existing files. Default is False.
  • additional_info (dict, optional): Additional information to include in the EDF header.

Understanding Format Selection

The exporter uses two complementary approaches to determine the appropriate format:

1. SVD Analysis

Singular Value Decomposition (SVD) is used to: - Estimate the effective dimensionality of the data - Analyze the distribution of signal energy across components - Determine if the precision requirements can be satisfied by 16-bit representation

2. FFT Analysis

Fast Fourier Transform (FFT) analysis: - Examines the frequency domain representation of the data - Evaluates the noise floor and signal-to-noise ratio - Helps determine if 16-bit precision is sufficient or if 24-bit is needed

Output Files

When exporting, EMGIO generates the following files:

  1. Main data file: Either .edf or .bdf extension depending on the format selected
  2. Channels metadata file: A {output_path}.channels.tsv file with detailed channel information in BIDS-compatible format

Example channels.tsv file content:

name    type    units   sampling_frequency
EMG1    EMG     µV      2000
EMG2    EMG     µV      2000
ACC1    ACC     g       2000

Additional Features

  • Channel scaling: Signals are automatically scaled to maximize precision
  • Metadata preservation: Subject, recording, and other metadata are included in the EDF header
  • BIDS compatibility: The exporter follows BIDS conventions for metadata
  • Multi-channel support: Handles multiple channel types with appropriate units
  • Different sampling rates: Can handle channels with different sampling rates