Overview
The Hyser (High densitY Surface Electromyogram Recordings) dataset is a comprehensive open-access collection of high-density surface EMG data from 20 subjects performing diverse hand and finger tasks. This dataset addresses critical gaps in existing sEMG datasets by providing cross-day recordings, random task combinations, and synchronized individual finger force measurements alongside 256-channel HD-sEMG signals.
Developed by researchers at Fudan University and collaborating institutions, Hyser fills important needs in neural interface research that previous datasets couldn’t address: the ability to study arbitrary switching between degrees of freedom (DoFs), cross-day robustness, and precise finger-level force control for dexterous prosthetic applications.
The dataset is particularly valuable for developing next-generation neural interfaces that can provide intuitive, proportional control of prosthetic hands with individual finger control, rather than just discrete gesture recognition.
Data Description
Contents
- HD-sEMG Data: 256-channel high-density surface EMG recorded at 2048 Hz using four 8×8 electrode arrays placed on forearm extensor and flexor muscles
- Force Measurements: Individual finger force recordings at 100 Hz for four of the five sub-datasets, enabling proportional control research
- Cross-Day Sessions: Each subject recorded twice on different days (3-25 day intervals) to evaluate temporal robustness
- Multiple Tasks: Five distinct experimental paradigms covering gesture recognition and force control applications
- Rich Annotations: Comprehensive labeling of gestures, force trajectories, and experimental conditions
Electrode Configuration
The 256 channels are arranged using four 8×8 electrode arrays (64 channels each):
- ED (Extensor-Distal): Distal end of extensor muscles
- EP (Extensor-Proximal): Proximal end of extensor muscles
- FD (Flexor-Distal): Distal end of flexor muscles
- FP (Flexor-Proximal): Proximal end of flexor muscles
Each electrode array uses 5mm × 2.8mm elliptical electrodes with 10mm inter-electrode spacing, providing high spatial resolution for muscle activation mapping and motor unit decomposition.
Five Sub-Datasets
1. Pattern Recognition (PR) Dataset
Purpose: Hand gesture classification research
Tasks: 34 commonly used hand gestures including individual finger extensions, wrist movements, and combined motions
Data: 204 dynamic tasks (1s transitions) + 68 maintenance tasks (4s holds) per subject per session
Applications: Prosthetic control via discrete gesture commands, human-computer interfaces
The complete set of 34 hand gestures included in the Pattern Recognition dataset, covering individual finger movements, wrist motions, and combined actions commonly used in daily activities.
2. Maximal Voluntary Contraction (MVC) Dataset
Purpose: Force normalization and maximum strength assessment
Tasks: MVC measurements for flexion and extension of each finger
Data: 2 trials × 5 fingers × 2 directions × 10s duration per subject per session
Applications: Normalizing force data, understanding individual strength capabilities
3. One Degree-of-Freedom (1-DoF) Dataset
Purpose: Single-finger proportional control research
Tasks: Force tracking with individual fingers (30% MVC flexion to 30% MVC extension)
Data: 3 trials × 5 fingers × 25s duration with triangular force trajectories
Applications: Proportional control of individual prosthetic fingers
4. N Degrees-of-Freedom (N-DoF) Dataset
Purpose: Multi-finger coordination and control
Tasks: 15 different finger combinations with prescribed force trajectories
Data: 2 trials × 15 combinations × 25s duration, including both synchronized and opposing finger movements
Applications: Coordinated multi-finger prosthetic control, studying finger interaction effects
5. Random Task Dataset
Purpose: Realistic, unconstrained control scenarios
Tasks: Free-form finger contractions without prescribed trajectories
Data: 5 trials × 25s duration of spontaneous finger force combinations
Applications: Developing robust controllers for real-world prosthetic use
Special Research Application: Bracelet vs Sleeve EMG
Optimizing Wearable EMG Placement for Practical Neural Interfaces
One compelling research direction using the Hyser dataset involves comparing information content between wrist-based “bracelet” EMG configurations versus traditional forearm “sleeve” arrangements. This has significant implications for developing practical, everyday-wearable neural interfaces.
Research Question
How much neural control information is lost when using a compact wrist-mounted EMG bracelet compared to a full forearm sleeve, and what is the optimal electrode count for each configuration?
Methodology Using Hyser Data
The high spatial resolution of Hyser’s 256-channel arrays enables systematic comparison of different electrode subsets:
Bracelet Configuration:
- Select electrodes closest to the wrist from all four arrays
- Compare 8, 16, and 32 electrode subsets arranged circumferentially around the wrist
- Focus on distal muscle activation patterns
Sleeve Configuration:
- Use distributed electrodes across the entire forearm coverage area
- Match electrode counts (8, 16, 32) but with broader spatial sampling
- Capture both proximal and distal muscle activations
Expected Insights
- Information Loss Quantification: Measure degradation in gesture classification accuracy and force estimation precision
- Optimal Electrode Placement: Identify which wrist locations provide maximum information density
- Task-Dependent Performance: Some gestures may be more robust to wrist-only sensing than others
- Individual Differences: Subject-specific variations in optimal electrode placement
Practical Implications
This research directly addresses the usability vs. performance tradeoff in wearable neural interfaces:
- Usability: Wrist bracelets may be more convenient, less conspicuous, and easier to don/doff
- Performance: Full forearm coverage traditionally provides richer control signals
- Finding the Sweet Spot: Determine minimum electrode requirements for acceptable performance
The results could guide development of next-generation wearable devices that balance practicality with control fidelity, making neural interfaces more accessible for daily use.
Data Management and Access
Dataset Structure
The dataset is organized hierarchically by subject and session:
/dataset_name/
├── pr_dataset/ # Pattern recognition (37.1 GB)
├── mvc_dataset/ # Maximal voluntary contraction (7.8 GB)
├── 1dof_dataset/ # Single finger control (29.3 GB)
├── ndof_dataset/ # Multi-finger control (58.6 GB)
├── random_dataset/ # Free-form tasks (9.8 GB)
├── SHA256SUMS.txt # Data integrity verification
├── equipment_info.pdf # Technical specifications
└── readme.txt # Dataset documentation
Each subdirectory contains folders for individual subjects and sessions:
/pr_dataset/
├── subject01_session1/
├── subject01_session2/
├── subject02_session1/
└── ...
File Formats
- EMG Data: WFDB format (.dat + .hea files) for efficient storage and MATLAB compatibility
- Force Data: WFDB format with synchronized timestamps
- Labels: Comma-separated text files for gesture classifications
- Preprocessing: Both raw and preprocessed versions provided
Getting Started
The dataset includes a comprehensive MATLAB toolbox with example scripts:
demo_pr.m
- Pattern recognition analysisdemo_1dof.m
- Single finger force estimationdemo_ndof.m
- Multi-finger coordination analysismain_decomposition.m
- Motor unit decomposition via ICA
Access Information
Download Locations
- Primary Source: PhysioNet - DOI: 10.13026/ym7v-bh53
- Toolbox: GitHub Repository - MATLAB analysis functions and examples
Dataset Statistics
- Total Size: 142.6 GB across all sub-datasets
- Subjects: 20 healthy volunteers (8 female, 12 male, ages 21-34)
- Sessions: 2 per subject on different days (3-25 day intervals)
- Sampling Rates: 2048 Hz (EMG), 100 Hz (force)
- Total Recording Time: ~33 hours of synchronized EMG and force data
License and Usage
- License: Open Data Commons Attribution License v1.0
- Commercial Use: Permitted with attribution
- Citation Required: Please cite the original dataset paper when using this data
Performance Benchmarks
The dataset paper provides baseline performance metrics for common neural interface tasks:
Gesture Recognition Results
- LDA-based Method: 96.86% accuracy (dynamic tasks), 93.80% accuracy (maintenance tasks)
- CNN-based Method: 88.96% accuracy (dynamic tasks), 89.84% accuracy (maintenance tasks)
- 34-class Classification: Well above 2.94% random chance level
Force Estimation Results
- Average RMSE: 18.28 ± 5.82% MVC across all subjects and fingers
- Correlation: 0.8611 ± 0.0358 between estimated and actual forces
- Individual Finger Performance: Consistent across thumb, index, middle, ring, and little fingers
These benchmarks provide reference points for evaluating new algorithms developed using the dataset.
Future Research Directions
The Hyser dataset enables investigation of numerous important questions in neural interface research:
Cross-Day Robustness: How do EMG-based controllers degrade over time? What adaptation strategies work best?
Individual Differences: Can we develop personalized control algorithms that account for anatomical and physiological variations?
Motor Unit Analysis: What insights can motor unit decomposition provide for prosthetic control?
Compressed Sensing: How can we reduce the computational burden of 256-channel processing for real-time applications?
Transfer Learning: Can models trained on one population generalize to new users with minimal calibration?
References
If you use this dataset, please cite the following publications:
- Hyser Dataset Paper: Jiang et al., IEEE TNSRE 2021 - “Open Access Dataset, Toolbox and Benchmark Processing Results of High-Density Surface Electromyogram Recordings”
- PhysioNet Entry: DOI: 10.13026/ym7v-bh53
Related Work
- Neural Interface Reviews: Farina et al., Nature Biomedical Engineering 2017
- HD-sEMG Decomposition: Chen & Zhou, IEEE TNSRE 2016
- Cross-Day EMG Challenges: Phinyomark et al., Expert Systems with Applications 2013
© 2025 Seyed Yahya Shirazi