Processing Data#
Overview#
This guide explains how to process and transform your data for the Delta Enigma project. It’s important to understand that the Yoda platform serves as a data storage and management solution, not a computational environment.
Key Principles#
Yoda Platform Limitations#
The Yoda platform does not provide computational resources for data processing, transformation, or analysis. Yoda is designed for:
- Data storage and archival 
- Metadata management 
- Data sharing and collaboration 
Processing Requirements#
All data cleaning, transformation, and processing must be performed outside of the Yoda environment using:
- Local computing resources (your personal computer or workstation) 
- Institutional computing infrastructure (university clusters, high-performance computing facilities) 
- Cloud computing platforms (public clouds like AWS, Azure, Google Cloud, or private institutional clouds) 
Data Processing Workflow#
1. Download Raw Data#
- Access your raw data through the SURF Yoda portal 
- Download to your local or institutional computing environment 
- Ensure you have adequate storage space for both raw and processed data 
2. Process Data Locally or in the Cloud#
Perform your data processing using appropriate tools and environments:
- Data cleaning: Remove outliers, handle missing values, quality control 
- Data transformation: Format conversion, unit standardization, coordinate transformations 
3. Prepare Refined Data for Publication#
Only upload refined but unaggregated data back to Yoda for publication:
- Cleaned and quality-controlled datasets 
- Standardized formats and units 
- Well-documented processing steps 
- Preserved spatial and temporal resolution where scientifically relevant 
Data Lifecycle Management#
Raw Data Handling#
The Data Governance Board is currently developing policies regarding raw data retention. Until these policies are finalized, we recommend:
- Temporarily retain raw data in secure storage (institutional or personal backup systems) 
- Consult with your Data Steward before removing any raw data from Yoda 
- Document the processing workflow to ensure reproducibility 
Getting Help#
For questions about:
- Data processing best practices: Contact your Data Steward 
- Computing infrastructure: Contact your institutional IT support 
- Yoda data management: See Uploading Data and Adding Metadata 
