Processing Data#

Overview#

This guide explains how to process and transform your data for the Delta Enigma project. It’s important to understand that the Yoda platform serves as a data storage and management solution, not a computational environment.

Key Principles#

Yoda Platform Limitations#

The Yoda platform does not provide computational resources for data processing, transformation, or analysis. Yoda is designed for:

  • Data storage and archival

  • Metadata management

  • Data sharing and collaboration

Processing Requirements#

All data cleaning, transformation, and processing must be performed outside of the Yoda environment using:

  • Local computing resources (your personal computer or workstation)

  • Institutional computing infrastructure (university clusters, high-performance computing facilities)

  • Cloud computing platforms (public clouds like AWS, Azure, Google Cloud, or private institutional clouds)

Data Processing Workflow#

1. Download Raw Data#

  • Access your raw data through the SURF Yoda portal

  • Download to your local or institutional computing environment

  • Ensure you have adequate storage space for both raw and processed data

2. Process Data Locally or in the Cloud#

Perform your data processing using appropriate tools and environments:

  • Data cleaning: Remove outliers, handle missing values, quality control

  • Data transformation: Format conversion, unit standardization, coordinate transformations

3. Prepare Refined Data for Publication#

Only upload refined but unaggregated data back to Yoda for publication:

  • Cleaned and quality-controlled datasets

  • Standardized formats and units

  • Well-documented processing steps

  • Preserved spatial and temporal resolution where scientifically relevant

Data Lifecycle Management#

Raw Data Handling#

The Data Governance Board is currently developing policies regarding raw data retention. Until these policies are finalized, we recommend:

  • Temporarily retain raw data in secure storage (institutional or personal backup systems)

  • Consult with your Data Steward before removing any raw data from Yoda

  • Document the processing workflow to ensure reproducibility

Getting Help#

For questions about:

Additional Resources#