Class: CMORizer::StepsChain

Inherits:
Object
  • Object
show all
Defined in:
lib/steps_chain.rb

Overview

The StepsChain class manages the sequence of processing steps for converting FESOM variable data to CMOR-compliant format. It initializes the necessary steps, handles the execution of these steps, and manages the metadata required for the process.

Instance Attribute Summary collapse

Instance Method Summary collapse

Constructor Details

#initialize(default_step_classes, fesom_variable_description, cmor_variable_description, &block) ⇒ StepsChain

Initializes a StepsChain instance.

This constructor sets up the processing chain for converting FESOM variable data to CMOR-compliant format. It parses the provided descriptions for FESOM and CMOR variables, initializes the steps from a given block or default step classes, and prepares the chain for execution.

Parameters:

  • default_step_classes (Array<Class>)

    Default classes for processing steps if no specific steps are provided.

  • fesom_variable_description (String)

    Description of the FESOM variable in the format “variable_name_frequency”.

  • cmor_variable_description (String)

    Description of the CMOR variable in the format “variable_id_table_id”.

  • block (Proc)

    An optional block to customize the processing steps.

[View source]

18
19
20
21
22
23
24
25
26
27
28
# File 'lib/steps_chain.rb', line 18

def initialize(default_step_classes, fesom_variable_description, cmor_variable_description, &block)
  @input_variable_name, @input_frequency_name = fesom_variable_description.split('_')
  @cmor_variable_id, @cmor_table_id = cmor_variable_description.split('_')

  @step_classes = []
  @eval_mode = true
  instance_eval(&block) if block_given?
  @eval_mode = false
        
  @step_classes = default_step_classes if @step_classes.empty?
end

Dynamic Method Handling

This class handles dynamic methods through the method_missing method

#method_missing(method_sym, *args, &block) ⇒ Object

Handles undefined methods to add steps to the chain.

Parameters:

  • method_sym (Symbol)

    The name of the undefined method.

  • args (Array)

    The arguments passed to the undefined method.

  • block (Proc)

    An optional block passed to the undefined method.

[View source]

167
168
169
170
171
172
# File 'lib/steps_chain.rb', line 167

def method_missing(method_sym, *args, &block)
  return super unless @eval_mode
  # we assume every unknown method designates a sub-task
  sym = method_sym.upcase
  add_step sym
end

Instance Attribute Details

#input_frequency_nameObject (readonly)

Returns the value of attribute input_frequency_name.


6
7
8
# File 'lib/steps_chain.rb', line 6

def input_frequency_name
  @input_frequency_name
end

#input_variable_nameObject (readonly)

Returns the value of attribute input_variable_name.


6
7
8
# File 'lib/steps_chain.rb', line 6

def input_variable_name
  @input_variable_name
end

Instance Method Details

#execute(fesom_files, experiment, data_request, grid_description_file, version_date) ⇒ Object

Executes the processing steps on the provided FESOM files.

Parameters:

  • fesom_files (Array<File>)

    List of FESOM files to process.

  • experiment (Experiment)

    The experiment metadata.

  • data_request (DataRequest)

    The data request metadata.

  • grid_description_file (String)

    The grid description file path.

  • version_date (String)

    The version date for the output files.

[View source]

45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
# File 'lib/steps_chain.rb', line 45

def execute(fesom_files, experiment, data_request, grid_description_file, version_date)      
  return if @step_classes.empty?
  
  # create new step instances here for each call to execute to have this method thread safe
  steps = []
  next_step = nil
  @step_classes.reverse_each do |cls|
    next_step = cls.new(next_step)
    steps << next_step
  end
  steps.reverse!
  unless steps.empty?
    steps[0].forbid_inplace = true # do not modify the original input files
    # filename prefix, has to be sufficient to avoid naming collisions of our files
    steps[0].initial_prefix = "_#{@input_variable_name}_#{@input_frequency_name}--#{@cmor_variable_id}_#{@cmor_table_id}_#{fesom_files.first.year}-#{fesom_files.last.year}"
  end

  # offer info about the current experiment and variable to all step objects
  data_request_variable = data_request.find_variable_id_in_table_id(@cmor_variable_id, @cmor_table_id) # the variable from the data request might have a different frequency than the input variable
  raise "data request does not contain variable #{@cmor_variable_id} #{@input_frequency_name}" unless data_request_variable
  cmor_frequency_name = data_request_variable.frequency_in_table(@cmor_table_id)
  global_attributes = create_global_attributes(experiment: experiment,
                                      first_file_year: fesom_files.first.year,
                                      last_file_year: fesom_files.last.year,
                                      variable_id: data_request_variable.variable_id,
                                      frequency: cmor_frequency_name,
                                      table_id: @cmor_table_id,
                                      realms: data_request_variable.realms,
                                      version_date: version_date)
  
  steps.each {|s| s.set_info(outdir: experiment.outdir,
                              grid_description_file: grid_description_file,
                              global_attributes: global_attributes,
                              fesom_variable_name: @input_variable_name,
                              fesom_variable_frequency: @input_frequency_name,
                              fesom_unit: fesom_files.first.unit,
                              out_unit: data_request_variable.unit,
                              variable_id: data_request_variable.variable_id,
                              description: data_request_variable.description,
                              standard_name: data_request_variable.standard_name,
                              out_cell_methods: data_request_variable.cell_methods_in_table(@cmor_table_id),
                              out_cell_measures: data_request_variable.cell_measures_in_table(@cmor_table_id))}

  # fill the first step with all the passed files without executing (i.e. dry run)
  # this will set the resultpath for each step
  fesom_files.each do |f|
    steps.first.add_input(f.path, [f.year], fesom_files.size, false)
  end
  
  # check from which resultpath we can resume this StepsChain (i.e. from the last resultpath for which a file exists)
  last_existing_index = -1
  steps.each_with_index do |step,i|
    last_existing_index = i if File.exist?(step.resultpath)
  end
  # set all steps which result in the last file we have available to skip execution
  steps.each_with_index do |step,i|
    step.needs_to_run = (last_existing_index < i)
  end     
  
  # fill the steps to execute from first to last step
  fesom_files.each do |f|
    steps.first.add_input(f.path, [f.year], fesom_files.size, true)
  end

  # remove all step results except the last one, we did set steps[0].forbid_inplace = true, so the first step has created a copy of the original input
  if(File.exist? steps.last.resultpath)
    steps[0..-2].each do |s|
      FileUtils.rm(s.resultpath) if File.exist?(s.resultpath) # if the step processes all files inplace, the resultpath from the previous step has been renamed and does not exist anymore
    end
  end
end

#to_sString

Returns a string representation of the steps chain.

Returns:

  • (String)

    A string in the format “input_variable_input_frequency ==> cmor_variable_id_cmor_table_id”.

[View source]

33
34
35
# File 'lib/steps_chain.rb', line 33

def to_s
  "#{@input_variable_name}_#{@input_frequency_name} ==> #{@cmor_variable_id}_#{@cmor_table_id}"
end