PR types

New features

PR changes

Others

Describe

add 1f1b scheduler for pipeline, which intervals forward phase and backward phase to save memory consumption.

Now, two pipeline schedulers are supported: F-then-B and 1F1B.
With F-then-B scheduler, the scheduler does forward for all microbatches and then does backward for all microbatches, and then does updating.

With 1F1B scheduler, after a startup stage, the scheduler does one forward then one backward alternatively which may save GPU memory.

How to use:

dist_strategy = DistributedStrategy()
dist_strategy.pipeline = True
dist_strategy.pipelie_configs = {
  'schedule_mode': '1F1B', # or 'F-then-B'
}

TODO:

  1. split program into forward program, backward program and update program
  2. check all ops have the attribute of op_role.