1.6K Star 6K Fork 2.3K

GVPMindSpore / mindspore

2021-10-23 15:54
6574854 jjfeing 1584438580 jjfeing

MindSpore 1.5.0

MindSpore 1.5.0 Release Notes

Major Features and Improvements

NewModels

  • [STABLE] Add CV model on Ascend: Fast-SCNN
  • [BETA] Add CV models on Ascend: midas_V2, attgan, FairMOT, CenterNet_resnet101, SEResNext, YOLOV3-tiny, RetinaFace
  • [STABLE] Add CV models on GPU: ssd_mobilenetv1_fpn, shufflenetv1, tinyDarkNet, CNN-CTC, unet++, DeepText, SqueezeNet
  • [STABLE] Add NLP models on GPU: GRU, GNMT2, Bert-Squad
  • [STABLE] Add recommand models on GPU: NCF
  • [BETA] Add CV models on GPU: FaceAttribute, FaceDetection, FaceRecongnition SENet,
  • [BETA] Add Audio models on GPU: DeepSpeech2
  • [STABLE]model_zoo has been seperated to an individual repositorymodels

FrontEnd

  • [STABLE] Supportwhile andbreak,continue statements of training network inGRAPH_MODE.
  • [BETA] Support export MindIR file after model training in cloud side and evaluate in edge side by import the MindIR file.
  • [STABLE] Support forward mode auto-diff interface Jvp(Jacobian-Vector-Product).
  • [STABLE] Support backward mode auto-diff interface Vjp(Vector-Jacobian-Product).

Auto Parallel

  • [STABLE] Support distributed pipeline inference.
  • [STABLE] Add implementation of the sparse attention and its distributed operator.
  • [STABLE] Add implementations of distributed operator of Conv2d/Conv2dTranspose/Conv2dBackpropInput/Maxpool/Avgpool/Batchnorm/Gatherd.
  • [STABLE] Support configuring the dataset strategy on distributed training and inference mode.
  • [STABLE] Add high level API of the Transformer module.

Executor

  • [STABLE] Support AlltoAll operator.
  • [STABLE] CPU operator (Adam) performance optimization increased by 50%.
  • [BETA] Support Adam offload feature, reduce the static memory usage of Pangu large model by 50%.
  • [STABLE] MindSpore Ascend backend supports configuration operator generation and loading cache path.
  • [STABLE] MindSpore Ascend backend supports lazy build in PyNaitve mode and compilation performance improved by 10 times.
  • [STABLE] The function or Cell decorated by ms_function supports gradient calculation in PyNative mode.
  • [STABLE] The outermost network supports parameters of non tensor type in PyNative mode.

DataSet

  • [BETA] Add a new method for class Model to support auto data preprocessing in scenario of Ascend 310 inference.
  • [STABLE] Add a new drawing tool to visualize detection/segmentation datasets.
  • [STABLE] Support a new tensor operaiton named ConvertColor to support color space transform of images.
  • [STABLE] Enhance the following tensor operations to handle multiple columns simultaneously: RandomCrop, RandomHorizontalFlip, RandomResize, RandomResizedCrop, RandomVerticalFlip.
  • [STABLE] Support electromagnetic simulation dataset loading and data augmentation.
  • [STABLE] Optimze the error logs of Dataset to make them more friendly to users.

Federated Learning

Running Data Recorder

  • [STABLE] RDR saves collected data files within directories named by Rank ID on distributed training on Ascend, GPU and CPU.

GraphKernel Fusion

API Change

Backwards Incompatible Change

Python API
New Recomputation Configuration for AutoParallel and SemiAutoParallel Scenarios

Configuring the recomputation of the communication operations generated by the model parallel and optimizer parallel to save the memory on the
devices. Users can pass mp_comm_recompute and parallel_optimizer_comm_recompute to enable the recomputation of the communication operations.

Bug fixes

FrontEnd

Executor

Dataset

MindSpore Lite

Major Features and Improvements

Converter and runtime

  1. Optimize TDNN-like streaming model by reusing the result of last inference.
  2. Support dynamic filter Convolution.
  3. Support serializing float32 weight into float16 weight for reducing size of model file.
  4. Provide unified runtime API for developer reusing their code between cloud side and end side.
  5. Now developer can configure build-in pass as custom passes.
  6. Now user can specify format and shape of model inputs while converting model.
  7. Support multiple devices inference, includeing CPU, NPU, GPU. User can set devices in mindspore::Context.
  8. Support mixed precision inference. User can set inference precision by LoadConfig API.
  9. Support custom operator registration and enable inference on third-party hardware.

ARM backend optimization

  1. Support the nchw data format of some Operators, such as Conv, InstanceNorm, etc. The performance of some models convertered from onnx and caffe is greatly improved.
  2. Fix bugs of memory leak on NPU.

Post quantization

  1. Weight quantization supports mixed bit quantization.
  2. Full quantization supports data pre-processing.
  3. Adjust the quantization parameters from the command line to the configuration file.

Training on Device

  1. Unify lite external api with MindSpore.
  2. Implement static memory allocator and common workspace for TOD,save memory 10-20%.
  3. Provide getgradients and setgradients interface,get and set optimizer params interfaces to support MOE Model.
  4. Support user specified output node when export IOD Model.
  5. Support more text networks (tinybert,albert) and operators.

Codegen

  1. Support kernel register for custom op. Third-party hardware like NNIE can be accessed through it.

API Change

API Incompatible Change

C++ API
Python
1
https://git.oschina.net/mindspore/mindspore.git
git@git.oschina.net:mindspore/mindspore.git
mindspore
mindspore
mindspore

Search

161121 f78d6d6f 1850385 154831 86f8c370 1850385