ERNIE-GEN is a multi-flow language generation framework for both pre-training and fine-tuning. Only finetune strategy is illustrated in this section.
We use Abstractive Summarization task CNN/DailyMail to illustate usage of ERNIE-GEN, you can download preprocessed finetune data from here
To starts finetuning ERNIE-GEN, run:
python3 -m paddle.distributed.launch \
--log_dir ./log \
./demo/seq2seq/finetune_seq2seq_dygraph.py \
--from_pretrained ernie-gen-base-en \
--data_dir ./data/cnndm \
--save_dir ./model_cnndm \
--label_smooth 0.1 \
--use_random_noice \
--noise_prob 0.7 \
--predict_output_dir ./pred \
--max_steps $((287113*30/64))
Note that you need more than 2 GPUs to run the finetuning.
During multi-gpu finetuning, max_steps
is used as stop criteria rather than epoch
to prevent dead block.
We simply canculate max_steps
with: EPOCH * NUM_TRIAN_EXAMPLE / TOTAL_BATCH
.
This demo script will save a finetuned model at --save_dir
, and do muti-gpu prediction every --eval_steps
and save prediction results at --predict_output_dir
.
While finetuning, a serials of prediction files is generated. First you need to sort and join all files with:
sort -t$'\t' -k1n ./pred/pred.step60000.* |awk -F"\t" '{print $2}'> final_prediction
then use ./eval_cnndm/cnndm_eval.sh
to calcuate all metrics
(pyrouge
is required to evalute CNN/Daily Mail.)
sh cnndm_eval.sh final_prediction ./data/cnndm/dev.summary
To run beam serach decode after you got a finetuned model. try:
cat one_column_source_text| python3 demo/seq2seq/decode.py \
--from_pretrained ./ernie_gen_large \
--save_dir ./model_cnndm \
--bsz 8
此处可能存在不合适展示的内容,页面不予展示。您可通过相关编辑功能自查并修改。
如您确认内容无涉及 不当用语 / 纯广告导流 / 暴力 / 低俗色情 / 侵权 / 盗版 / 虚假 / 无价值内容或违法国家有关法律法规的内容,可点击提交进行申诉,我们将尽快为您处理。