What type of PR is this?
/kind feature

What does this PR do / why do we need it:
The PR redesigned the bijector class and made two major changes to the class:

  1. added a batch_shape to the bijector class. Now, transformed_distribution's broadcast_shape is calculated as the broadcast between the bijetor and the distribution.
  2. redesigned bijector's dtype logic, added restrictions to the dtype of the input values and the parameter(s) of the bijector.

Which issue(s) this PR fixes:

Fixes #

Special notes for your reviewers: