DiCE: The Infinitely Differentiable Monte-Carlo Estimator
- Submitting institution
-
University College London
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 14498
- Type
- E - Conference contribution
- DOI
-
-
- Title of conference / published proceedings
- Proceedings of the 35th International Conference on Machine Learning
- First page
- 1529
- Volume
- -
- Issue
- -
- ISSN
- 2640-3498
- Open access status
- Compliant
- Month of publication
- July
- Year of publication
- 2018
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
5
- Research group(s)
-
-
- Citation count
- -
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- This is a principled approach for constructing any-order gradient estimators and a proof the correctness of this method. This approach has become the de-facto standard for gradient estimation in popular deep learning libraries such as TensorFlow and Pyro, and it has paved the way for simplifying the modeling of multi-agent reinforcement learning (see for example Letcher et al. “Stable Opponent Shaping in Differentiable Games.” ICLR 2019 and meta-learning (see for example Rothfuss et al. “ProMP: Proximal Meta-Policy Search.” ICLR 2019).
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -