Write For Us

2019 09 19 Stuart Armstrong Research Agenda Online Talk

E-Commerce Solutions SEO Solutions Marketing Solutions
154 Views
Published
Stuart Armstrong talks about the No Free Lunch result in value learning (you cannot deduce the preferences of a potentially irrational agent by observing its behaviour; and simplicity doesn't help), how this connects with humans' theory of mind, and sketches out his research agenda for learning human preferences despite this impossibility result.

Relevant links: "Occam's razor is insufficient to infer the preferences of irrational agents" https://arxiv.org/abs/1712.05812

"Research Agenda v0.9: Synthesising a human's preferences into a utility function" https://www.lesswrong.com/posts/CSEdLLEkap2pubjof/research-agenda-v0-9-synthesising-a-human-s-preferences-into
Category
Academic
Sign in or sign up to post comments.
Be the first to comment