Abstract: We provide a new non-asymptotic analysis of distributed temporal difference learning with linear function approximation. Our approach relies on “one-shot averaging,” where N agents run ...
一些您可能无法访问的结果已被隐去。
显示无法访问的结果一些您可能无法访问的结果已被隐去。
显示无法访问的结果