PMID- 32881705 OWN - NLM STAT- PubMed-not-MEDLINE LR - 20201223 IS - 2168-2275 (Electronic) IS - 2168-2267 (Linking) VI - 51 IP - 1 DP - 2021 Jan TI - Large-Scale Traffic Signal Control Using a Novel Multiagent Reinforcement Learning. PG - 174-187 LID - 10.1109/TCYB.2020.3015811 [doi] AB - Finding the optimal signal timing strategy is a difficult task for the problem of large-scale traffic signal control (TSC). Multiagent reinforcement learning (MARL) is a promising method to solve this problem. However, there is still room for improvement in extending to large-scale problems and modeling the behaviors of other agents for each individual agent. In this article, a new MARL, called cooperative double Q -learning (Co-DQL), is proposed, which has several prominent features. It uses a highly scalable independent double Q -learning method based on double estimators and the upper confidence bound (UCB) policy, which can eliminate the over-estimation problem existing in traditional independent Q -learning while ensuring exploration. It uses mean-field approximation to model the interaction among agents, thereby making agents learn a better cooperative strategy. In order to improve the stability and robustness of the learning process, we introduce a new reward allocation mechanism and a local state sharing method. In addition, we analyze the convergence properties of the proposed algorithm. Co-DQL is applied to TSC and tested on various traffic flow scenarios of TSC simulators. The results show that Co-DQL outperforms the state-of-the-art decentralized MARL algorithms in terms of multiple traffic metrics. FAU - Wang, Xiaoqiang AU - Wang X FAU - Ke, Liangjun AU - Ke L FAU - Qiao, Zhimin AU - Qiao Z FAU - Chai, Xinghua AU - Chai X LA - eng PT - Journal Article DEP - 20201222 PL - United States TA - IEEE Trans Cybern JT - IEEE transactions on cybernetics JID - 101609393 SB - IM EDAT- 2020/09/04 06:00 MHDA- 2020/09/04 06:01 CRDT- 2020/09/04 06:00 PHST- 2020/09/04 06:00 [pubmed] PHST- 2020/09/04 06:01 [medline] PHST- 2020/09/04 06:00 [entrez] AID - 10.1109/TCYB.2020.3015811 [doi] PST - ppublish SO - IEEE Trans Cybern. 2021 Jan;51(1):174-187. doi: 10.1109/TCYB.2020.3015811. Epub 2020 Dec 22.