The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. ex. Some numerals are expressed as "XNUMX".
Copyrights notice
The original paper is in English. Non-English content has been machine-translated and may contain typographical errors or mistranslations. Copyrights notice
Kertas kerja ini mencadangkan kaedah baharu untuk meningkatkan kerjasama dalam sistem serentak dalam rangka Sistem Multi-Agen (MAS) dengan menggunakan pembelajaran pengukuhan. Apabila subsistem berfungsi secara bebas dan serentak, mencapai kerjasama yang sesuai di kalangan mereka adalah penting untuk meningkatkan keberkesanan keseluruhan sistem. Melayan subsistem sebagai ejen memudahkan untuk menangani interaksi di antara mereka secara eksplisit kerana ia boleh dimodelkan secara semula jadi sebagai komunikasi antara ejen dengan maklumat yang dimaksudkan. Dalam pendekatan kami, ejen cuba mempelajari keseimbangan yang sesuai antara penerokaan dan eksploitasi melalui ganjaran, yang penting dalam penyelesaian masalah teragih dan serentak secara umum. Dengan memberi tumpuan kepada cara memberi ganjaran dalam pembelajaran peneguhan, bukan persamaan pembelajaran, dua jenis ganjaran ditakrifkan dalam konteks kerjasama antara ejen, berbeza dengan pembelajaran pengukuhan dalam rangka kerja ejen tunggal. Dalam pendekatan kami ganjaran untuk desakan oleh ejen individu menyumbang untuk memudahkan penerokaan dan ganjaran untuk konsesi kepada ejen lain menyumbang kepada memudahkan eksploitasi. Kaedah kerjasama kami telah diteliti melalui uji kaji reka bentuk satelit mikro dan hasilnya menunjukkan ianya berkesan sedikit sebanyak memudahkan kerjasama di kalangan ejen dengan membiarkan ejen sendiri mempelajari keseimbangan yang sesuai antara desakan dan konsesi. Hasilnya juga mencadangkan kemungkinan menggunakan magnitud relatif ganjaran ini sebagai parameter kawalan baharu dalam MAS untuk mengawal tingkah laku keseluruhan MAS.
The copyright of the original papers published on this site belongs to IEICE. Unauthorized use of the original or translated papers is prohibited. See IEICE Provisions on Copyright for details.
Salinan
Tetsuya YOSHIDA, Koichi HORI, Shinichi NAKASUKA, "Learning the Balance between Exploration and Exploitation via Reward" in IEICE TRANSACTIONS on Fundamentals,
vol. E82-A, no. 11, pp. 2538-2545, November 1999, doi: .
Abstract: This paper proposes a new method to improve cooperation in concurrent systems within the framework of Multi-Agent Systems (MAS) by utilizing reinforcement learning. When subsystems work independently and concurrently, achieving appropriate cooperation among them is important to improve the effectiveness of the overall system. Treating subsystems as agents makes it easy to explicitly deal with the interactions among them since they can be modeled naturally as communication among agents with intended information. In our approach agents try to learn the appropriate balance between exploration and exploitation via reward, which is important in distributed and concurrent problem solving in general. By focusing on how to give reward in reinforcement learning, not the learning equation, two kinds of reward are defined in the context of cooperation between agents, in contrast to reinforcement learning within the framework of single agent. In our approach reward for insistence by individual agent contributes to facilitating exploration and reward for concession to other agents contributes to facilitating exploitation. Our cooperation method was examined through experiments on the design of micro satellites and the result showed that it was effective to some extent to facilitate cooperation among agents by letting agents themselves learn the appropriate balance between insistence and concession. The result also suggested the possibility of utilizing the relative magnitude of these rewards as a new control parameter in MAS to control the overall behavior of MAS.
URL: https://global.ieice.org/en_transactions/fundamentals/10.1587/e82-a_11_2538/_p
Salinan
@ARTICLE{e82-a_11_2538,
author={Tetsuya YOSHIDA, Koichi HORI, Shinichi NAKASUKA, },
journal={IEICE TRANSACTIONS on Fundamentals},
title={Learning the Balance between Exploration and Exploitation via Reward},
year={1999},
volume={E82-A},
number={11},
pages={2538-2545},
abstract={This paper proposes a new method to improve cooperation in concurrent systems within the framework of Multi-Agent Systems (MAS) by utilizing reinforcement learning. When subsystems work independently and concurrently, achieving appropriate cooperation among them is important to improve the effectiveness of the overall system. Treating subsystems as agents makes it easy to explicitly deal with the interactions among them since they can be modeled naturally as communication among agents with intended information. In our approach agents try to learn the appropriate balance between exploration and exploitation via reward, which is important in distributed and concurrent problem solving in general. By focusing on how to give reward in reinforcement learning, not the learning equation, two kinds of reward are defined in the context of cooperation between agents, in contrast to reinforcement learning within the framework of single agent. In our approach reward for insistence by individual agent contributes to facilitating exploration and reward for concession to other agents contributes to facilitating exploitation. Our cooperation method was examined through experiments on the design of micro satellites and the result showed that it was effective to some extent to facilitate cooperation among agents by letting agents themselves learn the appropriate balance between insistence and concession. The result also suggested the possibility of utilizing the relative magnitude of these rewards as a new control parameter in MAS to control the overall behavior of MAS.},
keywords={},
doi={},
ISSN={},
month={November},}
Salinan
TY - JOUR
TI - Learning the Balance between Exploration and Exploitation via Reward
T2 - IEICE TRANSACTIONS on Fundamentals
SP - 2538
EP - 2545
AU - Tetsuya YOSHIDA
AU - Koichi HORI
AU - Shinichi NAKASUKA
PY - 1999
DO -
JO - IEICE TRANSACTIONS on Fundamentals
SN -
VL - E82-A
IS - 11
JA - IEICE TRANSACTIONS on Fundamentals
Y1 - November 1999
AB - This paper proposes a new method to improve cooperation in concurrent systems within the framework of Multi-Agent Systems (MAS) by utilizing reinforcement learning. When subsystems work independently and concurrently, achieving appropriate cooperation among them is important to improve the effectiveness of the overall system. Treating subsystems as agents makes it easy to explicitly deal with the interactions among them since they can be modeled naturally as communication among agents with intended information. In our approach agents try to learn the appropriate balance between exploration and exploitation via reward, which is important in distributed and concurrent problem solving in general. By focusing on how to give reward in reinforcement learning, not the learning equation, two kinds of reward are defined in the context of cooperation between agents, in contrast to reinforcement learning within the framework of single agent. In our approach reward for insistence by individual agent contributes to facilitating exploration and reward for concession to other agents contributes to facilitating exploitation. Our cooperation method was examined through experiments on the design of micro satellites and the result showed that it was effective to some extent to facilitate cooperation among agents by letting agents themselves learn the appropriate balance between insistence and concession. The result also suggested the possibility of utilizing the relative magnitude of these rewards as a new control parameter in MAS to control the overall behavior of MAS.
ER -