返回列表 发帖

ELF:精彩仍将继续-采访“退役”后的ELF OpenGo

本帖最后由 天马行空 于 2018-9-8 13:19 编辑

2018年9月7日   弈客头条


1. ELF目前达到了一个什么水平?和AlphaGo Zero 相比,谁更强?

1. What level has ELF reached now? Who is stronger comparing with AlphaGo Zero?

ELF OpenGo大致达到了AlphaGo Zero 3天版本的水平。在此之外没有直接比较的话很难说。

We believe ELF OpenGo has reached AlphaGo Zero 3-day version's level. Beyond that it is hard to say without a direct comparison.

2. 现在很多AI的研发训练都使用了ELF的棋谱,权重,棋力因此得到了大量的提升,作为数据开源者您怎么看这件事?

2.Now, a lot of new AI making great progress, as a result of using the ELF weight and Sgfs to train their AI. As the data sources, what do you think about it?

这是我们开源的初衷。我们坚持让AI为世界服务,另外,对于AlphaGo Zero这样优秀的算法,我们希望提供一个可复现的实现,以用于其他研究,并且可以让全世界其他研究者在这上面发挥创造力,改进算法。

This is exactly the purpose of open sourcing. AI should benefit the community and the world. Moreover, we would like to have a reliable and reproducible implementation of AlphaGo Zero, such that we can use if for other research and provide a baseline for researchers over the world to work on and improve it.

3. 选择只用20block的神经网络而不是更深的网络,主要的考虑因素是?

3. Why you choose only 20 block neural network instead of a deeper network? what is the main consideration about this choice.

我们主要考虑能让模型能在单卡上很好的运行,这样大家比较容易使用。更深的网络对于硬件要求更高。将来我们也可能发布更深的网络。

Our primary consideration is that our work should be accessible to a wide audience. By starting with an inexpensive model architecture, we enable those with mass-consumer hardware to take advantage of ELF. In the future, we might release additional deeper models.

4. 传说ELF为什么在训练时选择输入黑棋白棋获胜各半的棋谱,这样的“平衡性”考虑是什么?

4.It is said that ELF select and input Sgfs with black /white win 50-50 when training. If this is true, what is this "balance" considering about?

这个设计主要是为了训练初期防止出现过拟合而陷入一个黑碾压白或者白碾压黑的局部解。这样模型将不会继续进步。

This is by design to avoid overfitting to a local solution in the early stage of training, in which one side always wins. Such overfitting will prevent the model from improving.

5. ELF最新的权重显示黑棋开局胜率更高?这和其他AI的判断不一样,这令我们非常惊讶,能谈一下个中的缘由吗?

5. The latest weight of the ELF shows that black have a higher winning chance at the start, which is is different to other AI's judgment. This judgement is a great surprise to us, can you talk about the reason?

这跟上一条紧密联系。

We think #4 is the main reason for this.

6. 在退役棋坛之后,ELF的研究成果会在其他方面得到应用吗?

6. After the retirement, how can the ELF's research results be applied in other ways?

ELF是一个通用的强化学习平台,并实现了不少基线算法。我们将会在其他博弈类和战略类游戏上做更多研究。

ELF is a general reinforcement learning platform which implements some common baselines. We will do more research with ELF on other competitive/strategy games.

7. ELF相比其他AI,胜率的波动似乎更为激烈,这样的理由是什么?

7. compared with other AI, ELF's win rate change seems to be more fierce, what is the reason for this?

这会使AI在比较不同招法时更加敏感从而选出更好的下法。理论上来说,越接近围棋之神,同一局面下不同下法之间的差别应该越大。

This will make the AI more sensitive to the quality differences of the moves, and choose the best one. In theory, the better the model, the more difference between different moves from the same situation.

8. 现在的AI水平都非常高,远超人类棋手,但是仍然会出现了“不识征子”的低级错误,这里的技术困难是什么?

8. After reached a level far higher than human now, there still will be some low-level mistakes such as “ladder”. what is the technical difficulty .

征子的手数很多并且每步都要求下的十分准确,并且结果与棋盘另一端的细节紧密相关。而现在AI所用的蒙特卡洛树搜索带有一定的随机性。在征子之前,AI会考虑全盘的招法,所以投入在征子的计算较少,如果引征情况复杂有可能失误。在较少算力下问题更明显一些。

A ladder represents an exact, fairly long sequence of moves, and the presence of one is globally relevant across the game board. However, MCTS is a randomized algorithm. Before the ladder occurs, the AI will consider moves from the whole board and only a small fraction of rollouts will go to the ladder branch. If the ladder situation is complicated, the rollouts might not consider a sufficiently long sequence of moves to detect the ladder. This is more obvious if the rollout number is low.

9. 3年前facebook的黑暗森林出现江湖,当时的棋力表现并没有达到一流,而现在ELF一鸣惊人,三年来,团队一定经历了很多,请向弈客棋迷介绍一下艰苦研发的心路历程。

9. Three years ago, facebook's dark forests came on the stage but did not reach the expectation, and now after three years with the ELF, the team must be experienced a lot. Please introduce something about the development to us.

首先要确认的是,DarkForest和这次的ELF OpenGo并非一个连续的项目,要论科研上的创新性,当时的DarkForest肯定是高于这次的OpenGo的。我们在2015年5月开始就已经开始了围棋AI的研究,在那时没人看好电脑围棋这个方向,甚至可以说成为大家的笑料(2015年的人机比赛,连笑让AI六子还可以获胜,当时的舆论都认为围棋AI还得要等几十年甚至百年)。但是FAIR开放的研究氛围让我们可以进行探索,并且有了DarkForest这个项目。需要强调的是,DarkForest那时的表现早已超过了当时的预期,2015年11月我们放出了论文,接受了一些媒体的采访,与当时花费十年功夫磨出来的Zen不相上下,只是因为2016年1月AlphaGo的横空出世,DarkForest才相形见绌。之后由于计算资源的限制和研究兴趣的转变,我们决定不再跟进。这期间我们发了一些强化学习其它方向的文章,包括获得2016年Doom AI比赛一个分项的冠军,基于语义的室内3D环境导航,ELF框架的设计,训练全盘即时战略AI的尝试,还有深度学习的一些理论分析,等等。一直到去年10月AlphaGoZero的文章出来之后,觉得围棋那么复杂的游戏能从零开始学习很有意思,复现它在科研上有价值。今年1月份开始ELFOpenGo才启动,当然,出于效率考虑,OpenGo复用了之前DarkForest的很多代码。
FAIR作为一个科研机构,自始至终并没有一个稳定长期的工程团队去维护围棋这个项目。我们的目的一直是从科研角度去考虑有没有新思路和新发现,而不是在现有基础上去追求更强的AI,这也是我们开源ELF OpenGo的源动力。

First of all, Darkforest and ELF Opengo is not a continuous project. In terms of scientific novelty, Darkforest has a higher impact. We started the project on Go AI in May 2015. At that time, nobody thought this would work. The result was even a bit embarrassing – Lian Xiao would win against the top AI with 6 handicaps, and most people thought it would take a few decades for AI to beat the top human. However, FAIR's open research strategies allowed us to start working on Go, and actually Darkforest is beyond our expectations. We published the paper in Nov 2015, and the AI was as strong as Zen, who took 10 years of team effort. It wasn't until Jan 2016 that AlphaGo came out. Afterwards, we decided to put aside this project due to resource limitations and research interest changes.
Since 2016, we have been working on reinforcement learning. We have won 2016 Doom AI (FPS) championship, worked on natural language-based house navigation, designed ELF platform for reinforcement learning, trained real-time strategy AI, and worked on deep learning theories. In Oct 2017 AlphaGo Zero paper came out and we considered it interesting that such a complicated game could be learned from scratch. It would be impactful to reproduce it scientifically. We started Opengo project in Jan 2018. Of course, Opengo reused a lot of code from Darkforest to be more effiecient.
FAIR is a research lab, without a dedicated engineering team to maintain the Go project. Our primary tasks are always focusing on new scientific discoveries and research breakthroughs. We do not try to get the best AI possible. This is also the motivation for open sourcing the project.

10.ELF(Extensive,LightWeight,Flexible) 的命名很有意思,体现了产品的理念,能否解释一下产品设计中对这三个层面的具体实现方式吗?

10. The name "ELF"(Extensive,LightWeight,Flexible) sounds interesting and shows the design concept. Can you explain the implementation of this product concpets in details?

ELF是一篇2017年发表在顶级人工智能会议NIPS的论文,最初做的是即时战略游戏,但是由于是通用平台,我们也用它实现了围棋的算法。文章链接:https://arxiv.org/pdf/1707.01067.pdf
Extensive 可扩展性:平台本身支持各种设定,例如不完全信息,长期回报,并行处理,模拟真实世界等。
Lightweight 轻量: 框架优化,速度非常快,能每分钟收集成千上万的经验。
Flexible 灵活性:平台环境很容易更改优化,参数和模式框架的调整都很方便,有利于新算法的研究。

ELF is a paper published at NIPS 2017. It was originally applied to RTS games, but it is a sufficiently general platform to handle diverse usecases such as Go. Here is the link to the paper: https://arxiv.org/pdf/1707.01067.pdf
Extensive: the platform supports rich dynamics such as imperfect information, long-term rewards, concurrency, and can simulate the real world.
Lightweight: the platform is optimized so that it can collect large number of experiences in a short time.
Flexible: the platform is easily customizable with a rich choice of environments. Moreover, easy manipulation of parameters and model architecture accelerates RL research.

采访前,惴惴不安,坊间流传“ELF”封刀,就此退出江湖,令人实在不舍。通过这次采访,才知晓原来精彩还会继续,期待ELF在不远的将来推出更深的网络,开源围棋人工智能的世界,有你更精彩!

以下为“ELF”相关报道:

围棋软件的开源,怪物级围棋AI变得无处不在

Facebook发布开源围棋AI

美国围棋大会开幕 400余人的嘉年华
附件: 您需要登录才可以下载或查看附件。没有帐号?注册

返回列表