ABOUT THE SPEAKER
Stuart Russell - AI expert
Stuart Russell wrote the standard text on AI; now he thinks deeply on AI's future -- and the future of us humans, too.

Why you should listen

Stuart Russell is a professor (and formerly chair) of Electrical Engineering and Computer Sciences at University of California at Berkeley. His book Artificial Intelligence: A Modern Approach (with Peter Norvig) is the standard text in AI; it has been translated into 13 languages and is used in more than 1,300 universities in 118 countries. His research covers a wide range of topics in artificial intelligence including machine learning, probabilistic reasoning, knowledge representation, planning, real-time decision making, multitarget tracking, computer vision, computational physiology, global seismic monitoring and philosophical foundations.

He also works for the United Nations, developing a new global seismic monitoring system for the nuclear-test-ban treaty. His current concerns include the threat of autonomous weapons and the long-term future of artificial intelligence and its relation to humanity.

More profile about the speaker
Stuart Russell | Speaker | TED.com
TED2017

Stuart Russell: 3 principles for creating safer AI

斯圖爾特 · 羅素: 人工智慧如何讓我們更美好

Filmed:
1,465,832 views

我們要如何駕馭超級人工智慧,並同時避免類似「被機器人全面掌控」的災難發生?當我們還在為創造出全能機器人而努力時,人工智慧專家斯圖爾特 · 羅素( Stuart Russell)已經開始投入不同的領域:機器人內在的不確定性。來聽聽他對能與人類和諧共處的 AI 的願景,這些 AI 通過常識、利他主義以及其他人類價值觀來解決問題。
- AI expert
Stuart Russell wrote the standard text on AI; now he thinks deeply on AI's future -- and the future of us humans, too. Full bio

Double-click the English transcript below to play the video.

這是李世石。
00:12
This is Lee背風處 SedolSedol.
0
712
1552
00:14
Lee背風處 SedolSedol is one of the world's世界
greatest最大 Go players玩家,
1
2288
3997
李世石是全世界
頂尖圍棋高手之一,
00:18
and he's having what my friends朋友
in Silicon Valley call
2
6309
2885
此時,他正在經歷的是
我的矽谷朋友們稱之為
00:21
a "Holy Cow" moment時刻 --
3
9218
1510
「我的媽呀!」的時刻......
00:22
(Laughter笑聲)
4
10752
1073
(笑聲)
00:23
a moment時刻 where we realize實現
5
11849
2188
在這一刻讓我們意識到,
00:26
that AIAI is actually其實 progressing進展
a lot faster更快 than we expected預期.
6
14061
3296
原來人工智慧發展的進程
比我們預期的要快得多。
00:30
So humans人類 have lost丟失 on the Go board.
What about the real真實 world世界?
7
18154
3047
人類已在圍棋博弈中落敗,
那現實世界中情況又如何?
00:33
Well, the real真實 world世界 is much bigger,
8
21225
2100
當然啦,現實世界要比棋盤
廣闊、複雜得多,
00:35
much more complicated複雜 than the Go board.
9
23349
2249
00:37
It's a lot less visible可見,
10
25622
1819
它也遠不如棋盤上那麽黑白分明,
00:39
but it's still a decision決定 problem問題.
11
27465
2038
但仍然是個判定問題
(Decision Problem)。
00:42
And if we think about some
of the technologies技術
12
30948
2321
如果我們思考一些
即將問世的新科技……
00:45
that are coming未來 down the pike梭子魚 ...
13
33293
1749
00:47
Noriko紀子 [Arai新井] mentioned提到 that reading
is not yet然而 happening事件 in machines,
14
35738
4335
新井紀子提到機器仍無法 「閱讀」,
至少無法真正理解文本含義。
00:52
at least最小 with understanding理解.
15
40097
1500
00:53
But that will happen發生,
16
41621
1536
但這項能力最終會被機器掌握,
00:55
and when that happens發生,
17
43181
1771
而當這一切發生時,
00:56
very soon不久 afterwards之後,
18
44976
1187
不久之後,
00:58
machines will have read everything
that the human人的 race種族 has ever written書面.
19
46187
4572
機器就能讀遍所有人類寫下的東西。
01:03
And that will enable啟用 machines,
20
51850
2030
這會讓機器擁有比人類
更深刻的遠見和洞察力。
01:05
along沿 with the ability能力 to look
further進一步 ahead than humans人類 can,
21
53904
2920
01:08
as we've我們已經 already已經 seen看到 in Go,
22
56848
1680
就如我們在這場圍棋博弈中所見,
01:10
if they also have access訪問
to more information信息,
23
58552
2164
如果機器能接觸到比人類更多的信息,
01:12
they'll他們會 be able能夠 to make better decisions決定
in the real真實 world世界 than we can.
24
60740
4268
那機器將能夠在現實世界中
做出比人類更好的決策。
01:18
So is that a good thing?
25
66792
1606
那這會是一件好事嗎?
01:21
Well, I hope希望 so.
26
69898
2232
我當然希望如此。
01:26
Our entire整個 civilization文明,
everything that we value,
27
74694
3255
人類的全部文明,
我們所珍視的一切,
01:29
is based基於 on our intelligence情報.
28
77973
2068
都是基於我們的智慧。
01:32
And if we had access訪問
to a lot more intelligence情報,
29
80065
3694
如果我們能獲得更強大的智慧,
01:35
then there's really no limit限制
to what the human人的 race種族 can do.
30
83783
3302
那人類將無所不能了。
01:40
And I think this could be,
as some people have described描述 it,
31
88665
3325
我在想,到時後就像
一些人所描述的那樣,
01:44
the biggest最大 event事件 in human人的 history歷史.
32
92014
2016
這會是人類歷史上最重要的事件。
01:48
So why are people saying things like this,
33
96665
2829
那為什麽有的人會說出
以下的言論呢?
01:51
that AIAI might威力 spell拼寫 the end結束
of the human人的 race種族?
34
99518
2876
說人工智慧將是人類的末日呢?
01:55
Is this a new thing?
35
103438
1659
這是新鮮事嗎?
01:57
Is it just Elon伊隆 Musk and Bill法案 Gates蓋茨
and Stephen斯蒂芬 Hawking霍金?
36
105121
4110
這僅僅只是伊隆馬斯克、比爾蓋茲、
史蒂芬霍金的新發明嗎?
02:01
Actually其實, no. This idea理念
has been around for a while.
37
109953
3262
實際上不是,這個概念
已經存在很長的時間了。
02:05
Here's這裡的 a quotation行情:
38
113239
1962
請看這段話:
02:07
"Even if we could keep the machines
in a subservient奴顏婢膝 position位置,
39
115225
4350
「即便我們能讓機器屈從於我們,
02:11
for instance, by turning車削 off the power功率
at strategic戰略 moments瞬間" --
40
119599
2984
比如說,在重要時刻關掉它。」
02:14
and I'll come back to that
"turning車削 off the power功率" idea理念 later後來 on --
41
122607
3237
我等會兒會再來討論
「關機」這一話題。
02:17
"we should, as a species種類,
feel greatly非常 humbled自愧不如."
42
125868
2804
「我們作為人類,仍應懷着謙卑......」
這段話是誰說的呢?
是艾倫 · 圖靈在 1951 年說的。
02:22
So who said this?
This is Alan艾倫 Turing圖靈 in 1951.
43
130177
3448
02:26
Alan艾倫 Turing圖靈, as you know,
is the father父親 of computer電腦 science科學
44
134300
2763
眾所皆知艾倫 · 圖靈是計算機科學之父,
02:29
and in many許多 ways方法,
the father父親 of AIAI as well.
45
137087
3048
並且從很多方面來講,
他也是人工智慧之父。
02:33
So if we think about this problem問題,
46
141239
1882
所以,當我們在思考「創造出
比自己更聰明的物種」這個問題時,
02:35
the problem問題 of creating創建 something
more intelligent智能 than your own擁有 species種類,
47
143145
3787
02:38
we might威力 call this "the gorilla大猩猩 problem問題,"
48
146956
2622
我們不妨將它稱為「大猩猩問題」。
02:42
because gorillas'大猩猩“ ancestors祖先 did this
a few少數 million百萬 years年份 ago,
49
150345
3750
因為大猩猩的祖先們
在幾百萬年前就親歷此境,
02:46
and now we can ask the gorillas大猩猩:
50
154119
1745
我們可以去問大猩猩們:
02:48
Was this a good idea理念?
51
156752
1160
「這是不是一個好主意?」
02:49
So here they are having a meeting會議
to discuss討論 whether是否 it was a good idea理念,
52
157936
3530
圖片中,牠們正在開會討論
那麽做是不是一個好主意,
02:53
and after a little while,
they conclude得出結論, no,
53
161490
3346
過了一會兒,牠們總結出:「不。」
02:56
this was a terrible可怕 idea理念.
54
164860
1345
這是個很爛的主意──
02:58
Our species種類 is in dire可怕的 straits海峽.
55
166229
1782
作為靈長類的我們正岌岌可危。
03:00
In fact事實, you can see the existential存在
sadness in their eyes眼睛.
56
168538
4263
你可以從牠們的眼神中
看到存亡攸關的憂傷。
03:04
(Laughter笑聲)
57
172825
1640
(笑聲)
03:06
So this queasy動盪 feeling感覺 that making製造
something smarter聰明 than your own擁有 species種類
58
174489
4840
「創造出比你自己更聰明的物種
並不是什麽妙計」
03:11
is maybe not a good idea理念 --
59
179353
2365
這種感覺很倒胃口。
03:14
what can we do about that?
60
182488
1491
那我們能做些什麽呢?
03:16
Well, really nothing,
except stop doing AIAI,
61
184003
4767
其實,除非停止人工智慧的研究,
否則束手無策。
03:20
and because of all
the benefits好處 that I mentioned提到
62
188794
2510
因為我所提到的人工智慧的各種裨益,
03:23
and because I'm an AIAI researcher研究員,
63
191328
1716
也因為我是人工智慧的研究人員,
03:25
I'm not having that.
64
193068
1791
我可不同意就此止步。
03:27
I actually其實 want to be able能夠
to keep doing AIAI.
65
195283
2468
實際上,我想一直研究人工智慧。
03:30
So we actually其實 need to nail down
the problem問題 a bit more.
66
198615
2678
所以我們需要更加明確問題所在。
03:33
What exactly究竟 is the problem問題?
67
201317
1371
這個問題到底是什麽呢?
03:34
Why is better AIAI possibly或者 a catastrophe災難?
68
202712
3246
為什麽更強大的人工智慧
可能會是個災難呢?
03:39
So here's這裡的 another另一個 quotation行情:
69
207398
1498
還有一句名言:
03:41
"We had better be quite相當 sure
that the purpose目的 put into the machine
70
209935
3335
「我們最好確保我們向機器發出的指令
與我們的真正目的相吻合。」
03:45
is the purpose目的 which哪一個 we really desire慾望."
71
213294
2298
03:48
This was said by Norbert諾伯特 Wiener維納 in 1960,
72
216282
3498
這句話是諾伯特 · 維納在 1960 年說的,
03:51
shortly不久 after he watched看著
one of the very early learning學習 systems系統
73
219804
4002
就在他看完一個早期的學習系統
(Learning System)之後。
03:55
learn學習 to play checkers跳棋
better than its creator創造者.
74
223830
2583
這個系統在學習如何能把
西洋棋下得比發明它的人更好。
04:00
But this could equally一樣 have been said
75
228602
2683
但如出一轍的一句話,
04:03
by King國王 Midas邁達斯.
76
231309
1167
邁達斯國王也說過。
04:05
King國王 Midas邁達斯 said, "I want everything
I touch觸摸 to turn to gold,"
77
233083
3134
他說:「我希望我觸碰的
所有東西都變成金子。」
04:08
and he got exactly究竟 what he asked for.
78
236241
2473
結果他真的獲得了點石成金的能力。
04:10
That was the purpose目的
that he put into the machine,
79
238738
2751
可以說,這就是他給機器下的指令。
04:13
so to speak說話,
80
241513
1450
04:14
and then his food餐飲 and his drink
and his relatives親戚們 turned轉身 to gold
81
242987
3444
結果他的食物、飲料
和家人都變成了金子,
04:18
and he died死亡 in misery苦難 and starvation飢餓.
82
246455
2281
最後他死於痛苦與饑餓當中。
04:22
So we'll call this
"the King國王 Midas邁達斯 problem問題"
83
250444
2341
所以我們把這類問題叫做
「邁達斯國王問題」,
04:24
of stating說明 an objective目的
which哪一個 is not, in fact事實,
84
252809
3305
這個比喻是要說明這種
不符合實際需求的 「目的」。
04:28
truly aligned對齊 with what we want.
85
256138
2413
用現代的術語來說,我們把它稱為
「價值取向不一致問題」。
04:30
In modern現代 terms條款, we call this
"the value alignment對準 problem問題."
86
258575
3253
04:37
Putting in the wrong錯誤 objective目的
is not the only part部分 of the problem問題.
87
265047
3485
「設錯了目標」不是唯一的問題,
04:40
There's another另一個 part部分.
88
268556
1152
還有其他的。
04:42
If you put an objective目的 into a machine,
89
270160
1943
如果你給機器人設了個目標,
04:44
even something as simple簡單 as,
"Fetch the coffee咖啡,"
90
272127
2448
即使簡單如「去把咖啡端來。」
04:47
the machine says to itself本身,
91
275908
1841
那機器人會對自己說:
04:50
"Well, how might威力 I fail失敗
to fetch the coffee咖啡?
92
278733
2623
「什麼會讓我無法去拿咖啡?
04:53
Someone有人 might威力 switch開關 me off.
93
281380
1580
說不定有人會把我關機;
好,那我要想辦法阻止,
04:55
OK, I have to take steps腳步 to prevent避免 that.
94
283645
2387
04:58
I will disable禁用 my 'off'“關” switch開關.
95
286056
1906
我得讓我的「關機」開關失效。
05:00
I will do anything to defend保衛 myself
against反對 interference干擾
96
288534
2959
我得盡一切可能防衛自己,
免得別人干涉我去達成
所被賦予的任務。」
05:03
with this objective目的
that I have been given特定."
97
291517
2629
05:06
So this single-minded專一 pursuit追求
98
294170
2012
這種專注的行事,以一種
極端自我保護的模式在執行,
05:09
in a very defensive防禦性 mode模式
of an objective目的 that is, in fact事實,
99
297213
2945
05:12
not aligned對齊 with the true真正 objectives目標
of the human人的 race種族 --
100
300182
2814
實際上與我們人類
想要的目標並不一致。
這就是我們面臨的問題。
05:16
that's the problem問題 that we face面對.
101
304122
1862
05:19
And in fact事實, that's the high-value高價值
takeaway帶走 from this talk.
102
307007
4767
而這就是這場演講的
核心想法,也是價值所在。
05:23
If you want to remember記得 one thing,
103
311798
2055
如果你想從這場演講中汲取什麽,
05:25
it's that you can't fetch
the coffee咖啡 if you're dead.
104
313877
2675
那你只要記得:
如果死了,就不能端咖啡了。
05:28
(Laughter笑聲)
105
316576
1061
(笑聲)
05:29
It's very simple簡單. Just remember記得 that.
Repeat重複 it to yourself你自己 three times a day.
106
317661
3829
這很簡單,記住就行了,
每天早晚覆誦三遍。
05:33
(Laughter笑聲)
107
321514
1821
(笑聲)
05:35
And in fact事實, this is exactly究竟 the plot情節
108
323359
2754
實際上,這正是電影
《2001太空漫步》的劇情。
05:38
of "2001: [A Space空間 Odyssey奧德賽]"
109
326137
2648
05:41
HALHAL has an objective目的, a mission任務,
110
329226
2090
HAL 有一個目標,一個任務,
05:43
which哪一個 is not aligned對齊
with the objectives目標 of the humans人類,
111
331340
3732
但這個目標與人類的目標不一致,
05:47
and that leads引線 to this conflict衝突.
112
335096
1810
最後導致了衝突。
幸運的是, HAL 並沒有超級智慧,
05:49
Now fortunately幸好, HALHAL
is not superintelligent超智.
113
337494
2969
05:52
He's pretty漂亮 smart聰明,
but eventually終於 Dave戴夫 outwitsoutwits him
114
340487
3587
它挺聰明的,
但還是比不過人類戴夫,
05:56
and manages管理 to switch開關 him off.
115
344098
1849
戴夫可以把 HAL 關掉。
06:01
But we might威力 not be so lucky幸運.
116
349828
1619
但我們可能就沒有這麽幸運了。
06:08
So what are we going to do?
117
356193
1592
那我們應該怎麽辦呢?
06:12
I'm trying to redefine重新定義 AIAI
118
360371
2601
我想要重新定義人工智慧,
06:14
to get away from this classical古典 notion概念
119
362996
2061
不再囿於傳統的概念:
06:17
of machines that intelligently智能
pursue追求 objectives目標.
120
365081
4567
能明智地達成目標的機器。
06:22
There are three principles原則 involved參與.
121
370712
1798
新的定義涉及三條原則。
06:24
The first one is a principle原理
of altruism利他主義, if you like,
122
372534
3289
第一個原則是利他主義原則,
06:27
that the robot's機器人 only objective目的
123
375847
3262
也就是說,機器的唯一目標
06:31
is to maximize最大化 the realization實現
of human人的 objectives目標,
124
379133
4246
就是要最大化地實現
人類的目標、人類的價值。
06:35
of human人的 values.
125
383403
1390
06:36
And by values here I don't mean
touchy-feely煽情, goody-goody偽善 values.
126
384817
3330
這種價值不是指多愁善感
或者假裝乖巧,
06:40
I just mean whatever隨你 it is
that the human人的 would prefer比較喜歡
127
388171
3787
而是指人類所嚮往、追求的生活,
無論現狀如何。
06:43
their life to be like.
128
391982
1343
06:47
And so this actually其實 violates違反 Asimov's阿西莫夫的 law
129
395364
2309
事實上,這樣就違反了艾西莫夫定律,
06:49
that the robot機器人 has to protect保護
its own擁有 existence存在.
130
397697
2329
定律裡的機器人必須維護自己的生存。
06:52
It has no interest利益 in preserving
its existence存在 whatsoever任何.
131
400050
3723
而在這條原則裡
機器對自身生存與否毫不關心。
06:57
The second第二 law is a law
of humility謙遜, if you like.
132
405420
3768
第二個原則,不妨稱之為謙遜原則。
07:01
And this turns out to be really
important重要 to make robots機器人 safe安全.
133
409974
3743
這一條對製造出安全的機器人十分重要。
07:05
It says that the robot機器人 does not know
134
413741
3142
它是指機器人不知道人類的價值是什麽,
07:08
what those human人的 values are,
135
416907
2028
07:10
so it has to maximize最大化 them,
but it doesn't know what they are.
136
418959
3178
它只知道將該價值最大化,
但卻不知道該價值究竟是什麽。
07:15
And that avoids避免 this problem問題
of single-minded專一 pursuit追求
137
423254
2626
這就避免了「追求單一目的
而不知變通」的現象。
07:17
of an objective目的.
138
425904
1212
07:19
This uncertainty不確定 turns out to be crucial關鍵.
139
427140
2172
這種不確定性就變得很重要了。
為了對我們有益,
07:21
Now, in order訂購 to be useful有用 to us,
140
429726
1639
07:23
it has to have some idea理念 of what we want.
141
431389
2731
機械就得大概明白我們想要什麽。
07:27
It obtains取得 that information信息 primarily主要
by observation意見 of human人的 choices選擇,
142
435223
5427
它要獲取這類信息,主要是
透過觀察人類的決策,
07:32
so our own擁有 choices選擇 reveal揭示 information信息
143
440674
2801
所以我們的決策會揭露
我們生活的意願,
07:35
about what it is that we prefer比較喜歡
our lives生活 to be like.
144
443499
3300
07:40
So those are the three principles原則.
145
448632
1683
所以,這三條原則,
07:42
Let's see how that applies適用
to this question of:
146
450339
2318
讓我們來看看要如何
應用到圖靈所說的問題:
07:44
"Can you switch開關 the machine off?"
as Turing圖靈 suggested建議.
147
452681
2789
「你能不能將機器關掉?」
07:49
So here's這裡的 a PRPR2 robot機器人.
148
457073
2120
這是 PR2 機器人,
07:51
This is one that we have in our lab實驗室,
149
459217
1821
這是我們實驗室裡的其中一台,
07:53
and it has a big red "off" switch開關
right on the back.
150
461062
2903
它的背面有一個大大的紅色開關。
07:56
The question is: Is it
going to let you switch開關 it off?
151
464541
2615
那問題來了:它會讓你把它關掉嗎?
07:59
If we do it the classical古典 way,
152
467180
1465
如果我們用傳統的定義製造它,
08:00
we give it the objective目的 of, "Fetch
the coffee咖啡, I must必須 fetch the coffee咖啡,
153
468669
3482
我們給它一個「去拿咖啡」的目標,
它會想:「我必須去拿咖啡,
08:04
I can't fetch the coffee咖啡 if I'm dead,"
154
472175
2580
但如果我死了,就不能拿咖啡了。」
08:06
so obviously明顯 the PRPR2
has been listening to my talk,
155
474779
3341
看來, PR2 聽過我的演講了,
08:10
and so it says, therefore因此,
"I must必須 disable禁用 my 'off'“關” switch開關,
156
478144
3753
因此它說:「我必須讓自己的開關失靈,
08:14
and probably大概 taser泰瑟槍 all the other
people in Starbucks星巴克
157
482976
2694
可能還要通過電擊把那些在
星巴克裡干擾我的人都擊暈。」
08:17
who might威力 interfere干擾 with me."
158
485694
1560
08:19
(Laughter笑聲)
159
487278
2062
(笑聲)
08:21
So this seems似乎 to be inevitable必然, right?
160
489364
2153
這無法避免,對吧?
08:23
This kind of failure失敗 mode模式
seems似乎 to be inevitable必然,
161
491541
2398
這種失敗看起來是必然的,
08:25
and it follows如下 from having
a concrete具體, definite objective目的.
162
493963
3543
因為機器人會遵循一個
十分明確的目標。
08:30
So what happens發生 if the machine
is uncertain不確定 about the objective目的?
163
498812
3144
那如果機器對目標
不那麽確定會發生什麽呢?
08:33
Well, it reasons原因 in a different不同 way.
164
501980
2127
那它的思路就不一樣了。
08:36
It says, "OK, the human人的
might威力 switch開關 me off,
165
504131
2424
它會說:「好的,人類可能會把我關掉,
08:39
but only if I'm doing something wrong錯誤.
166
507144
1866
但只有我做錯事了,才會把我關掉。
沒錯,我真的不知道什麽才是錯,
08:41
Well, I don't really know what wrong錯誤 is,
167
509747
2475
08:44
but I know that I don't want to do it."
168
512246
2044
但我知道我不該做錯的事。」
這就是第一和第二原則。
08:46
So that's the first and second第二
principles原則 right there.
169
514314
3010
「所以我應該讓人類把我關掉。」
08:49
"So I should let the human人的 switch開關 me off."
170
517348
3359
08:53
And in fact事實 you can calculate計算
the incentive激勵 that the robot機器人 has
171
521721
3956
事實上你可以推斷出機器人為了
允許讓人類關掉它所包含的動機,
08:57
to allow允許 the human人的 to switch開關 it off,
172
525701
2493
09:00
and it's directly tied to the degree
173
528218
1914
而且這與根本目標的
不確定性程度直接相關。
09:02
of uncertainty不確定 about
the underlying底層 objective目的.
174
530156
2746
09:05
And then when the machine is switched交換的 off,
175
533977
2949
當機器被關閉後,
09:08
that third第三 principle原理 comes into play.
176
536950
1805
第三條原則就起作用了。
09:10
It learns獲悉 something about the objectives目標
it should be pursuing追求,
177
538779
3062
機器開始學習它應追求的目標,
09:13
because it learns獲悉 that
what it did wasn't right.
178
541865
2533
因為它知道它剛才做的事是不對的。
09:16
In fact事實, we can, with suitable適當 use
of Greek希臘語 symbols符號,
179
544422
3570
實際上,我們可以適當地
使用些希臘字母,
09:20
as mathematicians數學家 usually平時 do,
180
548016
2131
就像數學家們經常做的那樣,
09:22
we can actually其實 prove證明 a theorem定理
181
550171
1984
直接證明這一個理論:這樣的
機器人對人類是絕對有利的。
09:24
that says that such這樣 a robot機器人
is provably可證明 beneficial有利 to the human人的.
182
552179
3553
09:27
You are provably可證明 better off
with a machine that's designed設計 in this way
183
555756
3803
可以證明如此設計出來的機器人,
對我們的生活是是有益的。
09:31
than without it.
184
559583
1246
09:33
So this is a very simple簡單 example,
but this is the first step
185
561237
2906
這個例子很簡單,
但它是我們嘗試實現
能與人類和諧共處的 AI 的第一步。
09:36
in what we're trying to do
with human-compatible與人相容 AIAI.
186
564167
3903
09:42
Now, this third第三 principle原理,
187
570657
3257
現在來看第三個原則,
09:45
I think is the one that you're probably大概
scratching搔抓 your head over.
188
573938
3112
我知道各位可能還在為
這一個原則傷腦筋。
09:49
You're probably大概 thinking思維, "Well,
you know, I behave表現 badly.
189
577074
3239
你可能會想:「你懂的,
我行為舉止比較差勁。
09:52
I don't want my robot機器人 to behave表現 like me.
190
580337
2929
我的機器人可不能被我帶壞。
09:55
I sneak潛行 down in the middle中間 of the night
and take stuff東東 from the fridge冰箱.
191
583290
3434
我有時後會大半夜偷偷摸摸地
從冰箱裡找東西吃,
09:58
I do this and that."
192
586748
1168
東瞅瞅,西摸摸。」
09:59
There's all kinds of things
you don't want the robot機器人 doing.
193
587940
2797
有各種各樣的事
你是不希望機器人去做的。
但實際上不是那樣。
10:02
But in fact事實, it doesn't
quite相當 work that way.
194
590761
2071
10:04
Just because you behave表現 badly
195
592856
2155
你行為不檢,
不代表機器人就得有樣學樣。
10:07
doesn't mean the robot機器人
is going to copy複製 your behavior行為.
196
595035
2623
它會去嘗試理解你做事的動機,
10:09
It's going to understand理解 your motivations動機
and maybe help you resist them,
197
597682
3910
而且可能會在合適的情況下
幫助你、制止你。
10:13
if appropriate適當.
198
601616
1320
但這仍然十分困難。
10:16
But it's still difficult.
199
604206
1464
10:18
What we're trying to do, in fact事實,
200
606302
2545
實際上,我們是要讓機器
10:20
is to allow允許 machines to predict預測
for any person and for any possible可能 life
201
608871
5796
為任何人、任何一種
可能的生活去預測:
他們更想怎樣?更想要什麽?
10:26
that they could live生活,
202
614691
1161
10:27
and the lives生活 of everybody每個人 else其他:
203
615876
1597
10:29
Which哪一個 would they prefer比較喜歡?
204
617497
2517
這涉及到諸多困難,
10:34
And there are many許多, many許多
difficulties困難 involved參與 in doing this;
205
622061
2954
10:37
I don't expect期望 that this
is going to get solved解決了 very quickly很快.
206
625039
2932
我不認為這會很快地就被解決。
10:39
The real真實 difficulties困難, in fact事實, are us.
207
627995
2643
實際上,真正的困難是我們自己。
10:44
As I have already已經 mentioned提到,
we behave表現 badly.
208
632149
3117
就像我剛說的那樣,
我們做事不守規矩。
10:47
In fact事實, some of us are downright徹頭徹尾 nasty討厭.
209
635290
2321
我們當中就有人是非常惡劣的。
10:50
Now the robot機器人, as I said,
doesn't have to copy複製 the behavior行為.
210
638431
3052
如前所說,機器人
未必得要複製那些行為。
10:53
The robot機器人 does not have
any objective目的 of its own擁有.
211
641507
2791
機器人沒有自己的目標,
10:56
It's purely純粹 altruistic利他.
212
644322
1737
它是完全利他的。
10:59
And it's not designed設計 just to satisfy滿足
the desires慾望 of one person, the user用戶,
213
647293
5221
它的誕生不僅僅是為了去滿足
某一個人、某一個用戶的欲望,
11:04
but in fact事實 it has to respect尊重
the preferences優先 of everybody每個人.
214
652538
3138
而是去尊重所有人的意願。
11:09
So it can deal合同 with a certain某些
amount of nastiness污穢,
215
657263
2570
所以它懂得抵制一些惡劣的行為,
11:11
and it can even understand理解
that your nastiness污穢, for example,
216
659857
3701
它甚至能理解你為什麼惡劣,比如說,
11:15
you may可能 take bribes行賄 as a passport護照 official官方
217
663582
2671
如果你是一個邊境護照官員,
你可能會收取賄賂,
11:18
because you need to feed飼料 your family家庭
and send發送 your kids孩子 to school學校.
218
666277
3812
因為你得養家、供孩子們上學。
機器人能理解這一點,
但不代表它也會學你偷錢,
11:22
It can understand理解 that;
it doesn't mean it's going to steal.
219
670113
2906
11:25
In fact事實, it'll它會 just help you
send發送 your kids孩子 to school學校.
220
673043
2679
它反而會幫助你去供孩子們上學。
11:28
We are also computationally計算 limited有限.
221
676976
3012
我們的計算能力也是有限的。
11:32
Lee背風處 SedolSedol is a brilliant輝煌 Go player播放機,
222
680012
2505
李世石是一個傑出的圍棋大師,
11:34
but he still lost丟失.
223
682541
1325
但他還是輸了。
11:35
So if we look at his actions行動,
he took an action行動 that lost丟失 the game遊戲.
224
683890
4239
如果我們仔細觀察他的棋路,
他下錯了那幾步以致輸棋,
11:40
That doesn't mean he wanted to lose失去.
225
688153
2161
但這不意味著他想要輸。
11:43
So to understand理解 his behavior行為,
226
691340
2040
所以要理解他的行為,
11:45
we actually其實 have to invert倒置
through通過 a model模型 of human人的 cognition認識
227
693404
3644
我們得從人類認知的模型回推過來,
11:49
that includes包括 our computational計算
limitations限制 -- a very complicated複雜 model模型.
228
697072
4977
它包含了我們計算能力上的局限,
是一個很覆雜的模型。
11:54
But it's still something
that we can work on understanding理解.
229
702073
2993
但我們仍然可以嘗試去理解。
11:57
Probably大概 the most difficult part部分,
from my point of view視圖 as an AIAI researcher研究員,
230
705876
4320
可能對於我這樣的 AI 研究人員來說,
12:02
is the fact事實 that there are lots of us,
231
710220
2575
最大的困難是,人有很多種,
12:06
and so the machine has to somehow不知何故
trade貿易 off, weigh稱重 up the preferences優先
232
714294
3581
所以機器必須想辦法去協調、
權衡不同人之間的喜好、需求,
12:09
of many許多 different不同 people,
233
717899
2225
而要做到這一點有多種不同的方法。
12:12
and there are different不同 ways方法 to do that.
234
720148
1906
12:14
Economists經濟學家, sociologists社會學家,
moral道德 philosophers哲學家 have understood了解 that,
235
722078
3689
經濟學家、社會學家、
道德哲學家都理解這一點,
12:17
and we are actively積極地
looking for collaboration合作.
236
725791
2455
我們正積極地尋求合作。
12:20
Let's have a look and see what happens發生
when you get that wrong錯誤.
237
728270
3251
讓我們來看看,如果我們把這一步
走錯了會怎麽樣。
12:23
So you can have
a conversation會話, for example,
238
731545
2133
比如說,你可能會與你的
人工智慧助理有這樣的對話,
12:25
with your intelligent智能 personal個人 assistant助理
239
733702
1944
12:27
that might威力 be available可得到
in a few少數 years'年份' time.
240
735670
2285
這樣的人工智慧可能幾年內就會出現。
12:29
Think of a SiriSiri的 on steroids類固醇.
241
737979
2524
可以把它想成是強化版的 Siri 。
12:33
So SiriSiri的 says, "Your wife妻子 called
to remind提醒 you about dinner晚餐 tonight今晚."
242
741627
4322
Siri 對你說:「你老婆打電話
提醒你別忘了今天的晚宴。」
12:38
And of course課程, you've forgotten忘記了.
"What? What dinner晚餐?
243
746616
2508
當然你早就忘了這回事:
「什麽?什麽晚宴?你在說什麽?」
12:41
What are you talking about?"
244
749148
1425
12:42
"Uh, your 20th anniversary週年 at 7pm下午."
245
750597
3746
「呃.....今晚 7 點
慶祝結婚 20 周年。」
12:48
"I can't do that. I'm meeting會議
with the secretary-general秘書長 at 7:30.
246
756915
3719
「我可去不了,
我晚上 7 點半要見秘書長。
12:52
How could this have happened發生?"
247
760658
1692
怎麽會這樣呢?」
12:54
"Well, I did warn警告 you, but you overrode凌駕於
my recommendation建議."
248
762374
4660
「呃,我可是提醒過你的,
但你沒有理會我的建議。」
13:00
"Well, what am I going to do?
I can't just tell him I'm too busy."
249
768146
3328
「我該怎麽辦呢?我可不能跟秘書長說
我有事,沒空見他。」
13:04
"Don't worry擔心. I arranged安排
for his plane平面 to be delayed延遲."
250
772490
3281
「別擔心。我已經安排了,
讓他的航班延誤。」
13:07
(Laughter笑聲)
251
775795
1682
(笑聲)
13:10
"Some kind of computer電腦 malfunction故障."
252
778249
2101
「用某種電腦故障。」
13:12
(Laughter笑聲)
253
780374
1212
(笑聲)
13:13
"Really? You can do that?"
254
781610
1617
「真的嗎?這個你也能做到?」
13:16
"He sends發送 his profound深刻 apologies道歉
255
784400
2179
「秘書長很不好意思,跟你道歉,
13:18
and looks容貌 forward前鋒 to meeting會議 you
for lunch午餐 tomorrow明天."
256
786603
2555
並邀請你明天中午吃飯。」
13:21
(Laughter笑聲)
257
789182
1299
(笑聲)
13:22
So the values here --
there's a slight輕微 mistake錯誤 going on.
258
790505
4403
所以這裡談的價值觀就有點問題了,
13:26
This is clearly明確地 following以下 my wife's妻子 values
259
794932
3009
這顯然是在遵循我老婆的價值觀,
13:29
which哪一個 is "Happy快樂 wife妻子, happy快樂 life."
260
797965
2069
也就是「老婆開心,生活舒心」。
13:32
(Laughter笑聲)
261
800058
1583
(笑聲)
13:33
It could go the other way.
262
801665
1444
它也有可能發展成另一種情況。
13:35
You could come home
after a hard day's work,
263
803821
2201
你忙碌一天,回到家裏,
13:38
and the computer電腦 says, "Long day?"
264
806046
2195
電腦對你說:「今天很忙喔?」
13:40
"Yes, I didn't even have time for lunch午餐."
265
808265
2288
「是啊,我連午飯都沒來得及吃。」
13:42
"You must必須 be very hungry飢餓."
266
810577
1282
「那你一定很餓了吧。」
13:43
"Starving挨餓, yeah.
Could you make some dinner晚餐?"
267
811883
2646
「快餓暈了。你能做點晚飯嗎?」
13:48
"There's something I need to tell you."
268
816070
2090
「有一件事我得告訴你。」
13:50
(Laughter笑聲)
269
818184
1155
(笑聲)
13:52
"There are humans人類 in South Sudan蘇丹
who are in more urgent緊急 need than you."
270
820193
4905
「南蘇丹人民的情況
比你更緊急,更需要照顧。」
13:57
(Laughter笑聲)
271
825122
1104
(笑聲)
13:58
"So I'm leaving離開. Make your own擁有 dinner晚餐."
272
826250
2075
「所以我要走了。你自己做飯去吧。」
14:00
(Laughter笑聲)
273
828349
2000
(笑聲)
我們得解決這類的問題,
14:02
So we have to solve解決 these problems問題,
274
830823
1739
14:04
and I'm looking forward前鋒
to working加工 on them.
275
832586
2515
我也很期待能解決這樣的問題。
我們有理由感到樂觀。
14:07
There are reasons原因 for optimism樂觀.
276
835125
1843
14:08
One reason原因 is,
277
836992
1159
理由之一是,
14:10
there is a massive大規模的 amount of data數據.
278
838175
1868
我們有大量的數據資料。
14:12
Because remember記得 -- I said
they're going to read everything
279
840067
2794
記住,我說過機器將能夠
閱讀所有人類寫下來的東西。
14:14
the human人的 race種族 has ever written書面.
280
842885
1546
而我們寫下的文字大都類似於
14:16
Most of what we write about
is human人的 beings眾生 doing things
281
844455
2724
「人類做了一些事情
導致其他人對此感到沮喪」。
14:19
and other people getting得到 upset煩亂 about it.
282
847203
1914
14:21
So there's a massive大規模的 amount
of data數據 to learn學習 from.
283
849141
2398
所以機器可以從
大量的數據中去學習。
14:23
There's also a very
strong強大 economic經濟 incentive激勵
284
851563
2236
同時從經濟的角度,
我們也有足夠的動機去做好這件事。
14:27
to get this right.
285
855331
1186
14:28
So imagine想像 your domestic國內 robot's機器人 at home.
286
856541
2001
想像一下,你家裡有個居家機器人。
14:30
You're late晚了 from work again
and the robot機器人 has to feed飼料 the kids孩子,
287
858566
3067
而你又得加班,
機器人得給孩子們做飯,
14:33
and the kids孩子 are hungry飢餓
and there's nothing in the fridge冰箱.
288
861657
2823
孩子們很餓,
但冰箱裡什麽都沒有。
14:36
And the robot機器人 sees看到 the cat.
289
864504
2605
然後機器人看到了家裡的貓。
14:39
(Laughter笑聲)
290
867133
1692
(笑聲)
14:40
And the robot機器人 hasn't有沒有 quite相當 learned學到了
the human人的 value function功能 properly正確,
291
868849
4190
機器人還沒學透人類的價值觀。
14:45
so it doesn't understand理解
292
873063
1251
所以它不知道,
貓的情感價值大於其營養價值。
14:46
the sentimental感傷 value of the cat outweighs勝過
the nutritional營養 value of the cat.
293
874338
4844
14:51
(Laughter笑聲)
294
879206
1095
(笑聲)
14:52
So then what happens發生?
295
880325
1748
接下來會發生什麽事?
14:54
Well, it happens發生 like this:
296
882097
3297
頭版頭條可能會是這樣:
14:57
"Deranged瘋狂 robot機器人 cooks廚師 kitty貓咪
for family家庭 dinner晚餐."
297
885418
2964
「瘋狂機器人煮了貓咪當晚餐!」
15:00
That one incident事件 would be the end結束
of the domestic國內 robot機器人 industry行業.
298
888406
4523
這場意外就足以結束
整個居家機器人的產業。
15:04
So there's a huge巨大 incentive激勵
to get this right
299
892953
3372
所以在我們實現超級 AI 之前,
我們有足夠的動機把它做對做好。
15:08
long before we reach達到
superintelligent超智 machines.
300
896349
2715
15:12
So to summarize總結:
301
900128
1535
總結來說:
15:13
I'm actually其實 trying to change更改
the definition定義 of AIAI
302
901687
2881
我事實上想要改變人工智慧的定義,
15:16
so that we have provably可證明
beneficial有利 machines.
303
904592
2993
這樣我們就可以製造出
對我們有益無害的機器人。
15:19
And the principles原則 are:
304
907609
1222
這三個原則是:
15:20
machines that are altruistic利他,
305
908855
1398
機器是利他的,
15:22
that want to achieve實現 only our objectives目標,
306
910277
2804
只想著實現我們的目標,
15:25
but that are uncertain不確定
about what those objectives目標 are,
307
913105
3116
但它不確定我們的目標是什麽,
15:28
and will watch all of us
308
916245
1998
並且它會觀察我們,
15:30
to learn學習 more about what it is
that we really want.
309
918267
3203
從中學習我們想要的究竟是什麽。
15:34
And hopefully希望 in the process處理,
we will learn學習 to be better people.
310
922373
3559
希望在這個過程中,
我們也能學會成為更好的人。
15:37
Thank you very much.
311
925956
1191
謝謝大家。
15:39
(Applause掌聲)
312
927171
3709
(掌聲)
克里斯安德森:非常有意思,斯圖爾特。
15:42
Chris克里斯 Anderson安德森: So interesting有趣, Stuart斯圖爾特.
313
930904
1868
15:44
We're going to stand here a bit
because I think they're setting設置 up
314
932796
3170
趁工作人員為下一位講者佈置的時候,
我們先站在這裡聊幾句。
15:47
for our next下一個 speaker揚聲器.
315
935990
1151
15:49
A couple一對 of questions問題.
316
937165
1538
我有幾個問題。
15:50
So the idea理念 of programming程序設計 in ignorance無知
seems似乎 intuitively直觀地 really powerful強大.
317
938727
5453
將「無知」編寫到程式中,
這種思想真的很有衝擊力。
15:56
As you get to superintelligence超級智能,
318
944204
1594
當機器人有超級智慧時,
15:57
what's going to stop a robot機器人
319
945822
2258
還有什麽東西能阻檔機器人閱讀書籍,
16:00
reading literature文學 and discovering發現
this idea理念 that knowledge知識
320
948104
2852
並了解到:博學比無知要好得多,
16:02
is actually其實 better than ignorance無知
321
950980
1572
16:04
and still just shifting its own擁有 goals目標
and rewriting重寫 that programming程序設計?
322
952576
4218
進而改變它的目標,
重新編寫自己的程式呢?
16:09
Stuart斯圖爾特 Russell羅素: Yes, so we want
it to learn學習 more, as I said,
323
957692
6356
斯圖爾特拉塞爾:是的,
我們想要它去學習,就像我說的,
讓機器人學習我們的目標,
16:16
about our objectives目標.
324
964072
1287
16:17
It'll它會 only become成為 more certain某些
as it becomes more correct正確,
325
965383
5521
只有在理解得越正確的時候,
它們才會更明確我們要的東西,
16:22
so the evidence證據 is there
326
970928
1945
佐證擺在那裡,
16:24
and it's going to be designed設計
to interpret it correctly正確地.
327
972897
2724
並且我們使它能夠正確解讀這些目標。
16:27
It will understand理解, for example,
that books圖書 are very biased
328
975645
3956
比如說,它能夠從書中的佐證
判斷出那些富含偏見的書,
16:31
in the evidence證據 they contain包含.
329
979625
1483
16:33
They only talk about kings國王 and princes王子
330
981132
2397
像是只講述國王、王子,
和男性精英白人之類的書。
16:35
and elite原種 white白色 male people doing stuff東東.
331
983553
2800
16:38
So it's a complicated複雜 problem問題,
332
986377
2096
所以這是一個複雜的問題,
16:40
but as it learns獲悉 more about our objectives目標
333
988497
3872
但當它更深入地學習我們的目標時,
16:44
it will become成為 more and more useful有用 to us.
334
992393
2063
它會變得越來越有用。
16:46
CACA: And you couldn't不能
just boil it down to one law,
335
994480
2526
CA:所以它十分複雜,
遠不足以濃縮成一條法則嗎?
16:49
you know, hardwired硬線 in:
336
997030
1650
像是,把這樣的命令燒録進去:
16:50
"if any human人的 ever tries嘗試 to switch開關 me off,
337
998704
3293
「如果人類想把我關掉,
16:54
I comply執行. I comply執行."
338
1002021
1935
我要服從。我要服從。」
16:55
SRSR: Absolutely絕對 not.
339
1003980
1182
SR:絕對不行。
16:57
That would be a terrible可怕 idea理念.
340
1005186
1499
那將是一個很糟糕的主意。
16:58
So imagine想像 that you have
a self-driving自駕車 car汽車
341
1006709
2689
試想一下,你有一輛無人駕駛汽車,
17:01
and you want to send發送 your five-year-old五十歲
342
1009422
2433
你想讓它送你五歲的孩子去幼稚園。
17:03
off to preschool幼兒.
343
1011879
1174
17:05
Do you want your five-year-old五十歲
to be able能夠 to switch開關 off the car汽車
344
1013077
3101
你會希望你五歲的孩子
在汽車運行的過程中將它關閉嗎?
17:08
while it's driving主動 along沿?
345
1016202
1213
應該不會吧。
17:09
Probably大概 not.
346
1017439
1159
所以它得理解
17:10
So it needs需求 to understand理解 how rational合理的
and sensible明智 the person is.
347
1018622
4703
下指令的人有多理智、有多講道理。
17:15
The more rational合理的 the person,
348
1023349
1676
這個人越理智,
17:17
the more willing願意 you are
to be switched交換的 off.
349
1025049
2103
它就越願意被你關掉。
如果這個人是完全思緒混亂
或者甚至是有惡意的,
17:19
If the person is completely全然
random隨機 or even malicious惡毒,
350
1027176
2543
那它就不太願意被你關掉了。
17:21
then you're less willing願意
to be switched交換的 off.
351
1029743
2512
17:24
CACA: All right. Stuart斯圖爾特, can I just say,
352
1032279
1866
CA:好吧。斯圖爾特,我得說,
17:26
I really, really hope希望 you
figure數字 this out for us.
353
1034169
2314
我真的希望你為我們所有人,
找到解決的辦法。
很感謝你的演講。
十分精彩。
17:28
Thank you so much for that talk.
That was amazing驚人.
354
1036507
2375
SR:謝謝。
CA:謝謝。
17:30
SRSR: Thank you.
355
1038906
1167
(掌聲)
17:32
(Applause掌聲)
356
1040097
1837
Translated by Yi-Fan Yu
Reviewed by Wilde Luo

▲Back to top

ABOUT THE SPEAKER
Stuart Russell - AI expert
Stuart Russell wrote the standard text on AI; now he thinks deeply on AI's future -- and the future of us humans, too.

Why you should listen

Stuart Russell is a professor (and formerly chair) of Electrical Engineering and Computer Sciences at University of California at Berkeley. His book Artificial Intelligence: A Modern Approach (with Peter Norvig) is the standard text in AI; it has been translated into 13 languages and is used in more than 1,300 universities in 118 countries. His research covers a wide range of topics in artificial intelligence including machine learning, probabilistic reasoning, knowledge representation, planning, real-time decision making, multitarget tracking, computer vision, computational physiology, global seismic monitoring and philosophical foundations.

He also works for the United Nations, developing a new global seismic monitoring system for the nuclear-test-ban treaty. His current concerns include the threat of autonomous weapons and the long-term future of artificial intelligence and its relation to humanity.

More profile about the speaker
Stuart Russell | Speaker | TED.com