ABOUT THE SPEAKER
Luis von Ahn - Computer scientist
Luis von Ahn builds systems that combine humans and computers to solve large-scale problems that neither can solve alone.

Why you should listen

Louis von Ahn is an associate professor of Computer Science at Carnegie Mellon University, and he's at the forefront of the crowdsourcing craze. His work takes advantage of the evergrowing Web-connected population to acheive collaboration in unprecedented numbers. His projects aim to leverage the crowd for human good. His company reCAPTCHA, sold to Google in 2009, digitizes human knowledge (books), one word at a time. His new project is Duolingo, which aims to get 100 million people translating the Web in every major language.

More profile about the speaker
Luis von Ahn | Speaker | TED.com
TEDxCMU

Luis von Ahn: Massive-scale online collaboration

Luis von Ahn: 大規模在線協作

Filmed:
1,740,008 views

在重新定義了CAPTCHA之後,每一次嘅人為輸入都會幫手數字化圖書,Luis von Ahn心諗我們縱可以點樣利用互聯網上許多個人嘅小小力量來實現巨大嘅價值。在TEDxCMU上,佢同我地分享咗佢充滿野心嘅新項目-Duoling。哩個項目系快速、準確嘅翻譯網頁嘅同時,幫助千萬人學習新嘅語言。而所有這一切都是免費嘅。
- Computer scientist
Luis von Ahn builds systems that combine humans and computers to solve large-scale problems that neither can solve alone. Full bio

Double-click the English transcript below to play the video.

00:15
How many好多 of you had to fill填補 out some sort排序 of webWeb form形式
0
0
2000
有幾多人系填寫網頁表格時
00:17
where you've been asked問吓 to read a distorted扭曲 sequence序列 of characters字符 like this?
1
2000
2000
需要識別甘樣扭曲嘅文字?
00:19
How many好多 of you found發現 it really, really annoying?
2
4000
2000
有幾多人覺得哩樣嘢真系好煩?
00:21
Okay, outstanding優秀. So I invented發明 that.
3
6000
3000
都唔少啊。哩樣嘢就系我發明嘅。
00:24
(Laughter笑聲)
4
9000
2000
(笑聲)
00:26
Or I was one of the people who did it.
5
11000
2000
或者講我系其中一個發明人。
00:28
That thing is called a CAPTCHACaptcha.
6
13000
2000
果樣嘢叫CAPTCHA(驗證碼)
00:30
And the reason原因 it is there is to make sure you, the entity實體 filling填充 out the form形式,
7
15000
2000
之所以佢會出現系網頁中,系因為要確認你,填空嘅哩個行為人,
00:32
are actually講真 a human人類 and not some sort排序 of computer計數機 program程序
8
17000
3000
系一個真正嘅人類,而唔系某某專門寫出來、
00:35
that was written to submit提交 the form形式 millions数百万 and millions数百万 of times.
9
20000
2000
為咗千萬次重複填表嘅電腦程式。
00:37
The reason原因 it works工程 is because humans人類,
10
22000
2000
甘樣做系因為人、
00:39
at least最小 non-visually-impaired非視力受損 humans人類,
11
24000
2000
至少系視覺正常的人,
00:41
have no trouble唔該 reading閲讀 these distorted扭曲 squiggly波浪 characters字符,
12
26000
2000
都唔會覺得讀出哩嘀扭曲嘅文字系一種困難,
00:43
whereas computer計數機 programs程序 simply淨係 can't do it as well yet尚未.
13
28000
3000
而電腦就縱未可以好似人甘樣讀得甘好。
00:46
So for example例子, in the case情況下 of Ticketmaster票務,
14
31000
2000
比如講,在Ticketmaster網站上,
00:48
the reason原因 you have to type類型 these distorted扭曲 characters字符
15
33000
2000
你要輸入哩嘀扭曲字符的原因
00:50
is to prevent防止 scalpers黃牛 from writing寫作 a program程序
16
35000
2000
系為咗防止“黃牛”寫程式
00:52
that can buy millions数百万 of tickets, two at a time.
17
37000
2000
兩張一次甘買幾萬張菲
00:54
CAPTCHAsCAPTCHAs are used all over the Internet互聯網.
18
39000
2000
驗證碼在網絡上嘅應用十分普遍
00:56
And since因為 they're used so often經常,
19
41000
2000
既然我們如此頻繁嘅使用佢
00:58
a lot of times the precise精確 sequence序列 of random隨機 characters字符 that is shown顯示 to the user用戶
20
43000
2000
很多時候用戶就會見到一嘀
01:00
is not so fortunate好彩呀.
21
45000
2000
奇怪嘅文字排序。
01:02
So this is an example例子 from the Yahoo雅虎 registration註冊 page網頁.
22
47000
3000
哩度系一個來自雅虎註冊頁嘅例子
01:05
The random隨機 characters字符 that happened發生 to be shown顯示 to the user用戶
23
50000
2000
展示俾用戶嘅隨機字符“W,A,I,T"
01:07
were W, A, I, T, which, of course課程, spell拼寫 a word.
24
52000
3000
啱好可以組成一個詞,等待。
01:10
But the best最好 part部分 is the message消息
25
55000
3000
最有趣嘅系
01:13
that the Yahoo雅虎 help desk got about 20 minutes分鐘 later之後.
26
58000
3000
20分鐘後幫助後台收到嘅訊息。
01:16
Text文本: "Help! I've been waiting for over 20 minutes分鐘, and nothing happens發生."
27
61000
3000
文字:救命啊!我都等咗廿幾分鐘啦,都冇任何變化啊。
01:19
(Laughter笑聲)
28
64000
4000
(笑聲)
01:23
This person thought they needed需要 to wait.
29
68000
2000
佢縱以為網站系叫佢等。
01:25
This of course課程, is not as bad as this poor可憐 person.
30
70000
3000
當然縱有更黑嘅
01:28
(Laughter笑聲)
31
73000
2000
(笑聲)
01:30
CAPTCHACaptcha Project項目 is something that we did here at Carnegie卡內基 MelllonMelllon over 10 years ago,
32
75000
3000
驗證碼計劃系我哋十多年前系卡內基梅隆大學搞起嘅
01:33
and it's been used everywhere周圍.
33
78000
2000
並開始被廣泛應用
01:35
Let me now tell you about a project項目 that we did a few幾個 years later之後,
34
80000
2000
以嘎等我哋來傾傾我哋幾年後搞嘅另一個項目
01:37
which is sort排序 of the next evolution演化 of CAPTCHACaptcha.
35
82000
3000
亦即系驗證碼的新版本
01:40
This is a project項目 that we call reCAPTCHA驗證碼,
36
85000
2000
我哋稱之為“reCAPTCHA”
01:42
which is something that we started初時 here at Carnegie卡內基 Mellon梅隆,
37
87000
2000
哩個計劃系從卡內基梅隆大學開始
01:44
then we turned打開 it into a startup啟動 company公司.
38
89000
2000
成為我哋嘅啟動公司
01:46
And then about a year and a half一半 ago,
39
91000
2000
一年半之前
01:48
Google谷歌 actually講真 acquired獲得 this company公司.
40
93000
2000
Google收購咗裡個公司
01:50
So let me tell you what this project項目 started初時.
41
95000
2000
以嘎我來講講哩個項目系點開始噶
01:52
So this project項目 started初時 from the following以下 realization實現:
42
97000
3000
哩個項目出於如下認識:
01:55
It turns輪流 out that approximately大約 200 million CAPTCHAsCAPTCHAs
43
100000
2000
每天在全球範圍之內有大概2億次
01:57
are typed類型 everyday每日 by people around the world世界.
44
102000
3000
驗證碼嘅輸入
02:00
When I first heard聽到 this, I was quite都幾 proud驕傲 of myself自己.
45
105000
2000
我第一次聽到嘅時候都幾自豪
02:02
I thought, look at the impact影響 that my research研究 has had.
46
107000
2000
我諗,我哋嘅研究都幾大影響噶喔
02:04
But then I started初時 feeling感覺 bad.
47
109000
2000
跟住我就覺得很難受
02:06
See here's呢度有 the thing, each每個 time you type類型 a CAPTCHACaptcha,
48
111000
2000
因為你每輸入一次驗證碼
02:08
essentially基本上 you waste嘥晒 10 seconds of your time.
49
113000
3000
你就浪費咗10秒鐘嘅時間。
02:11
And if you multiply that by 200 million,
50
116000
2000
如果你將佢乘以2億
02:13
you get that humanity人類 as a whole整個 is wasting嘥晒 about 500,000 hours小時 every day
51
118000
3000
甘人類就因為輸入驗證碼
02:16
typing打字 these annoying CAPTCHAsCAPTCHAs.
52
121000
2000
而每天浪費咗50萬個小時
02:18
So then I started初時 feeling感覺 bad.
53
123000
2000
所以我就開始覺得唔爽。
02:20
(Laughter笑聲)
54
125000
2000
(笑聲)
02:22
And then I started初時 thinking思維, well, of course課程, we can't just get rid擺脫 of CAPTCHAsCAPTCHAs,
55
127000
3000
跟住我就開始諗,恩,當然啦,我哋冇可能就此拋棄驗證碼系統
02:25
because the security安全 of the WebWeb sort排序 of depends要睇 on them.
56
130000
2000
因為網頁嘅安全指意緊佢
02:27
But then I started初時 thinking思維, is there any way we can use this effort努力
57
132000
3000
但是否有乜辦法可以將佢利用起來
02:30
for something that is good for humanity人類?
58
135000
2000
為人類做嘀好事?
02:32
So see, here's呢度有 the thing.
59
137000
2000
恩,關鍵在於,
02:34
While you're typing打字 a CAPTCHACaptcha, during those 10 seconds,
60
139000
2000
當你輸入一個驗證碼嘅時候,在果10秒鐘,
02:36
your brain大腦 is doing something amazing驚人.
61
141000
2000
你嘅大腦系度做緊一嘀好神奇嘅嘢。
02:38
Your brain大腦 is doing something that computers計數機 cannot唔可以 yet尚未 do.
62
143000
2000
哩個系電腦未可以做到嘅嘢。
02:40
So can we get you to do useful有用 work for those 10 seconds?
63
145000
3000
我哋可唔可以用裡10秒做嘀有用嘅嘢呢?
02:43
Another另一個 way of putting it is,
64
148000
2000
換種講法,
02:45
is there some humongous巨大 problem個問題 that we cannot唔可以 yet尚未 get computers計數機 to solve解決,
65
150000
2000
系唔系有嘀計算機無法解決嘅龐大問題
02:47
yet尚未 we can split分裂 into tiny 10-second chunks
66
152000
3000
而我哋可以將之分為10秒10秒嘅子問題呢?
02:50
such that each每個 time somebody有人 solves解決 a CAPTCHACaptcha
67
155000
2000
甘樣,每次有人輸入一個驗證碼嘅時候,
02:52
they solve解決 a little bit of this problem個問題?
68
157000
2000
佢哋就解決咗哩個問題嘅其中小小部分。
02:54
And the answer回答 to that is "yes," and this is what we're doing now.
69
159000
2000
答案系肯定嘅,而這就是我哋做緊嘅嘢。
02:56
So what you may可能 not know is that nowadays現時 while you're typing打字 a CAPTCHACaptcha,
70
161000
3000
你可能唔知道以嘎你每一次輸入驗證碼嘅時候,
02:59
not only are you authenticating認證 yourself自己 as a human人類,
71
164000
2000
你唔單止證明咗自己人類嘅身份,
03:01
but in addition除咗 you're actually講真 helping幫手 us to digitize數字化 books.
72
166000
2000
同時亦系度幫緊我哋數字化圖書。
03:03
So let me explain解釋 how this works工程.
73
168000
2000
等我來解釋一下:
03:05
So there's a lot of projects項目 out there trying試圖 to digitize數字化 books.
74
170000
2000
目前已經有好多嘅項目做緊數字化圖書,
03:07
Google谷歌 has one. The Internet互聯網 Archive檔案 has one.
75
172000
3000
Google有一個,“互聯網檔案”都有一個。
03:10
Amazon亞馬遜, now with the KindleKindle, is trying試圖 to digitize數字化 books.
76
175000
2000
亞馬遜同埋Kindle,亦都試圖數字化圖書。
03:12
Basically基本上 the way this works工程
77
177000
2000
基本上數字化圖書的方式
03:14
is you start初時 with an old book.
78
179000
2000
系要從一本舊書開始。
03:16
You've seen看到 those things, right? Like a book?
79
181000
2000
你睇過書葛嚯? 一本書?
03:18
(Laughter笑聲)
80
183000
2000
(笑聲)
03:20
So you start初時 with a book, and then you scan掃描 it.
81
185000
2000
你從一本紙質書開始,然後掃描佢。
03:22
Now scanning掃描 a book
82
187000
2000
掃描一本書
03:24
is like taking採取 a digital數字 photograph of every page網頁 of the book.
83
189000
2000
就好似對一本書嘅每一頁影一張數字相片。
03:26
It gives you an image圖像 for every page網頁 of the book.
84
191000
2000
佢可以給你書中每一頁嘅圖像。
03:28
This is an image圖像 with text文本 for every page網頁 of the book.
85
193000
2000
哩嘀圖像包括了書中每一頁上的文字。
03:30
The next step in the process過程
86
195000
2000
下一步就係
03:32
is that the computer計數機 needs需要 to be able to decipher破譯 all of the words的話 in this image圖像.
87
197000
3000
電腦需要能夠識別這個圖像中嘅所有文字。
03:35
That's using使用 a technology技術 called OCROcr,
88
200000
2000
目前使用的技術叫做OCR
03:37
for optical光學 character字符 recognition識別,
89
202000
2000
亦即系,光學字符識別。
03:39
which takes a picture圖片 of text文本
90
204000
2000
OCR首先獲取文字的圖像,
03:41
and tries試圖 to figure out what text文本 is in there.
91
206000
2000
然後嘗試辨認出什麼文字在那個圖像中。
03:43
Now the problem個問題 is that OCROcr is not perfect完美.
92
208000
2000
問題在於OCR並不完美。
03:45
Especially尤其係 for older books
93
210000
2000
尤其對於古舊嘅圖書而言,
03:47
where the ink油墨 has faded褪色 and the pages頁面 have turned打開 yellow黃色,
94
212000
3000
上面嘅墨跡已經變淡,紙頁亦開始變黃,
03:50
OCROcr cannot唔可以 recognize認識 a lot of the words的話.
95
215000
2000
好多文字OCR唔可以識別
03:52
For example例子, for things that were written more than 50 years ago,
96
217000
2000
比如,對五十多年前嘅書,
03:54
the computer計數機 cannot唔可以 recognize認識 about 30 percent百分比 of the words的話.
97
219000
3000
大概有30%的文字唔可以被電腦識別。
03:57
So what we're doing now
98
222000
2000
所以,我哋正在做嘅事
03:59
is we're taking採取 all of the words的話 that the computer計數機 cannot唔可以 recognize認識
99
224000
2000
就係將所有哩嘀電腦唔可以識別嘅文字摞出來,
04:01
and we're getting得到 people to read them for us
100
226000
2000
然後讓其他人在網上輸入驗證碼的同時,
04:03
while they're typing打字 a CAPTCHACaptcha on the Internet互聯網.
101
228000
2000
幫我地將佢哋讀出來。
04:05
So the next time you type類型 a CAPTCHACaptcha, these words的話 that you're typing打字
102
230000
3000
所以,下一次你輸入驗證碼嘅時候,你輸入嘅果嘀文字
04:08
are actually講真 words的話 that are coming from books that are being digitized數字化
103
233000
3000
實質上來自於正在被數字化嘅圖書之中
04:11
that the computer計數機 could not recognize認識.
104
236000
2000
而所有這些文字都是電腦無法職別嘅。
04:13
And now the reason原因 we have two words的話 nowadays現時 instead相反 of one
105
238000
2000
另外,現在同時出現兩個詞而唔系一個詞
04:15
is because, you see, one of the words的話
106
240000
2000
系因為其中一個詞
04:17
is a word that the system系統 just got out of a book,
107
242000
2000
系系統從書中撿出嘅無法識別嘅單詞
04:19
it didn't know what it was, and it's going to present目前 it to you.
108
244000
3000
系統並唔知道哩個詞系乜嘢,
04:22
But since因為 it doesn't know the answer回答 for it, it cannot唔可以 grade年級 it for you.
109
247000
3000
但系既然佢唔知答案系乜嘢,佢都唔可以判斷你是否答啱。
04:25
So what we do is we give you another另一個 word,
110
250000
2000
所以我哋嘅做法就系俾你另一個詞,
04:27
one for which the system系統 does know the answer回答.
111
252000
2000
另一個系統知道答案嘅詞。
04:29
We don't tell you which one's人嘅 which, and we say, please type類型 both.
112
254000
2000
我哋唔講俾你知邊個打邊個,我哋淨系講,請將兩個詞都輸入。
04:31
And if you type類型 the correct word
113
256000
2000
如果你正確甘輸入咗
04:33
for the one for which the system系統 already knows the answer回答,
114
258000
2000
電腦知道答案嘅果個詞,
04:35
it assumes假設 you are human人類,
115
260000
2000
電腦就認為你系人類,
04:37
and it also gets得到 some confidence信心 that you typed類型 the other word correctly.
116
262000
2000
同時亦對你正確輸入另一個詞多咗幾分信心。
04:39
And if we repeat重複 this process過程 to like 10 different不同 people
117
264000
3000
如果我哋對10個人重複哩個過程,
04:42
and all of them agree同意 on what the new新增功能 word is,
118
267000
2000
而佢地都同意果個新詞系乜嘢。
04:44
then we get one more word digitized數字化 accurately準確.
119
269000
2000
甘我哋就可以正確嘅多數字化一個新詞。
04:46
So this is how the system系統 works工程.
120
271000
2000
這就係哩個系統嘅工作原理
04:48
And basically基本上, since因為 we released釋放 it about three or four years ago,
121
273000
3000
同時,因為我哋已經系三四年前發布咗哩個新系統,
04:51
a lot of websites網站 have started初時 switching
122
276000
2000
許多網站已經開始從以前浪費用戶時間嘅舊驗證碼系統
04:53
from the old CAPTCHACaptcha where people wasted嘥晒 their佢哋 time
123
278000
2000
許多網站已經開始從以前浪費用戶時間嘅舊驗證碼系統
04:55
to the new新增功能 CAPTCHACaptcha where people are helping幫手 to digitize數字化 books.
124
280000
2000
轉去哩個新嘅可以幫助數字化圖書嘅新系統。
04:57
So for example例子, Ticketmaster票務.
125
282000
2000
例如,系Ticketmaster的網站上,
04:59
So every time you buy tickets on Ticketmaster票務, you help to digitize數字化 a book.
126
284000
3000
你每一次買票,都系幫緊數字化圖書。
05:02
FacebookFacebook: Every time you add添加 a friend朋友 or poke somebody有人,
127
287000
2000
Facebook,你每一次加好友或者打招呼,
05:04
you help to digitize數字化 a book.
128
289000
2000
你都在幫緊數字化圖書。
05:06
TwitterTwitter and about 350,000 other sites網站 are all using使用 reCAPTCHA驗證碼.
129
291000
3000
twitter同埋另外350000個網站都用緊reCAPTCHA。
05:09
And in fact事實, the number數量 of sites網站 that are using使用 reCAPTCHA驗證碼 is so high
130
294000
2000
事實上,使用reCAPTCHA 嘅網站數量如此之多,
05:11
that the number數量 of words的話 that we're digitizing數字化 per day is really, really large.
131
296000
3000
以至於我哋每天數字化嘅文字數量亦都十分之高。
05:14
It's about 100 million a day,
132
299000
2000
大概有1000萬之多,
05:16
which is the equivalent等傚 of about two and a half一半 million books a year.
133
301000
4000
相當於每年數字化咗250萬本書。
05:20
And this is all being done one word at a time
134
305000
2000
而這一切都系一個字一個字甘
05:22
by just people typing打字 CAPTCHAsCAPTCHAs on the Internet互聯網.
135
307000
2000
由用戶在網上輸入驗證碼得來嘅。
05:24
(Applause掌聲)
136
309000
8000
(鼓掌)
05:32
Now of course課程,
137
317000
2000
當然,
05:34
since因為 we're doing so many好多 words的話 per day,
138
319000
2000
既然我哋每天都可以完成甘多字,
05:36
funny有趣 things can happen發生.
139
321000
2000
有趣嘅事可能就會發生。
05:38
And this is especially尤其係 true真係 because now we're giving people
140
323000
2000
尤其現在我哋同時向用戶
05:40
two randomly隨機 chosen選擇 English英文 words的話 next to each每個 other.
141
325000
2000
展示兩個隨機產生、並列出現嘅英文單詞。
05:42
So funny有趣 things can happen發生.
142
327000
2000
好笑的事會發生。
05:44
For example例子, we presented提出 this word.
143
329000
2000
比如,我哋展示了哩個詞。
05:46
It's the word "Christians基督徒"; there's nothing wrong with it.
144
331000
2000
單詞“基督徒”,並冇乜問題。
05:48
But if you present目前 it along沿 with another另一個 randomly隨機 chosen選擇 word,
145
333000
3000
但系如果你同另外一個隨機選擇出來嘅詞擺埋一齊,
05:51
bad things can happen發生.
146
336000
2000
就可能搞獲。
05:53
So we get this. (Text文本: bad christians基督徒)
147
338000
2000
我哋就見到甘樣嘅組合。(文字:壞基督徒)
05:55
But it's even worse更糟, because the particular特定 website網站 where we showed表明 this
148
340000
3000
但更衰嘅系,这组文字啱好出现系一个叫
05:58
actually講真 happened發生 to be called The Embassy大使館 of the Kingdom王國 of God.
149
343000
3000
“上帝王国的使館”的网站上。
06:01
(Laughter笑聲)
150
346000
2000
(笑声)
06:03
Oops哎呀.
151
348000
2000
哎呀
06:05
(Laughter笑聲)
152
350000
3000
(笑声)
06:08
Here's呢度有 another另一個 really bad one.
153
353000
2000
哩度縱有一個
06:10
JohnEdwardsJohnEdwards.comCom
154
355000
2000
JohnEdwards.com
06:12
(Text文本: Damn抵死 liberal自由)
155
357000
3000
(文字:可恶的自由派)
06:15
(Laughter笑聲)
156
360000
2000
(笑声)
06:17
So we keep on insulting侮辱 people left and right everyday每日.
157
362000
3000
所以我哋每天都不停嘅侮辱
06:20
Now, of course課程, we're not just insulting侮辱 people.
158
365000
2000
当然,我哋不单只侮辱人。
06:22
See here's呢度有 the thing, since因為 we're presenting提出 two randomly隨機 chosen選擇 words的話,
159
367000
3000
就这样,既然我哋展示两个随机产生的单词,
06:25
interesting有趣 things can happen發生.
160
370000
2000
有趣嘅事可能发生。
06:27
So this actually講真 has given rise上升
161
372000
2000
所以佢实际上催生了
06:29
to a really big Internet互聯網 memeMeme
162
374000
3000
一个十分庞大嘅、
06:32
that tens成千上萬 of thousands數以千計 of people have participated參與 in,
163
377000
2000
有成千上万人参与嘅互联网流行,
06:34
which is called CAPTCHACaptcha art藝術.
164
379000
2000
叫做“验证码艺术”。
06:36
I'm sure some of you have heard聽到 about it.
165
381000
2000
我相信你地其中一些人已经听说过佢了。
06:38
Here's呢度有 how it works工程.
166
383000
2000
佢系甘运行嘅。
06:40
Imagine想象 you're using使用 the Internet互聯網 and you see a CAPTCHACaptcha
167
385000
2000
想像你自己正系度用紧互联网,
06:42
that you think is somewhat有 D peculiar奇特,
168
387000
2000
如果你看到一个有嘀奇怪嘅验证码,
06:44
like this CAPTCHACaptcha. (Text文本: invisible無形 toaster多士爐)
169
389000
2000
好似甘。(文字:隐形嘅土司机)
06:46
Then what you're supposed應該 to do is you take a screen屏幕 shot拍攝 of it.
170
391000
2000
然后,你应该做嘅系将佢截图落来,
06:48
Then of course課程, you fill填補 out the CAPTCHACaptcha
171
393000
2000
跟住,当然啦,你将个验证码填好,
06:50
because you help us digitize數字化 a book.
172
395000
2000
因为甘你就帮紧我哋数字化图书。
06:52
But then, first you take a screen屏幕 shot拍攝,
173
397000
2000
所以,你首先截图落来,
06:54
and then you draw something that is related相關 to it.
174
399000
2000
然后画一些与之相关嘅嘢。
06:56
(Laughter笑聲)
175
401000
2000
(笑声)
06:58
That's how it works工程.
176
403000
3000
就系甘。
07:01
There are tens成千上萬 of thousands數以千計 of these.
177
406000
3000
哩度有成千上万甘样嘅作品。
07:04
Some of them are very cute得意. (Text文本: clenched握緊 it)
178
409000
2000
有嘀都好可爱咖。(文字:捉緊佢)
07:06
(Laughter笑聲)
179
411000
2000
(笑声)
07:08
Some of them are funnier有趣.
180
413000
2000
有嘀更加搞笑。
07:10
(Text文本: stoned飄飄欲仙 founders創始人)
181
415000
3000
(文字:飲high咗嘅建國者)
07:13
(Laughter笑聲)
182
418000
3000
(笑声)
07:16
And some of them,
183
421000
2000
縱有一嘀
07:18
like paleontological古生物 shvisleshvisle,
184
423000
3000
比如“古生物嘅史維錘”
07:21
they contain包含 Snoop探聽 DoggDogg.
185
426000
2000
佢哋包括Snoop Dogg。
07:23
(Laughter笑聲)
186
428000
3000
(笑聲)
07:26
Okay, so this is my favorite中意 number數量 of reCAPTCHA驗證碼.
187
431000
2000
哩個系我最中意嘅reCAPTCHA數字
07:28
So this is the favorite中意 thing that I like about this whole整個 project項目.
188
433000
3000
我最中意嘅部分
07:31
This is the number數量 of distinct不同 people
189
436000
2000
哩個數字系
07:33
that have helped幫手 us digitize數字化 at least最小 one word out of a book through透過 reCAPTCHA驗證碼:
190
438000
3000
通過reCAPTCHA幫助我哋數字化圖書嘅人數
07:36
750 million,
191
441000
2000
7億5千萬,
07:38
which is a little over 10 percent百分比 of the world's世界嘅 population人口,
192
443000
2000
剛好小小地超過世界人口的百分之一,
07:40
has helped幫手 us digitize數字化 human人類 knowledge知識.
193
445000
2000
已經幫助咗我哋數字化人類知識,
07:42
And it is numbers數字 like these that motivate激勵 my research研究 agenda議程.
194
447000
3000
就是甘樣嘅數字激勵咗我嘅研究計劃。
07:45
So the question個問題 that motivates激勵 my research研究 is the following以下:
195
450000
3000
所以激勵我研究嘅問題如下:
07:48
If you look at humanity's人類嘅 large-scale大規模 achievements成就,
196
453000
2000
如果你睇睇人類的大規模的成就,
07:50
these really big things
197
455000
2000
果嘀歷史上嘅、人類聚集起來
07:52
that humanity人類 has gotten得到 together一起 and done historically歷史 --
198
457000
3000
一起完成嘅真正“大”事
07:55
like for example例子, building建築 the pyramids金字塔 of Egypt埃及
199
460000
2000
譬如,建造埃及嘅金字塔
07:57
or the Panama巴拿馬 Canal
200
462000
2000
或者建成巴拿馬運河
07:59
or putting a man on the Moon月亮 --
201
464000
2000
或者將人類送上月球
08:01
there is a curious好奇 fact事實 about them,
202
466000
2000
哩度有一個幾有趣嘅關於佢嘅事實
08:03
and it is that they were all done with about the same相同 number數量 off people.
203
468000
2000
那就系佢哋全部都系被差唔多數量嘅人完成嘅。
08:05
It's weird奇怪; they were all done with about 100,000 people.
204
470000
3000
很奇怪,佢哋全部都系被差唔多10萬人完成嘅。
08:08
And the reason原因 for that is because, before the Internet互聯網,
205
473000
3000
其中嘅原因系,系互聯網出現之前,
08:11
coordinating協調 more than 100,000 people,
206
476000
2000
聯合超過10萬人——更不用說付錢俾佢哋
08:13
let alone一手一腳 paying支付 them, was essentially基本上 impossible冇可能.
207
478000
3000
系幾乎冇可能嘅。
08:16
But now with the Internet互聯網, I've just shown顯示 you a project項目
208
481000
2000
但系因為有咗互聯網,我剛剛展示俾你地嘅項目
08:18
where we've我哋都 gotten得到 750 million people
209
483000
2000
就有7億5千萬人參與
08:20
to help us digitize數字化 human人類 knowledge知識.
210
485000
2000
來幫助我哋數字化人類知識。
08:22
So the question個問題 that motivates激勵 my research研究 is,
211
487000
2000
所以激勵我哋研究嘅問題就系
08:24
if we can put a man on the Moon月亮 with 100,000,
212
489000
3000
如果我哋可以用10萬人就將人類送上月球,
08:27
what can we do with 100 million?
213
492000
2000
我哋可以用1億人做嘀乜嘢?
08:29
So based基於 on this question個問題,
214
494000
2000
基於裡個問題,
08:31
we've我哋都 had a lot of different不同 projects項目 that we've我哋都 been working工作 on.
215
496000
2000
我哋開展咗許多唔同嘅項目。
08:33
Let me tell you about one that I'm most excited興奮 about.
216
498000
3000
等我同你哋介紹下我最為之興奮嘅一個。
08:36
This is something that we've我哋都 been semi-quietly半靜 working工作 on
217
501000
2000
哩個項目我哋已經“半地下”甘進行咗
08:38
for the last year and a half一半 or so.
218
503000
2000
差唔多一年半。
08:40
It hasn't yet尚未 been launched推出. It's called DuolingoDuolingo.
219
505000
2000
佢縱未正式運行,叫做Duolingo.
08:42
Since因為 it hasn't been launched推出, shhhhhshhhhh!
220
507000
2000
因為我哋縱沒投入使用,所以,噓!
08:44
(Laughter笑聲)
221
509000
2000
(笑聲)
08:46
Yeah, I can trust信任 you'll你咪會 do that.
222
511000
2000
我相信你哋都會守口如瓶嘅。
08:48
So this is the project項目. Here's呢度有 how it started初時.
223
513000
2000
哩個項目系甘樣開始嘅。
08:50
It started初時 with me posing構成 a question個問題 to my graduate畢業 student學生,
224
515000
2000
佢開始於我向我嘅研究生Severin Hacker提出嘅一個問題
08:52
SeverinSeverin Hacker黑客.
225
517000
2000
佢開始於我向我嘅研究生Severin Hacker提出嘅一個問題
08:54
Okay, that's SeverinSeverin Hacker黑客.
226
519000
2000
這就系佢。
08:56
So I posed提出 the question個問題 to my graduate畢業 student學生.
227
521000
2000
我向佢提咗一個問題
08:58
By the way, you did hear聽到 me correctly;
228
523000
2000
順便一提,你冇聽錯,
09:00
his last name名字 is Hacker黑客.
229
525000
2000
佢確實姓“Hacker(駭客)”
09:02
So I posed提出 this question個問題 to him:
230
527000
2000
我向佢提出咗哩個問題:
09:04
How can we get 100 million people
231
529000
2000
我哋點樣先可以讓1億人免費甘
09:06
translating正在翻譯 the WebWeb into every major主要 language語言 for free自由?
232
531000
3000
將互聯網翻譯為每一種主要嘅語言?
09:09
Okay, so there's a lot of things to say about this question個問題.
233
534000
2000
恩,關於哩個題目我哋可以有好多可以講。
09:11
First of all, translating正在翻譯 the WebWeb.
234
536000
2000
首先系翻譯網頁。
09:13
So right now the WebWeb is partitioned分區 into multiple多個 languages語言.
235
538000
3000
目前,互聯網被分為多種語言。
09:16
A large fraction分數 of it is in English英文.
236
541000
2000
其中很大一部分系英文。
09:18
If you don't know any English英文, you can't access訪問 it.
237
543000
2000
如果你唔使任何英文,你就冇辦法接觸到佢哋。
09:20
But there's large fractions分數 in other different不同 languages語言,
238
545000
2000
但同時亦有很大部分系其他語言,
09:22
and if you don't know those languages語言, you can't access訪問 it.
239
547000
3000
同樣的,如果你唔識果嘀語言,你亦無法接觸到。
09:25
So I would like to translate翻譯 all of the WebWeb, or at least最小 most of the WebWeb,
240
550000
3000
所以我好想可以將整個互聯網,或者至少系大部分
09:28
into every major主要 language語言.
241
553000
2000
翻譯成每一種主要語言。
09:30
So that's what I would like to do.
242
555000
2000
這就係我想做嘅事。
09:32
Now some of you may可能 say, why can't we use computers計數機 to translate翻譯?
243
557000
3000
有嘀人可能會講,點解唔用電腦來翻譯呢?
09:35
Why can't we use machine translation翻譯?
244
560000
2000
點解我哋唔用機器翻譯?
09:37
Machine translation翻譯 nowadays現時 is starting初時 to translate翻譯 some sentences句子 here and there.
245
562000
2000
機器翻譯以嘎已經開始時不時嘅出現,
09:39
Why can't we use it to translate翻譯 the whole整個 WebWeb?
246
564000
2000
點解我哋唔用佢翻譯整個互聯網呢?
09:41
Well the problem個問題 with that is that it's not yet尚未 good enough
247
566000
2000
恩, 問題在於,機器翻譯縱未夠好。
09:43
and it probably可能 won't唔會 be for the next 15 to 20 years.
248
568000
2000
而且哩個問題在未來15、20年後亦唔一定能解決。
09:45
It makes使 a lot of mistakes錯誤.
249
570000
2000
佢出錯太多。
09:47
Even when it doesn't make a mistake錯誤,
250
572000
2000
即使佢冇出錯,
09:49
since因為 it makes使 so many好多 mistakes錯誤, you don't know whether係唔係 to trust信任 it or not.
251
574000
3000
但因為佢出錯太多,你好難知道系唔系應該相信佢。
09:52
So let me show顯示 you an example例子
252
577000
2000
舉個用機器翻譯嘅例子吧。
09:54
of something that was translated目標語言 with a machine.
253
579000
2000
舉個用機器翻譯嘅例子吧。
09:56
Actually講真 it was a forum論壇 post發布.
254
581000
2000
哩個系一篇網上論壇嘅文章,
09:58
It was somebody有人 who was trying試圖 to ask問吓 a question個問題 about JavaScriptJavascript.
255
583000
3000
文章想系一個網民想問一個關於Java語言嘅問題。
10:01
It was translated目標語言 from Japanese日文 into English英文.
256
586000
3000
佢從日文被翻譯成英文。
10:04
So I'll just let you read.
257
589000
2000
你可以睇睇。
10:06
This person starts初時 apologizing道歉
258
591000
2000
哩個人首先為使用電腦翻譯
10:08
for the fact事實 that it's translated目標語言 with a computer計數機.
259
593000
2000
而道歉。
10:10
So the next sentence句子 is is going to be the preamble序言 to the question個問題.
260
595000
3000
下一個句子開始入題
10:13
So he's just explaining解釋 something.
261
598000
2000
佢系度解釋緊一嘀嘢。
10:15
Remember記得, it's a question個問題 about JavaScriptJavascript.
262
600000
3000
請留意,哩個系一個關於Java語言嘅問題。
10:19
(Text文本: At often經常, the goat-time山羊時間 install安裝 a error錯誤 is vomit.)
263
604000
4000
(文字:常常,山羊時間安裝一個錯誤系嘔吐
10:23
(Laughter笑聲)
264
608000
4000
(笑聲)
10:27
Then comes the first part部分 of the question個問題.
265
612000
3000
接著系哩個問題嘅第一個部分。
10:30
(Text文本: How many好多 times like the wind, a pole, and the dragon?)
266
615000
4000
(文字:有幾多次好似風,柱,龍?)
10:34
(Laughter笑聲)
267
619000
2000
(笑聲)
10:36
Then comes my favorite中意 part部分 of the question個問題.
268
621000
3000
跟住系我最中意嘅部分。
10:39
(Text文本: This insult侮辱 to father's老竇 stones石頭?)
269
624000
3000
(文字:這對父親石嘅侮辱?)
10:42
(Laughter笑聲)
270
627000
2000
(笑聲)
10:44
And then comes the ending結束, which is my favorite中意 part部分 of the whole整個 thing.
271
629000
3000
接著到咗問題的最後部分,整件事我最中意嘅部分。
10:47
(Text文本: Please apologize道歉 for your stupidity愚蠢. There are a many好多 thank you.)
272
632000
4000
(文字:請為你嘅愚蠢而道歉。裡度有對你嘅許多感謝。)
10:51
(Laughter笑聲)
273
636000
2000
(笑聲)
10:53
Okay, so computer計數機 translation翻譯, not yet尚未 good enough.
274
638000
2000
所以,電腦翻譯並沒夠好。
10:55
So back to the question個問題.
275
640000
2000
回到問題
10:57
So we need people to translate翻譯 the whole整個 WebWeb.
276
642000
3000
我哋需要人來翻譯互聯網。
11:00
So now the next question個問題 you may可能 have is,
277
645000
2000
你可能要問嘅下一個問題可能系
11:02
well why can't we just pay支付 people to do this?
278
647000
2000
點解我哋唔使錢叫人來做呢?
11:04
We could pay支付 professional專業 language語言 translators翻譯 to translate翻譯 the whole整個 WebWeb.
279
649000
3000
我哋可以使錢請專業嘅語言翻譯家來翻譯整個互聯網。
11:07
We could do that.
280
652000
2000
我哋可以甘做
11:09
Unfortunately不幸, it would be extremely expensive昂貴.
281
654000
2000
但不幸嘅系,甘樣可能會非常貴。
11:11
For example例子, translating正在翻譯 a tiny, tiny fraction分數 of the whole整個 WebWeb, Wikipedia維基百科,
282
656000
3000
例如,將互聯網中嘅極小極小嘅一部分——維基百科,
11:14
into one other language語言, Spanish西班牙文.
283
659000
3000
翻譯為西班牙文。
11:17
Wikipedia維基百科 exists存在 in Spanish西班牙文,
284
662000
2000
雖然有西班牙文嘅維基百科,
11:19
but it's very small compared比較 to the size大小 of English英文.
285
664000
2000
但相比於英文維基百科,佢嘅內容很少。
11:21
It's about 20 percent百分比 of the size大小 of English英文.
286
666000
2000
大概只有英文維基百科嘅20%。
11:23
If we wanted to translate翻譯 the other 80 percent百分比 into Spanish西班牙文,
287
668000
3000
如果我哋想將另外果80%翻譯為西班牙文,
11:26
it would cost成本 at least最小 50 million dollars美元 --
288
671000
2000
可能要使五千萬美元
11:28
and this is at even the most exploited利用, outsourcing外包 country國家 out there.
289
673000
3000
即使系最便嘅服務外包國家
11:31
So it would be very expensive昂貴.
290
676000
2000
因此人工翻譯會好貴。
11:33
So what we want to do is we want to get 100 million people
291
678000
2000
我哋想做嘅系將1億人聯合起來,
11:35
translating正在翻譯 the WebWeb into every major主要 language語言
292
680000
2000
將互聯網翻譯為任何一種主要語言,
11:37
for free自由.
293
682000
2000
而唔使一分錢。
11:39
Now if this is what you want to do,
294
684000
2000
如果哩個系你想做嘅,
11:41
you pretty quickly迅速 realize實現 you're going to run運行 into two pretty big hurdles障礙,
295
686000
2000
你好快就會發現自己將面臨兩個幾大嘅攔路石
11:43
two big obstacles障礙.
296
688000
2000
兩個大障礙。
11:45
The first one is a lack缺乏 of bilinguals雙語.
297
690000
3000
第一個就是缺少雙語人才。
11:48
So I don't even know
298
693000
2000
我甚至唔知道
11:50
if there exists存在 100 million people out there using使用 the WebWeb
299
695000
3000
系唔系有1億擁有足夠雙語能力嘅人網友
11:53
who are bilingual雙語 enough to help us translate翻譯.
300
698000
2000
會幫我哋翻譯。
11:55
That's a big problem個問題.
301
700000
2000
哩個系個大問題。
11:57
The other problem個問題 you're going to run運行 into is a lack缺乏 of motivation動機.
302
702000
2000
另一個你會遇到嘅問題系缺少激勵。
11:59
How are we going to motivate激勵 people
303
704000
2000
我哋點樣激勵人哋
12:01
to actually講真 translate翻譯 the WebWeb for free自由?
304
706000
2000
去免費翻譯網頁呢?
12:03
Normally通常, you have to pay支付 people to do this.
305
708000
3000
通常來講,你必須使錢僱人來做哩嘀。
12:06
So how are we going to motivate激勵 them to do it for free自由?
306
711000
2000
點樣能激勵人哋免費來翻譯呢?
12:08
Now when we were starting初時 to think about this, we were blocked封鎖 by these two things.
307
713000
3000
當我哋開始諗哩嘀問題嘅時候,我哋就被哩兩個困難限制住咗。
12:11
But then we realized實現, there's actually講真 a way
308
716000
2000
但後來我哋發現系有辦法
12:13
to solve解決 both these problems個問題 with the same相同 solution解決方案.
309
718000
2000
用同一個解決方案同時解決裡嘀問題。
12:15
There's a way to kill two birds with one stone石頭.
310
720000
2000
有一個一石二鳥嘅辦法。
12:17
And that is to transform變換 language語言 translation翻譯
311
722000
3000
就系將語言翻譯轉變為
12:20
into something that millions数百万 of people want to do,
312
725000
3000
大家都想做嘅事,
12:23
and that also helps幫手 with the problem個問題 of lack缺乏 of bilinguals雙語,
313
728000
3000
同時用語言教育
12:26
and that is language語言 education教育.
314
731000
3000
幫助果嘀雙語能力不足嘅人
12:29
So it turns輪流 out that today今日,
315
734000
2000
實際上,現在有超過12億人在學習外語。
12:31
there are over 1.2 billion people learning學習 a foreign外國 language語言.
316
736000
3000
實際上,現在有超過12億人在學習外語。
12:34
People really, really want to learn學習 a foreign外國 language語言.
317
739000
2000
大家都十分、十分想學習外語,
12:36
And it's not just because they're being forced to do so in school學校.
318
741000
3000
而且唔系因為系學校被逼甘做。
12:39
For example例子, in the United聯合 States國家 alone一手一腳,
319
744000
2000
比如,單單系美國
12:41
there are over five million people who have paid支付 over $500
320
746000
2000
就有超過500萬人在軟件上花費咗超過500美元
12:43
for software軟件 to learn學習 a new新增功能 language語言.
321
748000
2000
用於學習一種新語言。
12:45
So people really, really want to learn學習 a new新增功能 language語言.
322
750000
2000
所以,大家真系好想學新語言。
12:47
So what we've我哋都 been working工作 on for the last year and a half一半 is a new新增功能 website網站 --
323
752000
3000
我哋在過去一年半嘅時間裡做嘅系一個新網站,
12:50
it's called DuolingoDuolingo --
324
755000
2000
叫做Duolingo
12:52
where the basic基本 idea想法 is people learn學習 a new新增功能 language語言 for free自由
325
757000
3000
Duolingo嘅基本理念系人們可以免費學習一種新語言,
12:55
while simultaneously同時 translating正在翻譯 the WebWeb.
326
760000
2000
同時義務翻譯網頁。
12:57
And so basically基本上 they're learning學習 by doing.
327
762000
2000
簡單來講,佢哋系度通過實踐來學習。
12:59
So the way this works工程
328
764000
2000
Duolingo運作嘅方式系
13:01
is whenever每當 you're a just a beginner初學者, we give you very, very simple簡單 sentences句子.
329
766000
3000
如果你系初學者,我哋會俾你非常非常簡單嘅句子。
13:04
There's, of course課程, a lot of very simple簡單 sentences句子 on the WebWeb.
330
769000
2000
當然網上有許多簡單嘅句子。
13:06
We give you very, very simple簡單 sentences句子
331
771000
2000
我哋俾你非常非常簡單嘅句子
13:08
along沿 with what each每個 word means意味着.
332
773000
2000
同每個單詞嘅意思。
13:10
And as you translate翻譯 them, and as you see how other people translate翻譯 them,
333
775000
3000
隨著你翻譯佢哋,加上看其他人點樣翻譯佢哋,
13:13
you start初時 learning學習 the language語言.
334
778000
2000
你就開始學習果種語言了。
13:15
And as you get more and more advanced先進,
335
780000
2000
當你變得越來越進階嘅時候,
13:17
we give you more and more complex複雜 sentences句子 to translate翻譯.
336
782000
2000
我哋就會俾你更多複雜句子來翻譯。
13:19
But at all times, you're learning學習 by doing.
337
784000
2000
無論何時,你都系通過練習來學習。
13:21
Now the crazy thing about this method方法
338
786000
2000
哩種方法嘅瘋狂在於
13:23
is that it actually講真 really works工程.
339
788000
2000
佢真嘅會成功。
13:25
First of all, people are really, really learning學習 a language語言.
340
790000
2000
首先,人們真嘅學緊語言。
13:27
We're mostly主要 done building建築 it, and now we're testing測試 it.
341
792000
2000
我哋已經建好網站,現在正在測試。
13:29
People really can learn學習 a language語言 with it.
342
794000
2000
人們真嘅可以用佢來學習語言,
13:31
And they learn學習 it about as well as the leading領先 language語言 learning學習 software軟件.
343
796000
3000
也可以學得與使用其他領先嘅語言學習軟件一樣好
13:34
So people really do learn學習 a language語言.
344
799000
2000
人們真系可以學一門語言。
13:36
And not only do they learn學習 it as well,
345
801000
2000
不單止學得一樣好,
13:38
but actually講真 it's way more interesting有趣.
346
803000
2000
佢實際上亦更加有趣。
13:40
Because you see with DuolingoDuolingo, people are actually講真 learning學習 with real真正 content內容.
347
805000
3000
因為在Duolingo,人們使用真正嘅內容來學習,
13:43
As opposed反對 to learning學習 with made-up組成 sentences句子,
348
808000
2000
相對於用編造嘅句子,
13:45
people are learning學習 with real真正 content內容, which is inherently本質上 interesting有趣.
349
810000
3000
人們學緊真嘅內容,從本質上就有趣許多。
13:48
So people really do learn學習 a language語言.
350
813000
2000
人們真嘅學習一種語言。
13:50
But perhaps或者 more surprisingly奇怪,
351
815000
2000
可能更令人驚奇嘅系,
13:52
the translations翻譯 that we get from people using使用 the site網站,
352
817000
3000
從用緊哩個網站嘅人——即使佢哋只系初學者,
13:55
even though雖然 they're just beginners初學者,
353
820000
2000
得到嘅翻譯
13:57
the translations翻譯 that we get are as accurate準確 as those of professional專業 language語言 translators翻譯,
354
822000
3000
與果嘀專業嘅翻譯師竟然一樣精確。
14:00
which is very surprising令人驚訝.
355
825000
2000
與果嘀專業嘅翻譯師竟然一樣精確。
14:02
So let me show顯示 you one example例子.
356
827000
2000
讓我為你展示一個例子。
14:04
This is a sentence句子 that was translated目標語言 from German德文 into English英文.
357
829000
2000
哩個系一個從德文翻譯成英文嘅句子。
14:06
The top返回頁首 is the German德文.
358
831000
2000
上面系德文。
14:08
The middle中間 is an English英文 translation翻譯
359
833000
2000
中間系由專業譯者做嘅英文翻譯
14:10
that was done by somebody有人 who was a professional專業 English英文 translator在綫繙譯
360
835000
2000
中間系由專業譯者做嘅英文翻譯
14:12
who we paid支付 20 cents美分 a word for this translation翻譯.
361
837000
2000
我哋為哩個翻譯嘅每個字使咗20美分。
14:14
And the bottom底部 is a translation翻譯 by users用戶 of DuolingoDuolingo,
362
839000
3000
最下面嘅翻譯系由之前完全唔識德文嘅
14:17
none of whom边个 knew any German德文
363
842000
2000
我哋嘅網站嘅用戶做嘅翻譯。
14:19
before they started初時 using使用 the site網站.
364
844000
2000
我哋嘅網站嘅用戶做嘅翻譯。
14:21
You can see, it's pretty much perfect完美.
365
846000
2000
你哋可以見到,哩個翻譯系幾完美嘅。
14:23
Now of course課程, we play a trick把戲 here
366
848000
2000
當然,我哋使咗一嘀手段
14:25
to make the translations翻譯 as good as professional專業 language語言 translators翻譯.
367
850000
2000
來使哩個翻譯與專業翻譯一樣好。
14:27
We combine結合 the translations翻譯 of multiple多個 beginners初學者
368
852000
3000
我哋將多個用戶嘅翻譯綜合起來,
14:30
to get the quality質素 of a single professional專業 translator在綫繙譯.
369
855000
3000
來得到與一個專業譯者同樣嘅質量。
14:33
Now even though雖然 we're combining結合 the translations翻譯,
370
858000
5000
而即使我哋綜合多個翻譯者,
14:38
the site網站 actually講真 can translate翻譯 pretty fast快速.
371
863000
2000
哩個網站嘅翻譯其實幾快。
14:40
So let me show顯示 you,
372
865000
2000
等我展示一下
14:42
this is our estimates估計 of how fast快速 we could translate翻譯 Wikipedia維基百科
373
867000
2000
哩個系我哋估算嘅我哋可以幾快甘將維基百科
14:44
from English英文 into Spanish西班牙文.
374
869000
2000
從英文翻譯為西班牙文。
14:46
Remember記得, this is 50 million dollars-worth美金-價值 of value價值.
375
871000
3000
留意,哩度有5000萬嘅價值。
14:49
So if we wanted to translate翻譯 Wikipedia維基百科 into Spanish西班牙文,
376
874000
2000
如果我哋想將維基百科翻譯為西班牙文
14:51
we could do it in five weeks禮拜 with 100,000 active積極 users用戶.
377
876000
3000
用10萬活躍用戶我哋可以系5週內完成。
14:54
And we could do it in about 80 hours小時 with a million active積極 users用戶.
378
879000
3000
用100萬活躍用戶我哋可以系80個種之內完成。
14:57
Since因為 all the projects項目 that my group has worked工作 on so far have gotten得到 millions数百万 of users用戶,
379
882000
3000
既然我嘅團隊接觸嘅所有項目都達到百萬用戶,
15:00
we're hopeful希望 that we'll我哋就 be able to translate翻譯
380
885000
2000
我哋希望可以用哩個項目
15:02
extremely fast快速 with this project項目.
381
887000
2000
極快甘翻譯。
15:04
Now the thing that I'm most excited興奮 about with DuolingoDuolingo
382
889000
3000
我對Duolingo最為興奮嘅系
15:07
is I think this provides提供 a fair公平 business業務 model模型 for language語言 education教育.
383
892000
3000
我認佢為語言教育提供咗一種公平交易嘅商業模式。
15:10
So here's呢度有 the thing:
384
895000
2000
系甘樣嘅:
15:12
The current當前 business業務 model模型 for language語言 education教育
385
897000
2000
目前嘅語言教育模式系
15:14
is the student學生 pays支付,
386
899000
2000
學生付費,
15:16
and in particular特定, the student學生 pays支付 Rosetta罗塞塔 Stone石頭 500 dollars美元.
387
901000
2000
特別一提嘅系,學生向Rosetta Stone付500美元。
15:18
(Laughter笑聲)
388
903000
2000
(笑聲)
15:20
That's the current當前 business業務 model模型.
389
905000
2000
這就係目前嘅商業模式。
15:22
The problem個問題 with this business業務 model模型
390
907000
2000
哩種商業模式嘅問題系
15:24
is that 95 percent百分比 of the world's世界嘅 population人口 doesn't have 500 dollars美元.
391
909000
3000
95%嘅人口並冇500美元
15:27
So it's extremely unfair公平 towards the poor可憐.
392
912000
3000
所以佢對貧窮人口系極唔公平嘅。
15:30
This is totally完全 biased偏見 towards the rich豐富.
393
915000
2000
佢完全偏向富裕人口。
15:32
Now see, in DuolingoDuolingo,
394
917000
2000
現在,因為你學緊嘅時候
15:34
because while you learn學習
395
919000
2000
現在,因為你學緊嘅時候
15:36
you're actually講真 creating創建 value價值, you're translating正在翻譯 stuff啲嘢 --
396
921000
3000
你實際上創造緊價值——
15:39
which for example例子, we could charge負責 somebody有人 for translations翻譯.
397
924000
3000
你翻譯緊果嘀本需要使錢翻譯嘅嘢
15:42
So this is how we could monetize賺錢 this.
398
927000
2000
這就係我哋貨幣化學習嘅方法
15:44
Since因為 people are creating創建 value價值 while they're learning學習,
399
929000
2000
既然佢哋學習嘅時候創造緊價值
15:46
they don't have to pay支付 their佢哋 money, they pay支付 with their佢哋 time.
400
931000
3000
佢哋無需付出金錢,而付出佢哋嘅時間。
15:49
But the magical神奇 thing here is that they're paying支付 with their佢哋 time,
401
934000
3000
但神奇嘅在於哩嘀你付出嘅時間
15:52
but that is time that would have had to have been spent anyways反正
402
937000
2000
本事就系要來學語言嘅
15:54
learning學習 the language語言.
403
939000
2000
本事就系要來學語言嘅
15:56
So the nice thing about DuolingoDuolingo is I think it provides提供 a fair公平 business業務 model模型 --
404
941000
3000
所以我講Duolingo做嘅一件好事就系提供咗一個公平嘅商業模式——
15:59
one that doesn't discriminate歧視 against poor可憐 people.
405
944000
2000
一個唔歧視貧窮嘅模式。
16:01
So here's呢度有 the site網站. Thank you.
406
946000
2000
這就是哩個網站。多謝
16:03
(Applause掌聲)
407
948000
8000
(掌聲)
16:11
So here's呢度有 the site網站.
408
956000
2000
哩個就系Duolingo嘅網站。
16:13
We haven't yet尚未 launched推出,
409
958000
2000
我們縱未上線,
16:15
but if you go there, you can sign標誌 up to be part部分 of our private私人 beta試用版,
410
960000
3000
但系如果你去哩個網站,你可以註冊成為非公開測試版本嘅一員。
16:18
which is probably可能 going to start初時 in about three or four weeks禮拜.
411
963000
2000
測試版本可能在未來三四個星期就會開始。
16:20
We haven't yet尚未 launched推出 this DuolingoDuolingo.
412
965000
2000
Duolingo縱未正式上線。
16:22
By the way, I'm the one talking講嘢 here,
413
967000
2000
順便講一句,雖然系我在這裡做這個演講,
16:24
but actually講真 DuolingoDuolingo is the work of a really awesome team團隊, some of whom边个 are here.
414
969000
3000
但實際上Duolingo是一個出色團隊的產品,哩個團隊中嘅一些人今日都系哩度。
16:27
So thank you.
415
972000
2000
多謝!
16:29
(Applause掌聲)
416
974000
4000
掌聲
Translated by Bruce Ding
Reviewed by Yuping Huang

▲Back to top

ABOUT THE SPEAKER
Luis von Ahn - Computer scientist
Luis von Ahn builds systems that combine humans and computers to solve large-scale problems that neither can solve alone.

Why you should listen

Louis von Ahn is an associate professor of Computer Science at Carnegie Mellon University, and he's at the forefront of the crowdsourcing craze. His work takes advantage of the evergrowing Web-connected population to acheive collaboration in unprecedented numbers. His projects aim to leverage the crowd for human good. His company reCAPTCHA, sold to Google in 2009, digitizes human knowledge (books), one word at a time. His new project is Duolingo, which aims to get 100 million people translating the Web in every major language.

More profile about the speaker
Luis von Ahn | Speaker | TED.com

Data provided by TED.

This site was created in May 2015 and the last update was on January 12, 2020. It will no longer be updated.

We are currently creating a new site called "eng.lish.video" and would be grateful if you could access it.

If you have any questions or suggestions, please feel free to write comments in your language on the contact form.

Privacy Policy

Developer's Blog

Buy Me A Coffee