ABOUT THE SPEAKER
John Wilbanks - Data Commons Advocate
Imagine the discoveries that could result from a giant pool of freely available health and genomic data. John Wilbanks is working to build it.

Why you should listen

Performing a medical or genomic experiment on a human requires informed consent and careful boundaries around privacy. But what if the data that results, once scrubbed of identifying marks, was released into the wild? At WeConsent.us, John Wilbanks thinks through the ethical and procedural steps to create an open, massive, mine-able database of data about health and genomics from many sources. One step: the Portable Legal Consent for Common Genomics Research (PLC-CGR), an experimental bioethics protocol that would allow any test subject to say, "Yes, once this experiment is over, you can use my data, anonymously, to answer any other questions you can think of." Compiling piles of test results in one place, Wilbanks suggests, would turn genetic info into big data--giving researchers the potential to spot patterns that simply aren't viewable up close. 

A campaigner for the wide adoption of data sharing in science, Wilbanks is also a Senior Fellow with the Kauffman Foundation, a Research Fellow at Lybba and supported by Sage Bionetworks

In February 2013, the US government responded to a We the People petition spearheaded by Wilbanks and signed by 65,000 people, and announced a plan to open up taxpayer-funded research data and make it available for free.

More profile about the speaker
John Wilbanks | Speaker | TED.com
TEDGlobal 2012

John Wilbanks: Let's pool our medical data

约翰·威尔班克斯: 让我们分享医疗数据

Filmed:
581,818 views

当你接受医疗或参加医疗实验时,隐私是很重要的。严格的法律限制着研究者去了解病人。如果任何希望验证实验假设的人都可以利用人们(不记名)的医疗数据会怎么样?约翰·威尔班克斯对保护医疗隐私的做法提出了质疑,认为其会拖累研究。他还讨论了开源医疗信息是否可以带来新一波医疗创新。
- Data Commons Advocate
Imagine the discoveries that could result from a giant pool of freely available health and genomic data. John Wilbanks is working to build it. Full bio

Double-click the English transcript below to play the video.

00:15
So I have bad news新闻, I have good news新闻,
0
98
3061
我有些好消息也有些坏消息
00:19
and I have a task任务.
1
3159
1865
也有个任务。
00:20
So the bad news新闻 is that we all get sick生病.
2
5024
2943
坏消息是我们都会得病
00:23
I get sick生病. You get sick生病.
3
7967
2272
我会生病,你会生病。
00:26
And every一切 one of us gets得到 sick生病, and the question really is,
4
10239
2542
我们每一个人都会生病。问题是:
00:28
how sick生病 do we get? Is it something that kills杀死 us?
5
12781
2877
我们病得有多重?会不会致命?
00:31
Is it something that we survive生存?
6
15658
1345
我们能不能挺过来?
00:32
Is it something that we can treat对待?
7
17003
1928
可不可以被治愈?
00:34
And we've我们已经 gotten得到 sick生病 as long as we've我们已经 been people.
8
18931
3256
只要我们还是人类,我们就会得病。
00:38
And so we've我们已经 always looked看着 for reasons原因 to explain说明 why we get sick生病.
9
22187
3486
所以,我们常会寻找得病的原因。
00:41
And for a long time, it was the gods, right?
10
25673
1957
很长时间以来,我们认为是和上帝有关,
00:43
The gods are angry愤怒 with me, or the gods are testing测试 me,
11
27630
3154
上帝生我的气了,或者上帝在考验我,
00:46
right? Or God, singular单数, more recently最近,
12
30784
2416
或是上帝
00:49
is punishing惩罚 me or judging判断 me.
13
33200
2664
在惩罚我或审判我。
00:51
And as long as we've我们已经 looked看着 for explanations说明,
14
35864
2680
我们不断的寻求生病的原因
00:54
we've我们已经 wound伤口 up with something that gets得到 closer接近 and closer接近 to science科学,
15
38544
3711
我们得到了一些越来越接近科学的假设,
00:58
which哪一个 is hypotheses假设 as to why we get sick生病,
16
42255
2489
一些我们为什么会得病的假设。
01:00
and as long as we've我们已经 had hypotheses假设 about why we get sick生病, we've我们已经 tried试着 to treat对待 it as well.
17
44744
4740
有了这些假设,我们就会尝试去治愈疾病。
01:05
So this is Avicenna阿维森纳. He wrote a book over a thousand years年份 ago called "The Canon教规 of Medicine医学,"
18
49484
4033
这是伊本·西那 ,几千年前他写了一本《医学规则》。
01:09
and the rules规则 he laid铺设 out for testing测试 medicines药品
19
53517
2406
他记述的药物测试规则
01:11
are actually其实 really similar类似 to the rules规则 we have today今天,
20
55923
1789
和我们今天使用的规则非常相似
01:13
that the disease疾病 and the medicine医学 must必须 be the same相同 strength强度,
21
57712
2945
药物的作用强度必须与疾病的严重程度相当。
01:16
the medicine医学 needs需求 to be pure, and in the end结束 we need
22
60657
2397
药物需是纯净的, 最终,我们会进行人体实验
01:18
to test测试 it in people. And so if you put together一起 these themes主题
23
63054
3141
如果你把这些关于人体试验的
01:22
of a narrative叙述 or a hypothesis假设 in human人的 testing测试,
24
66195
4465
叙述或假设总结起来,
01:26
right, you get some beautiful美丽 results结果,
25
70660
2656
就会得到很漂亮的结果,
01:29
even when we didn't have very good technologies技术.
26
73316
1442
即使是在科技不发达的年代。
01:30
This is a guy named命名 Carlos卡洛斯 Finlay芬利. He had a hypothesis假设
27
74758
3062
这个人叫卡洛斯·芬莱, 他提出了一个
01:33
that was way outside the box for his time, in the late晚了 1800s.
28
77820
2905
在19世纪末期非常有超前的假设
01:36
He thought yellow黄色 fever发热 was not transmitted发送 by dirty clothing服装.
29
80725
2848
他认为黄热病不是由脏衣物
01:39
He thought it was transmitted发送 by mosquitos蚊子.
30
83573
2426
而是由蚊子传播的。
01:41
And they laughed笑了 at him. For 20 years年份, they called this guy
31
85999
2362
人们都取笑他,之后的20年里人们称他为"蚊子先生"
01:44
"the mosquito蚊子 man." But he ran an experiment实验 in people,
32
88361
3489
但他在一些人身上进行了实验
01:47
right? He had this hypothesis假设, and he tested测试 it in people.
33
91850
3103
他有了这个假设然后在人们身上进行测试
01:50
So he got volunteers志愿者 to go move移动 to Cuba古巴 and live生活 in tents帐篷
34
94953
4642
他招了自愿感染黄热病的志愿者
01:55
and be voluntarily自行 infected感染 with yellow黄色 fever发热.
35
99595
3035
去古巴住在帐篷里
01:58
So some of the people in some of the tents帐篷 had dirty clothes衣服
36
102630
3022
其中一些放有脏衣服
02:01
and some of the people were in tents帐篷 that were full充分
37
105652
1219
另一些
02:02
of mosquitos蚊子 that had been exposed裸露 to yellow黄色 fever发热.
38
106871
2256
则满是接触过黄热病的蚊子
02:05
And it definitively明确 proved证实 that it wasn't this magic魔法 dust灰尘
39
109127
3401
实验无疑证明了并不是那些
02:08
called fomites传染体 in your clothes衣服 that caused造成 yellow黄色 fever发热.
40
112528
3422
叫做感染体的神奇污垢导致黄热病
02:11
But it wasn't until直到 we tested测试 it in people that we actually其实 knew知道.
41
115950
3376
如果没有人体实验,我们将无法证明这个结论。
02:15
And this is what those people signed up for.
42
119326
1959
并且这是人们自愿的
02:17
This is what it looked看着 like to have yellow黄色 fever发热 in Cuba古巴
43
121285
3090
这就是当时在古巴得黄热病的样子
02:20
at that time. You suffered遭遇 in a tent帐篷, in the heat, alone单独,
44
124375
4534
你在帐篷里独自一人忍受高温
02:24
and you probably大概 died死亡.
45
128909
2696
还有可能会死掉
02:27
But people volunteered自告奋勇 for this.
46
131605
3217
但人们依然自愿参加
02:30
And it's not just a cool example of a scientific科学 design设计
47
134822
3311
这并不仅是一个很酷的科学实验设计
02:34
of experiment实验 in theory理论. They also did this beautiful美丽 thing.
48
138133
2913
他们还做了这样一件漂亮的事情
02:36
They signed this document文件, and it's called an informed通知 consent同意 document文件.
49
141046
3919
这个文件叫做知情同意书,志愿者需要签署这个协议
02:40
And informed通知 consent同意 is an idea理念 that we should be
50
144965
2513
知情同意书是一个值得社会骄傲的想法
02:43
very proud骄傲 of as a society社会, right? It's something that
51
147478
2226
它把我们和纽伦堡的那些
02:45
separates中隔离 us from the Nazis纳粹 at Nuremberg纽伦堡,
52
149704
2766
进行强制医学实验的纳粹党区分开来
02:48
enforced强制执行 medical experimentation实验. It's the idea理念
53
152470
2875
它的基本原则是,
02:51
that agreement协议 to join加入 a study研究 without understanding理解 isn't agreement协议.
54
155345
3788
志愿者的同意必须建立在知晓实验内容的基础上
02:55
It's something that protects保护 us from harm危害, from hucksters小贩,
55
159133
4109
它可以防止我们被唯利是图的人
02:59
from people that would try to hoodwink蒙蔽 us into a clinical临床
56
163242
2853
哄骗去参加那些
03:01
study研究 that we don't understand理解, or that we don't agree同意 to.
57
166095
3752
我们不了解也不想参加的医学实验
03:05
And so you put together一起 the thread线 of narrative叙述 hypothesis假设,
58
169847
4329
将这些口头叙述、科学假设
03:10
experimentation实验 in humans人类, and informed通知 consent同意,
59
174176
2597
人体试验以及知情同意书组合在一起
03:12
and you get what we call clinical临床 study研究, and it's how we do
60
176773
2665
就是我们今天的临床试验
03:15
the vast广大 majority多数 of medical work. It doesn't really matter
61
179438
3015
大量的医学工作都通过临床实验完成。
03:18
if you're in the north, the south, the east, the west西.
62
182453
2342
无论你来自何方
03:20
Clinical临床 studies学习 form形成 the basis基础 of how we investigate调查,
63
184795
4113
临床研究形成了医学研究的基础
03:24
so if we're going to look at a new drug药物, right,
64
188908
1859
比如说我们要开发一种新药物
03:26
we test测试 it in people, we draw blood血液, we do experiments实验,
65
190767
2998
我们要进行人体实验、抽血、化验
03:29
and we gain获得 consent同意 for that study研究, to make sure
66
193765
2329
当然在这之前要通过知情同意书
03:31
that we're not screwing拧紧 people over as part部分 of it.
67
196094
2649
来确认我们没有强迫人们参与
03:34
But the world世界 is changing改变 around the clinical临床 study研究,
68
198743
3664
临床实验方法已经建立超过10年
03:38
which哪一个 has been fairly相当 well established既定 for tens of years年份
69
202407
3366
如果没有超过50或100年的话
03:41
if not 50 to 100 years年份.
70
205773
1900
而它周围的环境已经不同了。
03:43
So now we're able能够 to gather收集 data数据 about our genomes基因组,
71
207673
3051
现如今我们可以得到基因组的信息
03:46
but, as we saw earlier, our genomes基因组 aren't dispositive处分.
72
210724
2860
但是基因组并不能决定一切
03:49
We're able能够 to gather收集 information信息 about our environment环境.
73
213584
2766
我们可以收集周围环境的信息
03:52
And more importantly重要的, we're able能够 to gather收集 information信息
74
216350
1910
但更重要的是,我们可以记录我们的日常选择
03:54
about our choices选择, because it turns out that what we think of
75
218260
2840
因为我们发现身体健康是由
03:57
as our health健康 is more like the interaction相互作用 of our bodies身体,
76
221100
2720
我们的身体、基因、环境
03:59
our genomes基因组, our choices选择 and our environment环境.
77
223820
3649
和日常选择共同决定的
04:03
And the clinical临床 methods方法 that we've我们已经 got aren't very good
78
227469
2744
如今的临床方法
04:06
at studying研究 that because they are based基于 on the idea理念
79
230213
2632
是基于人与人之间的交互
04:08
of person-to-person人对人 interaction相互作用. You interact相互作用
80
232845
1914
所以并不能满足现代的要求。
04:10
with your doctor医生 and you get enrolled就读 in the study研究.
81
234759
2095
在实验中你只是和你的医生进行交流
04:12
So this is my grandfather祖父. I actually其实 never met会见 him,
82
236854
2615
这是我的外公,虽然我从来没见过他
04:15
but he's holding保持 my mom妈妈, and his genes基因 are in me, right?
83
239469
3795
但他抱着的是我妈妈,而他的基因遗传给我
04:19
His choices选择 ran through通过 to me. He was a smoker抽烟者,
84
243264
2891
他的选择也会影响我。像他人一样
04:22
like most people were. This is my son儿子.
85
246155
2584
他吸烟。 这是我的儿子。
04:24
So my grandfather's爷爷的 genes基因 go all the way through通过 to him,
86
248739
3442
我外公的基因也会遗传给他
04:28
and my choices选择 are going to affect影响 his health健康.
87
252181
2552
我的选择会影响他的健康
04:30
The technology技术 between之间 these two pictures图片
88
254733
2694
两代人的时间
04:33
cannot不能 be more different不同, but the methodology方法
89
257427
3673
科技发展日新月异
04:37
for clinical临床 studies学习 has not radically根本 changed over that time period.
90
261100
4124
但是临床实验方法论并没有很大的改善
04:41
We just have better statistics统计.
91
265224
2668
我们只是有了更好的统计而已
04:43
The way we gain获得 informed通知 consent同意 was formed形成 in large part部分
92
267892
3452
知情同意的授权方式,主要是在第一张照片的年代
04:47
after World世界 War战争 IIII, around the time that picture图片 was taken采取.
93
271344
2591
也就是二战后形成的
04:49
That was 70 years年份 ago, and the way we gain获得 informed通知 consent同意,
94
273935
3934
知情同意书的本意是保护我们免遭伤害
04:53
this tool工具 that was created创建 to protect保护 us from harm危害,
95
277869
2877
但在这70后的今天,这种方法却形成了信息孤岛
04:56
now creates创建 silos筒仓. So the data数据 that we collect搜集
96
280746
3666
在前列腺癌或老年痴呆症
05:00
for prostate前列腺 cancer癌症 or for Alzheimer's老年痴呆症 trials试验
97
284412
2726
临床实验中获取的数据
05:03
goes into silos筒仓 where it can only be used
98
287138
2615
只能存入用以进行前列腺癌或
05:05
for prostate前列腺 cancer癌症 or for Alzheimer's老年痴呆症 research研究.
99
289753
3224
老年痴呆症研究的数据孤岛
05:08
Right? It can't be networked联网. It can't be integrated集成.
100
292977
2894
它们不会被链接或整合起来
05:11
It cannot不能 be used by people who aren't credentialed特命.
101
295871
3533
他们需要获得授权才能被使用
05:15
So a physicist物理学家 can't get access访问 to it without filing备案 paperwork证件.
102
299404
2953
所以一个物理学家必须填写表格才能得到临床数据
05:18
A computer电脑 scientist科学家 can't get access访问 to it without filing备案 paperwork证件.
103
302357
3068
不填表格计算机学家就不能得到临床数据
05:21
Computer电脑 scientists科学家们 aren't patient患者. They don't file文件 paperwork证件.
104
305425
4143
计算机学家可没那么耐心,他们不喜欢填表格
05:25
And this is an accident事故. These are tools工具 that we created创建
105
309568
3986
这是不应该的。这些用来保护我们的方法
05:29
to protect保护 us from harm危害, but what they're doing
106
313554
3267
现在却在阻碍
05:32
is protecting保护 us from innovation革新 now.
107
316821
2530
我们创新的脚步
05:35
And that wasn't the goal目标. It wasn't the point. Right?
108
319351
3265
这并不是他们的初衷,对吧?
05:38
It's a side effect影响, if you will, of a power功率 we created创建
109
322616
2699
这是一个副作用
05:41
to take us for good.
110
325315
2359
一个好事的副作用
05:43
And so if you think about it, the depressing压抑 thing is that
111
327674
3144
想想看,Facebook用于改进广告投放算法
05:46
FacebookFacebook的 would never make a change更改 to something
112
330818
2133
能够获得的样本大小都比
05:48
as important重要 as an advertising广告 algorithm算法
113
332951
2571
一个三期临床实验的样本大得多
05:51
with a sample样品 size尺寸 as small as a Phase IIIIII clinical临床 trial审讯.
114
335522
4411
这真叫人觉得沮丧
05:55
We cannot不能 take the information信息 from past过去 trials试验
115
339933
3662
我们不能汇总过去的实验数据
05:59
and put them together一起 to form形成 statistically统计学 significant重大 samples样本.
116
343595
4154
形成有效的统计样本
06:03
And that sucks, right? So 45 percent百分 of men男人 develop发展
117
347749
3484
这太糟糕了,不是吗? 45%的男性会患癌症
06:07
cancer癌症. Thirty-eight三八 percent百分 of women妇女 develop发展 cancer癌症.
118
351233
3097
38%的女性会患癌症
06:10
One in four men男人 dies of cancer癌症.
119
354330
2344
每4个男人就有1人死于癌症
06:12
One in five women妇女 dies of cancer癌症, at least最小 in the United联合的 States状态.
120
356674
3556
每5个女人就有1人死于癌症,至少在美国是这样
06:16
And three out of the four drugs毒品 we give you
121
360230
2228
四分之三的癌症治疗药物
06:18
if you get cancer癌症 fail失败. And this is personal个人 to me.
122
362458
3513
最终都失败了。 我个人对癌症感受颇深
06:21
My sister妹妹 is a cancer癌症 survivor幸存者.
123
365971
1963
我的姊妹从癌患中痊愈
06:23
My mother-in-law岳母 is a cancer癌症 survivor幸存者. Cancer癌症 sucks.
124
367934
3589
我的岳母也是。得癌症很郁闷的。
06:27
And when you have it, you don't have a lot of privacy隐私
125
371523
2190
如果得了癌症,你在医院里是没多少隐私的
06:29
in the hospital醫院. You're naked the vast广大 majority多数 of the time.
126
373713
3487
大部分时间都是裸体
06:33
People you don't know come in and look at you and poke you and prod you,
127
377200
3695
你不认识的人会进到病房指指点点
06:36
and when I tell cancer癌症 survivors幸存者 that this tool工具 we created创建
128
380895
3441
当我告诉得过癌症的人旨在保护他们的
06:40
to protect保护 them is actually其实 preventing防止 their data数据 from being存在 used,
129
384336
3098
知情同意书阻止了临床研究使用他们的诊断数据
06:43
especially特别 when only three to four percent百分 of people
130
387434
2050
尤其是授权临床研究的癌症患者比例
06:45
who have cancer癌症 ever even sign标志 up for a clinical临床 study研究,
131
389484
2798
只有百分之三到四时
06:48
their reaction反应 is not, "Thank you, God, for protecting保护 my privacy隐私."
132
392282
3558
他们的反应不是“感谢上帝我的隐私得到了保护”
06:51
It's outrage暴行
133
395840
2697
而是非常愤怒
06:54
that we have this information信息 and we can't use it.
134
398537
2125
我们采集了这些信息却没有好好的利用
06:56
And it's an accident事故.
135
400662
2476
这是个意外
06:59
So the cost成本 in blood血液 and treasure宝藏 of this is enormous巨大.
136
403138
3055
这个意外的代价是巨大的
07:02
Two hundred and twenty-six26 billion十亿 a year is spent花费 on cancer癌症 in the United联合的 States状态.
137
406193
3655
美国每年在癌症上的支出是2260亿美元
07:05
Fifteen十五 hundred people a day die in the United联合的 States状态.
138
409848
3219
但每天却有1500人死于癌症
07:08
And it's getting得到 worse更差.
139
413067
2573
真是越来越糟
07:11
So the good news新闻 is that some things have changed,
140
415640
2982
好消息是,有些事正在改变
07:14
and the most important重要 thing that's changed
141
418622
1553
其中最重要的是我们可以利用那些
07:16
is that we can now measure测量 ourselves我们自己 in ways方法
142
420175
2338
之前只能被医疗体系内行使用的方法
07:18
that used to be the dominion主权 of the health健康 system系统.
143
422513
3058
来了解我们自己
07:21
So a lot of people talk about it as digital数字 exhaust排气.
144
425571
2158
很多人认为这是数据狂热
07:23
I like to think of it as the dust灰尘 that runs运行 along沿 behind背后 my kid孩子.
145
427729
3042
我常用一个例子诠释这个想法:
07:26
We can reach达到 back and grab that dust灰尘,
146
430771
2376
提取我儿子身后的粉尘
07:29
and we can learn学习 a lot about health健康 from it, so if our choices选择
147
433147
2414
可以得到很多关于健康的信息
07:31
are part部分 of our health健康, what we eat is a really important重要
148
435561
2680
如果选择是健康的一部分,那么
07:34
aspect方面 of our health健康. So you can do something very simple简单
149
438241
2689
饮食是健康很重要的一部分。你可以随意的
07:36
and basic基本 and take a picture图片 of your food餐饮,
150
440930
1957
拍一张食物的照片
07:38
and if enough足够 people do that, we can learn学习 a lot about
151
442887
2884
如果有足够多的人这样做,我们就可以从中调查
07:41
how our food餐饮 affects影响 our health健康.
152
445771
1425
食物对健康的影响
07:43
One interesting有趣 thing that came来了 out of this — this is an app应用 for iPhonesiPhone手机 called The Eatery简便饮食店
153
447196
4516
由此得到的一个有趣结论是——屏幕上是一款叫做The Eatery的iPhone应用——
07:47
is that we think our pizza比萨 is significantly显著 healthier健康
154
451712
2490
我们吃的披萨比别人吃的披萨
07:50
than other people's人们 pizza比萨 is. Okay? (Laughter笑声)
155
454202
3438
要健康的多(笑声)
07:53
And it seems似乎 like a trivial不重要的 result结果, but this is the sort分类 of research研究
156
457640
3608
这看起来是一个简单的结论
07:57
that used to take the health健康 system系统 years年份
157
461248
2314
但这种研究往往会花费医疗机构
07:59
and hundreds数以百计 of thousands数千 of dollars美元 to accomplish完成.
158
463562
2293
数年以及成百上千万美元去完成
08:01
It was doneDONE in five months个月 by a startup启动 company公司 of a couple一对 of people.
159
465855
3724
现在只有几个人的创业公司在5个月就得出了这个结论
08:05
I don't have any financial金融 interest利益 in it.
160
469579
2624
当然我对这家公司没有任何利益联系
08:08
But more nontrivially非平凡, we can get our genotypes基因型 doneDONE,
161
472203
2696
更加不同寻常的是,我们可以测定自己的基因型
08:10
and although虽然 our genotypes基因型 aren't dispositive处分, they give us clues线索.
162
474899
2818
虽然它们不是决定性的,但依然可以从中得到信息
08:13
So I could show显示 you mine. It's just A's, T'sT的, C'sC'S and G'sG公司.
163
477717
2806
这是我的基因型,腺嘌呤、胞嘧啶、胸腺嘧啶和鸟嘌呤
08:16
This is the interpretation解释 of it. As you can see,
164
480523
2232
这是解读,可以看到
08:18
I carry携带 a 32 percent百分 risk风险 of prostate前列腺 cancer癌症,
165
482755
2600
我有32%的风险患前列腺癌
08:21
22 percent百分 risk风险 of psoriasis银屑病 and a 14 percent百分 risk风险 of Alzheimer's老年痴呆症 disease疾病.
166
485355
4223
22%的风险患牛皮癣,14%的风险患老年痴呆症
08:25
So that means手段, if you're a geneticist遗传学家, you're freaking再用 out,
167
489578
2607
这意味着,如果你是基因学家,你会崩溃的
08:28
going, "Oh my God, you told everyone大家 you carry携带 the ApoEApoE基因 E4 allele等位基因. What's wrong错误 with you?"
168
492185
4034
喊着,“神呐,你告诉每个人你携带ApoE E4等位基因,有毛病吧?”
08:32
Right? When I got these results结果, I started开始 talking to doctors医生,
169
496219
3688
当我得到这些结果,我开始跟一些医生交谈
08:35
and they told me not to tell anyone任何人, and my reaction反应 is,
170
499907
2409
他们跟我说不要告诉任何人,我的反应是
08:38
"Is that going to help anyone任何人 cure治愈 me when I get the disease疾病?"
171
502316
3288
“如果我得病,这些信息可以帮助人治愈我吗”
08:41
And no one could tell me yes.
172
505604
3064
没人肯定的回答我
08:44
And I live生活 in a web卷筒纸 world世界 where, when you share分享 things,
173
508668
2806
我生活在网络时代,当大家分享信息时
08:47
beautiful美丽 stuff东东 happens发生, not bad stuff东东.
174
511474
2710
美好的事情应该发生,而不是糟糕的事情
08:50
So I started开始 putting this in my slide滑动 decks甲板,
175
514184
1900
所以我开始把这些结果放进幻灯片
08:51
and I got even more obnoxious厌恶, and I went to my doctor医生,
176
516084
2461
我变得更“贪得无厌”,我去找我的医生
08:54
and I said, "I'd like to actually其实 get my bloodwork血汗工作.
177
518545
1982
说:“我想要我的验血结果”
08:56
Please give me back my data数据." So this is my most recent最近 bloodwork血汗工作.
178
520527
2790
这就是我最近的验血数据
08:59
As you can see, I have high cholesterol胆固醇.
179
523317
2369
正如所见,我的胆固醇过高
09:01
I have particularly尤其 high bad cholesterol胆固醇, and I have some
180
525686
2751
尤其是有害胆固醇,我的肝脏指数也不太好
09:04
bad liver numbers数字, but those are because we had a dinner晚餐 party派对 with a lot of good wine红酒
181
528437
3003
不过这是因为在验血的前一天的晚宴
09:07
the night before we ran the test测试. (Laughter笑声)
182
531440
2709
我喝了很多红酒
09:10
Right. But look at how non-computable非可计算 this information信息 is.
183
534149
4413
但是,这些数据是多么的不可计算
09:14
This is like the photograph照片 of my granddad公公 holding保持 my mom妈妈
184
538562
2974
这张纸和那张外公抱着母亲的照片一样
09:17
from a data数据 perspective透视, and I had to go into the system系统
185
541536
3599
从数据的角度来说,我必须先进入系统
09:21
and get it out.
186
545135
2162
才能取出数据
09:23
So the thing that I'm proposing建议 we do here
187
547297
3282
这里我要提议的是
09:26
is that we reach达到 behind背后 us and we grab the dust灰尘,
188
550579
2416
我们要去收集身后的灰尘
09:28
that we reach达到 into our bodies身体 and we grab the genotype基因型,
189
552995
2978
要去体内得到基因型
09:31
and we reach达到 into the medical system系统 and we grab our records记录,
190
555973
2701
要去医疗机构得到我们的记录
09:34
and we use it to build建立 something together一起, which哪一个 is a commons公地.
191
558674
3440
然后大家一起建成一个公共数据库(commons)
09:38
And there's been a lot of talk about commonsescommonses, right,
192
562114
3144
各种各样的演讲都提到了公共数据
09:41
here, there, everywhere到处, right. A commons公地 is nothing more
193
565258
2948
一个公共数据库是用个人利益
09:44
than a public上市 good that we build建立 out of private私人的 goods产品.
194
568206
2928
换取大众利益的机制
09:47
We do it voluntarily自行, and we do it through通过 standardized标准化
195
571134
2769
我们通过合法的标准化工具与技术
09:49
legal法律 tools工具. We do it through通过 standardized标准化 technologies技术.
196
573903
2800
资源的参与进去
09:52
Right. That's all a commons公地 is. It's something that we build建立
197
576703
3271
这就是公共数据库。 因为我们认为它很重要
09:55
together一起 because we think it's important重要.
198
579974
2520
所以我们会集体参与
09:58
And a commons公地 of data数据 is something that's really unique独特,
199
582494
2632
这个公共数据库是独一无二的
10:01
because we make it from our own拥有 data数据. And although虽然
200
585126
2868
因为他是由每个人特殊的数据组成的
10:03
a lot of people like privacy隐私 as their methodology方法 of control控制
201
587994
2287
即使很多人用很多方法去保护
10:06
around data数据, and obsess缠住 around privacy隐私, at least最小
202
590281
2255
和关注自己的隐私和数据
10:08
some of us really like to share分享 as a form形成 of control控制,
203
592536
3048
但至少有一些人喜欢去分享,掌控自己的数据
10:11
and what's remarkable卓越 about digital数字 commonsescommonses
204
595584
2353
公共数据库的一个显著特点是
10:13
is you don't need a big percentage百分比 if your sample样品 size尺寸 is big enough足够
205
597937
3532
只要有了足够的样本,不需要很大比例的人来参与
10:17
to generate生成 something massive大规模的 and beautiful美丽.
206
601469
2511
也可以得到漂亮且大规模的结果
10:19
So not that many许多 programmers程序员 write free自由 software软件,
207
603980
2558
正因为此,虽然写开源软件的程序员不多
10:22
but we have the Apache阿帕奇 web卷筒纸 server服务器.
208
606538
2335
但我们仍有Apache服务器
10:24
Not that many许多 people who read Wikipedia维基百科 edit编辑,
209
608873
2697
虽然浏览维基百科的人中很少有人会去编辑
10:27
but it works作品. So as long as some people like to share分享
210
611570
4009
但维基百科很好用。所以只要有人愿意分享
10:31
as their form形成 of control控制, we can build建立 a commons公地, as long as we can get the information信息 out.
211
615579
3744
只要能得到数据,我们就能建立公共数据库
10:35
And in biology生物学, the numbers数字 are even better.
212
619323
2376
在生物界,数字更加客观
10:37
So Vanderbilt范德比尔特 ran a study研究 asking people, we'd星期三 like to take
213
621699
2552
范德比特大学在一项研究中对人们进行调查
10:40
your biosamples生物样品, your blood血液, and share分享 them in a biobank生物样本库,
214
624251
3322
希望得到他们的生物样品、血液,并且在生物样本库中分享
10:43
and only five percent百分 of the people opted选择 out.
215
627573
2372
只有5%的人拒绝
10:45
I'm from Tennessee田纳西. It's not the most science-positive科学阳性 state
216
629945
3092
我来自田纳西州,在美国范围里
10:48
in the United联合的 States状态 of America美国. (Laughter笑声)
217
633037
3039
这并不是一个特别喜欢科学的州
10:51
But only five percent百分 of the people wanted out.
218
636076
2378
但是仅有5%的人拒绝提供样本
10:54
So people like to share分享, if you give them the opportunity机会 and the choice选择.
219
638454
4023
所以只要有机会,人们是喜欢分享的
10:58
And the reason原因 that I got obsessed痴迷 with this, besides除了 the obvious明显 family家庭 aspects方面,
220
642477
4483
除了明显的家庭因素,另一个驱使我关注这些的原因是
11:02
is that I spend a lot of time around mathematicians数学家,
221
646960
3273
我和数学家们共事了很长时间
11:06
and mathematicians数学家 are drawn to places地方 where there's a lot of data数据
222
650233
2914
他们会被吸引到有大量数据的地方去
11:09
because they can use it to tease signals信号 out of noise噪声.
223
653147
2943
因为数学家可以利用数据从一团乱麻中梳理出头绪来
11:11
And those correlations相关 that they can tease out, they're not
224
656090
2968
这些被发现的结果并不是病原体
11:14
necessarily一定 causal因果 agents代理, but math数学, in this day and age年龄,
225
659058
3872
但是当我们用原始的方法去研究健康时
11:18
is like a giant巨人 set of power功率 tools工具
226
662930
2360
数学这个能力强大的工具
11:21
that we're leaving离开 on the floor地板, not plugged in in health健康,
227
665290
3875
却一直被遗忘在角落
11:25
while we use hand saws.
228
669165
2312
没有被利用在健康研究中
11:27
If we have a lot of shared共享 genotypes基因型, and a lot of shared共享
229
671477
4438
如果我们可以获得很多人们公开的基因型
11:31
outcomes结果, and a lot of shared共享 lifestyle生活方式 choices选择,
230
675915
2748
化验结果,生活中的选择
11:34
and a lot of shared共享 environmental环境的 information信息, we can start开始
231
678663
2776
以及环境信息
11:37
to tease out the correlations相关 between之间 subtle微妙 variations变化
232
681439
2896
就可以从细微的差别中梳理出联系
11:40
in people, the choices选择 they make and the health健康 that they create创建 as a result结果 of those choices选择,
233
684335
5311
知道选择是怎样影响健康的
11:45
and there's open-source开源 infrastructure基础设施 to do all of this.
234
689646
2486
现在已有开源的基础设施可以完成这些任务
11:48
Sage智者 Bionetworks生物网络 is a nonprofit非营利性 that's built内置 a giant巨人 math数学 system系统
235
692132
3094
Sage Bionetwork是一个拥有大型数学系统的公益机构
11:51
that's waiting等候 for data数据, but there isn't any.
236
695226
4572
他们需要数据,但是却得不到很多
11:55
So that's what I do. I've actually其实 started开始 what we think is
237
699798
3888
这就是我要做的。我想发起世界上第一个
11:59
the world's世界 first fully充分 digital数字, fully充分 self-contributed自促成,
238
703686
3938
完全数字化的、自给自足的
12:03
unlimited无限 in scope范围, global全球 in participation参与, ethically道德 approved批准
239
707624
5035
无限制、无国界且符合伦理的
12:08
clinical临床 research研究 study研究 where you contribute有助于 the data数据.
240
712659
3655
临床研究供人们来提供数据
12:12
So if you reach达到 behind背后 yourself你自己 and you grab the dust灰尘,
241
716314
2206
所以你可以去获取身后的尘土
12:14
if you reach达到 into your body身体 and grab your genome基因组,
242
718520
2626
获取你的基因型
12:17
if you reach达到 into the medical system系统 and somehow不知何故 extract提取 your medical record记录,
243
721146
3047
去医疗机构得到你的医疗数据
12:20
you can actually其实 go through通过 an online线上 informed通知 consent同意 process处理 --
244
724193
3323
你就可以在网上完成知情同意的过程
12:23
because the donation捐款 to the commons公地 must必须 be voluntary自主性
245
727516
2646
因为向公共数据库的贡献必须是自愿的
12:26
and it must必须 be informed通知 -- and you can actually其实 upload上载
246
730162
2793
志愿者必须知情。 之后你可以上传
12:28
your information信息 and have it syndicated辛迪加 to the
247
732955
2592
你的信息,聚合后的信息会送给那些
12:31
mathematicians数学家 who will do this sort分类 of big data数据 research研究,
248
735547
3096
专门解决大数据的数学家们去研究
12:34
and the goal目标 is to get 100,000 in the first year
249
738643
2856
我们第一年的目标是得到10万份数据
12:37
and a million百万 in the first five years年份 so that we have
250
741499
2358
前五年得到100万
12:39
a statistically统计学 significant重大 cohort队列 that you can use to take
251
743857
3834
这样我们就有了具有统计意义的同期组群
12:43
smaller sample样品 sizes大小 from traditional传统 research研究
252
747691
2422
你可以从传统研究中得到更小的采样数量
12:46
and map地图 it against反对,
253
750113
1599
与之对应
12:47
so that you can use it to tease out those subtle微妙 correlations相关
254
751712
2922
你可以从那些使我们互异的差别中
12:50
between之间 the variations变化 that make us unique独特
255
754634
2529
找出细微的联系
12:53
and the kinds of health健康 that we need to move移动 forward前锋 as a society社会.
256
757163
4024
可以找到需要整个社会努力的健康标准
12:57
And I've spent花费 a lot of time around other commons公地.
257
761187
3024
我在开源世界工作了很长时间
13:00
I've been around the early web卷筒纸. I've been around
258
764211
2680
我参与了早期的web的形成
13:02
the early creative创作的 commons公地 world世界, and there's four things
259
766891
2608
我也参与了早期的知识共享组织(creative commons)
13:05
that all of these share分享, which哪一个 is, they're all really simple简单.
260
769499
3354
他们有四个共同点:一是简单
13:08
And so if you were to go to the website网站 and enroll注册 in this study研究,
261
772853
2727
所以如果你去我们的网站参与研究
13:11
you're not going to see something complicated复杂.
262
775580
2255
你不会看到很复杂的事情
13:13
But it's not simplistic简单化. These things are weak intentionally故意地,
263
777835
5049
但这不是过分简单。 它们被故意设计的很轻量
13:18
right, because you can always add power功率 and control控制 to a system系统,
264
782884
3023
因为向一个系统中增加功能简单
13:21
but it's very difficult to remove去掉 those things if you put them in at the beginning开始,
265
785907
3964
但想移除一个一开始就存在的部分是很难的
13:25
and so being存在 simple简单 doesn't mean being存在 simplistic简单化,
266
789871
2545
所以简单并不代表过分简单
13:28
and being存在 weak doesn't mean weakness弱点.
267
792416
2184
保持轻量并不代表能力弱
13:30
Those are strengths优势 in the system系统.
268
794600
2351
这就是我们系统的力量
13:32
And open打开 doesn't mean that there's no money.
269
796951
2665
开放并不代表不能盈利
13:35
Closed关闭 systems系统, corporations公司, make a lot of money
270
799616
3020
封闭的系统和公司通过开放的网络
13:38
on the open打开 web卷筒纸, and they're one of the reasons原因 why the open打开 web卷筒纸 lives生活
271
802636
3539
赚了很多钱,开放的网络赖以生存的原因之一
13:42
is that corporations公司 have a vested既得利益 interest利益 in the openness透明度
272
806175
2827
就是公司和企业对系统的开放性
13:44
of the system系统.
273
809002
2334
有很大的兴趣
13:47
And so all of these things are part部分 of the clinical临床 study研究 that we've我们已经 created创建,
274
811336
3794
我们创造的新临床实验包括这些特性
13:51
so you can actually其实 come in, all you have to be is 14 years年份 old,
275
815130
3429
所以你只要年满14周岁,来到我们的网站
13:54
willing愿意 to sign标志 a contract合同 that says I'm not going to be a jerk混蛋,
276
818559
2027
愿意签署一份合同证明你不会做蠢事
13:56
basically基本上, and you're in.
277
820586
2665
你就被接受了
13:59
You can start开始 analyzing分析 the data数据.
278
823251
1573
你可以开始分析数据
14:00
You do have to solve解决 a CAPTCHACAPTCHA as well. (Laughter笑声)
279
824824
4159
当然你也需要填验证码(笑声)
14:04
And if you'd like to build建立 corporate企业 structures结构 on top最佳 of it,
280
828983
3581
如果你希望基于此建立一个公司
14:08
that's okay too. That's all in the consent同意,
281
832564
3146
也是可以的。这些都在同意书中
14:11
so if you don't like those terms条款, you don't come in.
282
835710
2564
如果你不喜欢这些条款,也可以不参加
14:14
It's very much the design设计 principles原则 of a commons公地
283
838274
3092
我们希望把公共品的设计理念
14:17
that we're trying to bring带来 to health健康 data数据.
284
841366
2594
引入到医疗数据领域
14:19
And the other thing about these systems系统 is that it only takes
285
843960
2979
值得一提的是开发这个系统的团队
14:22
a small number of really unreasonable不合理 people working加工 together一起
286
846939
3179
只有很少几个无私奉献的人
14:26
to create创建 them. It didn't take that many许多 people
287
850118
3182
并不像维基百科一样动用了大量的人
14:29
to make Wikipedia维基百科 Wikipedia维基百科, or to keep it Wikipedia维基百科.
288
853300
3472
去编辑和维护
14:32
And we're not supposed应该 to be unreasonable不合理 in health健康,
289
856772
2068
人们总说我们不应该无私分享关于健康的数据
14:34
and so I hate讨厌 this word "patient患者."
290
858840
2276
所以我讨厌“耐心”这个词
14:37
I don't like being存在 patient患者 when systems系统 are broken破碎,
291
861116
3167
当我们的系统、卫生保健崩溃的时候
14:40
and health健康 care关心 is broken破碎.
292
864283
2627
我不会保持耐心
14:42
I'm not talking about the politics政治 of health健康 care关心, I'm talking about the way we scientifically科学 approach途径 health健康 care关心.
293
866910
4164
我不是指政治上的医保,而是通过科学途径达到的卫生保健
14:46
So I don't want to be patient患者. And the task任务 I'm giving to you
294
871074
3270
我等不及了,交给你们的任务
14:50
is to not be patient患者. So I'd like you to actually其实 try,
295
874344
3046
也很急迫。我希望你们回家后
14:53
when you go home, to get your data数据.
296
877390
2717
尝试去获得自己的数据
14:56
You'll你会 be shocked吃惊 and offended生气 and, I would bet赌注, outraged愤怒,
297
880107
2717
困难程度会让你震惊
14:58
at how hard it is to get it.
298
882824
2876
甚至愤怒
15:01
But it's a challenge挑战 that I hope希望 you'll你会 take,
299
885700
2619
但我希望你们能去挑战一下
15:04
and maybe you'll你会 share分享 it. Maybe you won't惯于.
300
888319
2461
或许你们最终会分享,或许不会
15:06
If you don't have anyone任何人 in your family家庭 who's谁是 sick生病,
301
890780
1444
如果你家里没有任何人生病
15:08
maybe you wouldn't不会 be unreasonable不合理. But if you do,
302
892224
2993
或许你不会无私的分享数据
15:11
or if you've been sick生病, then maybe you would.
303
895217
2207
但如果你曾病过,或许你会分享
15:13
And we're going to be able能够 to do an experiment实验 in the next下一个 several一些 months个月
304
897424
3088
接下来的几个月我们会做一个实验
15:16
that lets让我们 us know exactly究竟 how many许多 unreasonable不合理 people are out there.
305
900512
3157
让我们了解有多少无私奉献的人
15:19
So this is the Athena雅典娜 Breast乳房 Health健康 Network网络. It's a study研究
306
903669
2122
这是Athena Breast健康网络
15:21
of 150,000 women妇女 in California加州, and they're going to
307
905791
3818
在一个针对15万加州女性的研究中
15:25
return返回 all the data数据 to the participants参与者 of the study研究
308
909609
2718
他们会将数据以可计算的方式归还给参与者
15:28
in a computable可计算 form形成, with one-clickability一个可点击 to load加载 it into
309
912327
3146
只要点击一下
15:31
the study研究 that I've put together一起. So we'll know exactly究竟
310
915473
2616
就可以把数据载入我们的系统中。这样就可以知道
15:33
how many许多 people are willing愿意 to be unreasonable不合理.
311
918089
2304
有多少人愿意无私的分享他们的数据
15:36
So what I'd end结束 [with] is,
312
920393
2384
我希望以这张图片结尾
15:38
the most beautiful美丽 thing I've learned学到了 since以来 I quit放弃 my job工作
313
922777
3320
自从一年前辞职做这件事情,我学到的
15:41
almost几乎 a year ago to do this, is that it really doesn't take
314
926097
3383
最美好的事情是:完成一件壮举真的
15:45
very many许多 of us to achieve实现 spectacular壮观 results结果.
315
929480
3808
不需要召集很多人去合作
15:49
You just have to be willing愿意 to be unreasonable不合理,
316
933288
2712
需要的只是愿意分享自己的数据
15:51
and the risk风险 we're running赛跑 is not the risk风险 those 14 men男人
317
936000
2331
承担的风险远低于那些
15:54
who got yellow黄色 fever发热 ran. Right?
318
938331
1868
自愿感染黄热病的人,对么?
15:56
It's to be naked, digitally数字, in public上市. So you know more
319
940199
2861
这像是在数字世界中集体裸体,是对称的
15:58
about me and my health健康 than I know about you. It's asymmetric非对称 now.
320
943060
3433
你了解我的身体和健康,我也了解你的。
16:02
And being存在 naked and alone单独 can be terrifying可怕的.
321
946493
3630
一个人裸体是很可怕的事情
16:06
But to be naked in a group, voluntarily自行, can be quite相当 beautiful美丽.
322
950123
4467
但是一群人自愿的这么做,可以是非常美好的。
16:10
And so it doesn't take all of us.
323
954590
1888
这不需要所有人都参与
16:12
It just takes all of some of us. Thank you.
324
956478
3006
只要所有愿意参加的人都参与就可以。 谢谢
16:15
(Applause掌声)
325
959484
5590
(掌声)
Translated by Huiqing CHE
Reviewed by Psycho Decoder

▲Back to top

ABOUT THE SPEAKER
John Wilbanks - Data Commons Advocate
Imagine the discoveries that could result from a giant pool of freely available health and genomic data. John Wilbanks is working to build it.

Why you should listen

Performing a medical or genomic experiment on a human requires informed consent and careful boundaries around privacy. But what if the data that results, once scrubbed of identifying marks, was released into the wild? At WeConsent.us, John Wilbanks thinks through the ethical and procedural steps to create an open, massive, mine-able database of data about health and genomics from many sources. One step: the Portable Legal Consent for Common Genomics Research (PLC-CGR), an experimental bioethics protocol that would allow any test subject to say, "Yes, once this experiment is over, you can use my data, anonymously, to answer any other questions you can think of." Compiling piles of test results in one place, Wilbanks suggests, would turn genetic info into big data--giving researchers the potential to spot patterns that simply aren't viewable up close. 

A campaigner for the wide adoption of data sharing in science, Wilbanks is also a Senior Fellow with the Kauffman Foundation, a Research Fellow at Lybba and supported by Sage Bionetworks

In February 2013, the US government responded to a We the People petition spearheaded by Wilbanks and signed by 65,000 people, and announced a plan to open up taxpayer-funded research data and make it available for free.

More profile about the speaker
John Wilbanks | Speaker | TED.com