TEDxCERN

Sinan Aral: How we can protect truth in the age of misinformation

Filmed:
1,289,169 views

Fake news can sway elections, tank economies and sow discord in everyday life. Data scientist Sinan Aral demystifies how and why it spreads so quickly -- citing one of the largest studies on misinformation -- and identifies five strategies to help us unweave the tangled web between true and false.

Double-click the English transcript below to play the video.

00:13
So, on April 23 of 2013,
0
1468
5222
00:18
the Associated Press
put out the following tweet on Twitter.
1
6714
5514
00:24
It said, "Breaking news:
2
12252
2397
00:26
Two explosions at the White House
3
14673
2571
00:29
and Barack Obama has been injured."
4
17268
2333
00:32
This tweet was retweeted 4,000 times
in less than five minutes,
5
20212
5425
00:37
and it went viral thereafter.
6
25661
2217
00:40
Now, this tweet wasn't real news
put out by the Associated Press.
7
28760
4350
00:45
In fact it was false news, or fake news,
8
33134
3333
00:48
that was propagated by Syrian hackers
9
36491
2825
00:51
that had infiltrated
the Associated Press Twitter handle.
10
39340
4694
00:56
Their purpose was to disrupt society,
but they disrupted much more.
11
44407
3889
01:00
Because automated trading algorithms
12
48320
2476
01:02
immediately seized
on the sentiment on this tweet,
13
50820
3360
01:06
and began trading based on the potential
14
54204
2968
01:09
that the president of the United States
had been injured or killed
15
57196
3381
01:12
in this explosion.
16
60601
1200
01:14
And as they started tweeting,
17
62188
1992
01:16
they immediately sent
the stock market crashing,
18
64204
3349
01:19
wiping out 140 billion dollars
in equity value in a single day.
19
67577
5167
01:25
Robert Mueller, special counsel
prosecutor in the United States,
20
73062
4476
01:29
issued indictments
against three Russian companies
21
77562
3892
01:33
and 13 Russian individuals
22
81478
2619
01:36
on a conspiracy to defraud
the United States
23
84121
3167
01:39
by meddling in the 2016
presidential election.
24
87312
3780
01:43
And what this indictment tells as a story
25
91855
3564
01:47
is the story of the Internet
Research Agency,
26
95443
3142
01:50
the shadowy arm of the Kremlin
on social media.
27
98609
3594
01:54
During the presidential election alone,
28
102815
2777
01:57
the Internet Agency's efforts
29
105616
1889
01:59
reached 126 million people
on Facebook in the United States,
30
107529
5167
02:04
issued three million individual tweets
31
112720
3277
02:08
and 43 hours' worth of YouTube content.
32
116021
3842
02:11
All of which was fake --
33
119887
1652
02:13
misinformation designed to sow discord
in the US presidential election.
34
121563
6323
02:20
A recent study by Oxford University
35
128996
2650
02:23
showed that in the recent
Swedish elections,
36
131670
3270
02:26
one third of all of the information
spreading on social media
37
134964
4375
02:31
about the election
38
139363
1198
02:32
was fake or misinformation.
39
140585
2087
02:35
In addition, these types
of social-media misinformation campaigns
40
143037
5078
02:40
can spread what has been called
"genocidal propaganda,"
41
148139
4151
02:44
for instance against
the Rohingya in Burma,
42
152314
3111
02:47
triggering mob killings in India.
43
155449
2303
02:49
We studied fake news
44
157776
1494
02:51
and began studying it
before it was a popular term.
45
159294
3219
02:55
And we recently published
the largest-ever longitudinal study
46
163030
5040
03:00
of the spread of fake news online
47
168094
2286
03:02
on the cover of "Science"
in March of this year.
48
170404
3204
03:06
We studied all of the verified
true and false news stories
49
174523
4161
03:10
that ever spread on Twitter,
50
178708
1753
03:12
from its inception in 2006 to 2017.
51
180485
3818
03:16
And when we studied this information,
52
184612
2314
03:18
we studied verified news stories
53
186950
2876
03:21
that were verified by six
independent fact-checking organizations.
54
189850
3918
03:25
So we knew which stories were true
55
193792
2762
03:28
and which stories were false.
56
196578
2126
03:30
We can measure their diffusion,
57
198728
1873
03:32
the speed of their diffusion,
58
200625
1651
03:34
the depth and breadth of their diffusion,
59
202300
2095
03:36
how many people become entangled
in this information cascade and so on.
60
204419
4142
03:40
And what we did in this paper
61
208942
1484
03:42
was we compared the spread of true news
to the spread of false news.
62
210450
3865
03:46
And here's what we found.
63
214339
1683
03:48
We found that false news
diffused further, faster, deeper
64
216046
3979
03:52
and more broadly than the truth
65
220049
1806
03:53
in every category of information
that we studied,
66
221879
3003
03:56
sometimes by an order of magnitude.
67
224906
2499
03:59
And in fact, false political news
was the most viral.
68
227842
3524
04:03
It diffused further, faster,
deeper and more broadly
69
231390
3147
04:06
than any other type of false news.
70
234561
2802
04:09
When we saw this,
71
237387
1293
04:10
we were at once worried but also curious.
72
238704
2841
04:13
Why?
73
241569
1151
04:14
Why does false news travel
so much further, faster, deeper
74
242744
3373
04:18
and more broadly than the truth?
75
246141
1864
04:20
The first hypothesis
that we came up with was,
76
248339
2961
04:23
"Well, maybe people who spread false news
have more followers or follow more people,
77
251324
4792
04:28
or tweet more often,
78
256140
1557
04:29
or maybe they're more often 'verified'
users of Twitter, with more credibility,
79
257721
4126
04:33
or maybe they've been on Twitter longer."
80
261871
2182
04:36
So we checked each one of these in turn.
81
264077
2298
04:38
And what we found
was exactly the opposite.
82
266691
2920
04:41
False-news spreaders had fewer followers,
83
269635
2436
04:44
followed fewer people, were less active,
84
272095
2254
04:46
less often "verified"
85
274373
1460
04:47
and had been on Twitter
for a shorter period of time.
86
275857
2960
04:50
And yet,
87
278841
1189
04:52
false news was 70 percent more likely
to be retweeted than the truth,
88
280054
5033
04:57
controlling for all of these
and many other factors.
89
285111
3363
05:00
So we had to come up
with other explanations.
90
288498
2690
05:03
And we devised what we called
a "novelty hypothesis."
91
291212
3467
05:07
So if you read the literature,
92
295038
1960
05:09
it is well known that human attention
is drawn to novelty,
93
297022
3754
05:12
things that are new in the environment.
94
300800
2519
05:15
And if you read the sociology literature,
95
303343
1985
05:17
you know that we like to share
novel information.
96
305352
4300
05:21
It makes us seem like we have access
to inside information,
97
309676
3838
05:25
and we gain in status
by spreading this kind of information.
98
313538
3785
05:29
So what we did was we measured the novelty
of an incoming true or false tweet,
99
317792
6452
05:36
compared to the corpus
of what that individual had seen
100
324268
4055
05:40
in the 60 days prior on Twitter.
101
328347
2952
05:43
But that wasn't enough,
because we thought to ourselves,
102
331323
2659
05:46
"Well, maybe false news is more novel
in an information-theoretic sense,
103
334006
4208
05:50
but maybe people
don't perceive it as more novel."
104
338238
3258
05:53
So to understand people's
perceptions of false news,
105
341849
3927
05:57
we looked at the information
and the sentiment
106
345800
3690
06:01
contained in the replies
to true and false tweets.
107
349514
4206
06:06
And what we found
108
354022
1206
06:07
was that across a bunch
of different measures of sentiment --
109
355252
4214
06:11
surprise, disgust, fear, sadness,
110
359490
3301
06:14
anticipation, joy and trust --
111
362815
2484
06:17
false news exhibited significantly more
surprise and disgust
112
365323
5857
06:23
in the replies to false tweets.
113
371204
2806
06:26
And true news exhibited
significantly more anticipation,
114
374392
3789
06:30
joy and trust
115
378205
1547
06:31
in reply to true tweets.
116
379776
2547
06:34
The surprise corroborates
our novelty hypothesis.
117
382347
3786
06:38
This is new and surprising,
and so we're more likely to share it.
118
386157
4609
06:43
At the same time,
there was congressional testimony
119
391092
2925
06:46
in front of both houses of Congress
in the United States,
120
394041
3036
06:49
looking at the role of bots
in the spread of misinformation.
121
397101
3738
06:52
So we looked at this too --
122
400863
1354
06:54
we used multiple sophisticated
bot-detection algorithms
123
402241
3598
06:57
to find the bots in our data
and to pull them out.
124
405863
3074
07:01
So we pulled them out,
we put them back in
125
409347
2659
07:04
and we compared what happens
to our measurement.
126
412030
3119
07:07
And what we found was that, yes indeed,
127
415173
2293
07:09
bots were accelerating
the spread of false news online,
128
417490
3682
07:13
but they were accelerating
the spread of true news
129
421196
2651
07:15
at approximately the same rate.
130
423871
2405
07:18
Which means bots are not responsible
131
426300
2858
07:21
for the differential diffusion
of truth and falsity online.
132
429182
4713
07:25
We can't abdicate that responsibility,
133
433919
2849
07:28
because we, humans,
are responsible for that spread.
134
436792
4259
07:34
Now, everything
that I have told you so far,
135
442472
3334
07:37
unfortunately for all of us,
136
445830
1754
07:39
is the good news.
137
447608
1261
07:42
The reason is because
it's about to get a whole lot worse.
138
450670
4450
07:47
And two specific technologies
are going to make it worse.
139
455850
3682
07:52
We are going to see the rise
of a tremendous wave of synthetic media.
140
460207
5172
07:57
Fake video, fake audio
that is very convincing to the human eye.
141
465403
6031
08:03
And this will powered by two technologies.
142
471458
2754
08:06
The first of these is known
as "generative adversarial networks."
143
474236
3833
08:10
This is a machine-learning model
with two networks:
144
478093
2563
08:12
a discriminator,
145
480680
1547
08:14
whose job it is to determine
whether something is true or false,
146
482251
4200
08:18
and a generator,
147
486475
1167
08:19
whose job it is to generate
synthetic media.
148
487666
3150
08:22
So the synthetic generator
generates synthetic video or audio,
149
490840
5102
08:27
and the discriminator tries to tell,
"Is this real or is this fake?"
150
495966
4675
08:32
And in fact, it is the job
of the generator
151
500665
2874
08:35
to maximize the likelihood
that it will fool the discriminator
152
503563
4435
08:40
into thinking the synthetic
video and audio that it is creating
153
508022
3587
08:43
is actually true.
154
511633
1730
08:45
Imagine a machine in a hyperloop,
155
513387
2373
08:47
trying to get better
and better at fooling us.
156
515784
2803
08:51
This, combined with the second technology,
157
519114
2500
08:53
which is essentially the democratization
of artificial intelligence to the people,
158
521638
5722
08:59
the ability for anyone,
159
527384
2189
09:01
without any background
in artificial intelligence
160
529597
2830
09:04
or machine learning,
161
532451
1182
09:05
to deploy these kinds of algorithms
to generate synthetic media
162
533657
4103
09:09
makes it ultimately so much easier
to create videos.
163
537784
4547
09:14
The White House issued
a false, doctored video
164
542355
4421
09:18
of a journalist interacting with an intern
who was trying to take his microphone.
165
546800
4288
09:23
They removed frames from this video
166
551427
1999
09:25
in order to make his actions
seem more punchy.
167
553450
3287
09:29
And when videographers
and stuntmen and women
168
557157
3385
09:32
were interviewed
about this type of technique,
169
560566
2427
09:35
they said, "Yes, we use this
in the movies all the time
170
563017
3828
09:38
to make our punches and kicks
look more choppy and more aggressive."
171
566869
4763
09:44
They then put out this video
172
572268
1867
09:46
and partly used it as justification
173
574159
2500
09:48
to revoke Jim Acosta,
the reporter's, press pass
174
576683
3999
09:52
from the White House.
175
580706
1339
09:54
And CNN had to sue
to have that press pass reinstated.
176
582069
4809
10:00
There are about five different paths
that I can think of that we can follow
177
588538
5603
10:06
to try and address some
of these very difficult problems today.
178
594165
3739
10:10
Each one of them has promise,
179
598379
1810
10:12
but each one of them
has its own challenges.
180
600213
2999
10:15
The first one is labeling.
181
603236
2008
10:17
Think about it this way:
182
605268
1357
10:18
when you go to the grocery store
to buy food to consume,
183
606649
3611
10:22
it's extensively labeled.
184
610284
1904
10:24
You know how many calories it has,
185
612212
1992
10:26
how much fat it contains --
186
614228
1801
10:28
and yet when we consume information,
we have no labels whatsoever.
187
616053
4278
10:32
What is contained in this information?
188
620355
1928
10:34
Is the source credible?
189
622307
1453
10:35
Where is this information gathered from?
190
623784
2317
10:38
We have none of that information
191
626125
1825
10:39
when we are consuming information.
192
627974
2103
10:42
That is a potential avenue,
but it comes with its challenges.
193
630101
3238
10:45
For instance, who gets to decide,
in society, what's true and what's false?
194
633363
6451
10:52
Is it the governments?
195
640387
1642
10:54
Is it Facebook?
196
642053
1150
10:55
Is it an independent
consortium of fact-checkers?
197
643601
3762
10:59
And who's checking the fact-checkers?
198
647387
2466
11:02
Another potential avenue is incentives.
199
650427
3084
11:05
We know that during
the US presidential election
200
653535
2634
11:08
there was a wave of misinformation
that came from Macedonia
201
656193
3690
11:11
that didn't have any political motive
202
659907
2337
11:14
but instead had an economic motive.
203
662268
2460
11:16
And this economic motive existed,
204
664752
2148
11:18
because false news travels
so much farther, faster
205
666924
3524
11:22
and more deeply than the truth,
206
670472
2010
11:24
and you can earn advertising dollars
as you garner eyeballs and attention
207
672506
4960
11:29
with this type of information.
208
677490
1960
11:31
But if we can depress the spread
of this information,
209
679474
3833
11:35
perhaps it would reduce
the economic incentive
210
683331
2897
11:38
to produce it at all in the first place.
211
686252
2690
11:40
Third, we can think about regulation,
212
688966
2500
11:43
and certainly, we should think
about this option.
213
691490
2325
11:45
In the United States, currently,
214
693839
1611
11:47
we are exploring what might happen
if Facebook and others are regulated.
215
695474
4848
11:52
While we should consider things
like regulating political speech,
216
700346
3801
11:56
labeling the fact
that it's political speech,
217
704171
2508
11:58
making sure foreign actors
can't fund political speech,
218
706703
3819
12:02
it also has its own dangers.
219
710546
2547
12:05
For instance, Malaysia just instituted
a six-year prison sentence
220
713522
4878
12:10
for anyone found spreading misinformation.
221
718424
2734
12:13
And in authoritarian regimes,
222
721696
2079
12:15
these kinds of policies can be used
to suppress minority opinions
223
723799
4666
12:20
and to continue to extend repression.
224
728489
3508
12:24
The fourth possible option
is transparency.
225
732680
3543
12:28
We want to know
how do Facebook's algorithms work.
226
736843
3714
12:32
How does the data
combine with the algorithms
227
740581
2880
12:35
to produce the outcomes that we see?
228
743485
2838
12:38
We want them to open the kimono
229
746347
2349
12:40
and show us exactly the inner workings
of how Facebook is working.
230
748720
4214
12:44
And if we want to know
social media's effect on society,
231
752958
2779
12:47
we need scientists, researchers
232
755761
2086
12:49
and others to have access
to this kind of information.
233
757871
3143
12:53
But at the same time,
234
761038
1547
12:54
we are asking Facebook
to lock everything down,
235
762609
3801
12:58
to keep all of the data secure.
236
766434
2173
13:00
So, Facebook and the other
social media platforms
237
768631
3159
13:03
are facing what I call
a transparency paradox.
238
771814
3134
13:07
We are asking them, at the same time,
239
775266
2674
13:09
to be open and transparent
and, simultaneously secure.
240
777964
4809
13:14
This is a very difficult needle to thread,
241
782797
2691
13:17
but they will need to thread this needle
242
785512
1913
13:19
if we are to achieve the promise
of social technologies
243
787449
3787
13:23
while avoiding their peril.
244
791260
1642
13:24
The final thing that we could think about
is algorithms and machine learning.
245
792926
4691
13:29
Technology devised to root out
and understand fake news, how it spreads,
246
797641
5277
13:34
and to try and dampen its flow.
247
802942
2331
13:37
Humans have to be in the loop
of this technology,
248
805824
2897
13:40
because we can never escape
249
808745
2278
13:43
that underlying any technological
solution or approach
250
811047
4038
13:47
is a fundamental ethical
and philosophical question
251
815109
4047
13:51
about how do we define truth and falsity,
252
819180
3270
13:54
to whom do we give the power
to define truth and falsity
253
822474
3180
13:57
and which opinions are legitimate,
254
825678
2460
14:00
which type of speech
should be allowed and so on.
255
828162
3706
14:03
Technology is not a solution for that.
256
831892
2328
14:06
Ethics and philosophy
is a solution for that.
257
834244
3698
14:10
Nearly every theory
of human decision making,
258
838950
3318
14:14
human cooperation and human coordination
259
842292
2761
14:17
has some sense of the truth at its core.
260
845077
3674
14:21
But with the rise of fake news,
261
849347
2056
14:23
the rise of fake video,
262
851427
1443
14:24
the rise of fake audio,
263
852894
1882
14:26
we are teetering on the brink
of the end of reality,
264
854800
3924
14:30
where we cannot tell
what is real from what is fake.
265
858748
3889
14:34
And that's potentially
incredibly dangerous.
266
862661
3039
14:38
We have to be vigilant
in defending the truth
267
866931
3948
14:42
against misinformation.
268
870903
1534
14:44
With our technologies, with our policies
269
872919
3436
14:48
and, perhaps most importantly,
270
876379
1920
14:50
with our own individual responsibilities,
271
878323
3214
14:53
decisions, behaviors and actions.
272
881561
3555
14:57
Thank you very much.
273
885553
1437
14:59
(Applause)
274
887014
3517
Translated by Ivana Korom
Reviewed by Krystian Aparta

▲Back to top

Data provided by TED.

This site was created in May 2015 and the last update was on January 12, 2020. It will no longer be updated.

We are currently creating a new site called "eng.lish.video" and would be grateful if you could access it.

If you have any questions or suggestions, please feel free to write comments in your language on the contact form.

Privacy Policy

Developer's Blog

Buy Me A Coffee