WEBVTT 00:00:00.000 --> 00:00:02.640 align:middle line:90% [LIVEWORX THEME] 00:00:02.640 --> 00:00:05.100 align:middle line:90% Welcome back to LiveWorx 2020. 00:00:05.100 --> 00:00:07.200 align:middle line:84% My name is Andia Winslow and I'm your host 00:00:07.200 --> 00:00:09.420 align:middle line:90% for today's livestream event. 00:00:09.420 --> 00:00:10.920 align:middle line:84% Already we've been hearing from some 00:00:10.920 --> 00:00:12.810 align:middle line:84% of the top technology thought leaders 00:00:12.810 --> 00:00:14.580 align:middle line:90% from the industrial world. 00:00:14.580 --> 00:00:17.730 align:middle line:84% Now, if you weren't able to attend those earlier sessions, 00:00:17.730 --> 00:00:21.090 align:middle line:84% be sure to check them out on the LiveWorx on-demand catalog, 00:00:21.090 --> 00:00:24.180 align:middle line:84% which is available to you from today through June 19th. 00:00:24.180 --> 00:00:27.810 align:middle line:84% This online library has over 100 additional sessions 00:00:27.810 --> 00:00:29.340 align:middle line:90% of great content. 00:00:29.340 --> 00:00:31.800 align:middle line:84% Also, keep an eye out for an upcoming email 00:00:31.800 --> 00:00:35.800 align:middle line:84% to gain access to our content archive for an entire year. 00:00:35.800 --> 00:00:38.220 align:middle line:84% Now, let's kick off our next session-- 00:00:38.220 --> 00:00:40.770 align:middle line:84% artificial emotional intelligence or emotion 00:00:40.770 --> 00:00:44.100 align:middle line:84% AI, what it is and why it matters. 00:00:44.100 --> 00:00:48.810 align:middle line:84% Dr. Rana el Kaliouby is the CEO and co-founder of Affectiva. 00:00:48.810 --> 00:00:52.890 align:middle line:84% An MIT Media Lab spinoff, Affectiva created and defined 00:00:52.890 --> 00:00:54.750 align:middle line:90% the emotion AI category. 00:00:54.750 --> 00:00:57.630 align:middle line:84% Software that can detect nuanced human emotions 00:00:57.630 --> 00:01:01.740 align:middle line:84% and complex cognitive states from the face and voice. 00:01:01.740 --> 00:01:05.099 align:middle line:84% Dr. el Kaliouby is a pioneer in emotion AI and the author 00:01:05.099 --> 00:01:07.680 align:middle line:84% of Girl Decoded: A Scientist's Quest 00:01:07.680 --> 00:01:10.080 align:middle line:84% to Reclaim our Humanity by Bringing Emotional 00:01:10.080 --> 00:01:12.240 align:middle line:90% Intelligence to Technology. 00:01:12.240 --> 00:01:17.690 align:middle line:84% I am super excited to welcome Dr. Rana el Kaliouby. 00:01:17.690 --> 00:01:18.740 align:middle line:90% Thank you, Andia. 00:01:18.740 --> 00:01:22.020 align:middle line:84% It's a pleasure to be with you all today. 00:01:22.020 --> 00:01:24.770 align:middle line:84% I'm on a mission to humanize technology 00:01:24.770 --> 00:01:29.180 align:middle line:84% before it dehumanizes us by building emotional intelligence 00:01:29.180 --> 00:01:32.900 align:middle line:84% and empathy into our devices and our technologies. 00:01:32.900 --> 00:01:35.420 align:middle line:84% And in doing so my goal is to re-imagine 00:01:35.420 --> 00:01:38.690 align:middle line:84% human-computer interfaces as well as 00:01:38.690 --> 00:01:42.100 align:middle line:90% human to human connection. 00:01:42.100 --> 00:01:45.250 align:middle line:84% We've all been catapulted into this universe where 00:01:45.250 --> 00:01:46.480 align:middle line:90% we're working virtually. 00:01:46.480 --> 00:01:49.210 align:middle line:84% That's how we're connecting with our team members. 00:01:49.210 --> 00:01:50.800 align:middle line:84% We're learning online, I mean, that's 00:01:50.800 --> 00:01:52.420 align:middle line:90% how my kids are learning. 00:01:52.420 --> 00:01:55.870 align:middle line:84% And we're connecting with friends and families virtually. 00:01:55.870 --> 00:01:57.680 align:middle line:90% However, something is missing. 00:01:57.680 --> 00:01:59.410 align:middle line:90% It's not quite the same. 00:01:59.410 --> 00:02:03.610 align:middle line:84% And it's really this concept of all of our nonverbal signals 00:02:03.610 --> 00:02:07.180 align:middle line:84% missing from these virtual environments. 00:02:07.180 --> 00:02:11.570 align:middle line:84% AI is taking on roles that were traditionally done by humans, 00:02:11.570 --> 00:02:13.990 align:middle line:84% such as assisting with driving our cars, 00:02:13.990 --> 00:02:17.620 align:middle line:84% assisting with our health care, helping us be more productive, 00:02:17.620 --> 00:02:21.310 align:middle line:84% and perhaps even hiring your next co-worker. 00:02:21.310 --> 00:02:25.240 align:middle line:84% The problem is we need a new social contract between humans 00:02:25.240 --> 00:02:26.380 align:middle line:90% and AI. 00:02:26.380 --> 00:02:30.130 align:middle line:84% One that is based on mutual and reciprocal trust. 00:02:30.130 --> 00:02:32.200 align:middle line:84% Of course, we need to trust in the AI 00:02:32.200 --> 00:02:34.450 align:middle line:84% and there's a lot of conversations around that. 00:02:34.450 --> 00:02:40.030 align:middle line:84% But more importantly, AI needs to trust in us humans. 00:02:40.030 --> 00:02:43.060 align:middle line:84% After all, we don't always have a perfect track record 00:02:43.060 --> 00:02:46.420 align:middle line:90% of doing the right thing. 00:02:46.420 --> 00:02:49.480 align:middle line:84% Unfortunately, there are already numerous examples where 00:02:49.480 --> 00:02:51.650 align:middle line:90% this trust has gone wrong. 00:02:51.650 --> 00:02:56.500 align:middle line:84% A Twitter chatbot that has turned racist overnight, 00:02:56.500 --> 00:02:59.470 align:middle line:84% self-driving cars getting into fatal accidents, 00:02:59.470 --> 00:03:03.040 align:middle line:84% and facial recognition technology that discriminates 00:03:03.040 --> 00:03:05.230 align:middle line:90% against certain populations-- 00:03:05.230 --> 00:03:08.740 align:middle line:90% especially women of color. 00:03:08.740 --> 00:03:13.090 align:middle line:84% To rebuild this trust, let's look at how humans do it. 00:03:13.090 --> 00:03:15.160 align:middle line:84% Every day we make thousands of decisions 00:03:15.160 --> 00:03:17.150 align:middle line:84% that involve trusting each other, 00:03:17.150 --> 00:03:21.330 align:middle line:84% both in our personal and our professional relationships. 00:03:21.330 --> 00:03:24.060 align:middle line:84% Sometimes this trust is based on legalize 00:03:24.060 --> 00:03:25.890 align:middle line:90% and terms and conditions. 00:03:25.890 --> 00:03:30.360 align:middle line:84% But more often, it's based on the implicit, nonverbal, subtle 00:03:30.360 --> 00:03:33.930 align:middle line:84% cues that we exchange with one another with empathy 00:03:33.930 --> 00:03:36.450 align:middle line:84% at the core of building this trust. 00:03:36.450 --> 00:03:38.970 align:middle line:84% Technology today has a lot of IQ-- 00:03:38.970 --> 00:03:42.690 align:middle line:84% a lot of cognitive intelligence, but no EQ-- 00:03:42.690 --> 00:03:44.410 align:middle line:90% no emotional intelligence. 00:03:44.410 --> 00:03:47.340 align:middle line:90% It's the missing component. 00:03:47.340 --> 00:03:50.950 align:middle line:84% So my entire career, I've been asking, 00:03:50.950 --> 00:03:54.990 align:middle line:84% what if technology could identify human emotions just 00:03:54.990 --> 00:03:56.942 align:middle line:90% as we can? 00:03:56.942 --> 00:03:58.900 align:middle line:84% What if your computer could tell the difference 00:03:58.900 --> 00:04:01.750 align:middle line:90% between a smile and a smirk? 00:04:01.750 --> 00:04:03.890 align:middle line:84% Both involve the lower half of the face, 00:04:03.890 --> 00:04:06.770 align:middle line:84% but they have very different meanings. 00:04:06.770 --> 00:04:11.690 align:middle line:84% So how can we build AI that understands human? 00:04:11.690 --> 00:04:14.570 align:middle line:84% The way to do this is to look at how humans do it. 00:04:14.570 --> 00:04:18.320 align:middle line:84% Only 7% of how we communicate our mental states 00:04:18.320 --> 00:04:21.760 align:middle line:84% is based on the actual choice of words we use. 00:04:21.760 --> 00:04:26.960 align:middle line:84% 93% is non-verbal, split between our facial expressions 00:04:26.960 --> 00:04:27.950 align:middle line:90% and our gestures-- 00:04:27.950 --> 00:04:30.420 align:middle line:90% I do a lot of these-- 00:04:30.420 --> 00:04:34.040 align:middle line:84% and 38% of vocal intonations, how fast 00:04:34.040 --> 00:04:37.640 align:middle line:84% are you speaking, how much energy is in your voice? 00:04:37.640 --> 00:04:40.080 align:middle line:84% A lot of my career has been focused on the face. 00:04:40.080 --> 00:04:44.240 align:middle line:84% It's a very powerful canvas of communicating human emotion. 00:04:44.240 --> 00:04:49.740 align:middle line:84% The science of facial emotions has existed for over 200 years. 00:04:49.740 --> 00:04:52.700 align:middle line:84% This guy, Duchenne, used to electrically stimulate 00:04:52.700 --> 00:04:57.740 align:middle line:84% our facial muscles to map how these facial muscles move. 00:04:57.740 --> 00:05:00.790 align:middle line:84% We don't do that anymore, thankfully. 00:05:00.790 --> 00:05:04.740 align:middle line:84% And then in the late 1970s, Paul Ekman and his team 00:05:04.740 --> 00:05:07.770 align:middle line:84% published the Facial Action Coding System-- 00:05:07.770 --> 00:05:11.760 align:middle line:84% an objective method of mapping each facial muscle movement 00:05:11.760 --> 00:05:13.960 align:middle line:90% to an action unit, to a code. 00:05:13.960 --> 00:05:16.650 align:middle line:84% So, for example, when you smile-- and try this with me-- 00:05:16.650 --> 00:05:19.800 align:middle line:84% when you smile, you're pulling the zygomaticus muscle 00:05:19.800 --> 00:05:24.180 align:middle line:84% and that's basically the lip corner pull or action unit 12. 00:05:24.180 --> 00:05:28.210 align:middle line:84% When you furrow or frown, this is the action unit 4, 00:05:28.210 --> 00:05:29.910 align:middle line:90% it's the brow furrow. 00:05:29.910 --> 00:05:32.310 align:middle line:84% And it's typically an indicator of a negative emotion 00:05:32.310 --> 00:05:36.080 align:middle line:90% like confusion or anger. 00:05:36.080 --> 00:05:39.530 align:middle line:84% There's about 45 of these facial muscles and it 00:05:39.530 --> 00:05:42.290 align:middle line:84% takes about 100 hours of training 00:05:42.290 --> 00:05:47.090 align:middle line:84% to become a certified fax coder or face reader-- 00:05:47.090 --> 00:05:49.790 align:middle line:84% very laborious, very time intensive. 00:05:49.790 --> 00:05:51.650 align:middle line:84% And to code every minute of video 00:05:51.650 --> 00:05:54.140 align:middle line:84% takes you about five minutes of watching the video 00:05:54.140 --> 00:05:56.890 align:middle line:84% in slow motion and say, oh, ah-ha, I see a, 00:05:56.890 --> 00:05:59.570 align:middle line:84% you know, an eyebrow raise or a squint. 00:05:59.570 --> 00:06:01.040 align:middle line:84% We don't need to do that anymore. 00:06:01.040 --> 00:06:04.370 align:middle line:84% Instead, we use computer vision and machine learning and deep 00:06:04.370 --> 00:06:07.670 align:middle line:84% learning to automatically train algorithms to detect 00:06:07.670 --> 00:06:10.720 align:middle line:90% these facial expressions. 00:06:10.720 --> 00:06:13.450 align:middle line:84% We use hundreds of thousands of examples 00:06:13.450 --> 00:06:17.140 align:middle line:84% of people smiling and smirking and furrowing their eyebrows 00:06:17.140 --> 00:06:18.670 align:middle line:90% to train these algorithms. 00:06:18.670 --> 00:06:23.290 align:middle line:84% And the deep learning network is able to distill 00:06:23.290 --> 00:06:25.480 align:middle line:84% what is common between all these smiles, what's 00:06:25.480 --> 00:06:29.710 align:middle line:84% common between all these frowns, and that's how it learns. 00:06:29.710 --> 00:06:32.320 align:middle line:84% And so, to simplify it, the first step of the process 00:06:32.320 --> 00:06:34.930 align:middle line:84% is to triangulate where the face is, 00:06:34.930 --> 00:06:39.130 align:middle line:84% find facial landmarks like your eyebrows, your mouth, 00:06:39.130 --> 00:06:40.240 align:middle line:90% or your nose. 00:06:40.240 --> 00:06:43.750 align:middle line:84% And then you feed that region into a deep neural network 00:06:43.750 --> 00:06:46.300 align:middle line:84% which is able to distill what expressions are happening 00:06:46.300 --> 00:06:49.450 align:middle line:84% on the face and mapping those into a number 00:06:49.450 --> 00:06:51.760 align:middle line:84% of emotional and cognitive states. 00:06:51.760 --> 00:06:55.090 align:middle line:84% Everything from joy, surprise, anger, disgust 00:06:55.090 --> 00:06:58.720 align:middle line:84% to more complex states like fatigue, attention, 00:06:58.720 --> 00:07:02.470 align:middle line:84% cognitive overload, confusion, and more. 00:07:02.470 --> 00:07:04.540 align:middle line:84% In our work over the past number of years, 00:07:04.540 --> 00:07:08.260 align:middle line:84% we have amassed the world's largest emotion repository. 00:07:08.260 --> 00:07:11.680 align:middle line:84% 9 and 1/2 million face videos that we've collected-- 00:07:11.680 --> 00:07:13.930 align:middle line:84% with everybody's opt-in and consent-- 00:07:13.930 --> 00:07:16.280 align:middle line:84% in 90 countries around the world. 00:07:16.280 --> 00:07:19.900 align:middle line:84% This roughly translates to about 5 billion facial frames. 00:07:19.900 --> 00:07:22.450 align:middle line:84% It's by far the largest repository 00:07:22.450 --> 00:07:25.090 align:middle line:84% of actual emotional responses out there. 00:07:25.090 --> 00:07:27.280 align:middle line:84% And we use that data to train, but also 00:07:27.280 --> 00:07:30.690 align:middle line:90% validate our algorithms. 00:07:30.690 --> 00:07:33.800 align:middle line:84% There are so many applications of this technology 00:07:33.800 --> 00:07:37.340 align:middle line:90% that transforms industries. 00:07:37.340 --> 00:07:39.980 align:middle line:84% My big vision is that in the next few years, 00:07:39.980 --> 00:07:43.160 align:middle line:84% we're going to see emotion AI becoming the de 00:07:43.160 --> 00:07:44.900 align:middle line:90% facto human-machine interface. 00:07:44.900 --> 00:07:47.810 align:middle line:84% So basically we would be interacting with our devices 00:07:47.810 --> 00:07:50.150 align:middle line:84% just the way we interact with one another. 00:07:50.150 --> 00:07:51.260 align:middle line:90% Through conversation. 00:07:51.260 --> 00:07:53.600 align:middle line:84% We're already seeing that with conversational devices 00:07:53.600 --> 00:07:55.550 align:middle line:90% like Alexa and Siri. 00:07:55.550 --> 00:07:56.540 align:middle line:90% Through perception. 00:07:56.540 --> 00:07:58.520 align:middle line:84% Again, we're already starting to see devices 00:07:58.520 --> 00:08:00.530 align:middle line:90% that have cameras on them. 00:08:00.530 --> 00:08:04.070 align:middle line:84% But perhaps most importantly, through empathy and emotional 00:08:04.070 --> 00:08:05.550 align:middle line:90% intelligence. 00:08:05.550 --> 00:08:06.800 align:middle line:90% There's a lot of applications. 00:08:06.800 --> 00:08:09.120 align:middle line:90% I'm going to focus on a few. 00:08:09.120 --> 00:08:11.420 align:middle line:84% The first application is around quantifying 00:08:11.420 --> 00:08:14.960 align:middle line:84% how consumers emotionally engage with products and brands 00:08:14.960 --> 00:08:16.730 align:middle line:90% around them. 00:08:16.730 --> 00:08:20.240 align:middle line:84% The way this works is we send out surveys to people, 00:08:20.240 --> 00:08:22.710 align:middle line:84% we ask them to watch a piece of content. 00:08:22.710 --> 00:08:26.120 align:middle line:84% It could be an online video ad, it could be a movie trailer, 00:08:26.120 --> 00:08:30.380 align:middle line:84% it could be an actual TV show, it could be learning content. 00:08:30.380 --> 00:08:32.299 align:middle line:84% But the idea is we want to capture 00:08:32.299 --> 00:08:35.510 align:middle line:84% the emotional engagement and the emotional response people 00:08:35.510 --> 00:08:37.789 align:middle line:90% have to this content. 00:08:37.789 --> 00:08:39.990 align:middle line:84% We asked people to turn their cameras on-- 00:08:39.990 --> 00:08:42.182 align:middle line:84% consent and opt-in is really important-- 00:08:42.182 --> 00:08:44.390 align:middle line:84% and then we're able to capture these moment-by-moment 00:08:44.390 --> 00:08:46.710 align:middle line:90% responses. 00:08:46.710 --> 00:08:52.170 align:middle line:84% Our technology is being used by 25% of the Fortune Global 500, 00:08:52.170 --> 00:08:55.080 align:middle line:84% as well as leading market research firms that 00:08:55.080 --> 00:08:59.400 align:middle line:84% use our technology to quantify the emotional response 00:08:59.400 --> 00:09:02.898 align:middle line:84% that consumers and viewers have to their content. 00:09:02.898 --> 00:09:04.440 align:middle line:84% I thought it would be fun to actually 00:09:04.440 --> 00:09:08.580 align:middle line:84% show you one of the video ads that we have tested 00:09:08.580 --> 00:09:11.200 align:middle line:84% and we have permission to share publicly. 00:09:11.200 --> 00:09:12.250 align:middle line:90% So let's watch together. 00:09:12.250 --> 00:09:15.666 align:middle line:84% [MUSIC - TONY DALLARA, "COME PRIMA"] 00:09:15.666 --> 00:09:18.106 align:middle line:90% 00:09:18.106 --> 00:09:21.522 align:middle line:90% [ITALIAN SINGING] 00:09:21.522 --> 00:10:13.124 align:middle line:90% 00:10:13.124 --> 00:10:16.390 align:middle line:90% Uh, mom got there first. 00:10:16.390 --> 00:10:21.380 align:middle line:84% So today, we have tested over 50,000 ads worldwide. 00:10:21.380 --> 00:10:27.520 align:middle line:84% And this particular ad scores in the top 10% 00:10:27.520 --> 00:10:30.760 align:middle line:84% of all of these ads in the ninetieth percentile. 00:10:30.760 --> 00:10:34.060 align:middle line:84% It garners very strong emotional engagement, which 00:10:34.060 --> 00:10:36.040 align:middle line:90% is our expressiveness score. 00:10:36.040 --> 00:10:37.570 align:middle line:90% It garners a lot of smiles. 00:10:37.570 --> 00:10:40.960 align:middle line:84% And you can see the moment by moment smile curve 00:10:40.960 --> 00:10:43.660 align:middle line:84% for everybody that's watched that ad that we 00:10:43.660 --> 00:10:46.400 align:middle line:90% were able to record data from. 00:10:46.400 --> 00:10:48.710 align:middle line:84% What is really fascinating about this particular ad 00:10:48.710 --> 00:10:51.410 align:middle line:84% is if you compare the first time people saw it, 00:10:51.410 --> 00:10:53.300 align:middle line:90% which is the solid green line. 00:10:53.300 --> 00:10:55.130 align:middle line:84% With the second time, people saw it, 00:10:55.130 --> 00:10:57.050 align:middle line:90% which is the dotted green line. 00:10:57.050 --> 00:11:00.140 align:middle line:84% You can actually see that people have memory of the ad. 00:11:00.140 --> 00:11:02.420 align:middle line:84% They're anticipating when it's really funny 00:11:02.420 --> 00:11:05.420 align:middle line:84% and they're laughing even before that scene starts, which 00:11:05.420 --> 00:11:07.640 align:middle line:90% is exactly what you want. 00:11:07.640 --> 00:11:09.920 align:middle line:84% And, more importantly, we're taking the viewers 00:11:09.920 --> 00:11:12.050 align:middle line:84% on an emotional journey which culminates 00:11:12.050 --> 00:11:15.650 align:middle line:84% with a very positive response at the end which coincides 00:11:15.650 --> 00:11:16.700 align:middle line:90% with the brand reveal. 00:11:16.700 --> 00:11:19.280 align:middle line:84% And, again, we know from our research 00:11:19.280 --> 00:11:22.990 align:middle line:84% that these kinds of metrics are very positive. 00:11:22.990 --> 00:11:25.020 align:middle line:84% I love this particular example, because it's 00:11:25.020 --> 00:11:29.130 align:middle line:84% an example of where brands like Coca-Cola and Unilever 00:11:29.130 --> 00:11:32.820 align:middle line:84% are using this technology to really push online video 00:11:32.820 --> 00:11:36.720 align:middle line:84% advertising and become more inclusive and more progressive. 00:11:36.720 --> 00:11:39.780 align:middle line:84% But they have to do that in a very thoughtful way. 00:11:39.780 --> 00:11:42.750 align:middle line:84% And they're able to capture these subconscious visceral 00:11:42.750 --> 00:11:45.830 align:middle line:90% responses to their ads. 00:11:45.830 --> 00:11:48.800 align:middle line:84% We also know, from a lot of work we've done with these brands 00:11:48.800 --> 00:11:51.650 align:middle line:84% where we've correlated the emotional responses 00:11:51.650 --> 00:11:54.860 align:middle line:84% to actual consumer behavior, that this kind 00:11:54.860 --> 00:11:58.460 align:middle line:84% of emotional journey and positive emotional responses 00:11:58.460 --> 00:12:00.620 align:middle line:84% correlate very highly with things 00:12:00.620 --> 00:12:03.800 align:middle line:84% like sales lift, purchase intent, and vitality. 00:12:03.800 --> 00:12:06.790 align:middle line:90% 00:12:06.790 --> 00:12:09.910 align:middle line:84% And because we've tested 50,000 ads worldwide, 00:12:09.910 --> 00:12:12.280 align:middle line:90% we're able to create benchmarks. 00:12:12.280 --> 00:12:14.680 align:middle line:84% So, for example, we know that in the US, 00:12:14.680 --> 00:12:19.070 align:middle line:84% pet care and baby care ads elicit the most enjoyment. 00:12:19.070 --> 00:12:22.400 align:middle line:84% Interestingly, in Canada, it's the cereal ads 00:12:22.400 --> 00:12:24.980 align:middle line:90% that elicit the most enjoyment. 00:12:24.980 --> 00:12:28.950 align:middle line:84% And, unfortunately, telecom ads elicit almost no enjoyment 00:12:28.950 --> 00:12:29.450 align:middle line:90% at all. 00:12:29.450 --> 00:12:32.010 align:middle line:90% They're pretty boring to watch. 00:12:32.010 --> 00:12:34.470 align:middle line:84% So this data is really, really insightful, 00:12:34.470 --> 00:12:36.510 align:middle line:84% and provides brands and marketers 00:12:36.510 --> 00:12:39.120 align:middle line:84% with novel insights that helps them 00:12:39.120 --> 00:12:41.400 align:middle line:84% make decisions around media spend 00:12:41.400 --> 00:12:44.510 align:middle line:84% as well as how to optimize their advertising content. 00:12:44.510 --> 00:12:46.140 align:middle line:84% And it's not just advertising, you 00:12:46.140 --> 00:12:50.803 align:middle line:84% can generate that same type of data with TV shows as well. 00:12:50.803 --> 00:12:52.470 align:middle line:84% And this is one of my favorite examples. 00:12:52.470 --> 00:12:56.430 align:middle line:84% It's a sitcom that we tested a while back for CBS. 00:12:56.430 --> 00:12:58.260 align:middle line:84% It was called Friends With Better Lives. 00:12:58.260 --> 00:12:59.610 align:middle line:90% I don't think it aired for long. 00:12:59.610 --> 00:13:01.120 align:middle line:90% It didn't do very well. 00:13:01.120 --> 00:13:03.810 align:middle line:84% But here, again, you're seeing the smile curve. 00:13:03.810 --> 00:13:06.690 align:middle line:84% And we superimpose the characters 00:13:06.690 --> 00:13:08.850 align:middle line:90% of the scene on that curve. 00:13:08.850 --> 00:13:11.490 align:middle line:84% And you can see that there are two particular characters, 00:13:11.490 --> 00:13:14.770 align:middle line:84% every time they show up they're just not funny. 00:13:14.770 --> 00:13:19.120 align:middle line:84% They're basically the trough in this smile curve. 00:13:19.120 --> 00:13:21.210 align:middle line:84% And, again, very, very interesting data 00:13:21.210 --> 00:13:23.730 align:middle line:84% that the producers were able to use 00:13:23.730 --> 00:13:27.780 align:middle line:90% to switch out these characters. 00:13:27.780 --> 00:13:32.240 align:middle line:84% Moving on from the world of understanding 00:13:32.240 --> 00:13:33.840 align:middle line:84% the emotional engagement people have 00:13:33.840 --> 00:13:36.240 align:middle line:84% with their content to another area that we're 00:13:36.240 --> 00:13:37.830 align:middle line:90% spending a lot of time in. 00:13:37.830 --> 00:13:41.430 align:middle line:84% And it's the future of mobility and transportation. 00:13:41.430 --> 00:13:43.320 align:middle line:84% When we first started working in the space, 00:13:43.320 --> 00:13:45.780 align:middle line:84% we got approached by a number of automakers 00:13:45.780 --> 00:13:49.710 align:middle line:84% around the world who wanted to repurpose our emotions sensing 00:13:49.710 --> 00:13:51.780 align:middle line:90% technology for the car. 00:13:51.780 --> 00:13:54.360 align:middle line:84% And so to do that we wanted to see, well, 00:13:54.360 --> 00:13:57.640 align:middle line:84% how did people behave in the car anyways, especially drivers. 00:13:57.640 --> 00:14:00.330 align:middle line:84% And so we set out to collect some data. 00:14:00.330 --> 00:14:03.683 align:middle line:84% A lot of us in the company were very skeptical that perhaps 00:14:03.683 --> 00:14:05.100 align:middle line:84% because people knew that there was 00:14:05.100 --> 00:14:06.725 align:middle line:84% going to be a camera in their dashboard 00:14:06.725 --> 00:14:09.990 align:middle line:84% that there wouldn't be any interesting emotional 00:14:09.990 --> 00:14:12.210 align:middle line:90% or cognitive responses. 00:14:12.210 --> 00:14:13.060 align:middle line:90% And we were wrong. 00:14:13.060 --> 00:14:16.920 align:middle line:84% So I'm going to show you some examples of video clips. 00:14:16.920 --> 00:14:19.920 align:middle line:84% Again, these people knew that the camera was 00:14:19.920 --> 00:14:22.320 align:middle line:84% in their vehicle, they installed it, 00:14:22.320 --> 00:14:27.090 align:middle line:84% they consented, and yet we saw very interesting driving 00:14:27.090 --> 00:14:28.030 align:middle line:90% behaviors. 00:14:28.030 --> 00:14:32.610 align:middle line:84% In this particular case, this dad is driving the vehicle. 00:14:32.610 --> 00:14:33.920 align:middle line:90% He's extremely drowsy. 00:14:33.920 --> 00:14:35.700 align:middle line:84% There are four levels of drowsiness. 00:14:35.700 --> 00:14:38.370 align:middle line:90% He's basically asleep. 00:14:38.370 --> 00:14:40.170 align:middle line:84% And you can also see that his toddler 00:14:40.170 --> 00:14:41.460 align:middle line:90% daughter is in the backseat. 00:14:41.460 --> 00:14:45.330 align:middle line:90% So very, very unsafe driving. 00:14:45.330 --> 00:14:48.480 align:middle line:84% And I can certainly empathize with that. 00:14:48.480 --> 00:14:52.200 align:middle line:84% There are many times back when we actually 00:14:52.200 --> 00:14:56.190 align:middle line:84% traveled when I'd be back from a long trip and jet-lagged 00:14:56.190 --> 00:14:57.870 align:middle line:84% and exhausted while I'm driving my kids. 00:14:57.870 --> 00:14:59.940 align:middle line:84% So I certainly can resonate with that. 00:14:59.940 --> 00:15:03.597 align:middle line:84% But that's such an easy example that the technology can pick. 00:15:03.597 --> 00:15:05.430 align:middle line:84% And you can see here that the technology was 00:15:05.430 --> 00:15:08.745 align:middle line:84% able to detect things like eye closure, mouth open, 00:15:08.745 --> 00:15:10.650 align:middle line:90% and levels of drowsiness. 00:15:10.650 --> 00:15:14.200 align:middle line:84% And you could imagine how the car can interject in that case. 00:15:14.200 --> 00:15:15.730 align:middle line:90% Here's another example. 00:15:15.730 --> 00:15:17.880 align:middle line:90% So she's driving. 00:15:17.880 --> 00:15:21.960 align:middle line:84% She's also texting while driving, so very distracted. 00:15:21.960 --> 00:15:24.510 align:middle line:84% But, oh, she has two phones in her hand, 00:15:24.510 --> 00:15:27.330 align:middle line:90% so she's very distracted. 00:15:27.330 --> 00:15:28.680 align:middle line:90% Her eyes are off the road. 00:15:28.680 --> 00:15:30.780 align:middle line:90% Her hands are off the wheel. 00:15:30.780 --> 00:15:33.960 align:middle line:84% You do not want to be driving next to this woman. 00:15:33.960 --> 00:15:37.080 align:middle line:84% But, again, that's an example that technology can pick up 00:15:37.080 --> 00:15:38.860 align:middle line:84% by just looking at her glance behavior, 00:15:38.860 --> 00:15:41.670 align:middle line:84% her head pose information, but also combining it 00:15:41.670 --> 00:15:43.270 align:middle line:84% with things like object detection. 00:15:43.270 --> 00:15:45.810 align:middle line:84% So we can detect that she has not only one, but two cell 00:15:45.810 --> 00:15:47.070 align:middle line:90% phones in her hand. 00:15:47.070 --> 00:15:50.160 align:middle line:84% We can flag that as a very highly distracted driver. 00:15:50.160 --> 00:15:52.800 align:middle line:84% And, again, the car can intervene one way or another. 00:15:52.800 --> 00:15:55.740 align:middle line:84% You can imagine if this were a Tesla 00:15:55.740 --> 00:15:58.710 align:middle line:84% and it can jump into semi-autonomous mode, 00:15:58.710 --> 00:16:00.360 align:middle line:84% it can basically say, hang on a second, 00:16:00.360 --> 00:16:02.610 align:middle line:84% I'm going to be a better driver at this moment in time 00:16:02.610 --> 00:16:06.030 align:middle line:84% than you are, I'm taking control. 00:16:06.030 --> 00:16:08.670 align:middle line:84% We have done a lot of research in collaboration 00:16:08.670 --> 00:16:11.970 align:middle line:84% with MIT'S advanced vehicle technology group, 00:16:11.970 --> 00:16:15.900 align:middle line:84% where we did the first-ever large scale study looking 00:16:15.900 --> 00:16:17.590 align:middle line:90% at driver behavior over time. 00:16:17.590 --> 00:16:19.260 align:middle line:84% And so in this particular case, you're 00:16:19.260 --> 00:16:22.030 align:middle line:84% looking at participant number 66. 00:16:22.030 --> 00:16:23.450 align:middle line:90% She's female. 00:16:23.450 --> 00:16:24.930 align:middle line:90% She's 22 years old. 00:16:24.930 --> 00:16:28.950 align:middle line:84% And all her Monday drives are concatenated in the Monday row. 00:16:28.950 --> 00:16:31.860 align:middle line:84% And all her Tuesday drives are concatenated on Tuesday, 00:16:31.860 --> 00:16:33.130 align:middle line:90% and so on. 00:16:33.130 --> 00:16:37.470 align:middle line:84% And you can see that she's pretty tired during the week. 00:16:37.470 --> 00:16:39.030 align:middle line:84% There's a lot of yawning happening 00:16:39.030 --> 00:16:40.650 align:middle line:90% at the beginning of the day. 00:16:40.650 --> 00:16:44.020 align:middle line:84% And during the weekend and also towards the end of the day, 00:16:44.020 --> 00:16:49.650 align:middle line:84% there's a lot more positive expressions. 00:16:49.650 --> 00:16:53.040 align:middle line:84% They say a picture is worth a thousand words. 00:16:53.040 --> 00:16:55.950 align:middle line:84% I believe a video is worth even more. 00:16:55.950 --> 00:17:02.220 align:middle line:84% So I'll just show you examples of her PM versus AM versus PM 00:17:02.220 --> 00:17:06.420 align:middle line:90% driving behaviors. 00:17:06.420 --> 00:17:09.990 align:middle line:84% So, again, these were all mined using our algorithms. 00:17:09.990 --> 00:17:12.930 align:middle line:84% So we basically invoked the yarn classifier 00:17:12.930 --> 00:17:15.060 align:middle line:84% to detect all of her yawning behaviors 00:17:15.060 --> 00:17:20.000 align:middle line:84% across all of her driving sessions. 00:17:20.000 --> 00:17:23.619 align:middle line:84% So you can see that she's often very exhausted in the morning. 00:17:23.619 --> 00:17:25.888 align:middle line:84% And you can imagine, if that's a recurring pattern, 00:17:25.888 --> 00:17:27.430 align:middle line:84% then the car already knows that she's 00:17:27.430 --> 00:17:30.340 align:middle line:90% going to come in exhausted. 00:17:30.340 --> 00:17:32.770 align:middle line:84% Maybe there are suggestions proactively 00:17:32.770 --> 00:17:37.690 align:middle line:84% that the car can make to the driver given her profile. 00:17:37.690 --> 00:17:42.720 align:middle line:84% Now contrast that to her PM driving behaviors. 00:17:42.720 --> 00:17:46.140 align:middle line:84% You can see that she's already looking much more awake. 00:17:46.140 --> 00:17:49.110 align:middle line:84% She's a lot more animated, a lot more smiles, a lot more 00:17:49.110 --> 00:17:50.520 align:middle line:90% positive expressions. 00:17:50.520 --> 00:17:54.300 align:middle line:84% And once again if the car had her driver profile 00:17:54.300 --> 00:17:56.940 align:middle line:84% it could customize and personalize 00:17:56.940 --> 00:18:02.170 align:middle line:84% the driving experience based on her emotional experiences. 00:18:02.170 --> 00:18:05.310 align:middle line:84% So we're working very closely with a number of automakers 00:18:05.310 --> 00:18:08.400 align:middle line:84% to re-imagine this transportation experience. 00:18:08.400 --> 00:18:10.260 align:middle line:84% Not only in cars today where we can 00:18:10.260 --> 00:18:13.890 align:middle line:84% focus on the driver, but also vehicles of the future where 00:18:13.890 --> 00:18:17.910 align:middle line:84% we really have this vision of an incumbent sensing solution that 00:18:17.910 --> 00:18:20.670 align:middle line:84% looks at the driver but the other occupants in the vehicle 00:18:20.670 --> 00:18:21.510 align:middle line:90% as well. 00:18:21.510 --> 00:18:23.820 align:middle line:84% As well as other objects in the car. 00:18:23.820 --> 00:18:25.680 align:middle line:90% Is there a child left behind? 00:18:25.680 --> 00:18:27.600 align:middle line:90% Did you leave your phone behind? 00:18:27.600 --> 00:18:29.557 align:middle line:84% How many occupants are in the vehicle? 00:18:29.557 --> 00:18:30.390 align:middle line:90% What is their state? 00:18:30.390 --> 00:18:34.300 align:middle line:84% Can you personalize the music, the content, the lighting, 00:18:34.300 --> 00:18:37.260 align:middle line:90% and so on? 00:18:37.260 --> 00:18:39.110 align:middle line:84% But, of course, a lot of us are not 00:18:39.110 --> 00:18:40.610 align:middle line:84% spending a lot of time in our car 00:18:40.610 --> 00:18:43.310 align:middle line:84% anymore during this global pandemic. 00:18:43.310 --> 00:18:47.870 align:middle line:84% And, in fact, a lot of us are spending much more of our time 00:18:47.870 --> 00:18:52.250 align:middle line:84% on video conferences and on virtual events. 00:18:52.250 --> 00:18:55.700 align:middle line:84% Making human connection via technology is really hard. 00:18:55.700 --> 00:19:00.140 align:middle line:84% And it's primarily because the main method of communication 00:19:00.140 --> 00:19:01.430 align:middle line:90% is nonverbal. 00:19:01.430 --> 00:19:05.070 align:middle line:84% And sometimes when we are on these virtual conferences 00:19:05.070 --> 00:19:09.140 align:middle line:84% and virtual events and video events, 00:19:09.140 --> 00:19:13.310 align:middle line:84% you're not really tapping into the energy and the expressions 00:19:13.310 --> 00:19:14.000 align:middle line:90% of the audience. 00:19:14.000 --> 00:19:16.220 align:middle line:90% I certainly feel that myself. 00:19:16.220 --> 00:19:19.220 align:middle line:84% I just launched my book Girl Decoded as I mentioned. 00:19:19.220 --> 00:19:21.890 align:middle line:84% And I had to pivot from a book tour 00:19:21.890 --> 00:19:24.650 align:middle line:84% where I was supposed to be traveling nonstop March, April, 00:19:24.650 --> 00:19:28.220 align:middle line:84% and May, to all of these virtual book tours and virtual book 00:19:28.220 --> 00:19:28.910 align:middle line:90% talks. 00:19:28.910 --> 00:19:32.840 align:middle line:84% And it's so different because in a live environment 00:19:32.840 --> 00:19:34.370 align:middle line:90% I can see the audience. 00:19:34.370 --> 00:19:36.710 align:middle line:84% I can see you all and I can riff off of your energy. 00:19:36.710 --> 00:19:38.870 align:middle line:84% And I can customize and personalize 00:19:38.870 --> 00:19:43.010 align:middle line:84% and adapt my content based on how you're engaging with me. 00:19:43.010 --> 00:19:46.190 align:middle line:84% When I do this virtually, it's often a one-way conversation 00:19:46.190 --> 00:19:49.010 align:middle line:84% or at least it feels to me like it's a one-way conversation. 00:19:49.010 --> 00:19:51.750 align:middle line:84% And I find it really painful and unsettling. 00:19:51.750 --> 00:19:55.820 align:middle line:84% But I think there's an opportunity to change that. 00:19:55.820 --> 00:19:59.390 align:middle line:84% What if we integrate it, emotion AI, 00:19:59.390 --> 00:20:02.300 align:middle line:84% as a way to aggregate audience response, 00:20:02.300 --> 00:20:05.360 align:middle line:84% capture all of your facial expressions anonymously. 00:20:05.360 --> 00:20:08.070 align:middle line:84% I don't need to see everybody's faces. 00:20:08.070 --> 00:20:11.120 align:middle line:84% In fact, that would probably be overwhelming as I'm presenting. 00:20:11.120 --> 00:20:14.510 align:middle line:84% But if I was able to see a moment by moment trace of how 00:20:14.510 --> 00:20:17.042 align:middle line:90% positive or negative did you-- 00:20:17.042 --> 00:20:18.500 align:middle line:84% are you engaging with this content, 00:20:18.500 --> 00:20:20.420 align:middle line:84% that would be very powerful information 00:20:20.420 --> 00:20:22.670 align:middle line:84% that would allow me to customize and get 00:20:22.670 --> 00:20:27.030 align:middle line:84% a sense of the level of engagement of the audience. 00:20:27.030 --> 00:20:28.900 align:middle line:84% And so I wanted to show you one example. 00:20:28.900 --> 00:20:31.390 align:middle line:84% Just from an internal team meeting on zoom where 00:20:31.390 --> 00:20:34.240 align:middle line:90% we recorded teams' responses. 00:20:34.240 --> 00:20:37.360 align:middle line:84% And you're able to see here the aggregated response. 00:20:37.360 --> 00:20:40.180 align:middle line:84% Of course, there's different ways to visualize this data, 00:20:40.180 --> 00:20:42.070 align:middle line:84% but we're able to track everybody 00:20:42.070 --> 00:20:44.405 align:middle line:84% and we can aggregate people's smiles, responses. 00:20:44.405 --> 00:20:46.030 align:middle line:84% They were actually talking about-- this 00:20:46.030 --> 00:20:49.960 align:middle line:84% was early in March, just when the pandemic broke. 00:20:49.960 --> 00:20:51.970 align:middle line:84% And we were talking about what's going 00:20:51.970 --> 00:20:54.287 align:middle line:90% to happen and work from home. 00:20:54.287 --> 00:20:55.870 align:middle line:84% And people were sharing their stories. 00:20:55.870 --> 00:20:58.190 align:middle line:84% And there was a lot of empathy and smiling behavior. 00:20:58.190 --> 00:21:00.190 align:middle line:90% So we're able to track that. 00:21:00.190 --> 00:21:02.890 align:middle line:84% Now compare that to a really flat, boring meeting 00:21:02.890 --> 00:21:04.600 align:middle line:84% where everybody's just not engaging. 00:21:04.600 --> 00:21:06.230 align:middle line:90% You can tell the difference. 00:21:06.230 --> 00:21:09.790 align:middle line:84% And we're able to quantify that data for real-time engagement, 00:21:09.790 --> 00:21:16.050 align:middle line:84% but also do that after the fact as post analytics as well. 00:21:16.050 --> 00:21:18.800 align:middle line:84% Similarly, in an online learning environment, 00:21:18.800 --> 00:21:21.650 align:middle line:84% what if a teacher could measure the emotional engagement 00:21:21.650 --> 00:21:24.920 align:middle line:84% of his or her students just as he or she would 00:21:24.920 --> 00:21:26.150 align:middle line:90% in a live classroom. 00:21:26.150 --> 00:21:28.070 align:middle line:84% That's what an awesome teacher does, right? 00:21:28.070 --> 00:21:30.710 align:middle line:84% You riff off of the engagement of the students 00:21:30.710 --> 00:21:33.710 align:middle line:84% and then you can personalize the learning experience. 00:21:33.710 --> 00:21:37.620 align:middle line:84% And maximize learning outcomes as well. 00:21:37.620 --> 00:21:41.390 align:middle line:84% Similarly, there's applications in mental health as well. 00:21:41.390 --> 00:21:44.090 align:middle line:84% Today when you walk into a doctor's office 00:21:44.090 --> 00:21:46.403 align:middle line:84% they don't ask you for your temperature 00:21:46.403 --> 00:21:48.320 align:middle line:84% or they don't ask you for your blood pressure, 00:21:48.320 --> 00:21:49.970 align:middle line:90% they just measure it. 00:21:49.970 --> 00:21:51.890 align:middle line:84% So what if doctors could objectively 00:21:51.890 --> 00:21:53.930 align:middle line:84% measure how you are feeling the way they 00:21:53.930 --> 00:21:56.090 align:middle line:90% measure other vital signs? 00:21:56.090 --> 00:21:58.430 align:middle line:84% Unfortunately, the gold standard in mental health 00:21:58.430 --> 00:22:01.230 align:middle line:84% is still on a scale from 1 to 10-- 00:22:01.230 --> 00:22:02.840 align:middle line:90% how depressed are you? 00:22:02.840 --> 00:22:04.160 align:middle line:90% How suicidal are you? 00:22:04.160 --> 00:22:05.810 align:middle line:90% How much pain are you in? 00:22:05.810 --> 00:22:08.780 align:middle line:84% And we can bring emotion AI and apply it 00:22:08.780 --> 00:22:12.620 align:middle line:84% in a way that brings objective measures of mental health 00:22:12.620 --> 00:22:15.190 align:middle line:90% conditions. 00:22:15.190 --> 00:22:17.760 align:middle line:84% One area that is very near and dear to my heart, 00:22:17.760 --> 00:22:21.010 align:middle line:84% it's the very first application of emotion AI that I explored 00:22:21.010 --> 00:22:22.300 align:middle line:90% is autism. 00:22:22.300 --> 00:22:24.760 align:middle line:84% Individuals on the autism spectrum struggle 00:22:24.760 --> 00:22:28.780 align:middle line:84% with reading and understanding non-verbal signals. 00:22:28.780 --> 00:22:32.620 align:middle line:84% They find the face, in particular, very overwhelming. 00:22:32.620 --> 00:22:34.300 align:middle line:84% And sometimes they avoid it altogether. 00:22:34.300 --> 00:22:37.390 align:middle line:84% They avoid face and eye contact altogether. 00:22:37.390 --> 00:22:40.690 align:middle line:84% So we are partnered with a company called Brain Power. 00:22:40.690 --> 00:22:45.910 align:middle line:84% They use Google Glass and our technology to help individuals 00:22:45.910 --> 00:22:49.180 align:middle line:84% on the autism spectrum learn about these non-verbal signals 00:22:49.180 --> 00:22:51.550 align:middle line:90% in a very fun and gamified way. 00:22:51.550 --> 00:22:56.200 align:middle line:84% So I'm going to show you a short video demonstrating that. 00:22:56.200 --> 00:22:57.850 align:middle line:90% What do you see on-screen? 00:22:57.850 --> 00:22:59.090 align:middle line:90% Mom. 00:22:59.090 --> 00:23:02.128 align:middle line:84% 8-year-old Matthew Krieger has been diagnosed with autism. 00:23:02.128 --> 00:23:04.170 align:middle line:84% A lot of the trouble he gets into with other kids 00:23:04.170 --> 00:23:05.820 align:middle line:90% is he thinks he's funny. 00:23:05.820 --> 00:23:07.950 align:middle line:84% And doesn't read at all that he's not 00:23:07.950 --> 00:23:10.020 align:middle line:84% or that they're annoyed or angry. 00:23:10.020 --> 00:23:11.550 align:middle line:84% Matthew's mother, Laura, signed him 00:23:11.550 --> 00:23:15.570 align:middle line:84% up for a clinical trial being conducted by Ned Sahin. 00:23:15.570 --> 00:23:20.010 align:middle line:84% I want to know what's going on inside the brain of someone 00:23:20.010 --> 00:23:21.270 align:middle line:90% with autism. 00:23:21.270 --> 00:23:24.600 align:middle line:84% And it turns out parents want to know that too. 00:23:24.600 --> 00:23:26.700 align:middle line:84% You get points for looking for a while 00:23:26.700 --> 00:23:29.660 align:middle line:84% and then even for looking away and then looking back. 00:23:29.660 --> 00:23:32.910 align:middle line:84% Sahin's company, Brain Power, uses Affectiva's software 00:23:32.910 --> 00:23:35.940 align:middle line:84% in programs Matthew sees through Google Glass. 00:23:35.940 --> 00:23:37.950 align:middle line:84% These games are trying to help him understand 00:23:37.950 --> 00:23:40.860 align:middle line:84% how facial expressions correspond to emotions 00:23:40.860 --> 00:23:42.840 align:middle line:90% and learn social cues. 00:23:42.840 --> 00:23:45.300 align:middle line:84% One of the key life skills is understanding 00:23:45.300 --> 00:23:47.000 align:middle line:90% the emotions of others. 00:23:47.000 --> 00:23:50.190 align:middle line:84% And another is looking in their direction 00:23:50.190 --> 00:23:51.510 align:middle line:90% when they're speaking. 00:23:51.510 --> 00:23:54.750 align:middle line:84% Looking at your mom and while it's green you're getting 00:23:54.750 --> 00:23:58.020 align:middle line:84% points when it starts to get orange and red you're-- 00:23:58.020 --> 00:24:00.660 align:middle line:90% you slow down with the points. 00:24:00.660 --> 00:24:01.698 align:middle line:90% I am looking at you. 00:24:01.698 --> 00:24:02.490 align:middle line:90% You are looking me. 00:24:02.490 --> 00:24:05.250 align:middle line:84% Just a few minutes later, the difference in Matthew's gaze 00:24:05.250 --> 00:24:06.780 align:middle line:90% overwhelmed his mother. 00:24:06.780 --> 00:24:09.200 align:middle line:90% I want to cry. 00:24:09.200 --> 00:24:09.700 align:middle line:90% Why? 00:24:09.700 --> 00:24:13.540 align:middle line:90% 00:24:13.540 --> 00:24:19.340 align:middle line:84% Because when you look at me it makes 00:24:19.340 --> 00:24:22.020 align:middle line:84% me think you haven't really before because you're 00:24:22.020 --> 00:24:25.230 align:middle line:90% looking at me differently. 00:24:25.230 --> 00:24:29.260 align:middle line:84% So Brain Power has about 400 of these Google Glass systems 00:24:29.260 --> 00:24:33.400 align:middle line:84% deployed in families across the United States. 00:24:33.400 --> 00:24:35.650 align:middle line:84% And the main question they're trying to answer-- we're 00:24:35.650 --> 00:24:38.290 align:middle line:84% already seeing improvement in terms 00:24:38.290 --> 00:24:41.763 align:middle line:84% of the kids' social and non-verbal understanding, 00:24:41.763 --> 00:24:43.180 align:middle line:84% while they're wearing the glasses. 00:24:43.180 --> 00:24:47.410 align:middle line:84% The key question is what happens when they take off the glasses. 00:24:47.410 --> 00:24:50.580 align:middle line:90% Does this learning generalize? 00:24:50.580 --> 00:24:53.592 align:middle line:84% There's also other applications of this technology 00:24:53.592 --> 00:24:54.300 align:middle line:90% in mental health. 00:24:54.300 --> 00:24:58.050 align:middle line:84% For example, a system for early detection of Parkinson's. 00:24:58.050 --> 00:25:02.070 align:middle line:84% Erin Smith-- she's now a student at Stanford. 00:25:02.070 --> 00:25:05.370 align:middle line:84% Way back when she was a junior in high school, 00:25:05.370 --> 00:25:08.550 align:middle line:84% she emailed us and she said I've been watching 00:25:08.550 --> 00:25:12.240 align:middle line:84% this documentary about Parkinson's and I 00:25:12.240 --> 00:25:14.320 align:middle line:90% want to use your technology. 00:25:14.320 --> 00:25:15.450 align:middle line:90% How much does it cost? 00:25:15.450 --> 00:25:20.187 align:middle line:84% And I remember our head of sales asked me what do I tell her? 00:25:20.187 --> 00:25:21.520 align:middle line:90% She can't afford our technology. 00:25:21.520 --> 00:25:24.390 align:middle line:84% And I said just give it to her for free. 00:25:24.390 --> 00:25:26.250 align:middle line:84% What is she going to do with it anyways? 00:25:26.250 --> 00:25:29.808 align:middle line:84% And Erin disappeared for a couple of months and came back. 00:25:29.808 --> 00:25:31.350 align:middle line:84% She had partnered with the Michael J. 00:25:31.350 --> 00:25:36.120 align:middle line:84% Fox foundation and built this system to identify facial 00:25:36.120 --> 00:25:38.250 align:middle line:90% biomarkers of Parkinson's. 00:25:38.250 --> 00:25:40.590 align:middle line:84% And she's continued to work on this research. 00:25:40.590 --> 00:25:43.060 align:middle line:90% Very inspiring young woman. 00:25:43.060 --> 00:25:45.090 align:middle line:84% And I feel very proud that we are playing 00:25:45.090 --> 00:25:47.190 align:middle line:90% a small part of her journey. 00:25:47.190 --> 00:25:50.490 align:middle line:84% We also know that there are facial and vocal biomarkers 00:25:50.490 --> 00:25:51.780 align:middle line:90% of depression. 00:25:51.780 --> 00:25:56.040 align:middle line:84% And there's a lot of work being done to flag suicidal intent 00:25:56.040 --> 00:25:57.630 align:middle line:90% based on these signals. 00:25:57.630 --> 00:26:00.570 align:middle line:84% This is work that we are collaborating 00:26:00.570 --> 00:26:05.960 align:middle line:84% with Professor Steven Benoit and others as well. 00:26:05.960 --> 00:26:08.260 align:middle line:84% But this brings up a very important topic, 00:26:08.260 --> 00:26:10.410 align:middle line:84% which is, OK, there's a lot of applications 00:26:10.410 --> 00:26:13.320 align:middle line:84% of this technology, where do we draw the line? 00:26:13.320 --> 00:26:16.470 align:middle line:84% And I feel very passionately about this concept 00:26:16.470 --> 00:26:20.460 align:middle line:84% of the ethical development and deployment of AI. 00:26:20.460 --> 00:26:24.990 align:middle line:84% It's not just about acknowledging that there's 00:26:24.990 --> 00:26:27.630 align:middle line:84% so much potential for good, but it's also 00:26:27.630 --> 00:26:29.940 align:middle line:84% acknowledging where this can be abused 00:26:29.940 --> 00:26:33.560 align:middle line:84% and the unintended consequences of this technology. 00:26:33.560 --> 00:26:35.310 align:middle line:84% A number of years ago when we were raising 00:26:35.310 --> 00:26:37.380 align:middle line:84% money for the company, we got approached 00:26:37.380 --> 00:26:41.670 align:middle line:84% by an agency that wanted to give us a lot of money-- 00:26:41.670 --> 00:26:44.700 align:middle line:84% $40 million at the time, which was a lot of money 00:26:44.700 --> 00:26:47.150 align:middle line:90% for our little startup-- 00:26:47.150 --> 00:26:49.680 align:middle line:84% on condition that they wanted to use the technology for lie 00:26:49.680 --> 00:26:52.500 align:middle line:90% detection and surveillance. 00:26:52.500 --> 00:26:54.690 align:middle line:84% And that was really not in line with our core values 00:26:54.690 --> 00:26:58.020 align:middle line:84% of respecting people's privacy and consent. 00:26:58.020 --> 00:27:02.020 align:middle line:84% And acknowledging that, as a user, 00:27:02.020 --> 00:27:03.730 align:middle line:90% this data is very personal. 00:27:03.730 --> 00:27:05.940 align:middle line:84% And if I'm going to be sharing it, 00:27:05.940 --> 00:27:07.890 align:middle line:84% that I need to know exactly who's using it, 00:27:07.890 --> 00:27:12.370 align:middle line:84% how is it being used, and also what's in it for me? 00:27:12.370 --> 00:27:14.040 align:middle line:84% So we spent a lot of time thinking 00:27:14.040 --> 00:27:15.780 align:middle line:90% about this power asymmetry-- 00:27:15.780 --> 00:27:17.580 align:middle line:84% what value am I getting in return 00:27:17.580 --> 00:27:20.307 align:middle line:84% for sharing this very personal data. 00:27:20.307 --> 00:27:22.390 align:middle line:84% What's really cool about this is that the industry 00:27:22.390 --> 00:27:25.440 align:middle line:84% is taking a lead in defining these best 00:27:25.440 --> 00:27:26.820 align:middle line:90% practices and guidelines. 00:27:26.820 --> 00:27:30.600 align:middle line:84% So we are part of a consortium called the Partnership on AI. 00:27:30.600 --> 00:27:32.460 align:middle line:84% It was started by the tech giants-- 00:27:32.460 --> 00:27:35.430 align:middle line:84% Amazon, Google, Facebook, Microsoft. 00:27:35.430 --> 00:27:39.120 align:middle line:84% And they have since invited a number of startups 00:27:39.120 --> 00:27:40.440 align:middle line:90% like Affectiva. 00:27:40.440 --> 00:27:44.880 align:middle line:84% But also other stakeholders like ACLU and Amnesty International. 00:27:44.880 --> 00:27:47.760 align:middle line:84% And I'm part of the FATE committee, 00:27:47.760 --> 00:27:51.090 align:middle line:84% which is fair, accountable, transparent, and equitable AI. 00:27:51.090 --> 00:27:55.290 align:middle line:84% And our task is to come up with these guidelines 00:27:55.290 --> 00:27:57.227 align:middle line:90% around thoughtful regulation. 00:27:57.227 --> 00:27:59.310 align:middle line:84% We need regulation, but it needs to be thoughtful. 00:27:59.310 --> 00:28:02.100 align:middle line:84% We don't want to completely squander innovation, 00:28:02.100 --> 00:28:03.600 align:middle line:84% but at the same time, we really need 00:28:03.600 --> 00:28:06.030 align:middle line:84% to think through where do we draw the line 00:28:06.030 --> 00:28:11.910 align:middle line:84% and what does this thoughtful regulation look like. 00:28:11.910 --> 00:28:14.240 align:middle line:84% And, at the end of the day, my mission 00:28:14.240 --> 00:28:17.570 align:middle line:84% is to humanize technology before it dehumanizes us. 00:28:17.570 --> 00:28:21.080 align:middle line:84% And I want us to put the emphasis back on the human, 00:28:21.080 --> 00:28:23.200 align:middle line:90% not on the artificial. 00:28:23.200 --> 00:28:27.290 align:middle line:84% And to wrap up, if this made you a little bit more curious 00:28:27.290 --> 00:28:29.750 align:middle line:84% about emotion AI and its applications 00:28:29.750 --> 00:28:34.280 align:middle line:84% and its implications, I am giving away a signed book 00:28:34.280 --> 00:28:37.520 align:middle line:84% to the first three people who post on my social media-- 00:28:37.520 --> 00:28:41.350 align:middle line:84% LinkedIn, Facebook, Twitter, Instagram-- using the hashtags 00:28:41.350 --> 00:28:45.860 align:middle line:84% #LIVEWORX and #GIRLDECODED With a comment or a question 00:28:45.860 --> 00:28:47.970 align:middle line:90% about this presentation. 00:28:47.970 --> 00:28:50.970 align:middle line:90% Thank you. 00:28:50.970 --> 00:28:52.470 align:middle line:90% [LIVEWORX THEME] 00:28:52.470 --> 00:28:54.500 align:middle line:84% Thank you for joining us Dr. el Kaliouby. 00:28:54.500 --> 00:28:56.690 align:middle line:90% What a compelling presentation. 00:28:56.690 --> 00:29:00.000 align:middle line:84% So that concludes our third session of the day. 00:29:00.000 --> 00:29:01.460 align:middle line:84% Next up at the top of the hour, we 00:29:01.460 --> 00:29:05.240 align:middle line:84% will hear how companies are leveraging SAAS-based CAD, PLM, 00:29:05.240 --> 00:29:07.550 align:middle line:84% and augmented reality as key tools for them 00:29:07.550 --> 00:29:11.270 align:middle line:84% to not only survive disruption but embrace this new normal. 00:29:11.270 --> 00:29:13.160 align:middle line:84% We'll be right back after a short break. 00:29:13.160 --> 00:29:16.510 align:middle line:90% [LIVEWORX THEME] 00:29:16.510 --> 00:29:35.000 align:middle line:90%