Results and submissions : REF 2021

Automated Reasoning for Amazon Cloud Security

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Technological
Is this case study continued from a case study submitted in 2014?: No

Download case study PDF

1. Summary of the impact

The correctness of code, networks, policies, and protocols are essential for the durability, privacy, and security of public cloud systems such as include Amazon’s Amazon Web Services (AWS), Microsoft’s Azure, or Google’s GCP. Professor Byron Cook’s UCL-based automated reasoning research directly changed business practices for Amazon’s AWS, the largest public Cloud offering. This research improved customer experience, security and trust for AWS’ 1,000,000 business customers in 190 countries, including the UK’s Ministry of Justice, the British Broadcast Corporation (BBC), and Vodafone Group Plc. Cook’s work has helped an exponentially growing number of AWS customers safely use the Cloud, with a shrinking number of security bulletins reported.

2. Underpinning research

Professor Byron Cook is a world-renowned expert in the area of automated reasoning, in which algorithms are used to find proofs expressed in mathematical logic. Often automated reasoning tools are then applied to prove properties of computer systems themselves—for example, Cook’s work on temporal reasoning ( R1, R2) for computer programs. This research allows Amazon to make precise claims about how computer programs will use resources such as API calls, time, memory, storage.

Amazon approached Cook in 2014 to apply his work to the problem of security assurance for Amazon Web Services (AWS), the world’s largest cloud service. As the founder and visionary behind Amazon’s AWS Automated Reasoning Group, Cook brought his UCL-based research to the development of bespoke AWS customer-facing security controls, including IAM Access Analyzer, S3 Block Public Access, VPC Reachability Analyzer, as well as mathematical proofs about the underlying foundations of the AWS virtualization and encryption systems. This work included UCL PhD students Kareem Khazem and Pavle Subotic.

Reasoning about programs

At UCL, Cook and team created the first known fully automatic method for proving temporal properties expressed in the logic CTL* of computer programs ( R1, R2). Previously, no automated systems allowed for the verification of such expressive properties ( R1). For the first time, this enabled automatic verification properties for programmes that mix branching-time and linear-time temporal operators ( R1, R2).

These ideas were used within Amazon in two key applications: the formal verification of virtualisation infrastructure, and encryption infrastructure. These proofs offer new and previously unseen levels of security assurance for cloud users. For example, when proving properties of the AWS virtualization infrastructure, Cook and UCL-based PhD student Khazem proved key properties of the boot code running in AWS data centres ( R3). In the space of encryption, Cook and team proved correctness properties of s2n, Amazon’s open-source implementation of the Transport Security Layer (TLS) protocol ( R4).

Reasoning about resources

Building on UCL-based research on resource reasoning (e.g. https://www.resourcereasoning.com/), Byron and UCL PhD student Subotic also developed two automated resource reasoning tools: Tiros, which formalises the semantics of AWS virtualised networking; and Zelkova, which formalizes the semantics of AWS resource policies ( R5, R6). These two tools then use automated theorem provers (such as Subotic’s Souffle tool) to verify security-related properties. Zelkova is now the basis of Amazon’s IAM Access Analyzer feature (see https://docs.aws.amazon.com/IAM/latest/UserGuide/what-is-access-analyzer.html) and S3 Block Public Access feature (see https://docs.aws.amazon.com/AmazonS3/latest/userguide/access\-control\-block\-public\-access.html\). Similarly, Tiros is now the basis of Amazon’s VPC Reachablity Analyzer feature (see https://docs.aws.amazon.com/vpc/latest/reachability/what-is-reachability-analyzer.html) as well as features in Amazon Inspector. These AWS features have helped customers avert security breaches.

4. Details of the impact

Cloud computing is fast becoming the standard computational and storage platform for businesses and government organisations around the world. Amazon’s AWS is the industry’s biggest cloud provider. Amazon approached Professor Byron Cook in 2014 to develop correctness-proving tools for AWS security through automated reasoning ( S1). In response Byron started the AWS Automated Reasoning Group. Cook’s research collaboration with Amazon ( R1 – R6) transformed business practices for AWS ( S2 – S7), and improved customer experience, security and trust for AWS’s over 1,000,000 business customers in 190 countries ( S2, S7).

New cloud security tools transform Amazon’s business practice

Amazon used Cook’s automated reasoning research ( R1 – R4) to fortify critical security applications. Cook’s CTL* research ( R1, R2) led him to develop two proofs for AWS foundations: OS/virtualization (specifically, the memory safety of boot loaders used in data centres in R3) and the correctness of Amazon’s cryptography (in particular, the implementation of the TLS protocol, s2n in R4). Amazon applies Cook’s techniques to continuously prove correctness during code development. Thus, in the example of s2n, proof is used to continuously protect the encryption used by S3’s 1,100,000 requests per second ( S2).

Senior stakeholders at Amazon attest to the importance of Cook’s techniques in how AWS functions. For instance, AWS Security VP confirmed: “Every time a developer commits code, that proof gets run again.” He added that if someone “breaks that proof, we don’t find out about it years later… [W]e find out about it during the build” ( S3). The AWS Chief Technical Officer stated: “AWS has a (not so) secret weapon that helps protect us and our customers—automated reasoning.” Furthermore, “[w]e apply provable security to our infrastructure [to] achieve the highest levels of security while our services rapidly grow” ( S3).

Cook’s formal verification infrastructure ( R4) further ensures that s2n automatically checks properties. This typically eliminates the need for developers to understand or modify the proof following code modifications. An s2n architect and VP attests: “This verification…is built into our public GitHub builds… [and tests] every change…confirming that the tools reject [errors]. [N]o changes to the s2n code were necessary to support the proof” ( S3).

Cook’s new developments transformed Amazon’s cloud security offerings, creating a higher security standard for AWS and their 1,000,000 plus daily business customers, including the BBC, the Ministry of Justice, the Met Office, Europol, National Rail and Vodafone Goup Plc. ( S3). The AWS CTO stated: “With previous tools, auditors could not evaluate all of the code in all configurations [or] evaluate where keys were used. With automated reasoning, millions of customers can use a proof to examine the entire system for a certain value. This creates a higher standard for security beyond today's advanced control measures, such as automated controls, preventive controls, or detective controls” ( S3). AWS Identity VP stated: “We are excited to have these assets and talent in AWS, and to make it available to all builders and AWS customers” ( S3).

Enhanced security tools improve AWS customer experience and trust

Customers are responsible for security within their cloud-based applications ( S3). In some instances, customers have found it hard to get these details right. For example, in 2017, 7% of S3 customers allowed unrestricted public access to buckets; among the consequences were the leaking of over 198,000,000 American voters’ names and addresses ( S4). Professor Cook’s work on Tiros and Zelkova ( R5, R6) formed the basis of new customer-facing features that AWS launched to help customers spot mistakes before they go live, so customers can confidently deploy sensitive workloads ( S4). Features include IAM Access Analyzer, S3 Block Public Access, VPC Reachability Analyzer, Amazon Inspector, Amazon Macie, and AWS Managed Config Rules. Customers who have specifically mentioned relying on these tools include global investment firm Goldman Sachs and their 3,000,000 customers, and Bridgewater Associates, the world’s largest hedge fund ( S7).

Enterprise-level customers have stated that Cook’s enhanced security protection features benefitted them by supporting overall compliance and risk policies, anti-malware, and threat detection:

“We’re closer to the developer, have a faster feedback loop, and they can still be agile in their infrastructure development while maintaining the best security standards.” – Goldman Sachs Developer ( S7)

“Bridgewater uses Zelkova to verify and provide assurances that our policies do not allow data exfiltration, misconfigurations, and many other malicious and accidental undesirable behaviors.” – Bridgewater Senior Software Developer ( S7)

“Coinbase is one of the most widely used bitcoin wallet and exchange companies. Amazon Inspector is helping companies like ours embrace the immutable future and pull our industry out of the security dark ages.” – Coinbase Director ( S7)

“We [are placing] 80% of our IT resources in the Cloud. Amazon Inspector is a great example of how AWS is accelerating investment in security-focused services… and a highly scalable, API driven security service that we can place throughout our cloud operations.” – University of Notre Dame IT Senior Director ( S7)

Without needing security development skills, millions of AWS customers use automated reasoning thousands of times per minute to defend against security leaks through tools such as S3 Block Public Access. The AWS ‘Chief Evangelist’ said, “[S3BPA] is designed to be easy to use, and can be accessed from the S3 Console, the CLI, the S3 APIs, and from within CloudFormation templates” ( S3). The AWS CEO agreed, stating: “S3 is the only object store that allows you to analyse all the access permissions on all your objects and buckets with… IAM Access Analyzer” ( S5).

Human-centred security policy

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Societal
Is this case study continued from a case study submitted in 2014?: Yes

Download case study PDF

1. Summary of the impact

Professor Sasse and colleagues’ influential, user-centric research on cybersecurity has informed security thinking in both government and corporate domains in the UK and globally. This work has shaped revision of official, nation-wide Government guidance from the UK National Cyber Security Centre (NCSC) on how to manage passwords more sustainably without compromising users’ security. This user-centric perspective has also informed guidance targeted at smaller organisations such as businesses, charities, and home users. [TEXT REMOVED FOR PUBLICATION].

2. Underpinning research

Professor Sasse and Dr. Parkin’s research at UCL explored factors that can influence peoples’ behaviours around information security controls and policies, and the role that these behaviours play in the continuing productivity of employees.

Research published in 2003, analysing system logs of login attempts for hundreds of users, showed users struggle to manage an increasing number of passwords ( R1). The research suggested re-considering the “3-strikes” policy commonly applied to password login systems as an immediate way of reducing this demand. They found that not having to change a password reduces the mental load on users and increasing the number of login attempts to 10 reduces the time taken away from, and interference caused with, users’ production tasks.

In 2008, Professor Sasse and team developed the compliance budget concept, which explains how friction between information security and business process reduces both security compliance and personal and organizational productivity ( R2). The user’s ability to comply – the “compliance budget” – is limited and needs to be managed like any other finite corporate resource. The compliance budget concept includes ways to improve secure working, including designing less user-costly technologies and improving awareness support.

Through interviews with approximately 100 employees in each of the two large organizations, the research team identified how employees may create user-centric balance of security and productivity when workable institutional support is not provided, as ‘shadow security’ practices emerge ( R3); this was also an opportunity for organizations to learn from challenges that employees manage in trying to achieve secure working practice.

Through comparison between existing and emerging technologies, UCL researchers were able to identify how individuals weigh up alternative approaches to security tasks against the context and goals of what they were trying to achieve in a primary task ( R4, R5). The case study paints a picture of chronic ‘authentication fatigue’ resulting from current policies and mechanisms, and the negative impact on staff productivity and morale ( R4). Another methodological approach developed in the project included informing methods for understanding the role of security technologies in peoples’ lives, and the team also assessed emerging biometrics technologies (facial liveness detection) as part of representative everyday tasks ( R5). This approach also complemented methods developed to inform top-level decisions about the adoption of usable technologies.

Associated research considers these challenges from a psychological perspective, starting from the premise that understanding how people perceive risks is critical to creating effective awareness campaigns ( R6). Changing behaviour requires more than providing information about risks and reactive behaviours; rather, people must first be able to understand and apply the advice, and secondly, they must be motivated and willing to do so—and the latter requires changes to attitudes and intentions.

4. Details of the impact

This research contributed significantly to the evidence base for two influential pieces of Government and business guidance: firstly, the 2015 GCHQ/NCSC Password Guidance for UK organisations and secondly, the “Awareness is Only the First Step” business whitepaper. These documents superseded previous inferior guidance and offered both business and individual better ways of staying secure. Following on from this policy impact, this research was picked up by iProov and OutThink, two top UK security and IT firms, whose products were not only influenced by Sasse’s research but also both appointed her their Chief Scientific Advisor.

Influence on Government guidance: 2015 GCHQ/NCSC Password Guidance

Findings from R1 and R4 reviewing the ‘3 strikes’ policies have informed the GCHQ/NCSC Password Guidance to UK organisations published in 2015 ( S1, S2). This led to a change in thinking, putting the users of technology in organisations first, and identifying practical ways to achieve productivity and security at the same time (for instance, directly advocating recommendations from R1 be put into practice). For instance, a testimonial from the NCSC stated, “The UCL password ‘Ten strikes and you’re out’ and ‘Great authentication fatigue’ research provided evidence and encouragement for the redevelopment of top-level password guidance which is intended as advice for large and small UK businesses and charities to follow, to more effectively consider the end-user in the management of security in organisations (Specifically, emboldening efforts to move away from a three-attempts ‘anchor’ that would otherwise prevail)” ( S3).

In addition, this research ( R1, R2, R3) have led to increased capacity in sociotechnical security among the ‘Five Eyes’ international intelligence alliance comprising Australia, Canada, New Zealand, UK and the US, as evidenced by further testimony from NCSC: “The ‘Ten strikes and you’re out’, ‘Compliance budget’, and ‘Shadow security’ research (and its inclusion in the ‘Awareness is only the first step’ whitepaper) fed into the evidence base which the sociotechnical security team (in CESG and then NCSC) referred to as a starting point when developing a user-centred perspective on security and leading in this direction among the Five Eyes nations” ( S3).

This approach prevented negative impacts upon users. For instance, Bill Burr (the originator of many of the previous rules about password creation) issued an ‘apology’ in 2017 for those rules and their impact on users, admitting that his 2003 manual was "barking up the wrong tree." He had advised, for instance, that users change their password every 90 days ( S4). When the revised password guidance was issued, the head of NCSC pointed to Sasse’s analysis of Burr’s guidelines, noting that, “That’s why we changed the unworkable password guidance, which Professor Sasse calculated was the equivalent of remembering a new 600-digit number every month” ( S5).

Influence on business policy with “Awareness is Only the First Step” whitepaper

Outputs from the compliance budget and shadow security papers ( R2 and R3) were used to inform a business whitepaper, “Awareness is Only the First Step”, with HP Enterprise (with oversight from CESG) co-authored by Professor Sasse and Dr Simon Parkin ( S6). As evidenced by NCSC testimony: “UCL research into the ‘Compliance budget’, and later the ‘Shadow security’ (and the ‘Awareness is only the first step’ whitepaper resource which is distributed to enterprises, and incorporates principles from these pieces of research) provided evidence and heuristics upon which the You Shape Security advice collection was based, at least in part (as well as their inclusion as resources for further reading for practitioners)” ( S3).

The You Shape Security collection (published early 2019) is the main sociotechnical advice collection provided by the NCSC (and prior to that, CESG), among many which involve how UK organisations manage security for their members . The Deputy Director of the National Technical Authority for Information Assurance described how the whitepaper intended to create organisational change across the UK: "At CESG, we advise both organisations and Government on the challenges that their security practitioners face when it comes to security awareness. With this whitepaper we hope to give them a refreshing new way to approach the challenge of involving employees in order to create a more secure organisation, instead of simply implementing a one-size-fits-all approach" ( S7). The whitepaper has been viewed online 6706 times between August 2017 and December 2020 ( S8).

This advice has been extensively consulted, as can be seen from the unique pageviews from Jan 1 2018 to Dec 31 2020 for Password Policy, You Shape Security, and then the top 5 password-related blogs:

Password policy: updating your approach = 48967
You Shape Security = 5188
Three random words or think random = 62272
Passwords, passwords everywhere = 48820
What does the NCSC think of password managers = 37274
Let them paste passwords = 26704
The problems with forcing regular password expiry = 14416

These webpages had an average time of 2 minutes 54 seconds spent on each page ( S8).

[TEXT REMOVED FOR PUBLICATION]

Improving Data Transfer with Multipath Transmission Control Protocols

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Technological
Is this case study continued from a case study submitted in 2014?: Yes

Download case study PDF

1. Summary of the impact

Research at UCL pioneered the suite of Multipath Transmission Control Protocols (MPTCP). These protocols improve the efficiency of data transfer across the Internet when compared to the classic suite of transfer protocols “TCP/IP”. The UCL team demonstrated the improved effectiveness of these protocols which has led to multiple commercial deployments, either by Internet Service Providers or by companies such as phone handset suppliers. In addition, UCL research has been incorporated into Operating Systems such as Apple’s iOS and Linux and it is now shaping the design of 5G cellular networks. As a result of UCL research, millions of users benefit every day from smoother transitions between different Internet connections ensuring that online services have a much higher performance and reliability.

2. Underpinning research

The Transmission Control Protocol, which accounts for the vast majority of all Internet traffic, allows users to transfer packets of data between two IP addresses. The TCP/IP “suite” of protocols for data transfer is fundamental to the architecture of the Internet. However, since the development of devices able to connect in multiple ways, such as phones that can transmit over Wi-Fi or cellular networks, the use of a single pathway for transmission has limited improvements in transfer speed and efficiency.

Research into Multipath Transmission Control Protocols (MPTCP) by the team at UCL has helped to address this issue. They have developed ways to transfer files between two IP addresses across multiple “subflows”. These subflows allow for a device to move between connections more smoothly or even to use multiple paths simultaneously. Previously, when a phone, for example, disconnected from a Wi-Fi network, it would not have simultaneously been using its cellular signal to transmit data; this causes an interruption to the transmission, sometimes impeding the function of apps. The same interruption could also happen when a new Wi-Fi connection was joined. Now, instead, a phone using MPTCP can transmit data using both pathways. The data transfer can now also migrate over to whichever pathway is more efficient, resulting in a better experience for the user.

Research by Professor Handley and his team at UCL was crucial to the development of these new protocols. In a 2011 paper, in collaboration with other scholars, the team at UCL demonstrated the challenges which would need to be overcome to enable MPTCP to be attached on to the existing Internet architecture ( R1). This research looked at the behaviour of middleboxes (networking devices which filter or inspect data packets). By studying 142 access networks in 24 countries, they showed that issues with middlebox behaviour would interfere with data transmission in ways that would limit the extension of TCP. The measurement results in this paper guided the design of MPTCP.

The team at UCL also developed an algorithm to optimize MPTCP, preventing congestion in the transfer of data ( R2). Implementing the algorithm in Linux, they were able to demonstrate the increased efficiency of MPTCP as compared to TCP. They produced further work on the principles for functioning MPTCP, taking into account the various challenges to its implementation ( R3). They also demonstrated the efficiency gains of MPTCP in datacentres ( R4). Because datacentres have multiple pathways to transmit within their stacks, they can transmit data using MPTCP, allowing for speedier transfer across multiple routes. By running MPTCP on Amazon’s cloud computing service, Amazon EC2, they were able to demonstrate that MPTCP outperforms TCP by a factor of three.

The results of this work were taken up by the Internet Engineering Task Force (IETF, the main standards organization for the Internet). A working group was formed to standardize MPTCP. The team at UCL authored three Request for Comments (RFCs). These are standards documents for the IETF, defining MPTCP and establishing guidelines for its use ( R5, R6). These guidelines have been promoted from the experimental track at the IETF to the standardisation track. This means that individuals who are both developing products and designing deployments commercially, can use MPTCP to help shape the future of the Internet.

4. Details of the impact

As a result of UCL research, its widespread dissemination (over 3,000 citations), and its uptake by the IETF, Prof Handley and team have helped to ensure a faster, smoother, more connected Internet globally. MPTCP is of growing importance to the telecoms industry and its customers. A search on Google patents for “MPTCP” currently lists 1,380 patents granted, with many more applied for. The team’s work has informed the practices of many technology companies, and improved the experiences of millions of users:

Operating systems, such as Apple’s iOS and Linux, have incorporated MPTCP.
The team at UCL has been involved in collaborations with service providers and handset makers, such as Korean Telecom and Samsung, helping to provide faster Internet.
The research has allowed companies such as Tessares to provide more efficient Wi-Fi services in rural areas.
MPTCP is also an important feature of 5G networks, helping to improve connectivity around the world.

In 2019, Mark Handley was awarded a Sigcomm award “For fundamental contributions to Internet multimedia, multicast, congestion control, and multi-path networks, and the standardization of Internet protocols in these domains” ( S1). His work has had a vital impact on the Internet’s ongoing transformation.

Impact on Apple iOS and its users

In 2013, Apple incorporated MPTCP in iOS 7, shipping it from Linux into its own software, after consulting UCL research published in ( R6). Their initial use case was the voice activated Siri assistant ( S2). Siri sends audio samples from the user’s query to Apple’s servers and returns the answer. It is crucial that this exchange happens without delay, but an iPhone cannot know in advance whether its Wi-Fi link or its cellular link will work best. MPTCP allows iOS to try both. The phone can then rapidly migrate to the link that works best at that instant. According to Apple, MPTCP reduces the delay before the first word of the response by 20% and eliminates 80% of failed connections. Based on their experience with Siri, Apple introduced Wi-Fi Assist in iOS 9 (released on 16 September 2015), which allowed apps such as Safari to benefit from the transition between subflows thanks to MPTCP. Apple enabled MPTCP for all applications on iOS 11, meaning app designers can now make use of MPTCP to optimise their functionality. As of 2019, Apple has incorporated MPTCP into their Apple Maps application and their streaming service ( S2). Apple’s Networking Architect Christoph Paasch says that, as a result, users “ are seeing much less music streaming stalls” ( S2). Apple also enabled MPTCP on MacOS from MacOS 10.10 onwards. According to Apple, over 800 million devices run an MPTCP-capable version of iOS or MacOS, meaning all users of those devices now benefit from the improved data transfer enabled by MPTCP.

Impact of MPTCP on other telecoms services

Linux Kernel: MPTCP is now incorporated into the mainline Linux kernel (as of release 5.8, in August 2020), so is available for use in Android and Linux servers. Much of this work was performed by Intel ( S3).

Korean Telecom and Samsung: In 2015, Korean Telecom (KT) teamed up with Samsung to launch a commercial MPTCP proxy service called Giga LTE. Giga LTE makes use of MPTCP to dramatically improve wireless performance. Samsung shipped an Android port of the Linux MPTCP implementation on Samsung S6 and S6 Edge devices sold in Korea. KT deployed proxy servers in their production network using the Linux MPTCP implementation. Together, these can be used to harness the power of Wi-Fi and phone signals at the same time. When a user selects “Giga LTE” on their phone, MPTCP is enabled for all applications. Connections then use both LTE and Wi-Fi to connect to servers on the Internet. Because of the MPTCP congestion control algorithm, rather than simply splitting dataflows equally between the two pathways, the Samsung phones can use all the available power. KT reported in 2015 that this could improve performance significantly. Their published results are included in the news article about Giga LTE ( S4).

Tessares: Tessares, based in Belgium, is pursuing the proxy approach to deploying MPTCP, informed by the research of the UCL team ( S5). Their target market is home or small business networks. In some areas, especially in rural locations, Internet services are delivered by ADSL at speeds slower than can be achieved over the faster LTE service, but LTE bandwidth may be scarce at busy times, and is often much more expensive. By deploying MPTCP proxies in the customer’s home or office router and in the telecom’s network, MPTCP can preferentially use ADSL, but when high bandwidth is required and LTE capacity is available, MPTCP can use both networks simultaneously to improve performance. Proximus, Belgium’s largest mobile provider, deployed Tessares MPTCP solution in 2016 in Frasnes-Lez-Anvaing, and (as of 2017) is planning a larger scale pilot ( S5). A similar approach has been used by Tessares in collaboration with Croatia’s Hrvatski Telecom, to bring faster Internet services to users in rural Croatia. Standards for MPTCP proxies are now being developed in the IETF ( S5).

Impact on the development of 5G

MPTCP is now in the main 5G network system architecture specification, 3GPP TS 23.501, released in 2016 ( S6) . This specification, issued by the overarching standards body for telecommunications protocols, the 3GPP, governs the design and implementation of 5G around the world. The new generation of mobile data networks will use MPTCP to ensure low latency (i.e. minimal delays in the transfer of data). The 3GPP has named the technology ATSSS (Access, Traffic Steering, Switch and Splitting). ATSSS was defined in collaboration with various telecoms companies, including KT, Apple, Deutsche Telekom, Orange, and Cisco. ATSSS will allow for seamless transition between 5G and Wi-Fi, or the use of both 5G and Wi-Fi to maximise speeds. MPTCP “ will play a key role in efficiently combining Wi-Fi and 5G” ( S7).

KT announced on 28 August, 2019 that it had completed the world’s first “5G low latency multi-radio access technology” test in a 5G commercial network in collaboration with Tessares. According to Tessares, “ *ATSSS technology reduces the initial session setup time to achieve 5G ultra-low latency in a multi-radio context, resulting in a setup delay of less than half compared to previous approaches” ( S8). The use of ATSSS has been described as a key “ differentiator”, one of the crucial differences between 5G and previous generations of cellular networks ( S8).

In addition, MPTCP informs other solutions within 5G architecture. As the IETF Network working group pointed out in 2018: “One of the key features of 5G…is dual connectivity (DC). With DC, a 5G device can be served by two different base stations. DC will play an essential role in leveraging the benefit of 5G…MPTCP could be integrated with DC and the 5G protocol stack ( S7). The research into MPTCP by the UCL team has changed networking architecture and will continue to do so during the 5G rollout.

Modelling infectious diseases using web search data

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Health
Is this case study continued from a case study submitted in 2014?: No

Download case study PDF

1. Summary of the impact

The impact of UCL’s research on digital epidemiology based on Web search and Twitter data has (i) contributed to the introduction of a national influenza vaccine for children (approximately 5,000,000 per annum), which is estimated to reduce prevalence of influenza in the general population by 20%; (ii) been adopted by PHE as part of the weekly influenza reports, which are used to determine the start and duration of the annual influenza epidemic and hence, underpinned the recommendation for the commencement of prescribing antivirals to those at risk; (iii) been incorporated in PHE’s publicly available COVID-19 surveillance reports, which informed the decision making process with regards to COVID19 national policy; and (iv) supported NHS UTLA authorities’ regional, tiered response to the pandemic distribution.

2. Underpinning research

Between July 2014 and February 2015, Lampos led a research project at UCL, in collaboration with Google, to help improve machine learning methods for estimating influenza-like illness (ILI) rates from Web search data ( R1). The UCL-led research was the first to show why Google Flu Trends over or under-estimated ILI rates. The paper ( R1) went on to propose nonlinear solutions for this machine learning task given the evident nonlinearities in the data. At the time, the proposed solutions provided state-of-the-art performance.

Lampos and Cox subsequently improved feature (search query) selection within this context by using developments in statistical natural language processing ( R2). That increased the accuracy of ILI rate estimates by between 12% and 28%.

In collaboration with Public Health England (PHE), Lampos and Cox assessed the accuracy of these models with respect to a number of epidemiological indicators, for example, magnitude and date of peak influenza prevalence, rather than common regression accuracy metrics such as mean square error ( R3). This study corroborated their previous findings, reaffirming the potential value of using such approaches as complementary syndromic surveillance indicators.

To mitigate against situations where either sparse data is available (eg: due to an outage or inaccuracy of a syndromic surveillance system) or only a few training examples exist (eg: due to the lack of a comprehensive way to monitor an infectious disease during previous circulations), Lampos and Cox led a research project that proposed multi-task learning approaches for modelling ILI from web search data at national and subnational level ( R4). This was also the first attempt to train models for two different countries jointly. Their models were able to improve accuracy by up to 40% when considering a 1-year long training period.

Lampos and Cox also led the development of a transfer learning approach for mapping a model for ILI based on web searches from one country, where historical ILI rates are available, to a country that has little or no such data, ie, where no comprehensive health surveillance system is in place, such as in low- and middle-income countries (LMIC) ( R5). A variation of this approach was deployed during the first wave of the COVID-19 pandemic to map models from Italy, which had been exposed to a high circulation of COVID-19, to other countries that were in earlier phases of their local epidemics.

Finally, in collaboration with PHE and Microsoft Research, Lampos and Cox led a research project to estimate, for the first time, the effectiveness of a flu vaccination programme using social media and Web search data ( R6). PHE initiated a pilot live attenuated influenza vaccine (LAIV) programme for school age children in seven geographically discrete areas in England during the 2013/14 influenza season. An analysis based on data from conventional syndromic surveillance systems did not yield statistically significant outcomes for the impact of this vaccination programme. Lampos and Cox then constructed models for the prevalence of ILI based on social media and web search activity in pilot (vaccinated) and control (unvaccinated) regions. The control regions were subsequently used to predict the prevalence of influenza in the pilot regions in the absence of the vaccine. The difference between this prediction and the estimated ILI prevalence in pilot regions was used to infer vaccine effectiveness. Estimates of effectiveness were strongly positive and statistically significant, corroborating the non-statistically significant indicators of prior work. This analysis was repeated for the 2014/15 flu season, yielding similar outcomes.

4. Details of the impact

Prior to COVID-19, the UK Government Cabinet Office Risk Register identified pandemic influenza as highly likely to occur and to have the greatest adverse impact on the country. Lampos and Cox led the development of an online tool, Flu Detector (also known as “i-senseFlu”) that displays estimates of flu rates for England based on web search data on a daily basis. The output from Flu Detector was incorporated in PHE’s suite of syndromic surveillance methods during the 2017/18 influenza season for the first time and has been used by PHE in all subsequent years. To the best of our knowledge, this is the first system of its kind that has been formally adopted by a national health agency worldwide. Estimates from this tool have been included in PHE’s weekly reports on influenza. Data in these reports are used to determine the timing and duration of an influenza epidemic and associated health recommendations, eg commencement of prescription antiviral drugs to the elderly. In a paper jointly authored with PHE, Lampos and Cox demonstrated that surveillance based on Web searches could give an earlier indicator (one to two weeks) of the onset of an influenza epidemic or pandemic ( R5) compared to a traditional network of sentinel doctors (coordinated by the Royal College of General Practitioners) that report the fraction of patients presenting at practices with influenza-like illness.

According to PHE, “ (Flu Detector) provides a vital early indicator of changes in influenza activity in the community. This was vital, for example, in the 2019-20 influenza season when activity began earlier than usual and Flu Detector was one of the first systems to detect this. This resulted in an early Chief Medical Officer (CMO) instruction to allow prescribing of antivirals in the community” ( S1).

Children are a major vector (spreader) of influenza in the community. PHE’s models of the spread of influenza suggested that vaccination of children would lead to about a 20% reduction in influenza in the (unvaccinated) community. In collaboration with PHE, Lampos and Cox’s assessment in ( R2) supported this hypothesis. A WHO expert now leading the European response to COVID-19 and formerly leading the Influenza and Other Respiratory Pathogens team at PHE writes: “ Our traditional surveillance metrics and these novel indicators based on non-traditional epidemiological data were used as evidence for introducing an influenza vaccination programme across all primary schools in the UK for years 2-7. *Influenza vaccinations for children became a national policy in 2015/16. NHS England currently recommends and offers a free influenza vaccine for all children between 2 and 7 years of age (over 5,000,000 children).*” ( S2) PHE has estimated that this reduces the prevalence of influenza in the general population (not just vaccinated children) by 20% ( S3).

During the first wave of the COVID-19 epidemic in the UK, PHE requested that the UCL team construct a model for COVID-19 based on their previous work on influenza. Lampos led the development of this model which exploits many of the ideas developed for influenza surveillance together with several novel and significant differences. A Consultant Epidemiologist at Public Health England (PHE) and lead of the COVID-19 surveillance cell writes that “ This system … provided an essential surveillance system for early indication of changes in COVID-19 activity. This was particularly notable at the start of the pandemic when the COVID-19 Google search surveillance system gave us one of our earliest indicators that the national lockdown was successfully reducing COVID-19 activity” ( S1).

The estimates of this model are sent to PHE on a weekly basis and are included in their publicly available COVID-19 surveillance report. The former digital surveillance lead for COVID-19 at PHE, writes these reports were “ used by policy makers at the national level to make decisions on outbreak management policy. In addition to appearing in a written form, the data is presented and discussed in a range of national level situational reports meetings attended by senior staff from Public Health England and the Department of Health and has undoubtedly informed the decision making process with regards to COVID19 national policy” ( S4).

In addition, the data provided by Lampos and Cox in collaboration with Microsoft “ contributed to the provision and interpretation of localised (at Local Authority level) COVID19-related search engine data (using Microsoft Bing data) provided to Public health England regional teams, which supported the detection of localised COVID19 clusters. This has become increasingly important as a tier-based approach to COVID management, based on the local epidemiology has been adopted in England” ( S4).

Sapienz deployment at Facebook

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Technological
Is this case study continued from a case study submitted in 2014?: No

Download case study PDF

1. Summary of the impact

Research at UCL led by Professor Mark Harman since 2010 has revolutionised the way that “bugs” are identified in mobile Apps, improving the everyday experience of mobile phone Apps for millions of users. The research led to a spin-out company, Majicke, to commercialise the testing tool designed and developed by Harman, Jia and Mao – called Sapienz. – which was then acquired by Facebook in 2017. Since then, Harman, Jie and Mao started working at Facebook, where Harman founded the Facebook Sapienz team. The Facebook Sapienz team deployed the Sapienz tool into Facebook’s infrastructure, where it was applied at the largest scales experienced in the software industry by any software testing technology. The Sapienz tool remains in full deployment at Facebook where it directly impact on the user experience of 2,600,000,000 people every day who rely on the communications and social networking apps it tests, such as Facebook, Instagram, Messenger and WhatsApp.

2. Underpinning research

Mark Harman co-founded the research field of Search Based Software Engineering (SBSE), an engineering approach—now widely studied across the software sector—which applies metaheuristic search techniques to software engineering problems ( R1). Of particular relevance is his research on a subset of this discipline: Search Based Software Testing (SBST), which concerns software testing and uses computational search techniques to tackle software engineering problems involving large, complex search spaces.

In this approach, test objectives find natural counterparts as the fitness functions used by SBSE to guide automated search, thereby facilitating SBSE formulations of many diverse testing problems. As a result, SBST has proved to be a widely applicable and effective way of generating test data, in addition to optimising the testing process ( R2).

The approach to search-based testing developed by Harman used a novel multi-objective approach to testing that includes two important innovations: firstly, it minimises the size of test cases used in the approach, and secondly, it simultaneously maximises the coverage achieved by the test cases. The first of these innovations—minimising test case size in comparison to other approaches—is important because it maximises the actionability of any faults found in the process (as shorter fault-revealing tests are easier to debug). The second of these innovations (maximising coverage) is significant because it elevates the number of faults that can be discovered. This work built on Harman’s long-standing advocacy for Pareto-optimal approaches being well-suited to software engineering problems, and here used a multi-objective formulation and the well-known NSGA-II algorithms to find Pareto-optimal solutions. This suitability is based on the observation that most software engineering measurements (including all those that later turned out to be relevant to Sapienz) are ordinal scale measurements, and this makes weighted approaches to multi-objective optimisation inappropriate.

In 2015, Harman began working at UCL with former PhD student and current Associate Professor, Yue Jia, and new PhD student Ke Mao (supervised by Harman and Jia) on the problem of applying Search Based Software Engineering to the automated generation of fault-revealing test cases for Android apps to improve their functionality. Based on this research, the same UCL team developed the prototype search-based testing tool, Sapienz, which was released as open source (and which has been further developed by other research teams).

In 2016, Harman and the team published the algorithm and approach to search-based testing at the first-tier software testing conference, ISSTA ( R3). The Sapienz tool was the first technology to target simultaneously both of the competing and conflicting objectives of test size and coverage and has subsequently been taken up by Facebook.

4. Details of the impact

Research at UCL, led by Professor Mark Harman has revolutionised the way that software bugs are identified in mobile Apps. The UCL team’s software – Sapienz – has been deployed by Facebook and is improving the user experience of more than 2,600,000,000 users of Facebook, Instagram, Messenger and Whatsapp every day.

During a keynote presentation at the International Conference on Software Testing (ICST) in April 2015, Harman set out a vision for search-based testing ( S1). The UCL team developed a software tool – Sapienz – that uses SBSE to automatically design tests that reveal faults. In September 2016, Harman, Mao and Jia co-founded the start-up company Majicke ltd to ensure their search-based testing methods and their ground-breaking Sapienz testing tool would be applied more widely ( S2).

In February 2017, Harman, Mao and Jia took up full-time positions at Facebook ( S3), founding a new Facebook Sapienz team within the company’s Developer Infrastructure organisation ( S3, S4). At the same time, Harman continued his research and development work at UCL, supported by the European Research Council advance fellowship grant EPIC (Evolutionary Program Improvement Collaborators; ERC grant no.741278), for which he is Principal Investigator.

The first prototype of Sapienz technology was deployed at Facebook in September 2017 and used to test continuous master builds of the main Facebook app and the workplace application ( S5). The research prototype reported on at ISSTA in 2016 had already found 558 bugs in the top 1,000 apps ( S2). When the Facebook Sapienz team was established in 2017 it deployed the Sapienz tool into the Facebook infrastructure and the tool went on to find thousands more bugs ( S5). Tested apps now include the key members of Facebook’s family of apps, including Facebook, Instagram, Messenger and WhatsApp, which, in December 2020, had approximately 2,600,000,000 users each day ( S6).

By February 2018, the prototype had been extended to handle continuous testing of each and every change submitted by developers to the central code repository, rather than simply continuously testing master builds ( S5). In 2018, the Facebook Sapienz team reported that over 700 bugs had already been found and fixed by developers by January 2018 ( S5). By 2018, the Facebook Sapienz team had grown from three to eight staff members ( S5).

The work also generated significant media attention. For example, a blog post that the Sapienz team wrote went viral, and it was covered by SD Times, CNET, SiliconANGLE (further picked up by SlashDot), Startup World, The Register, TechCrunch (further picked up by the Verge), ZDNet, The Next Web (further picked up by Wheaton Business Journal), Fossbytes, and JAXenter. It was also the basis of a high profile Forbes article ( S7). In addition to this media attention, members of the Facebook Sapienz team have also given public lectures on the impact of their work ( S8).

At Facebook, Harman collaborated with Peter O’Hearn to launch two calls for funding to further develop the testing and verification research agenda, supported by funding from the Facebook Research Operations and Academic Relations (ROAR) team ( S9). Through Facebook, Harman and O’Hearn ran three successful symposia, drawing industry and academia together on testing and verification ( S10).

In May 2019, in recognition of this impact, Harman received both the IEEE Harlan Mills award and the ACM SIGSOFT Outstanding Research Award — the first time in 20 years that both these awards had been simultaneously given to the same researcher. This was partly in recognition of Harman’s co-founding of the field of Search Based Software Engineering itself, and partly a reflection of the impact that this research work has had, at Facebook and elsewhere.

In July 2019, Harman was invited to give the opening keynote at the International Symposium on Software Testing and Analysis (ISSTA 2019), on the deployment of search-based software engineering research at Facebook; three years after the publication of the underpinning research at ISSTA 2016. The General Chair of ISSTA 2019, said: “Mark received the IEEE Computer Society’s 2019 Harlan D. Mills Award for his fundamental contributions throughout software engineering, most notably on his seminal contributions in establishing search-based software engineering. In recent years, Mark led the Sapienz team in Facebook to deploy Sapienz to continuously test Facebook’s suite of Android and iOS apps, which has made significant impact in practice. Mark’s great accomplishments in both research and practice made him a perfect keynote speaker at ISSTA 2019, the flagship conference of software testing in the software engineering community” ( S11).

Similarly, the Chair of Committee, IEEE (Harlan Mills Award) said: ”[o]ne of the key reasons Professor Mark Harman received the Harlan Mills award in 2019 was his successful deployment of the Sapienz technology, resulting from many years of research at UCL, into a Facebook technology enabling efficient automated testing” ( S11).

Synthesia: cheaper and more accessible presenter-led videos

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Technological
Is this case study continued from a case study submitted in 2014?: No

Download case study PDF

1. Summary of the impact

Advances from the 3D Vision team at UCL, led by Prof. Agapito, have enabled new ways to synthesise video of photorealistic human faces in speech. This technology has been commercialised by Synthesia, a spinout co-founded by Agapito in 2017, via the launch of services with personalised and localised AI presenters that fit within professional content creation pipelines, as well as products including automatic text-to-video synthesis with a framework for ethical control. Synthesia has rapidly grown to be one of the top UK AI companies in terms of investment, revenue, and customer base, serving large companies such as Facebook, Google, Fedex and Tesco across a wide range of sectors. [TEXT REMOVED FOR PUBLICATION]. As a result of using Synthesia’s product, the high-profile campaign Malaria No More (featuring David Beckham) has raised USD14,000,000,000 for the cause.

2. Underpinning research

Synthesising photorealistic, expressive human faces in speech has been a long-standing challenge in computer vision and graphics. For decades, this technology has been the exclusive domain of the film and TV industries, with multi-million budgets needed to build specialised and complex multi-camera 3D capture studios to create digital 3D doubles of humans, and for manual post-production by visual effects artists. While the recent emergence of deep fake technology, based on 2D generative adversarial networks (GANs), has enabled easy creation of videos of talking people, bypassing the 3D capture process comes at the cost of a lack of any form of explicit control or 3D interpretability over the synthesis.

For over 15 years, Agapito’s team has been at the forefront of research in non-rigid 3D modelling from monocular video, a technology that has enabled to fully automate the process of capturing vivid 3D models of humans in motion, directly from videos captured casually with a single commodity camera, without the requirement for expensive studios, or hours of manual editing. Perhaps more importantly, these algorithms do not require expensive 3D supervision, or vast amounts of data, operating in a self-supervised fashion.

Agapito’s team pioneered the first algorithms to demonstrate full dense tracking and 3D reconstruction of deformable surfaces, such as human faces, from monocular sequences –

video clips captured with a single camera. Existing monocular algorithms were simplistic and severely limited to only handle a small set of sparse points. On the other hand, fully dense modelling of non-rigid surfaces had only been shown before for specialised multi-camera setups or depth cameras.

The next level to enable truly light-weight, low-cost, scalable, fast and accurate 3D capture of faces in speech was to enable frame-to-frame sequential operation and to couple tracking and 3D reconstruction into a single inference ( R1). This research ( R1) resulted in the first sequential method to simultaneously track and reconstruct deformable surfaces in motion directly from an input video at close to real-time operation. The innovation in ( R1) was to estimate 3D deformations directly from photometric consistency losses and resulted in the most accurate, fully automated method to reconstruct 3D models of non-rigid surfaces directly from a single video. Agapito’s team pushed this method further in ( R2) to model textures and changes in appearance due to deformations and illumination changes over time.

Beyond faces, Agapito’s team has also pioneered weakly supervised methods for 3D human pose estimation from single images that only require 2D image annotations ( R3) which are cheaper and easier to harvest than the 3D annotations required by other methods.

These breakthrough algorithms for monocular non-rigid 3D reconstruction by Agapito’s team at UCL ( R1-R3) form the underpinning technology that made 3D-driven, photorealistic and low-cost AI video synthesis finally possible and form an integral part of Synthesia’s technology. It is this ability to create photorealistic digital doubles of humans at scale, automatically and directly from casually captured videos (even from a mobile phone), that has enabled Synthesia to incorporate 3D reasoning into the synthesis process to provide the explicit control and interpretability that other 2D generative models (such as GANs) completely lack. In turn, this 3D reasoning and control are responsible for the high quality and photorealism of the synthesised videos that sets Synthesia apart from their competitors.

4. Details of the impact

The ability to model 3D faces in speech from video alone (described in R1-R3) eliminated the need for complex capture studios and 3D scanners, and became the cornerstone for generating ‘synthetic media’ such as AI-generated realistic, human-like avatars. Agapito co-founded Synthesia with other researchers and entrepreneurs to provide commercial solutions for a range of applications for this new technology, from lip-sync dubbing for content localisation to personalised video messages and corporate training. Synthesia technology allows users to create professional looking videos by simply typing a message, using an automated, 3D-driven AI process to synthesize photorealistic results that are indistinguishable from real video but without the need for cameras, actors or expensive film studios.

Agapito’s research has 1) enabled the commercial viability and rapid growth of Synthesia; 2) benefited Synthesia’s customers by providing new and more cost-effective services; and 3) increased public understanding of synthetic media through its high-profile work.

Impact on enabling the commercial viability and rapid growth of Synthesia

Regarding the impact of Agapito’s research on founding Synthesia, CEO and co-founder, says: “The algorithms proposed in her research for accurate 3D non-rigid shape estimation from video based on photo-consistency, were pivotal to the creation of Synthesia and now photometric tracking lies at the core of our technology. The ability to capture the 3D geometry and appearance of a human face in speech from a single video with algorithms building on Agapito’s research has been transformational in allowing us to build a low-cost solution to create high fidelity 3D avatars of humans for animation and synthesis” ( S1). It is this breakthrough in capturing 3D geometry from short video clips or even a single image which allows the low-cost, controllable synthesis that is so valuable in applications.

3D understanding means that the actions of the synthesized faces can be fully decoupled from the input clip, and controlled by smart software, to open up the new applications which are at the heart of Synthesia’s rapid growth. As the CEO continues: “By making it easier to reconstruct 3D photorealistic faces, the algorithms permitted Synthesia to provide new services to clients” ( S1). Synthesia has transitioned from offering high profile video-to-video services towards offering a “Software as a Service” platform where users can create videos simply by writing the speech of the digital ‘actor’. This technology has a wide range of applications, from corporate training to in-house communication to sales. As such, Synthesia’s services have been used by diverse clients, including Reuters, WPP, Dixa, Just Eat, Tesco, FedEx, Facebook and Google.

With this technology in place, Synthesia was extremely well placed to grow rapidly during the recent boom in the use of online services. Synthetic media has become a cheaper and quicker way to produce video, an advantage amplified by the Covid-19 pandemic. With studios and other facilities out of action, companies with needs in corporate communication, training or advertising used synthetic video generation for the first time. In an interview with TechRepublic, the CEO said, “Using AI, we've digitized the video production process and enabled our customers to create a video in 5 to 10 minutes, without the need for any cameras, actors or studios" ( S1).

During 2020, [TEXT REMOVED FOR PUBLICATION]. As a result of its world-leading technology, Forbes magazine named Synthesia one of its “fearless five” Tech companies ( S3).

Reshaping famous faces for global commercials and charity campaigns

Synthesia has provided the technology behind many of the most high-profile uses of photorealistic video face synthesis, as the ease of use and low cost of this technology were combined with the need for extremely high professional standards of quality and ethical use.

For example, the company completed a highly successful project for the food delivery company Just Eat. After filming a campaign with the rapper Snoop Dogg in 2020, the company wanted to extend it to its Australian subsidiary MenuLog. However, rerecording the advert with the new name would have been prohibitively expensive. By using Synthesia’s rendering, they were able to simply edit the original, reaching an entirely new audience without making a new advert, which created substantial savings for the firm. In addition to saving significant costs for the firm, this advert was crucial in helping MenuLog reach new audiences, and the campaign went on to receive over 10,000,000 views ( S4).

The technology was also crucial in a 2019 Malaria No More campaign that raised USD14,000,000,000 to help end the world’s three biggest preventable killer diseases: AIDS, Tuberculosis and malaria. It was used to make David Beckham speak 9 languages as part of a 2019 campaign video, and because this video localised the campaign to suit specific global audiences, it created 700,000,000 online impressions and resulted in the disease’s peak awareness in almost 3 years. The campaign attracted over 1,800 pieces of media coverage, resulting in internet searches for malaria reaching an all-time high ( S5). It won marketing awards, as well as an award for social good in AI, and was instrumental in winning commitments from world leaders to increase funding for fighting malaria. As one of the Malaria No More team said, “ This magic wouldn’t have been possible without the talented team at Synthesia” ( S5).

Cutting costs and increasing engagement for training videos and online assistants

Synthesia’s development of a highly automatic and scalable SaaS platform, which is key to its current growth, has enabled easy-to-use Text to Video services which customers can use to generate video of synthetic actors as easily as typing text.

For example, the multinational communications company WPP used Synthesia in 2020 to provide their corporate training videos in multiple languages, without having to reshoot using different actors and scripts. As WPP’s chief technology officer told Wired magazine, this saved them a considerable amount of money: “ A company-wide internal education campaign might require 20 different scripts for WPP’s global workforce, each costing tens of thousands of dollars to produce. With Synthesia we can have avatars that are diverse and speak your name and your agency and in your language and the whole thing can cost USD100,000” ( S6). The former Global Director of WPP said that “Synthesia allowed us to transform the way we think of training materials” ( S6).

Life Extension Europe, a nutritional supplement supplier, utilised Synthesia to improve its online sales. The success in this approach was reflected in the increased time that visitors spent on their webpages; for instance, with the new videos, the average session duration in the UK increased to 9 minutes 37 seconds from the previous average of 3 minutes and 36 seconds, an increase of +167%. Similarly, the average number of page views per session increased to 6.16 (versus 4.24 previously), an increase of 45.3%. The campaign recorded an additional 101 transactions ( S7).

Improving public understanding of and engagement with synthetic media

The final impact of Agapito’s research, via Synthesia, has been increased awareness of the positive potential of synthetic media. While there has been widespread concern about ‘deepfakes’ and the potential for misinformation, Synthesia has used Agapito’s research to demonstrate the many beneficial applications of synthetic media. Their work, and the code of ethics they have developed for the use of their work, has been widely covered in news media in 2019 - 2020, illuminating ways in which the changed landscape of AI-generated video can be regulated and turned to socially responsible ends. For instance, the MIT Technology Review praised Synthesia for only working with vetted clients in its 2019 article, “Making deepfake tools doesn’t have to be irresponsible. Here’s how.” Similarly, TechCrunch expressed excitement that Synthesia’s products “ could also be used to expand the reach of creators around the world” ( S8).

Synthesia’s code of ethics includes a commitment only to create synthetic video of people who have given their explicit permission. This debate has also been promoted by other industry leaders such as Samsung, who have highlighted Synthesia’s work in their 2020 list of “5 companies leading the creation of AI-enabled images & videos” ( S9). As the Managing Director of Samsung Next (Samsung’s synthetic media initiative) said in relation to the Synthesia’s work, “it looks like AI will actually democratize creativity” ( S9).

In addition to raising awareness of the potentially positive impact of AI, Synthesia created engaging AI videos for public consumption, such as the Synthesia Santa, which allowed users to easily create a video from Santa speaking their text to friends and family. The site made 90,000 cards in the first three weeks of launch, and in September 2020, Synthesia was voted the #2 product by Product Hunt ( S10).

5. Sources to corroborate the impact

S1. Testimonial from Victor Riparbelli, CEO and co-founder of Synthesia.

S2. Confidential information on the finances of Synthesia, available upon request.

S3. Media Coverage of the Synthesia’s commercial prospects:

Forbes : 2019’s Boldest Media & Tech Companies (The “Fearless Five”) https://www.forbes.com/sites/petercsathy/2019/11/29/2019s-boldest-media--tech-companies-the-fearless-five/#521be2ae3d4c

S4. Discussions of the Just Eat campaign.

Forbes covers the making of Snoop dogg MenuLog advert, powered by Synthesia
Video by Synthesia demonstrating how the edit of the advert was done.

S5. Discussions of Synthesia’s work for no More Malaria:

Synthesia case study:
TechCrunch covers David Beckham Malaria-no-more campaign
ABC News, David Beckham 'speaks' 9 languages for new campaign to end malaria
Sky News, David Beckham 'speaks nine languages' in malaria campaign's new video https://news.sky.com/story/david-beckham-speaks-nine-languages-in-malaria-campaigns-new-video-11688600
The Drum, How Malaria No More UK inspired world leaders to commit $4bn to defeat deadliest-ever disease

S6. Discussions of WPP’s use of Synthesia

WIRED: Deepfakes Are Becoming the Hot New Corporate Training Tool https://www.wired.com/story/covid-drives-real-businesses-deepfake-technology/
Futurism: This Company is Making Corporate Training videos using Deepfakes

S7. Zesta NY Resolution video report for Life Extension Europe.

S8. Media discussions of the ethics of ‘deepfakes’

MIT Technology Review Making deepfake tools doesn’t have to be irresponsible. Here’s how.
TechCrunch An optimistic view of Deepfakes

S9. Samsung Next’s ‘Landscape of Synthetic Media and its discussions of Synthesia

Samsung NEXT – Landscape of synthetic media, featuring a synthetic video of the author (Managing Director and General Manager of Samsung NEXT) powered by Synthesia:

S10. Public engagement with Synthesia products

Santa video page usage, 90,000 Santa cards made within 3 weeks of launch: https://www.synthesia.io/santa
Synthesia voted 2nd place of ‘product of the month’ in Sep 2020 on product hub: https://www.producthunt.com/posts/synthesia-2?utm_campaign=producthunt-api&utm_medium=api&utm_source=Application:+IFTTT+(ID:+2742)

xlinkit for fast, cheap, reliable and automated verification of over-the counter ...

Submitting institution: University College London
Unit of assessment: 11 - Computer Science and Informatics
Summary impact type: Technological
Is this case study continued from a case study submitted in 2014?: Yes

Download case study PDF

1. Summary of the impact

The Software Systems Engineering Group at UCL developed and patented xlinkit, an approach that supports the validation of XML documents in general and over-the-counter (OTC) derivative transactions expressed in the Financial Products Markup Language (FpML). The widespread adoption of FpML—also adopted by JP Morgan Chase’s Extensions for securities, repos and security lending (95% of financial market participants now use it for OTC transactions)—has brought about a substantial reduction in market and credit risk for financial institutions, by reducing the time required to confirm derivative transactions from up to ten days to at most one day. About 500 validation rules have been defined for FpML using xlinkit. They define, for example, the constraints that check that the cashflows in an order for a derivative instrument match those held by a counterparty in a confirmation or that payment periods of the agreed cashflows equally divide the contract period. This reduction critically relies on very high straight-through-processing rates, which cannot be achieved if manual interventions are required. High straight-through-processing rates are critically reliant upon a high level of consistency within the OTC contracts, which xlinkit crucially provides. These consistency rules are formally defined in xlinkit and can be automatically checked by FpML validation products. This innovation is especially important given the high value of OTC derivatives, which rose to USD640,400,000,000,000 per annum at the end of June 2019. Message Automation (which markets a product called Message Automation Validator, based on xlinkit patent) has received GBP3,000,000 revenue in that same period. Following the acquisition of Message Automation by Broadridge in 2017, xlinkit remains an essential part of Broadridge Financial Solutions software platform, among others, now in use by over 50 financial institutions globally.

2. Underpinning research

The background of the research that led to xlinkit was Professor Wolfgang Emmerich and Professor Anthony Finkelstein’s work on consistency management of structured and semi- structured software engineering artefacts, usually source-code documents. The consistency management of such documents required three elements: 1) the representation of abstract syntax trees and graphs; 2) the definition of validation rules to define static semantics and inter-document consistency constraints; and 3) the construction of validation engines that can execute these rules.

The adoption of the Internet standards for managing semi-structured documents—most notably XML—created the possibility of applying similar techniques to documents other than source codes. Similar in nature to abstract syntax trees, representation of such semi-structured XML documents is governed by the Document Object Model (DOM). Thus, the UCL Software Systems Engineering Research Group explored whether the principles, methods and techniques for consistency management of software engineering artefacts could be realised more elegantly using the emerging family of standards on XML that were being defined at the same time by the World Wide Web Consortium (W3C)—thereby making them applicable to a broader application area and semi-structured documents that are managed in a decentralised manner. A decentralised setting necessitated managing the consistency relationships out of bounds from the documents being related. The XLink standard of the W3C enabled the management of such out-of-bound relationships and the research focused on how such XLink relationships could be defined and created in an effective and efficient manner.

Xlinkit defines a first order rule language, which combines universal and existential quantification with Boolean logic operators over path expressions defined using the XPath standard. Through further work, researchers under the supervision of Professors Emmerich and Finkelstein then developed three different interpretations for the xlinkit language. The first interpretation showed how the language determines whether two distributed semi-structured documents are consistent with each other ( R1). A second interpretation defines how out-of-bound links that capture consistency relationships between elements in two semi-structured distributed documents ( R1) are inferred. And the third interpretation defines how the language determines for two distributed documents that are inconsistent with each other all possible modifications that render them consistent again ( R3). The initial application of this research was to demonstrate how to manage the consistency of software engineering documents ( R2, R4).

Once the wide applicability of the basic research on consistency management using XML technologies became evident, UCL protected the IP of the underlying research by patenting it in the US and UK. UCL then created a spin-out company called Systemwire, appointed a CEO to run the company and moved to develop a commercial application of the research results. This application has been available since early 2002.

During Spring 2002, UBS, UCL and Systemwire proposed the creation of the FpML Validation Working Group to the International Swaps and Derivatives Association (ISDA), which was submitted in June 2002 ( R6). ISDA accepted the proposal and the FpML Validation working group was created in autumn 2003. It was chaired by Christian Nentwich, Wolfgang Emmerich’s PhD student and had wide industry participation from BNP Paribas, Deutsche Bank, Barclays Capital, UBS and JP Morgan. The Validation working group then used the xlinkit language called Constraint Language in XML (clix) to formulate consistency rules for derivative transactions defined using the FpML standard ( R5).

4. Details of the impact

Financial services institutions who trade in over-the-counter (OTC) derivatives have benefitted significantly from research conducted at the UCL Software Systems Engineering Research Group, as the notional outstanding contract value of OTC derivatives increased during the first half of 2019 to USD640,400,000,000,000, an 18% increase from the first half of 2017 ( S1). FpML remains an open standard maintained by ISDA for documenting, dealing and processing OTC derivatives ( R5), and FpML offers a cost-effective alternative for electronic communication of derivative contract information ( S2). This innovation has reduced the time required to confirm derivative transactions from up to ten days to one day at most, thereby significantly reducing risk and exposure for financial market participants. The FpML validation rules continue to be defined using UCL’s xlinkit technology, and the adoption of FpML continues to increase in the financial service sector.

FpML standard version 5.11 and validation rules

The current Version 5.11 of the FpML standard (recently revised in July 2020) was released in December 2019 to continuously support service users. Like previous FpML versions, the latest release is governed by validation rules developed by the UCL team ( R1, R6), which are “an integral part of the FpML standard providing business logic validation in addition to the schema validation,” according to the Senior Director and Co-head of Data Reporting and FpML at the International Swaps and Derivatives Association ( S3). Different parts of the trade cycle therefore benefit from “an additional layer of business logic validation that cannot be enforced through XML schema” ( S3). The validation architecture written as a result of the underpinning research ( R6) now defines some 500 validation rules for a large number of equities, interest rates, credit, energy and foreign exchange derivatives.

These rules help clarify the meaning of derivative transactions defined in FpML and provide precise and unambiguous means for market participants using FpML to electronically trade derivatives to validate the correctness of these transactions. The validation rules are included in the normative part of the standard, which means that the financial market participants that have adopted FpML will have to comply with these validation rules in their FpML messages. As such, Message Automation continues to provide a reference implementation of these rules using its xlinkit technology, as it continues to be included in Version 5.11 of the FpML standard. This latest version has informed the continued adoption of electronic processing implemented between August 2013 and December 2020. Thus, the substantial introduction of electronic confirmation with FpML (which can be validated automatically) has reduced manual effort and brought down the time required to confirm derivative transactions from up to 10 days to at most one day. This reduction means there is also a substantial reduction in the period during which a financial market participant is subject to market and credit risk because a contract is not yet confirmed. Given the value of these transactions confirmed by the FpML surveys, this risk reduction is very significant. Some financial market participants have stated these benefits publicly.

Continued use of xlinkit language for financial services

A recent survey on progress made by firms incorporating the FpML standard gathered data from 33 participating firms including dealers, asset managers, technology companies, assets and fund managers, trade repositories, clearing houses and execution facilities ( S4). These vendors benefit from the clarity and unambiguity introduced through the validation rules defined using UCL’s consistency-checking technology with xlinkit. FpML continues to be used for regulatory reporting in major jurisdictions and reporting systems in Asia, Europe and in the USA. As such, participating vendors reported 10,000,000 daily FpML messages, showing a “ large increase in message volume compared to the last survey” and continuous implementation of UCL technology ( S4). All messages recorded abide by the validation rules defined in the xlinkit underpinning research ( R6).

The ISDA found a significant level of adoption of electronic confirmations for different classes of OTC derivatives. The adoption rate of electronic confirmations for all market participants was 81% for interest rates derivatives, 16% foreign exchange and 3% distributed among credit derivatives, commodities, equity and other derivatives ( S1). The adoption of FpML by the financial services sector has become widespread, with 76% of the financial market participants dealing with recordkeeping view and 62% dealing with confirmation view ( S4). Moreover, 39% of dealers and 30% of technology and asset management participants implement the FpML validation rules defined with xlinkit ( S4). In addition, 64% of the firms that responded to the survey use tools to convert or translate FpML to other formats or from other formats to FpML depending on firms’ needs, while 45% specified using binding tools for FpML application development. This has provided an opportunity for technology firms to develop tools related to FpML: of the technology firms surveyed, 50% provide tools to create, transform or parse FpML, while 40% provide interfaces to and from systems and 20% offer system integration and/or validation services.

Xlinkit technology through Message Automation (Broadridge Financials)

The ability to check whether a trade meets all relevant constraints automatically, and therefore with minimal cost, significantly reduces the time it takes to confirm these transactions ( R1, R2), with fewer operations staff required. This advantage highlights the flexibility of the xlinkit solution through Message Automation, where there is a “ proven track record - 100% of clients happy,” in turn building “long-term relationships with a wide range of clients inducing Tier 1 and Tier 2 banks and buy-side organisations” ( S5). By using Message Automation, financial market participants are therefore exposed to market and credit risk during a shorter period between a trade being agreed and its confirmation.

In 2017, Message Automation was acquired by Broadridge ( S6) for about GBP45,000,000, attesting to the technology’s value for the financial services sector. The most recent revenue sharing report from 17 December 2019 states that UCL Business has received royalty income from this patent in the amount of GBP462,838.95 since 2003. Between May 2017 and December 2019, GBP150,000 of this income was received from Broadridge, suggesting the continuous impact from the licenced UCL patent. The President of Global Technology and Operations International for Broadbridge, said that “ Message Automation’s leading technology and expertise on derivatives processing models” was the key driver behind this acquisition, and acquiring this technology has “helped Broadridge establish a comprehensive suite of capabilities across asset classes globally” ( S7).

The General Manager, BRMA Head of Capital Markets Data & Regulatory Solutions, Broadridge Financial Solutions stated, “Xlinkit, or as we describe it ‘Message Automation Validator’ remains an intrinsic part of our software platform. It is one of the four core engines we use in all of our business solutions, and is used for much more than just its original purpose of FpML validation, it is the decisioning engine that our orchestration engine uses for [applying] routing rules” ( S8). Through Broadridge Financials, he said, xlinkit “is now in use by over 50 financial institutions globally for various business purposes” ( S8).

Impact case study database

Search and filter

Filter by

Automated Reasoning for Amazon Cloud Security

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Human-centred security policy

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Improving Data Transfer with Multipath Transmission Control Protocols

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Modelling infectious diseases using web search data

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Sapienz deployment at Facebook

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Synthesia: cheaper and more accessible presenter-led videos

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

xlinkit for fast, cheap, reliable and automated verification of over-the counter ...

1. Summary of the impact

2. Underpinning research

3. References to the research

4. Details of the impact

5. Sources to corroborate the impact

Filter by higher education institution

Filter by unit of assessment

Filter by continued case study

Filter by summary impact type

Filter by impact UK location

Filter by impact global location

Filter by underpinning research subject