Impact case study database
Search and filter
Filter by
- Heriot-Watt University
- 11 - Computer Science and Informatics
- Submitting institution
- Heriot-Watt University
- Unit of assessment
- 11 - Computer Science and Informatics
- Summary impact type
- Technological
- Is this case study continued from a case study submitted in 2014?
- No
1. Summary of the impact
An award-winning fleet planning company began exploiting Heriot-Watt University optimization research in 2014, leading to the following impacts:
Economic: GBP11,650,000 investment and B2B income, plus GBP9,600,000 estimated gains for end-users, four jobs created (15 person-years in the period).
Environmental: the associated cloud service was launched August 2018 and by end 2020 had reduced CO2 emissions by approx. 3,000 tonnes and started recruiting nationwide home delivery fleets.
Policy: founded on results of an associated project, the World Business Council for Sustainable Development now recommend asset-sharing platforms in their guidelines for freight procurement.
2. Underpinning research
Commercial users of optimisation almost invariably use standard single-objective tools, which treat optimisation as the requirement to minimise a single objective, such as ‘cost’. However, this is increasingly regarded as suboptimal, since the ‘single objective’ formalisation distorts the underlying problem, and the optimizer will often miss solutions that would have been preferable. Rather than a cost-minimizing solution, for example, a vehicle fleet manager may prefer a plan using fewer vehicles, or providing a fairer balance between drivers’ working hours, better utilization of electric vehicles, and so forth.
Much of our research has been on algorithms for multi-objective optimization, which avoid distorting the underlying task, and deliver solutions that straddle the inherent trade-offs, providing more value and insight to the decision-maker. Early algorithms in this area were slower than their single-objective counterparts, however we are responsible for some of the more widely cited, faster and effective algorithms. In particular, they explore theoretical issues and algorithm design in the context of many-objective optimization [3.1,3.2]. While research in this area tends to focus on problems with typically 2 or 3 ‘headline’ objectives (e.g. cost and risk), the Many-Objective Optimization (MOO) area recognizes that (arguably) most optimization tasks involve 4-10, or even more, conflicting objectives.
Our MOO research has included theoretical work, to build an understanding of the consequent algorithm design challenges [3.1], and algorithm design work, in particular reference [3.2], which presents an effective algorithm for problems with 5-20 objectives. The technical challenge in many-objective optimization relates largely to the difficulty in assigning relative quality among groups of candidate solutions, since they will often be ‘mutually non-dominated’. The work in [3.2] published in 2007 at the primary conference in evolutionary computation, GECCO found an approach to assigning quality in such circumstances that was relatively efficient and outperformed rival methods. The paper was awarded the “2017 ACM SIGEVO Impact award”; which recognises “…papers published in the GECCO conference 10 years earlier, which are both highly cited and deemed to be seminal” ( https://tinyurl.com/ds4xnvp4).
During our collaboration with Route Monkey Ltd (RML) [3.3], we developed further refinements of [3.2] for real-world vehicle fleet planning tasks (broadly known as Vehicle Routing Problems (VRPs). Real-world VRPs are inherently many-objective, (e.g. cost, mileage, emissions, time, CO2, resources-used, etc.), and we found that our approach reliably outperformed leading commercial software, even in terms of the standard single-objective targets of mileage or cost.
Finally, from late 2015, we worked with the World Business Council for Sustainable Development (WBCSD) on horizontal collaboration between freight operators. Such collaboration has the potential for substantial savings in CO2 emissions, however is hard to achieve in a business context. With WBCSD, we exploited the aforementioned research, with adaptations for business collaboration models. The findings [5.1] were the basis of an innovate UK project [3.4] which went on to co-develop algorithms for multi-fleet asset sharing and an associated business model which could cope effectively with the often disproportionate utilisation of assets that arises in optimised multi-fleet solutions [3.5].
3. References to the research
The CS/logistics impact case study is underpinned by three research publications. [3.1] and [3.2] respectively underpin the theoretical and practical aspects of the research that led to all of the environmental and economic impacts. [3.1] is published in the primary international medium for its specialized research area; [3.2] is published in what is regarded the top conference in the wider research area and is regarded as seminal (and explicitly indicated as such by an award). The study also has policy impact that arises from the same overall body of work, and is best represented (in terms of underpinning) by the research published in [3.5] (co-authored by a team from the 'Connected Places Catapult', and in an open-access international journal).
[3.1] Knowles JD, Corne DW 2007, Quantifying the Effects of Objective Space Dimension in Evolutionary Multiobjective Optimization. in Obayashi S, Deb K, Poloni C, Hiroyasu T, Murata T (eds) Evolutionary Multi-Criterion Optimization. EMO 2007. Lecture Notes in Computer Science, vol. 4403, Springer, Berlin, Heidelberg.
https://doi.org/10.1007/978-3-540-70928-2_57
[3.2] Corne, DW & Knowles, JD 2007, Techniques for highly multiobjective optimisation: Some nondominated points are better than others. in Proceedings of GECCO 2007: Genetic and Evolutionary Computation Conference. pp. 773-780, 9th Annual Genetic and Evolutionary Computation Conference, London, United Kingdom, 7/07/07. https://doi.org/10.1145/1276958.1277115
[3.3] Innovate UK KTP partnership no. 9839 between Heriot-Watt University and Route Monkey Ltd, 10/2014 – 04/2018.
[3.4] FreightShare Lab (FSL), Innovate UK Project 103890, https://gtr.ukri.org/projects?ref=103890
[3.5] Vargas, A, Fuster, C & Corne, D 2020, 'Towards Sustainable Collaborative Logistics Using Specialist Planning Algorithms and a Gain-Sharing Business Model: A UK Case Study', Sustainability, vol. 12, no. 16, 6627.
4. Details of the impact
In 2014, Route Monkey Ltd (RML), was establishing a reputation for innovation in fleet software, developing business plans around the vision of a fast/flexible ‘online scheduler’, and radically simplifying fleets’ access to optimisation capability. RML adopted Corne’s research to help realise these plans, and a series of associated Innovate UK, EU and B2B projects began in October 2014 [3.5]. One project in this HWU/RML partnership explored multi-fleet collaboration, leading ultimately to policy impact; meanwhile, others transformed RML’s technology portfolio, setting the stage for RML’s acquisition in 2015/16 by Trakm8 PLC, who sustained the HWU partnership, and launched ‘Vortex’ (incorporating Corne’s algorithms) in August 2018. Associated impacts are outlined below.
Environmental: Trakm8 released the ‘Vortex’ API in August 2018, incorporating the research and underpinning their optimization service ( https://www.trakm8.com/optimisation). Vortex is used for new clients since August 2018, while gradually migrating pre-existing clients. Daily emissions savings accumulate, which can be estimated as follows: Before Vortex, we can estimate that fleets would have used, on average, 10% additional mileage. This is more conservative than the 12.5% estimated gains from top-tier optimization across diverse fleets [3.1], which takes into account that some may previously have used optimization services. This translates into reduced CO2 emissions, mitigating pollution and climate change, and contributing to the UK’s CO2 targets. Meanwhile in 2017 RML commissioned an estimate revealing savings of c. 2,200 tonnes per month (from 3,240,000 miles saved per month) across its customer base ( https://tinyurl.com/rmlest). Since August 2018, around 5% of pre-existing RML customers have migrated to Vortex, suggesting further reduced emissions of 110 tonnes per month, accumulating to c. 3,000 tonnes by end 2020. Meanwhile some very significant fleets will adopt vortex in 2021 on the basis of proven benefits beyond their current service [5.1]. These include two top-10 supermarkets accounting for over 90,000,000 home-delivery miles p.a. The latter projects and associated product confidence contribute to pre-2021 Trakm8-based economic impact, as presented below.
Economic Impacts
Route Monkey Ltd / The Algorithm People Ltd (TAP): In 2015, Trakm8 PLC acquired RML for GBP7,100,000 (for a consideration of up to GBP9,100,000); this investment was driven in large part by the distinctive and novel algorithm capabilities, and associated development roadmap, afforded by engagement with Corne’s group, which also enabled RML to leverage further R&D and private funding summing to approx. GBP3,000,000, and enabling ~15 additional person years of employment in technical positions. RML’s CEO went on to found TAP, which raised GBP1,300,000 to develop its novel pay-as-you-go online scheduling platform ‘My Transport Planner’, which makes use of vortex [5.2].
Trakm8 PLC: associated economic impacts for Trakm8 can be quantified in terms of jobs created in connection with the Vortex API service, and associated project income from a number of IUK and B2B projects, on topics ranging from integration/deployment of Vortex through to specific consultancy tasks that exploit Vortex. Trakm8 estimate these impacts (until end 2020) as: project income: GBP250,000; creating 4 jobs [5.1].
End Users: where case studies have been done by Trakm8 PLC on the impact of Vortex on individual customers reports include, e.g., savings of GBP150,000 p.a. for a charity’s transport costs [5.3], and 10% savings on fuel costs along with 30% improved driver productivity for Iceland Foods Ltd [5.4]. Estimating economic benefits across all end users is confounded by the variety of ways that end-users exploit increased plan efficiency; however a lower bound can be suggested, based on cost-per-mile of the most fuel-efficient diesel vans (11p - https://tinyurl.com/fuelppm). Assuming 3,240,000 vehicle miles per month saved to December 2020 (RML estimate noted above), the resulting figure is GBP9,600,000 [5.1].
Policy Impacts
The WBCSD is an organization in Geneva, funded by businesses globally and by the World Bank, advising businesses and influencing policy globally around sustainable practices. Corne’s research on asset-sharing was central to two reports from the WBCSD’s Low-Carbon Freight working group, and also underpins (via the FSL project), procurement guidelines for freight operators published by the WBCSD’s ‘Transforming Heavy Transport’ project.
The first WBCSD report [5.5] was built around Corne’s research as part of the Low Carbon Freight working group, and co-authored by the consortium, including Nestle, UPS, and Scania, promoting non-trivial horizontal asset sharing among fleets (i.e. beyond simply ‘backhaul’) as one of the more significant measures to be recommended for reducing emissions. Meanwhile, the freight procurement guidelines (WBCSD report 2) is a deliverable of the WBCSD’s ‘Transforming Heavy Transport’ initiative, which brings together 20 global transport organizations to guide the sector towards zero emissions by 2050 [5.6].
Additionally the WBCSD report from the World Business Council’s ‘Transforming Heavy Transport’ project, Sep 2019; “ provides professionals engaged in logistics procurement, supply chain and logistics management, and logistics emissions management with action-based guidance on how to reduce greenhouse gas (GHG) emissions and air pollutants from their freight transport and logistics procurement practices”; the report describes the ‘Freightshare Lab Asset sharing platform”, an outcome of the Innovate UK project [3.4], as an exemplar and a signpost to best practice that is “ applicable to all companies” [5.6].
5. Sources to corroborate the impact
[5.1] Group Director of Big Data, Trakm8 PLC will provide corroboration of economic impacts at Trakm8 PLC, and environmental impacts from the Vortex software.
[5.2] Chief Executive Officer, The Algorithm People Ltd (TAP), (formerly CEO of Route Monkey, 2014-18) will provide corroboration of economic impact regarding Route Monkey and TAP.
[5.3] The Challenge, Trakm8 customer case study
[5.4] Iceland Foods, Trakm8 customer case study
[5.5] “Demonstrating the GHG reduction potential of asset sharing, asset optimization and other measures”, World Business Council for Sustainable Development, first report of the Low Carbon Freight Working Group, focusing on research outcomes, November 2016. Contains and summarises Corne’s research on the benefits of asset sharing, https://tinyurl.com/wbcsdghg1
[5.6] *“Smart Freight Procurement Guidelines” (*Sep, 2019), by Smart Freight Centre (smartfreightcentre.org) and World Business Council for Sustainable Development (wbcsd.org), a publication from the World Business Council’s ‘Transforming Heavy Transport’ project, September 2019, https://tinyurl.com/wbcsdfpg
- Submitting institution
- Heriot-Watt University
- Unit of assessment
- 11 - Computer Science and Informatics
- Summary impact type
- Societal
- Is this case study continued from a case study submitted in 2014?
- No
1. Summary of the impact
JournalTOCs helps researchers to timely access ‘personalised’ new research by alerting them when new articles are published in their selected journals and, by providing them with full-text links when the articles are Open Access (OA). These OA links were the result of JEMO, a technology developed by JournalTOCs in 2014 to resolve the problem of OA articles published in hybrid journals being erroneously kept behind pay-walls. Since 2015, JEMO has been adopted by over 20,000 scholarly journals and publishing platforms including Atypon. This has led to JournalTOCs being used extensively worldwide by multiple research centres, libraries and multinationals. JournalTOCs also works with over 3,770 scholarly publishers and its effect reaches 78 licensed research centres in 19 countries.
2. Underpinning research
JournalTOCs is a university spin-off from research undertaken at the ICBL (Institute for Computer Based Learning) of Heriot-Watt University. JournalTOCs technology includes real-time data-mining software to discover latest content published in scholarly journals. The software first aggregates, normalises and enriches metadata and then makes it freely available for reuse. JournalTOCs uses JEMO to identify OA content in the metadata extracted from hybrid journals.
To further develop JournalTOCs JEMO was the result of a project funded by the EPSRC Impact Acceleration Account (IAA) (2015). The project had twofold objectives:
to help publishers make their journal metadata readily available for systematic identification of OA articles and,
to prevent OA articles from being labelled as non-OA across the production,
discovery, and delivery chain of e-journals.
The JEMO project was a partnership formed by the ICBL with five publishers (Oxford University Press, Libertas Academica, Edinburgh University Press, IGI Global and Thieme), INASP (the International Network for the Availability of Scientific Publications) and a consortium of six NHS-England hospital libraries. JEMO includes a metadata schema adapted from the Dublin Core, PRISM and Creative Commons (CC) metadata schemas. JEMO showed publishers how it was possible to make their OA content discoverable, using a cost-effective and relatively technically easy process. At that time, NISO produced its own specifications (the NISO RP-22-2015 recommendation) and disseminated its usage among publishers. However, JEMO has proved to be much more effective than the NISO RP-22-2015 metadata specification and has been widely adopted by publishers and hosting platforms. The results produced by the JEMO project was expanded with the MOOD Knowledge Transfer Project to cover online-first articles.
The application of these technologies developed by the ICBL for JournalTOCs resulted in a free service for hundreds of thousands of individual users. In addition, a Premium service, created to ensure the sustainability of the spin-off, has been licensed to more than 70 large and small research centres and libraries worldwide at very economic licence rates. Of special relevance to Open Access, and crucial to the Plan S initiative of the European Science Foundation, is the tagging of journals in the JournalTOCs database as OA or hybrid. In 2018, JournalTOCs included more selected OA journals than other services, and was unique in having identified and included individual OA articles from more than 12,000 hybrid journals. A coherent subject indexing further enhances the value of JournalTOCs.
3. References to the research
[3.1] Chumbe, S, Kelly, B & Macleod, R 2015, 'Hybrid Journals: Ensuring Systematic and Standard Discoverability of the Latest Open Access Articles', Serials Librarian, vol. 68, no. 1-4, pp. 143-155. https://doi.org/10.1080/0361526X.2015.1016856
[3.2] Chumbe, SS, MacLeod, RA & Kelly, B 2015, We should not light an Open Access lamp and then hide it under a bushel! in B Schmidt & M Dobreva (eds), New Avenues for Electronic Publishing in the Age of Infinite Collections and Citizen Science: Scale, Openness and Trust: Proceedings of the 19th International Conference on Electronic Publishing. IOS Press, pp. 102-112. https://doi.org/10.3233/978-1-61499-562-3-102
4. Details of the impact
Licensed research centres and libraries from hospitals, universities, governmental agencies, global organisations, banks as well as biotechnology and pharmaceutical companies use JournalTOCs. They use JournalTOCs to discover critical research results for their researchers. The usefulness of JournalTOCs was enhanced when the results of the JEMO (2015) and MOOD projects were integrated within JournalTOCs, enabling it to provide users with full-text links for individual OA articles. JEMO has been adopted by over 20,000 scholarly journals and publishing platforms such as Atypon, leading to JournalTOCs now working with over 3,770 scholarly publishers and its effect reaches 78 licensed research centres in 19 countries.
Research-driven biopharmaceutical companies such as NovoNordisk Pharma, Roche and Ferring Pharmaceuticals as well as hospitals from the NHS, The Australian Health Service [5.1] and the New Zealand Police are using JournalTOCs, thus saving considerable time and resources. The Information Resource Manager from Ferring Pharmaceuticals described how his company uses JournalTOCs, "The service is used as a one-stop-shop for signing up for TOC alerts. As knowledge workers in commercial organisations (based on R & D activities) we are in a combined situation of being extremely dependant on having exhaustive knowledge of new developments within our research field and having very little time to identify all relevant sources of information. A service that allows you to quickly and conveniently sign up for TOCs from any journal of potential relevance is a highly valuable tool." (5.2)
The Knowledge & Information Coordinator (New Zealand Police Library) confirmed that,
”Having an institutional licence to JournalTOCS over the last five years……has enabled us to provide a one-stop-shop approach to providing table of contents alerting to all our subscribed journals. If we did not have an affordable product like JournalTOCs, it would be a logistical nightmare trying to provide a table of contents service to our customers” [5.3].
The European University Institute, Library describing the, “unified interface and the alerts’ service”...as…”particularly valuable” [5.4].
Over 70 companies and organisations bought licences for JournalTOCs Premium services and thousands of researchers and librarians use the free version of the technology every day. In addition, many research organisations from the UK, USA, France, Denmark, Canada, Australia, The Netherlands, New Zealand, Italy, Norway, Germany, Brazil and Spain, are accessing information tagged with JEMO elements through JournalTOCs web services (API) to integrate new research with OA identification in their own applications. In addition, important worldwide organisations such as the Food and Agriculture Organization (FAO), the International Monetary Fund's (IMF) Library Network, the International Labour Organization (ILO), and the European Commission are Premium partners of JournalTOCs.
The largest consortium was signed with the Indian Space Research Organisation (ISRO) to provide premium access to new research, identified as Open Access or non-OA regardless of their provenance. The ISRO consortium includes 17 large aerospace research centres located in different parts of India ( https://www.isro.gov.in/about-isro/isro-centres). The leading research centre of ISRO is the Vikram Sarabhai Space Centre (VSSC). This consortium has been running since June 2017. Since the formation of the consortium JournalTOCs has been introduced to all libraries under ISRO. Currently 17 libraries are using this service and total number of users is about 8,000. In the VSSC library there are about 2,000 users and some of them following approximately 100 journals [5.5].
In addition, JournalTOCS established a partnership with the Quality Open Access Market (QOAM) service. QOAM was created by the CWTS (Centre for Science and Technology Studies) of Leiden University from The Netherlands. QOAM is a marketplace for scientific and scholarly journals which publish articles in Open Access. JournalTOCs has become critical to QOAM operation in matching author experiences with a journal against its publishing fees [5.6].
5. Sources to corroborate the impact
[5.1] Librarian at the Library and Information Service, Women and Newborn Health Service, King Edward Memorial Hospital, Australia, will confirm the use of JournalTocs and its importance.
[5.2] The Information Resource Manager, Global Regulatory Affairs, Corporate Information Services, Ferring Pharmaceuticals, will confirm the use of JournalTocs and its importance.
[5.3] The Knowledge & Information Coordinator, New Zealand Police Library, will confirm the benefits of using JournalTOCs.
[5.4] Letter from the European University Institute, Library confirming the use and benefits of using JournalTOCs.
[5.5] Letter from Indian Space Research Organisation confirming extensive use of JounalTOCs within the VSSC network.
[5.6] Letter from the Centre for Science and Technology Studies Leiden University, confirming the partnership with QOAM and importance of JournalTocs to QOAM operations.
- Submitting institution
- Heriot-Watt University
- Unit of assessment
- 11 - Computer Science and Informatics
- Summary impact type
- Technological
- Is this case study continued from a case study submitted in 2014?
- No
1. Summary of the impact
Data has been an underutilised output of research projects for many years due to the challenges of finding, understanding, and reusing that data. Research at Heriot-Watt Computer Science by Dr Gray contributed substantially to the definition of the FAIR Data Principles (2016) and led to a global Health Care and Life Sciences community recommendation for describing datasets for discovery and reuse. This community recommendation has been adopted by several data providers including the RDF Platform of the European Bioinformatics Institute (EBI). The standard has also been adopted and used internally in major pharmaceutical companies, including AstraZeneca, leading to datasets that comply with the FAIR Data Principles to be more readily reused and exploited.
2. Underpinning research
From February 2011 until May 2015, the World Wide Web Consortium (W3C) Health Care and Life Sciences Interest Group (HCLS-IG) had an activity, co-led by Dr Gray as an invited expert, to develop a community profile for describing datasets, i.e. to provide machine readable metadata to make datasets more Findable, Accessible, Interoperable, and Reusable, cf. the FAIR Data Principles [3.1].
The group first identified use cases from a broad range of applications within the Health Care and the Life Sciences domains. One such use case was drawn from Dr Gray’s work on the Open PHACTS Data Platform [3.2] where there was a requirement to know which version of a dataset was used within the platform, and to identify in query responses where each item of data had been retrieved from.
A community profile that extended the Data Catalog Vocabulary (DCAT) was developed to meet the needs of the use cases. Dr Gray extended the core model of DCAT to support the abstract notion of a dataset distinct from versions and distributions of the dataset. The capability developed by Dr Gray enabled the model not just to reference a dataset but to attach the information on all of its version history. As an example, it enables references to the dataset ChEMBL or any of the specific versions or multiple distribution formats per version [3.3]. The community also agreed on which properties were to be mandatory, recommended, and optional, and detailed statistics needed to enable data reuse. The community profile was published as a W3C Interest Group Note in May 2015 [3.4] and subsequently described in [3.5].
To support the adoption of the community profile, Dr Gray exploited his earlier experiences of adoption in the Open PHACTS project to provide concrete examples from a real-world dataset that could easily be adapted for other datasets. Additionally, Dr Gray’s team at HWU developed a validation tool, deployed by the W3C ( https://www.w3.org/2015/03/ShExValidata/) that supported verifying the correctness of dataset descriptions against the profile (Hansen et al., 2015). The tool supports users providing their dataset description and then choosing the level of conformance to validate against. A crucial aspect of the tool was providing meaningful, contextualised, error messages when a dataset description deviated from the community profile.
Dr Gray with his collaborator Prof Dumontier fed the outcomes of the HCLS community profile work into the development of the FAIR Data Principles [3.1] in particular shaping principles F1-3, A1 and A2, I1-3, and R1 and its sub-clauses. Together with a wider team of collaborators we demonstrated the ability to make data available following the FAIR Data Principles and the W3C HCLS Community Profile [3.6].
3. References to the research
[3.1] Wilkinson, MD, Dumontier, M, Aalbersberg, IJ, Appleton, G, Axton, M, Baak, A, Blomberg, N, Boiten, J-W, da Silva Santos, LB, Bourne, PE, Bouwman, J, Brookes, AJ, Clark, T, Crosas, M, Dillo, I, Dumon, O, Edmunds, S, Evelo, CT, Finkers, R, Gonzalez-Beltran, A, Gray, AJG, Groth, P, Goble, CA, Grethe, JS, Heringa, J, 't Hoen, PAC, Hooft, R, Kuhn, T, Kok, R, Kok, J, Lusher, SJ, Martone, M, Mons, A, Packer, AL, Persson, B, Rocca-Serra, P, Roos, M, van Schaik, R, Sansone, S-A, Schultes, E, Sengstag, T, Slater, T, Strawn, G, Swertz, MA, Thompson, M, van der Lei, J, van Mulligen, E, Velterop, J, Waagmeester, A, Wittenburg, P, Wolstencroft, K, Zhao, J & Mons, B 2016, 'The FAIR Guiding Principles for scientific data management and stewardship', Scientific Data, vol. 3, 160018. https://doi.org/10.1038/sdata.2016.18
[3.2] Groth, P, Loizou, A, Gray, AJG, Goble, C, Harland, L & Pettifer, S 2014, 'API-centric linked data integration: the Open PHACTS discovery platform case study', Journal of Biomedical Semantics, vol. 29, pp. 12-18. https://doi.org/10.1016/j.websem.2014.03.003
[3.3] Hansen, JB, Beveridge, A, Farmer, R, Gehrmann, L, Gray, AJG, Khutan, S, Robertson, T & Val, J 2015, 'Validata: An online tool for testing RDF data conformance', Paper presented at 8th International Conference on Semantic Web Applications and Tools for Life Sciences 2015, Cambridge, United Kingdom, 7/12/15 - 10/12/15.
[3.4] Gray, AJG (ed.), Baran, J (ed.), Marshall, MS (ed.), Dumontier, M (ed.), Alexiev, V, Ansell, P, Bader, G, Bando, A, Bolleman, JT, Callahan, A, Cruz-Toledo, J, Gaudet, P, Gombocz, EA, Gonzalez-Beltran, A, Groth, P, Haendel, M, Ito, M, Jupp, S, Juty, N, Katayama, T, Kobayashi, N, Krishnaswami, K, Laibe, C, Le Novère, N, Lin, S, Malone, J, Miller, M, Mungall, CJ, Rietveld, L, Wimalaratne, SM & Yamaguchi, A 2015, Dataset Descriptions: HCLS Community Profile. W3C Interest Group Note, World Wide Web Consortium. < https://www.w3.org/TR/hcls-dataset/>
[3.5] Dumontier, M, Gray, AJG, Marshall, MS, Alexiev, V, Ansell, P, Bader, G, Baran, J, Bolleman, JT, Callahan, A, Cruz-Toledo, J, Gaudet, P, Gombocz, EA, Gonzalez-Beltran, A, Groth, P, Haendel, M, Ito, M, Jupp, S, Juty, N, Katayama, T, Kobayashi, N, Krishnaswami, K, Laibe, C, Le Novère, N, Lin, S, Malone, J, Miller, M, Mungall, CJ, Rietveld, L, Wimalaratne, SM & Yamaguchi, A 2016, 'The health care and life sciences community profile for dataset descriptions', PeerJ, vol. 4, e2331. https://doi.org/10.7717/peerj.2331
[3.6] Wilkinson, MD, Verborgh, R, Olavo Bonino da Silva Santos, L, Clark, T, Swertz, MA, Kelpin, FDL, Gray, AJG, Schultes, EA, van Mulligen, EM, Ciccarese, P, Kuzniar, A, Gavai, A, Thompson, M, Kaliyaperumal, R, Bolleman, JT & Dumontier, M 2017, 'Interoperability and FAIRness through a novel combination of Web technologies', PeerJ Computer Science, vol. 3, e110. https://doi.org/10.7717/peerj-cs.110
4. Details of the impact
The increasing use of computers to support researchers in gathering data, processing and analysing data, and publishing data and research results has led to a step-change in the way research is conducted. It was vitally important to pharmaceutical companies that the Open PHACTS system could provide provenance on the returned query answers, to state where the data originated (ChEMBL, Drugbank, UniProt, etc) and which version of the dataset was used. Dr Gray developed the Open PHACTS Dataset Description to provide the needed metadata about the data consumed, including important properties such as stating the version and format of the ingested data. This allowed specific provenance information to be returned on the platform’s query answers and increased trust in the analysis resulting from the data.
Building on the Open PHACTS Dataset Descriptions, subsequent research in the period 2013-2015 led to The FAIR Data Principles, published in March 2016, which set out desirable criteria to enable the discovery, retrieval, understanding, and reuse of data associated with research, particular that funded by public bodies such as UK Research & Innovation (UKRI), European Research Council (ERC), and National Institutes of Health (NIH) [5.1].
The FAIR Data Principles built on Dr Gray’s work on dataset descriptions, particularly with respect to the definitions of principles F1, F2, F3, A1, A2, I1, I3, R1.1, R1.2, R1.3. Dr Gray collaborated in the development of the FAIR Principles and engaged in activities to publicise the FAIR principles and train people to FAIRifiy their data. The FAIR Data Principles were endorsed by the G20 Leaders’ Communique Hangzhou Summit, September 2016, by stating;
‘‘We support effort to promote voluntary knowledge diffusion and technology transfer on mutually agreed terms and conditions. Consistent with this approach, we support appropriate efforts to promote open science and facilitate appropriate access to publicly funded research results on findable, accessible, interoperable and reusable (FAIR) principles”. [5.2]
The Principles have subsequently led to interest within industry and academia to further exploit data that has previously been collected, either internally by companies or publicly by academia. This has been particularly the case within the health care and life sciences community where pharmaceutical companies have initiated/funded initiatives to retroactively make existing datasets comply with the FAIR Data Principles so that they can be more readily reused and exploited.
The W3C HCLS Dataset Description Profile enables the meeting of FAIR principle R1.3 and has been adopted internally in major pharmaceutical companies, including AstraZeneca as a means to make their internal data more discoverable and reusable by a wider set of research labs across the world. AstraZeneca’s Director, Oncology Translational Medicine, Data Strategy Lead stated,
“We recognised the costs associated with continual curation and reshaping of data as new questions arise beyond the original collection intent and have found the alignment and implementation of the FAIR principles as a way to solve this challenge”, and, “we found the use of the DCAT standard and the W3C HCLC recommendations to be critical to implementing the FAIR data set management”. [5.3]
The profile has been deployed within major data repositories including the European Bioinformatics Institute’s (EBI) RDF Platform, the Swiss Institute for Bioinformatics (SIB), and the Japanese RIKEN MetaDatabase portal for life sciences data. At the EBI, the profile was used to automate their data ingestion pipeline for their RDF platform. The approach allowed them to perform various quality control checks on the metadata. This improved the quality of the data and also saved time [5.4]
An independent study recognised that Biopharma Research and Development (R&D) productivity can be improved by implementing the FAIR Data Principles and is an enabler for digital transformation of Biopharma R&D. The study went on to highlight the impact for one company who had implemented a FAIR platform for 3,000 users across three main sites and, “ since running the FAIR platform for 2 months, the company collected usage activity data based on click counts per user. This FAIR platform had 900,000 page views in 60 days. The projection for the year gave an estimation of ∼5.5M page views. A very conservative assumption that each of these FAIR-enhanced views saved ∼5 s, by providing better search results with direct access to the target repository, led to a calculation of ∼3.5 full-time employees (FTEs) worth of time saved per year” [5.5].
In Boston, 14 May 2020, The Pistoia Alliance, a global, not-for-profit alliance that works to lower barriers to innovation in life sciences R&D, launched a freely accessible toolkit to help companies implement the FAIR (Findable, Accessible, Interoperable, Reusable) guiding principles for data management and stewardship. Collated by experts in the field, the toolkit contains numerous method tools, training and change management, as well as use cases, allowing organizations to learn from industry successes. The Alliance recognised that as the life sciences industry continues to digitize, the FAIR guiding principles of Findable, Accessible, Interoperable and Reusable data would help organizations realise their digital transformation.
“At Roche, we know that implementing the FAIR principles can be difficult for biotech and pharma organizations of every size, so we are very pleased to lead on this project and help make the process easier,” commented the Principal Scientist at Roche. “The toolkit will help to smooth the path to greater data sharing within and between industries, which is critical to future research efforts. We see the FAIR guiding principles as a worthy goal, and one which will help the industry realize the value of technologies like deep learning.” [5.6]
5. Sources to corroborate the impact
[5.1] National Institute of Health NIH New Models of Data Stewardship – endorsement of FAIR Data Principles https://commonfund.nih.gov/commons/awardees.
[5.2] G20 Leaders’ Communique Hangzhou Summit, September 2016, point 12 – endorsing the FAIR Data Principles were endorsed https://ec.europa.eu/commission/presscorner/detail/en/STATEMENT_16_2967
[5.3] Letter from AstraZeneca’s Director, Oncology Translational Medicine, Data Strategy Lead – confirming use of FAIR Data Principles.
[5.4] Head of Molecular Archival Resources, European Bioinformatics Institute (EMBL-EBI), who can be contacted to confirm implementation and impact of adopting the FAIR Data Principles.
[5.5] John Wise., Alexandra Grebe de Barron., Andrea Splendiani., Beeta Balali-Mood., Drashtti Vasant., Eric Little., Gaspare Mellino., Ian Harrow., Ian Smith., Jan Taubert., Kees van Bochove., Martin Romacker,, Peter Walgemoed,, Rafael C. Jimenez., Rainer Winnenburg., Tom Plasterer., Vibhor Gupta., Victoria Hedley., 2019. Implementation and relevance of FAIR data principles in biopharmaceutical R&D, Drug Discovery Today, Volume 24, Issue 4,
[5.6] The Pistoia Alliance announcement – Launch of toolkit to accelerate implementation of FAIR Data Principles https://www.pistoiaalliance.org/news/fair-toolkit-launch/