Reliable computing service in massive-scale systems through rapid low-cost failover
- Submitting institution
-
The University of Lancaster
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 154337822
- Type
- D - Journal article
- DOI
-
10.1109/TSC.2016.2544313
- Title of journal
- IEEE Transactions on Services Computing
- Article number
- -
- First page
- 969
- Volume
- 10
- Issue
- 6
- ISSN
- 1939-1374
- Open access status
- Out of scope for open access requirements
- Month of publication
- March
- Year of publication
- 2016
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
7
- Research group(s)
-
D - Distributed Systems
- Citation count
- 5
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- This industry-collaborated research designed and implemented a new form of user-transparent failover for massive-scale production Cloud datacentres via the concept of soft-state inference for rapid recovery from correlated software and hardware failure. Evaluated in real-world systems at scale, this work has been integrated into Alibaba’s production datacentres underpinning their core services (Taobao, Alipay, etc.) used by hundreds of millions of users. The paper was published within the special issue of Security and Dependability of Cloud Systems and Services for IEEE Transactions on Services Computing – a top IEEE journal for Computer Science Software Engineering.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -