Learning URI selection criteria to improve the crawling of linked open data
- Submitting institution
-
University of Greenwich
- Unit of assessment
- 11 - Computer Science and Informatics
- Output identifier
- 30839
- Type
- E - Conference contribution
- DOI
-
10.1007/978-3-030-21348-0_13
- Title of conference / published proceedings
- The Semantic Web: 16th International Conference / Proceedings. Lecture Notes in Computer Science
- First page
- 194
- Volume
- 11503
- Issue
- -
- ISSN
- 0302-9743
- Open access status
- Deposit exception
- Month of publication
- May
- Year of publication
- 2019
- URL
-
-
- Supplementary information
-
-
- Request cross-referral to
- -
- Output has been delayed by COVID-19
- No
- COVID-19 affected output statement
- -
- Forensic science
- No
- Criminology
- No
- Interdisciplinary
- No
- Number of additional authors
-
1
- Research group(s)
-
-
- Citation count
- 0
- Proposed double-weighted
- No
- Reserve for an output with double weighting
- No
- Additional information
- Crawling RDF data is important to harvest enormous Linked Open Data Web. This paper provides a pioneering use of the online learning algorithm FTRL (Follow-the-Regularised-Leader) instead of heuristic rules to predict RDF relevance of web resources. With the prediction model designed in this work, the performance (throughput and coverage) of RDF data crawler improves significantly. In 2019 the acceptance rate of ESWC was 29.1%. This paper received a best paper award at the conference.
- Author contribution statement
- -
- Non-English
- No
- English abstract
- -