Spend: Linked Data Sparql Endpoints Discovery Using Search Engines
No Thumbnail Available
Date
2017
Journal Title
Journal ISSN
Volume Title
Publisher
Ieice-inst Electronics information Communication Engineers
Open Access Color
OpenAIRE Downloads
OpenAIRE Views
Abstract
Linked data endpoints are online query gateways to semantically annotated linked data sources. In order to query these data sources, SPARQL query language is used as a standard. Although a linked data endpoint (i.e. SPARQL endpoint) is a basic Web service, it provides a platform for federated online querying and data linking methods. For linked data consumers, SPARQL endpoint availability and discovery are crucial for live querying and semantic information retrieval. Current studies show that availability of linked datasets is very low, while the locations of linked data endpoints change frequently. There are linked data respsitories that collect and list the available linked data endpoints or resources. It is observed that around half of the endpoints listed in existing repositories are not accessible (temporarily or permanently offline). These endpoint URLs are shared through repository websites, such as Datahub. io, however, they are weakly maintained and revised only by their publishers. In this study, a novel metacrawling method is proposed for discovering and monitoring linked data sources on the Web. We implemented the method in a prototype system, named SPARQL Endpoints Discovery (SpEnD). SpEnD starts with a "search keyword" discovery process for finding relevant keywords for the linked data domain and specifically SPARQL endpoints. Then, the collected search keywords are utilized to find linked data sources via popular search engines (Google, Bing, Yahoo, Yandex). By using this method, most of the currently listed SPARQL endpoints in existing endpoint repositories, as well as a significant number of new SPARQL endpoints, have been discovered. We analyze our findings in comparison to Datahub collection in detail.
Description
Yumusak, Semih/0000-0002-8878-4991; Vandenbussche, Pierre-Yves/0000-0003-0591-6109; Dogdu, Erdogan/0000-0001-5987-0164; Kodaz, Halife/0000-0001-8602-4262
Keywords
Linked Data, Semantic Web, Sparql Endpoint, Endpoint Discovery, Metasearch, Knowledge Graph
Turkish CoHE Thesis Center URL
Fields of Science
Citation
WoS Q
Q4
Scopus Q
Q4

OpenCitations Citation Count
16
Source
8th Forum on Data Engineering and Information Management (DEIM) -- MAR, 2016 -- Fukuoka, JAPAN
Volume
E100D
Issue
4
Start Page
758
End Page
767
PlumX Metrics
Citations
CrossRef : 1
Scopus : 20
Captures
Mendeley Readers : 30
SCOPUS™ Citations
20
checked on Nov 25, 2025
Web of Science™ Citations
16
checked on Nov 25, 2025
Google Scholar™

OpenAlex FWCI
6.71780446
Sustainable Development Goals
3
GOOD HEALTH AND WELL-BEING

7
AFFORDABLE AND CLEAN ENERGY

11
SUSTAINABLE CITIES AND COMMUNITIES

12
RESPONSIBLE CONSUMPTION AND PRODUCTION

13
CLIMATE ACTION

15
LIFE ON LAND

17
PARTNERSHIPS FOR THE GOALS
