Wikidata - Largest Crowdsourced Open Data Knowledge Graph

Kartik Arora (~kartik53)


Description:

Wikidata - Largest Crowd sourced Knowledge Graph - Open Data

Wikidata is one of the many sister projects of Wikimedia Foundation Wikimedia Foundation projects

What is a Knowledge Graph

Or a Knowledge Base to be more generic, but we tend to use Graph structure, hence many times used interchangeably as Knowledge Graph. (Again, Google calls it Knowledge Graph as Knowledge Graph) When we refer to these, the following concepts are what we have in mind * Some sort of formalization in terms of how we are representing our data (Ontology!) * Data - in terms of Entities, events, relationships or any other formalization defined * Some Functions - in term of functions for maintenance, cleaning, freshness. * Some Engine - in terms of a platform that helps us have these functions, make them run on the data we have

Some examples of these knowledge bases: * Wikidata * Google's Knowledge Graph * Freebase * Microsft's Satori * Wolfram Alpha

Wikidata

Wikidata is a document-oriented database, focused on items. Each item represents a topic (or an administrative page used to maintain Wikipedia) and is identified by a unique number, prefixed with the letter Q — for example, the item for the topic Douglas Adams is Q42 — known as a "QID". This enables the basic information required to identify the topic the item covers to be translated without favouring any language.

wikidata sample

As of the last update on this page, Wikidata has 54,905,015 data items! If you want to get started yourself, learn more about how/what and even the why behind wikidata, I would strongly recommend Wikidata Tour which is maintained by the community to help new people get started with Wikidata.

Can we query it?

__SPARQL__,I mean Yes. Yes you can query Wikidata using SPARQL.

SPARQL = An RDF Query Language

You could go as far as saying its like SQL for data in RDF specifications

Now what is RDF? RDF is basically "Subject - Predicate - Object" triples. RDF = Resource Description Framework RDF image

Wikidata provides a beautiful tool for querying known as Wikidata Query Service

Let build a query now.

  • For the first query, lets simply get a count of people that have a spouse listed. (ever married)

SELECT (COUNT(?item) AS ?count) WHERE { ?item wdt:P31 wd:Q5. ?item wdt:P26 ?spouse. SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } }

You can do alot with these queries, here is a list of examples queries listed. A must visit page of the internet.

A live query building session to follow

Query for Nuclear plants setup per country in last 60 years

Thanks and for further Communications

Thanks for going through this material, hope I was able to help in some form or way.

Any help, updating this page, or something broader would be highly appreciated. Best place to communicate would be in order

Prerequisites:

It would great if participants are aware of NoSQL type data storage techniques and best practices. Any knowledge about other knowledge graphs would your add to the experience.

Speaker Info:

Hi All, I am Kartik and am very glad to introduce myself to your esteemed community.

Kartik Image

Just after graduation in 2015, I worked for Bing Search Engine team for a couple of years and since then have been working on my own venture here in Noida. We work as a product house, who also provide specialized services if and when needed, in the domain of Web Technologies and IoT.

I am an aspiring developer advocate, DIY enthusiast, a little extrovert and also a flexible opinionated guy.

Speaker Links:

Most of my writeups are shared here:

https://karx.github.io

Social profiles:

Section: Data Science, Machine Learning and AI
Type: Talks
Target Audience: Intermediate
Last Updated: