Shortcuts: WD:RAQ, w.wiki/LX

Wikidata:Request a query

From Wikidata
Jump to navigation Jump to search

Request a query
Fishing in the Wikidata river requires both an idea where to look for fish and a suitable fishing method. If you have the former, this page can help you find the latter.

This is a page where SPARQL 1.1 Query Language (Q32146616) queries can be requested. Please provide feedback if a query is written for you.

For sample queries, see Examples and Help:Dataset sizing. Property talk pages include also summary queries for these.

For help writing your own queries, or other questions about queries, see Wikidata talk:SPARQL query service/queries and Wikidata:SPARQL query service/query optimization.

Help resources about Wikidata Query Service (Q20950365) and SPARQL: Wikidata:SPARQL query service/Wikidata Query Help and Category:SPARQL.

To report an issue about the Query Service (interface, results views, export...) please see Wikidata:Contact the development team/Query Service and search.
On this page, old discussions are archived. An overview of all archives can be found at this page's archive index. The current archive is located at 2024/07.

VALUES slows down query hugely

[edit]
SELECT DISTINCT ?item WHERE {

  VALUES ?classes {
    wd:Q1785071  # fort
    wd:Q56344492 # artillery battery
  }

  ?item wdt:P31/wdt:P279* ?classes.

  ?item wdt:P17 wd:Q145.
}
Try it!

Takes 30 s, while

SELECT DISTINCT ?item WHERE {

  VALUES ?classes {
    wd:Q1785071  # fort
  #  wd:Q56344492 # artillery battery
  }

  ?item wdt:P31/wdt:P279* ?classes.

  ?item wdt:P17 wd:Q145.
}
Try it!

Takes 0.7 s. Obviously for this trivial case I could use UNION, but in my real case I want to parameterise with a VALUES array. Vicarage (talk) 21:54, 25 June 2024 (UTC)[]

Some sort of optimisation fail, I fear. Oddly, implementng VALUES in a named subquery seems to get over the problem.
--Tagishsimon (talk) 01:54, 26 June 2024 (UTC)[]

Making progress, but have 2 issues.

1) I'd like to only do the distance check if ?range1 were declared

2) I get a "bad aggregate" when trying to combine ?type1 and ?type2 preferentially

@Vicarage: The aggretation issue is sorted ut in the query above. The final GROUP BY compliments the initial SELECT, so ?itemLabel was required. Not sure where you're going with range: ?range1 is always BOUND in the first, %i, named subquery b/c it is being set by VALUES. The initial SELECT does not select ?range (or ?range1) so you're not getting the range from the lower query. And then, ?range 1 does not seem to be used in the second query at all, and all of the ?items found in the second query will have a ?range b/c they're being found by the wikibase:around function. So. Have a think, let me know. --Tagishsimon (talk) 13:29, 26 June 2024 (UTC)[]
@Tagishsimon: Thanks for the aggregate solution. I think I'm trying to be too clever, writing a generic back end that can filter on combinations of classes, distances and properties provided in a short source file, with the logic that if the filter values are mentioned, use them, otherwise skip the check entirely, to find items of many different types that are related to an item (geographically, because of subclass, because they are linked by a property of one item that is in the other's database) This requires the programatic 'IF ?thing not blank use ?thing in this check, otherwise skip the check' which doesn't seem available in the language. I think I will need to retreat to a wider range of separate back end queries. Vicarage (talk) 13:54, 26 June 2024 (UTC)[]
@Vicarage: BIND(IF(BOUND(?this),?this,?that) as ?whatever), or BIND(IF(!BOUND(?something_else),?this,?that) as ?whatever) or BIND(IF(?this<30),?this,?that) as ?whatever) sounds like what you may be after ... so you're not deciding whether or not to do the distance calculation, but rather deciding based on some condition whether to use the result? --Tagishsimon (talk) 21:56, 26 June 2024 (UTC)[]
Yes, but I can't do the distance calculation if the item doesn't have a P625, or check against a list of properties if the list is blank. Each time I come up with a generic solution it either won't accept nulls, or runs unreasonably slowly. Its very frustrating that a few dozen results can time out. Vicarage (talk) 22:17, 26 June 2024 (UTC)[]
For example, this, which uses 2 techniques you showed me, times out

Vicarage (talk) 22:25, 26 June 2024 (UTC)[]

@Vicarage: SPARQL will be the death of us. I made two changes: uncommented ?classes2 in the top query b/c otherwise that variable is unbound in the second query which would mean it was asking for a P31/P279* of everything; and then added a runfirst hint to the code which fetches ?loc1. Now 1.2 seconds runtime.
I stripped down the query and then added clauses to find out when it slowed down. I played with the runfirst b/c the sooner the number of possible values of a variable such as ?item can be restricted, the better. The query is making two calls to the label service, which is probably a bad idea. There may be scope for further hints down the line; particularly hint:Prior hint:gearing "forward". after each P31/P279* clause, within their {braces}, so the optimiser works from the ?item to the ?classes1 and ?classes2 rather than from ?classes1 to the ?item.
SELECT DISTINCT ?item ?type (SAMPLE (DISTINCT ?sta) as ?subtypeLabel) WITH {
  SELECT ?classes1 ?classes2 #?range ?props
  WHERE {
    VALUES ?classes1 {
      wd:Q1785071  # fort
    }
    VALUES ?classes2 {
      wd:Q91122    # bunker
    }
    VALUES ?range {"15"}
} } AS %i
WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en-GB,en,fr,de,es,pt,pl,nl,cs". }
  INCLUDE %i
  wd:Q1180981 wdt:P17 ?country. 
  ?item wdt:P17 ?country.
  {?item wdt:P31/wdt:P279* ?classes1 .} UNION 
  {?item wdt:P31/wdt:P279* ?classes2 .}
  { {wd:Q1180981 wdt:P625 ?loc1.} UNION 
    {wd:Q1180981 wdt:P276 ?loc. 
     ?loc wdt:P625 ?loc1} }  hint:Prior hint:runFirst true.
  SERVICE wikibase:around {
    ?item wdt:P625 ?loc2.
    bd:serviceParam wikibase:center ?loc1.
    bd:serviceParam wikibase:radius "15".
    bd:serviceParam wikibase:distance ?dist.
  } 
  MINUS {VALUES ?item {wd:Q1180981} } # itself
  OPTIONAL {?item wdt:P1448 ?labellist. FILTER (lang(?labellist) = "mul")} # official name
  {SERVICE wikibase:label {bd:serviceParam wikibase:language "en" . ?item rdfs:label ?label1} }
}
    #GROUP by ?item ?starts ?ends ?dist ?label1 ?type
    GROUP by ?item ?type ?label1
Try it!
--Tagishsimon (talk) 22:56, 26 June 2024 (UTC)[]
Argh, how could it be so dumb to not find a location before using it! Each time I prepare example code I strip out all the variants and hints I tried out, so commented out sections get through. It still slows if you multiply up the number of ?classes1 entries, but with judicial pruning, I've managed to get the Dreadnought which defeated us last week to under a minute. Vicarage (talk) 06:35, 27 June 2024 (UTC)[]

Hi! I was trying to build a query to obtain works (p31 = Q3331189) with an author (P50) whose nationality is argentinian (P27 = Q414). I believe i should use UNION..but im kind of lost, maybe with FIlTER? I appreciate any help :) Mauricio V. Genta (talk) 16:43, 26 June 2024 (UTC)[]

@Mauricio V. Genta: Right now, you'd be wanting something like this. Come back & tell me if you want the query to head in any other direction.
SELECT ?author ?authorLabel ?edition ?editionLabel WHERE { 

  ?author wdt:P27 wd:Q414.       # Argentinian author
  ?edition wdt:P50 ?author .     # is an author of something
  ?edition wdt:P31 wd:Q3331189 . # something is an edition, translation &c.
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],es,en". } 
}
Try it!
--Tagishsimon (talk) 21:48, 26 June 2024 (UTC)[]
Thanks! It was quite simple, i have to overthink less. Mauricio V. Genta (talk) 03:22, 27 June 2024 (UTC)[]

Please can we have a query to find people with a value for IPNI author ID (P586), but no article on Wikispecies? Andy Mabbett (Pigsonthewing); Talk to Andy; Andy's edits 19:49, 27 June 2024 (UTC)[]

@Pigsonthewing:
SELECT ?item ?itemLabel ?IPNI WHERE {   
  ?item wdt:P586 ?IPNI .
  filter not exists { ?article schema:about ?item ;
                                schema:isPartOf <https://species.wikimedia.org/> . }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". } 
}
Try it!
--Tagishsimon (talk) 20:26, 27 June 2024 (UTC)[]

Hi there! I'm trying to look for people & their birth details, like time & place. As the results for that query are of course too big, and times out, I tried adding some extra "filters", like being females or having an image. But even then, it's still failing. Any ideas to divide this into parts that I can later join to get the whole set? Here's a current query example (that include people article's URL for Spanish):

PREFIX schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
 SELECT
 DISTINCT
        ?item ?itemLabel ?itemDescription ?wdimage # WHAT
        ?Sdate ?SdatePrecision # WHEN
        ?where ?whereLabel # WHERE
        ?articleUrl
 WHERE {
 ?item wdt:P31 wd:Q5.
 ?item wdt:P21 wd:Q6581072. # female
 ?item wdt:P18 ?wdimage.
 ?item wdt:P19 ?where.
 ?item p:P569/psv:P569 ?SdateNode.
   ?SdateNode wikibase:timeValue ?Sdate.
   ?SdateNode wikibase:timePrecision ?SdatePrecision.
 OPTIONAL { ?articleUrl schema:about ?item .
?articleUrl schema:inLanguage "es" .
?articleUrl schema:isPartOf <https://es.wikipedia.org/> . }
SERVICE wikibase:label { bd:serviceParam wikibase:language "es" }
 }
Try it!
Pruna.ar (talk) 21:56, 29 June 2024 (UTC)[]
@Pruna.ar: Blazegraph, Wikidata's current report engine, has a slice service allowing the user to iterate through complete sets of triples, using an offset and limit pair of parameters to identify where in the set index to start, and how many triples to go through. So for your query, something like the below would be the way to go. You'll have to experiment with the two parameters; you can remove other clauses which you may have introduced only to cut down the number of qualifying items.
PREFIX schema: <http://schema.org/>
PREFIX wikibase: <http://wikiba.se/ontology#>
PREFIX wd: <http://www.wikidata.org/entity/>
PREFIX wdt: <http://www.wikidata.org/prop/direct/>
 SELECT
 DISTINCT
        ?item ?itemLabel ?itemDescription ?wdimage # WHAT
        ?Sdate ?SdatePrecision # WHEN
        ?where ?whereLabel # WHERE
        ?articleUrl
 WHERE {
   
 SERVICE bd:slice {
        ?item p:P569 [].
    bd:serviceParam bd:slice.offset 0 . # Start at item number (not to be confused with QID)
    bd:serviceParam bd:slice.limit 100000 . # List this many items
  }  
 
 ?item wdt:P31 wd:Q5.
 ?item wdt:P21 wd:Q6581072. # female
 ?item wdt:P18 ?wdimage.
 ?item wdt:P19 ?where.
 ?item p:P569/psv:P569 ?SdateNode.
   ?SdateNode wikibase:timeValue ?Sdate.
   ?SdateNode wikibase:timePrecision ?SdatePrecision.
 OPTIONAL { ?articleUrl schema:about ?item .
?articleUrl schema:inLanguage "es" .
?articleUrl schema:isPartOf <https://es.wikipedia.org/> . }
SERVICE wikibase:label { bd:serviceParam wikibase:language "es" }
 }
Try it!
--Tagishsimon (talk) 23:08, 29 June 2024 (UTC)[]

Do you have any suggestions how to speed this up, which given a list of 500 British castles tries to find out their heritage register properties. If I restrict the registers to UK only, it runs in 25s, but I ideally I want the register to be international, like ‎Burgenwelt ID (P12823)

SELECT DISTINCT ?item ?itemLabel ?type 
  (CONCAT("+",STR(COALESCE(?pointintime,?starttime))) AS ?start)
  ?description ?notes ?link ?title ?keywords WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en-GB,en,fr,de,es,pt,pl,nl,cs". }
  { 
    SELECT DISTINCT ?item ?type ?pointintime ?starttime ?description ?notes ?link ?title ?keywords WHERE {
# places that need tripadvisor entries to be notable
VALUES ?notable {wd:Q23413 wd:Q16748868} # castle, city walls

?item wdt:P31/wdt:P279* ?notable.
?item wdt:P3134 ?tripadvisor.

?item wdt:P17 wd:Q145.

VALUES ?registerclass {
 wd:Q18618628 # cultural heritage
  wd:Q19829908 # places
  wd:Q22964288 # military
  wd:Q23779665 # maritime
  wd:Q74568206 # archives
}
      {
          ?registerclass ^wdt:P31 ?register.

         ?register wikibase:claim ?claim .
          ?register wikibase:statementProperty ?value.

          ?item ?claim ?stat .
          ?stat ?value ?entry .
          OPTIONAL {
          ?register wdt:P2378 ?issuer.
          ?issuer rdfs:label ?issuerlabel. FILTER (LANG(?issuerlabel) = "en")
          ?register wdt:P1630 ?URLprefix.
                  {?stat pq:P1435 ?caveat.} # heritage designation
            UNION {?stat pq:P518  ?caveat}  # applies to part
            UNION {?stat pq:P1810 ?caveat}  # subject named as
            ?caveat rdfs:label ?caveatlabel1. FILTER (LANG(?caveatlabel1) = "en")
          }
          BIND (IF(BOUND(?caveatlabel1),?caveatlabel1,"entry") AS ?caveatlabel2)

          BIND(CONCAT("[[d:Special:EntityPage/",STR(?ISSUERLABEL),"|",STR(?ISSUERLABEL),"]] [[:Template:Website]]") AS ?description)
        }
    }
  }
}
Try it!

Vicarage (talk) 09:13, 2 July 2024 (UTC)[]

@Vicarage: This, for instance. A runfirst hint on tripadvisor b/c there are only ~28k uses of that property, so constraining the query to those items makes the rest of it smaller. Forward gearing hint on the P31/P279 property path b/c it's probably more sensible to work from subject to object than the other way around (although in fairness I've not checked: may make little difference).
SELECT DISTINCT ?item ?itemLabel ?type 
  (CONCAT("+",STR(COALESCE(?pointintime,?starttime))) AS ?start)
  ?description ?notes ?link ?title ?keywords WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en-GB,en,fr,de,es,pt,pl,nl,cs". }
  { 
    SELECT DISTINCT ?item ?type ?pointintime ?starttime ?description ?notes ?link ?title ?keywords WHERE {
# places that need tripadvisor entries to be notable
   VALUES ?notable {wd:Q23413 wd:Q16748868} # castle, city walls

  ?item wdt:P3134 ?tripadvisor. hint:Prior hint:runFirst true.
  ?item wdt:P17 wd:Q145.

  ?item wdt:P31/wdt:P279* ?notable. hint:Prior hint:gearing "forward".

VALUES ?registerclass {
 wd:Q18618628 # cultural heritage
  wd:Q19829908 # places
  wd:Q22964288 # military
  wd:Q23779665 # maritime
  wd:Q74568206 # archives
}
      {
          ?registerclass ^wdt:P31 ?register.

         ?register wikibase:claim ?claim .
          ?register wikibase:statementProperty ?value.

          ?item ?claim ?stat .
          ?stat ?value ?entry .
          OPTIONAL {
          ?register wdt:P2378 ?issuer.
          ?issuer rdfs:label ?issuerlabel. FILTER (LANG(?issuerlabel) = "en")
          ?register wdt:P1630 ?URLprefix.
                  {?stat pq:P1435 ?caveat.} # heritage designation
            UNION {?stat pq:P518  ?caveat}  # applies to part
            UNION {?stat pq:P1810 ?caveat}  # subject named as
            ?caveat rdfs:label ?caveatlabel1. FILTER (LANG(?caveatlabel1) = "en")
          }
          BIND (IF(BOUND(?caveatlabel1),?caveatlabel1,"entry") AS ?caveatlabel2)

          BIND(CONCAT("[[d:Special:EntityPage/",STR(?ISSUERLABEL),"|",STR(?ISSUERLABEL),"]] [[:Template:Website]]") AS ?description)
        }
    }
  }
}
Try it!
--Tagishsimon (talk) 11:28, 2 July 2024 (UTC)[]

Hello! I'm trying to get the population of all the city-municipalities in Baden-Württenberg. The code is fine, but it runs out of time every single time. Can it be optimized somehow?

SELECT DISTINCT ?germany ?germanyLabel ?pop WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?germany wdt:P31 wd:Q42744322.
  ?germany wdt:P1082 ?pop.
  ?germany wdt:P131* wd:Q985.
}
Try it!

Theklan (talk) 20:55, 2 July 2024 (UTC)[]

@Theklan: A hint sorts it out. The hint tells the optimiser to work from the subject (?germany) to the object (wd:Q985), rather than from object to subject. The set of values for ?germany is quite small, so quite quick to work subject to object. The set of items that are P131* wd:Q985 is very large, so slow.
SELECT DISTINCT ?germany ?germanyLabel ?pop WHERE {
  SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],en". }
  ?germany wdt:P31 wd:Q42744322.
  ?germany wdt:P1082 ?pop.
  ?germany wdt:P131* wd:Q985. hint:Prior hint:gearing "forward".
}
Try it!
--Tagishsimon (talk) 22:02, 2 July 2024 (UTC)[]
Thanks! Theklan (talk) 07:02, 3 July 2024 (UTC)[]

Hey folks. I tried to build my own query for my Pywikibot bot but I can't get it to work to what I want, so I need some help.

I got a query working to get television series episode (Q21191270) that doesn't have Trakt.tv ID (P8013) but the related television series (Q5398426) has one.

SELECT DISTINCT ?episode ?episodeLabel ?episodeTraktId ?series ?seriesLabel ?seriesTraktId
WHERE {  
  
  ?episode p:P31 ?instanceOf.
  ?instanceOf (ps:P31/(wdt:P279*)) wd:Q21191270.
  
  MINUS {
    {
        ?episode p:P8013 ?episodeTraktId.
        ?episodeTraktId (ps:P8013) _:anyValueP8013.
    }
  }
  
  ?episode wdt:P179 ?series.
  ?series wdt:P8013 ?seriesTraktId.
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". }
} LIMIT 10
Try it!

Now, I want to invert it, so I get television series (Q5398426) that doesn't have Trakt.tv ID (P8013) but the television series episode (Q21191270) within has one. I can't get it to work.

I tried to inverse the filter so it filters the television series (Q5398426) that doesn't have Trakt.tv ID (P8013) instead, and I can't get it to work. I tried to get television series episode (Q21191270) from television series (Q5398426) but I can't figure it out.

In the end, I want to do the following.

  1. Get all television series (Q5398426), television series season (Q3464665), and television series episode (Q21191270) by fetching television series (Q5398426) and its decendants, possibly using has part(s) (P527).
  2. Filter so that whats left is those that has at least one of the three has Trakt.tv ID (P8013) but not all three.
  3. Optional: Only show the television series (Q5398426) for feeding to the Pywikibot bot.
  4. Optional: Check additional entities at once such as Rotten Tomatoes ID (P1258) (for feeding only, so this would be a separate query, while leaving queries for each property as is so I can preview it)

I think that's it for now, and I appreciate the help. Thanks! Hans5958 (talk) 13:42, 7 July 2024 (UTC)[]

@Hans5958: Seems like your inverse is this:
SELECT DISTINCT ?episode ?episodeLabel ?episodeTraktId ?series ?seriesLabel ?seriesTraktId
WHERE {  
  
  ?episode p:P31 ?instanceOf.
  ?instanceOf ps:P31/wdt:P279* wd:Q21191270. 
  ?episode p:P8013 ?episodestat .
  ?episodestat ps:P8013 ?episodeTraktId .
  ?episode wdt:P179 ?series.
  MINUS { 
           ?series wdt:P8013 ?seriesTraktId.
        }
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". }
} LIMIT 10
Try it!
And then the one you want - which I understand to be, has one out of the three possible IDs, is in this direction ... you can cut down the elements in the SELECT to those fields you want to feed to pywikibot; you can add other IDs by optional clauses ... OPTIONAL {?edisode wdt:P1258 RT_ID.} .... &c. You can also BIND(COALESCE(ID1,ID2,ID3) as ?ID) if you want to present the singular ID in a singular field to Pywikibot.
It looks to me like the data is a bit dodgy; consider the series https://www.wikidata.org/wiki/Q237072#P8013 which seems to have TraktIds for two series_parts. And the query right now would give a row where the episode and series have no ID, but one of the series_parts (television series season) does have an ID. So, possibly some more work to do. Happy to play more, once you have seen this.
I've also moved it to look at wdt: rather than p: ps: because I expect all the values we're dealing with will be truthy.
Finally, there's more optimisation that could be done by way of selecting rows without labels first, and selecting labels second, so avoid timeouts which sporadically happen.
SELECT DISTINCT ?episode ?episodeLabel ?episodeTraktId ?series ?seriesLabel ?seriesTraktId  ?series_partTraktId
WHERE {  
  
  ?episode wdt:P31/wdt:P279* wd:Q21191270. 
  OPTIONAL {?episode wdt:P8013 ?episodeTraktId . }
  ?episode wdt:P179 ?series.
  OPTIONAL {?series wdt:P8013 ?seriesTraktId. 

  OPTIONAL {?series wdt:P527 ?series_part .
  ?series_part wdt:P8013 ?series_partTrakID . }
  }
   FILTER(!BOUND(?episodeTraktId) && !BOUND(?seriesTraktId)|| 
          !BOUND(?episodeTraktId) && !BOUND(?series_partTraktId) || 
          !BOUND(?seriesTraktId) && !BOUND(?series_partTraktId) && BOUND(?series_part)
         )
  
  SERVICE wikibase:label { bd:serviceParam wikibase:language "en,[AUTO_LANGUAGE]". }
} LIMIT 10
Try it!
--Tagishsimon (talk) 15:48, 7 July 2024 (UTC)[]
@Tagishsimon: Thanks for the help. The inverse query seemed to work as I expect, so that's one.
For the second one, I have to be honest that I found it quite difficult to explain it at that time. I think it is best for me to clarify my intentions.
My bot will fill the missing Trakt.tv ID (P8013) of television series season (Q3464665) (series), television series episode (Q21191270) (season), and television series (Q5398426) (episode) items. Since they are related to each other (most likely scenario: a season is part of a series, a episode is part of a season), the best approach in my opinion is to use the values of Trakt.tv ID (P8013) found in either of these three itmes. The bot will iterate each series using has part(s) (P527) to get the seasons and do the same on each seasons to get the episodes, so the bot only gets the list of series items, and nothing else.
I was thinking to do the following.
  1. Get all series and its decendants on two levels (season and episode items) with has part(s) (P527).
  2. Exclude any series when...
    1. ...the series, all of the seasons of the series, and all of the episodes of all of the season has Trakt.tv ID (P8013).
    2. ...the series, all of the seasons of the series, and all of the episodes of all of the season doesn't have Trakt.tv ID (P8013).
    You can say that for all (truthy) values of Trakt.tv ID (P8013)... (no statement = False), XOR all of it, and exclude if it is 0.
    Some examples:
  3. Only show television series (Q5398426) for feeding to the Pywikibot bot (The way I did it is just SELECT ?series as ?item but I said it here in case there are any better ideas)
For the example you gave on Q237072, I checked every seasons of it and I don't see any Trakt.tv ID (P8013) on it. In any case, this should be included by the query since the series has one.
I hope this clears any confusion. Please ask away if there is still one. Sorry for the hold up. Hans5958 (talk) 07:12, 8 July 2024 (UTC)[]

Hello,

I tried to make a graph of people related to some specific organisations (associations or less formal groups with member of (P463) and political parties with member of political party (P102)), which I will call the “first people” or people from the first criteria, as well as the relatives they have in common (with a few different family properties).
The explanations are probably quite confusing to read, there is a shorter FILTERing formula closer to the end of this post which makes the logic more understandable.
As there are hundreds of people in these groups, I want to filter on several criteria. I want people connected to at least two groups or parties, excepted if it is only to political parties and no group, as several people are only changing parties and it is not interesting to include them. I also want some of the relatives of these people; the relatives appearing should be either connected to at least two of the “first people” themselves directly connected to groups (even if these “first people” are only connected to one group or party and therefore excluded from the first criteria), or be connected to one person connected to groups + connected to one group or party themselves.

I first went with a UNION to get people linked to groups and parties, and got different COUNTs on each, which we will call ?cG and ?cP, respectively for the number of groups and the number of political parties the person is connected with.

For the relatives’ part, I went with a subquery reproducing the first UNION, added the relatives (I filtered on the superior than 0 to eliminate a NULL value otherwise appearing), and BINDed the ?relative variable on the ?item variable in order to UNION this section with the first part and counted the connections to them with ?cR. The further FILTER operations with the COUNTs need to be done outside of the UNIONs, as it will be their different sums which will include or not the people in the final request.

However I failed to properly write the UNIONs. I succeeded getting a result by having the two first COUNTs separated from the relatives’ one, but this way it does not group/merge the items in one, and the people appearing in both subqueries get two lines. This is now my main problem, all the rest can be ignored, I should be able to continue after solving this merging problem.

The current request is:

SELECT DISTINCT ?item ?itemLabel (count (?memberOf) AS ?cG) (count (?memberOfPP) AS ?cP) ?cR WHERE {  
  {  
    VALUES ?memberOf { a bunch of wd:Q } .  
    ?item wdt:P463 ?memberOf .  
  }  
  UNION  
  {  
    VALUES ?memberOfPP { another bunch of wd:Q } .  
    ?item wdt:P102 ?memberOfPP .  
  }  
  UNION  
  {  
    SELECT ?item ?cR WHERE {  
      {  
        SELECT ?relative (count (?relative) AS ?cR) WHERE {  
          SELECT DISTINCT ?item ?relative WHERE {  
            {  
              VALUES ?memberOf { a bunch of wd:Q again } .  
              ?item wdt:P463 ?memberOf .  
            }  
            UNION  
            {  
              VALUES ?memberOfPP { another bunch of wd:Q again } .  
              ?item wdt:P102 ?memberOfPP .  
            }  
            OPTIONAL { ?item (wdt:P22|wdt:P25|wdt:P26|wdt:P40|wdt:P451|wdt:P1038|wdt:P3373|wdt:P3448) ?relative }.  
          }  
        } GROUP BY ?relative HAVING (?cR > 0) }  
      BIND (?relative AS ?item).  
    }
  }
  SERVICE wikibase:label {bd:serviceParam wikibase:language "fr,en" }  
} GROUP BY ?item ?itemLabel ?cR
Try it!

I have not written the FILTERing conditions in order to better visualise the results. It should look like the following conditions, but it will be quite trivial to write so it is of no importance for the moment:
(?cG > 1) OR (?cR > 1) OR ((?cG + ?cR) > 1) OR (?cG > 0 AND (?cG + ?cP) > 1) OR (?cR > 0 AND (?cR + ?cP) > 1)

Also in the end, it will be a graph between people, the groups and political parties they are belonging to and some other people connecting groups with others. This will also be done once the conditions can be applied, by joining the result with simple OPTIONAL queries or another way. This will not be complicated to write, so I do not need help on this either.

If this query can be made, I will then add a last criteria/UNION consisting of adding the people excluded from the first criteria, but connecting to people included in the other main query. For example someone who is member of only one targeted group, but who has a relative themself only included in the main query as being a relative of 2+ people from the first UNION. In other words, someone appearing in one of the two first groups/parties’ UNIONs without fulfilling the whole criteria (so normally not in the relatives’ UNION), but who is the cause of people in the relatives’ UNION to appear.

Thanks in advance! --Bischnu (talk) 23:33, 7 July 2024 (UTC)[]