DSL Reference
Complete reference for the py3plex DSL (Domain-Specific Language) for querying multilayer networks.
Note
For task-oriented usage, see How to Query Multilayer Graphs with the SQL-like DSL. This page is a complete reference with all syntax and operators.
Overview
The py3plex DSL provides two interfaces:
String syntax: SQL-like queries for quick exploration
Builder API: Type-safe Python interface for production code (autocomputes referenced metrics)
String Syntax Reference
Basic Structure
SELECT <target> [FROM <layers>] [WHERE <conditions>] [COMPUTE <metrics>] [ORDER BY <field>] [LIMIT <n>]
[AT <timestamp> | DURING <start> TO <end>]
Use AT for a single time point and DURING for closed intervals (ISO 8601 strings).
Targets
nodes— Select nodesedges— Select edges (experimental; limited coverage support)
Layer Selection
FROM layer="layer_name"
FROM layers IN ("layer1", "layer2")
If FROM is omitted, all layers are considered.
Conditions
Operators:
=— Equal>— Greater than<— Less than>=— Greater than or equal<=— Less than or equal!=— Not equal
Logical operators:
AND— Both conditions must be trueOR— Either condition must be trueNOT— Negate a condition
Examples:
WHERE degree > 5
WHERE layer="friends" AND degree > 3
WHERE degree > 5 OR betweenness_centrality > 0.1
WHERE NOT layer="spam"
String values must be wrapped in double quotes.
Compute Clause
Calculate metrics for selected nodes:
COMPUTE degree
COMPUTE degree betweenness_centrality
COMPUTE clustering pagerank
Available metrics:
degree— Node degreebetweenness_centrality— Betweenness centralitycloseness_centrality— Closeness centralityclustering— Clustering coefficientpagerank— PageRank scorelayer_count— Number of layers node appears in
Metrics are computed after filtering and layer selection. See Algorithm Roadmap for the complete metric list.
Order By
ORDER BY degree
ORDER BY -degree # Descending (prefix with -)
ORDER BY betweenness_centrality
You can specify multiple keys in sequence; each key may be prefixed with - for descending order.
Limit
LIMIT 10
LIMIT 100
Complete Examples
from py3plex.dsl import execute_query
# Get high-degree nodes
result = execute_query(
network,
'SELECT nodes WHERE degree > 5'
)
# Get nodes from specific layer
result = execute_query(
network,
'SELECT nodes FROM layer="friends" '
'WHERE degree > 3 '
'COMPUTE betweenness_centrality '
'ORDER BY -betweenness_centrality '
'LIMIT 10'
)
Builder API Reference
Import
from py3plex.dsl import Q, L
Query Construction
Start a query:
Q.nodes() # Select nodes
Q.edges() # Select edges (experimental)
Layer Selection
Single layer:
Q.nodes().from_layers(L["friends"])
Multiple layers (union):
Q.nodes().from_layers(L["friends"] + L["work"])
# Or use the new LayerSet algebra:
Q.nodes().from_layers(L["friends | work"])
Layer intersection:
Q.nodes().from_layers(L["friends"] & L["work"])
Advanced Layer Set Algebra:
# All layers except coupling
Q.nodes().from_layers(L["* - coupling"])
# Complex expressions with set operations
Q.nodes().from_layers(L["(social | work) & ~bots"])
# Named groups for reuse
from py3plex.dsl import LayerSet
LayerSet.define_group("bio", LayerSet("ppi") | LayerSet("gene"))
Q.nodes().from_layers(LayerSet("bio"))
See also
For complete documentation on layer set algebra including all operators, string parsing, named groups, and real-world examples, see: Layer Set Algebra
Filtering
Comparison operators:
Q.nodes().where(degree__gt=5) # Greater than
Q.nodes().where(degree__gte=5) # Greater than or equal
Q.nodes().where(degree__lt=5) # Less than
Q.nodes().where(degree__lte=5) # Less than or equal
Q.nodes().where(degree__eq=5) # Equal
Q.nodes().where(degree__ne=5) # Not equal
Multiple conditions:
Q.nodes().where(
degree__gt=5,
layer_count__gte=2
)
Computing Metrics
Q.nodes().compute("degree")
Q.nodes().compute("degree", "betweenness_centrality")
Row-wise Transformations (Mutate)
Create new columns or transform existing ones with row-by-row operations:
# Simple transformation
Q.nodes().compute("degree").mutate(
doubled=lambda row: row.get("degree", 0) * 2
)
# Multiple transformations
Q.nodes().compute("degree", "clustering").mutate(
hub_score=lambda row: row.get("degree", 0) * row.get("clustering", 0),
is_hub=lambda row: row.get("degree", 0) > 2
)
# Conditional transformation
Q.nodes().compute("degree").mutate(
category=lambda row: "hub" if row.get("degree", 0) > 3 else "peripheral"
)
The lambda function receives a dictionary with all computed attributes and network properties
for each node/edge. Use row.get(attr_name, default) to safely access attributes.
Note
Use mutate() for row-by-row transformations. For group-level aggregations,
use summarize() or aggregate() instead.
Sorting
Q.nodes().order_by("degree") # Ascending
Q.nodes().order_by("-degree") # Descending
Limiting
Q.nodes().limit(10)
Execution
result = Q.nodes().execute(network)
Chaining
All methods can be chained:
result = (
Q.nodes()
.from_layers(L["friends"])
.where(degree__gt=5)
.compute("betweenness_centrality", "degree")
.mutate(
influence=lambda row: row.get("degree", 0) * row.get("betweenness_centrality", 0)
)
.order_by("-influence")
.limit(10)
.execute(network)
)
Temporal Queries
Filter by Time Point
# String syntax
result = execute_query(
network,
'SELECT nodes AT "2024-01-15T10:00:00"'
)
# Builder API
result = (
Q.nodes()
.at("2024-01-15T10:00:00")
.execute(network)
)
Filter by Time Range
# String syntax
result = execute_query(
network,
'SELECT nodes DURING "2024-01-01" TO "2024-01-31"'
)
# Builder API
result = (
Q.nodes()
.during("2024-01-01", "2024-01-31")
.execute(network)
)
Temporal Edge Attributes
Edges can have temporal attributes:
t— Point in time (ISO 8601 timestamp)t_startandt_end— Time range
See Working with Networks for creating temporal networks.
Grouping and Coverage Queries
Per-Layer Grouping
Group results by layer and apply per-group operations:
# Group by layer
result = (
Q.nodes()
.from_layers(L["*"])
.compute("degree")
.per_layer() # Sugar for .group_by("layer")
.top_k(5, "degree") # Top 5 per layer
.end_grouping()
.execute(network)
)
Top-K Per Group
Select top-k items per group (requires prior grouping):
# Top 10 highest-degree nodes per layer
result = (
Q.nodes()
.from_layers(L["*"])
.compute("degree", "betweenness_centrality")
.per_layer()
.top_k(10, "degree")
.end_grouping()
.execute(network)
)
Coverage Filtering
Filter based on presence across groups:
Mode: “all” — Keep items appearing in ALL groups (intersection)
# Nodes that are top-5 hubs in ALL layers
multi_hubs = (
Q.nodes()
.from_layers(L["*"])
.compute("betweenness_centrality")
.per_layer()
.top_k(5, "betweenness_centrality")
.end_grouping()
.coverage(mode="all")
.execute(network)
)
Mode: “any” — Keep items appearing in AT LEAST ONE group (union)
# Nodes that are top-5 in any layer
any_hubs = (
Q.nodes()
.from_layers(L["*"])
.compute("degree")
.per_layer()
.top_k(5, "degree")
.end_grouping()
.coverage(mode="any")
.execute(network)
)
Mode: “at_least” — Keep items appearing in at least K groups
# Nodes in top-10 of at least 2 layers
two_layer_hubs = (
Q.nodes()
.from_layers(L["*"])
.compute("degree")
.per_layer()
.top_k(10, "degree")
.end_grouping()
.coverage(mode="at_least", k=2)
.execute(network)
)
Mode: “exact” — Keep items appearing in exactly K groups
# Layer specialists: top-5 in exactly 1 layer
specialists = (
Q.nodes()
.from_layers(L["*"])
.compute("betweenness_centrality")
.per_layer()
.top_k(5, "betweenness_centrality")
.end_grouping()
.coverage(mode="exact", k=1)
.execute(network)
)
Wildcard Layer Selection
Use L["*"] to select all layers:
# All layers
Q.nodes().from_layers(L["*"])
# All layers except "bots"
Q.nodes().from_layers(L["*"] - L["bots"])
# Layer algebra still works
Q.nodes().from_layers((L["*"] - L["spam"]) & L["verified"])
General Grouping
Group by arbitrary attributes (not just layer):
# Group by multiple attributes
result = (
Q.nodes()
.compute("degree", "community")
.group_by("layer", "community")
.top_k(3, "degree")
.end_grouping()
.execute(network)
)
Limitations
Coverage filtering is currently supported only for node queries
Edge queries with coverage will raise a clear
DslExecutionErrorGrouping requires computed attributes or inherent node properties (like layer)
Explaining Results
The .explain() method adds interpretable explanations to query results or displays execution plans.
Execution Plan Mode
Call .explain() with no arguments to get the query execution plan:
plan = Q.nodes().compute("degree").where(degree__gt=5).explain()
print(plan) # Shows query stages and optimization
This mode does NOT execute the query or attach explanations.
Explanations Mode
Call .explain() with arguments to attach explanations to each result row:
result = (
Q.nodes()
.compute("pagerank")
.explain(
include=["community", "top_neighbors", "attribution"],
attribution={"metric": "pagerank", "seed": 42}
)
.execute(network)
)
# Access explanations
df = result.to_pandas(expand_explanations=True)
# df now has explanation columns with JSON-serialized dicts
Available Explanation Blocks
"community"— Community membership and size"top_neighbors"— Top neighbors by weight/degree"layer_footprint"— Layers where node/edge appears"attribution"— Shapley-based attribution explanations (see below)
Default blocks: ["community", "top_neighbors", "layer_footprint"]
Attribution Block
The "attribution" block provides Shapley value-based explanations for why nodes/edges have high metric values or rankings.
Purpose: Decompose metric contributions across layers and/or edges using game-theoretic attribution.
Basic Example:
result = (
Q.nodes()
.compute("pagerank")
.explain(
include=["attribution"],
attribution={
"metric": "pagerank",
"levels": ["layer"],
"method": "shapley_mc",
"seed": 42
}
)
.execute(network)
)
Configuration Parameters:
Parameter |
Type |
Default |
Description |
|---|---|---|---|
|
str | None |
Auto-infer |
Which computed metric to explain (required if multiple metrics) |
|
str |
“value” |
|
|
List[str] |
[“layer”] |
Attribution levels: |
|
str |
“shapley_mc” |
|
|
str |
“layers” |
|
|
int |
128 |
Monte Carlo sample count (≥16) |
|
int |
8 |
Switch from exact to MC Shapley at this threshold |
|
int | None |
None |
Random seed for determinism (strongly recommended) |
|
str |
“incident” |
|
|
int |
2 |
Ego network radius (for |
|
int |
40 |
Maximum candidate edges to consider |
|
int |
10 |
Number of top layer contributions to return |
|
int |
20 |
Number of top edge contributions to return |
|
bool |
True |
Include negative contributions |
|
bool |
True |
Cache subset computations for performance |
|
str |
“off” |
|
|
float |
0.95 |
Confidence interval level for UQ propagation |
Output Structure:
{
"metric": "pagerank",
"objective": "value",
"utility_def": None, # or "margin_to_cutoff(k=10)" for rank
"levels": ["layer"],
"method": "shapley_mc",
"seed": 42,
"n_permutations": 128,
"feature_space": "layers",
"full_value": 0.1186,
"baseline_value": 0.0500,
"delta": 0.0686,
"residual": 1e-12, # sum(phi) ≈ delta
"layer_contrib": [
{"layer": "social", "phi": 0.0401},
{"layer": "work", "phi": 0.0285}
],
"edge_contrib": [], # populated if levels includes "edge"
"warnings": [],
"cache_hit_rate": 0.73
}
Advanced Examples:
# Edge attribution for betweenness
result = (
Q.nodes()
.compute("betweenness_centrality")
.order_by("-betweenness_centrality")
.limit(10)
.explain(
include=["attribution"],
attribution={
"objective": "rank", # Explain ranking position
"levels": ["layer", "edge"],
"edge_scope": "ego_k_hop",
"k_hop": 2,
"max_edges": 40,
"n_permutations": 128,
"seed": 42
}
)
.execute(network)
)
# With UQ propagation
result = (
Q.nodes()
.uq(method="perturbation", n_samples=30, seed=42)
.compute("pagerank")
.explain(
include=["attribution"],
attribution={
"metric": "pagerank",
"uq": "propagate", # Compute attribution per UQ replicate
"levels": ["layer"],
"seed": 42
}
)
.execute(network)
)
Determinism:
Setting
seedensures reproducible Shapley values across runsSame seed produces identical attributions regardless of parallel execution settings
Different seeds produce statistically different but valid attributions
Performance Notes:
Exact Shapley: Only feasible for ≤
max_exact_featureslayers (default 8)Monte Carlo Shapley: Scales to larger feature sets via sampling
Edge Attribution: More expensive than layer attribution; bounded by
max_edgesCaching: Enabled by default to reuse subset metric computations
UQ Integration:
uq="off": No UQ (default), deterministic scalar Shapley valuesuq="propagate": Compute attribution per UQ replicate, aggregate mean/std/CIuq="summarize_only": Compute once on base network, wrap in UQ-like structure
Export:
Attribution data serializes to JSON strings when using expand_explanations=True:
df = result.to_pandas(expand_explanations=True)
# df["attribution"] contains JSON-serialized attribution dicts
Working with Results
Result Object
execute returns a QueryResult with items, computed attributes, and metadata:
result = Q.nodes().compute("degree").execute(network)
# Counts and identifiers
print(result.count) # -> number of nodes/edges
print(result.nodes[:3]) # first few nodes (use .edges for edge queries)
# Attributes are aligned lists keyed by metric name
degrees = result.attributes["degree"]
for node, deg in zip(result.nodes, degrees):
print(node, deg)
# Execution metadata (e.g., grouping, ordering)
print(result.meta)
Convert to Pandas
df = result.to_pandas(multiindex=True, include_grouping=True)
print(df.head())
Set expand_uncertainty=True to unpack UQ-aware metrics into multiple columns.
Extensibility
Custom Operators
Register custom DSL operators:
from py3plex.dsl import register_operator
@register_operator('my_metric')
def my_custom_metric(context, node):
"""Compute custom metric for a node."""
# Your implementation
return value
See Architecture and Design for plugin development.
Performance Considerations
Compute metrics once: Don’t recompute in multiple queries
Filter early: Use WHERE before COMPUTE
Limit results: Use LIMIT for large networks
Layer-specific: Query single layers when possible
Next Steps
Learn by doing: How to Query Multilayer Graphs with the SQL-like DSL
See examples: Examples & Recipes
Understand implementation: Architecture and Design