How to Reproduce Common Analysis Workflows
Goal: Use ready-made recipes for common multilayer network analysis tasks.
Prerequisites: Basic understanding of py3plex (see Quick Start Tutorial).
About examples: Code and outputs below use the bundled synthetic datasets unless noted. Replace file paths with your own data; metrics will differ accordingly.
Complete Workflows
This guide links to detailed recipes and case studies. For step-by-step implementations, see:
Recipes: Analysis Recipes & Workflows — Focused solutions for specific tasks
Case Studies: Use Cases & Case Studies — Complete end-to-end analyses
Examples: Examples & Recipes — Runnable code examples
Quick Recipe Index
Network Construction
Building from edge lists → How to Load and Build Networks
Converting from NetworkX → Analysis Recipes & Workflows (Recipe 1)
Loading temporal networks → API Documentation (Temporal section)
Statistical Analysis
Computing multilayer statistics → How to Compute Network Statistics
Comparing layers → Analysis Recipes & Workflows (Recipe 3)
Node versatility analysis → Analysis Recipes & Workflows (Recipe 4)
Community Detection
Multilayer Louvain → How to Run Community Detection on Multilayer Networks
Cross-layer community comparison → Analysis Recipes & Workflows (Recipe 5)
Community stability analysis → Use Cases & Case Studies (Case Study 2)
Network Embeddings
Node2Vec for link prediction → How to Run Random Walk Algorithms
Embedding-based clustering → Analysis Recipes & Workflows (Recipe 7)
Layer-specific embeddings → How to Run Random Walk Algorithms
Visualization
Publication-ready plots → How to Visualize Multilayer Networks
Interactive visualizations → How to Visualize Multilayer Networks
Layer comparison plots → Analysis Recipes & Workflows (Recipe 9)
Domain-Specific Workflows
Biological Networks
Multi-omics integration:
# See: user_guide/case_studies.rst - Biological Network Case Study
# 1. Integrate protein-protein + gene regulation + metabolic pathways
# 2. Find key regulators using multilayer centrality
# 3. Detect functional modules
# 4. Prioritize disease genes
See Use Cases & Case Studies for complete implementation.
Transportation Networks
Multimodal route analysis:
# See: examples/index.rst - Transportation Example
# 1. Model different transportation modes as layers
# 2. Add transfer connections between layers
# 3. Compute optimal multimodal routes
# 4. Identify critical transfer points
See Examples & Recipes for runnable code.
Config-Driven Workflows
Use configuration files for reproducibility:
# workflow_config.yaml
network:
input_file: "data.multiedgelist"
input_type: "multiedgelist"
analysis:
- name: "statistics"
metrics: ["degree", "betweenness_centrality"]
- name: "community_detection"
algorithm: "louvain"
params:
resolution: 1.0
- name: "visualization"
output: "network.png"
layout: "force_directed"
Execute workflow:
from py3plex.workflows import execute_workflow
results = execute_workflow("workflow_config.yaml")
See Analysis Recipes & Workflows for complete config-driven workflow examples.
The config captures what to run (input format, metrics, algorithms, visualization) so you can reuse the same analysis across datasets by swapping only the file path.
DSL-Driven Analysis Workflows
Goal: Use py3plex’s DSL to create reproducible, declarative analysis pipelines.
The DSL expresses analysis workflows as queries rather than imperative code, keeping them readable and reproducible. Q builds a query, L is a layer helper, and execute_query returns dictionaries keyed by (node_id, layer) tuples.
Basic DSL Workflow Pattern
Template for DSL-first analysis: Keep the pipeline linear—load, query, refine, export—so each step can be repeated with different parameters.
from py3plex.core import multinet
from py3plex.dsl import Q, L, execute_query
# 1. Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
"py3plex/datasets/_data/synthetic_multilayer.edges",
input_type="multiedgelist"
)
# 2. Query and filter nodes with DSL
high_degree_nodes = (
Q.nodes()
.compute("degree", "betweenness_centrality")
.where(degree__gt=5)
.order_by("betweenness_centrality", reverse=True)
.execute(network)
) # dict keyed by (node_id, layer) -> metrics
# 3. Extract subnetwork
subgraph = network.core_network.subgraph(high_degree_nodes.keys())
# 4. Analyze subnetwork
print(f"High-degree subnetwork:")
print(f" Nodes: {len(high_degree_nodes)}")
print(f" Edges: {subgraph.number_of_edges()}")
# 5. Export results
import pandas as pd
df = pd.DataFrame([
{
'node': node[0],
'layer': node[1],
'degree': data['degree'],
'betweenness': data['betweenness_centrality']
}
for node, data in high_degree_nodes.items()
])
df.to_csv('high_degree_analysis.csv', index=False)
Expected output (synthetic_multilayer sample):
High-degree subnetwork:
Nodes: 25
Edges: 89
Multilayer Exploration Workflow
Systematic multilayer network analysis:
from py3plex.core import multinet
from py3plex.dsl import Q, L
import pandas as pd
# Load network
network = multinet.multi_layer_network(directed=False)
network.load_network(
"py3plex/datasets/_data/synthetic_multilayer.edges",
input_type="multiedgelist"
)
print("MULTILAYER NETWORK EXPLORATION")
print("=" * 70)
# Step 1: Per-layer statistics
print("\n1. Per-Layer Statistics:")
layer_stats = []
for layer in network.get_layers():
# Query layer nodes
layer_nodes = Q.nodes().from_layers(L[layer]).execute(network)
layer_edges = Q.edges().from_layers(L[layer]).execute(network)
# Compute metrics
result = (
Q.nodes()
.from_layers(L[layer])
.compute("degree")
.execute(network)
)
avg_degree = sum(d['degree'] for d in result.values()) / len(result) if result else 0
layer_stats.append({
'layer': layer,
'nodes': len(layer_nodes),
'edges': len(layer_edges),
'avg_degree': avg_degree
})
print(f" {layer}: {len(layer_nodes)} nodes, {len(layer_edges)} edges, avg_degree={avg_degree:.2f}")
# Step 2: Find versatile nodes (present in multiple layers)
print("\n2. Versatile Nodes (multilayer presence):")
from collections import Counter
node_layer_count = Counter()
for node, layer in network.get_nodes():
node_layer_count[node] += 1
versatile_nodes = {
node: count for node, count in node_layer_count.items()
if count >= 2
}
print(f" Total versatile nodes: {len(versatile_nodes)}")
print(f" Top 5 most versatile:")
for node, count in sorted(versatile_nodes.items(), key=lambda x: x[1], reverse=True)[:5]:
print(f" {node}: {count} layers")
# Step 3: Layer comparison
print("\n3. Layer Overlap Analysis:")
layers = network.get_layers()
for i, layer1 in enumerate(layers):
for layer2 in layers[i+1:]:
nodes1 = set(n[0] for n in Q.nodes().from_layers(L[layer1]).execute(network).keys())
nodes2 = set(n[0] for n in Q.nodes().from_layers(L[layer2]).execute(network).keys())
overlap = nodes1 & nodes2
jaccard = len(overlap) / len(nodes1 | nodes2) if (nodes1 | nodes2) else 0
print(f" {layer1} ∩ {layer2}: {len(overlap)} nodes, Jaccard={jaccard:.3f}")
# Step 4: Hub identification across layers
print("\n4. Cross-Layer Hub Nodes:")
all_metrics = (
Q.nodes()
.compute("degree", "betweenness_centrality")
.where(degree__gt=7)
.execute(network)
)
print(f" Hub nodes (degree > 7): {len(all_metrics)}")
# Group hubs by base node ID
from collections import defaultdict
hub_layers = defaultdict(set)
for (node, layer), data in all_metrics.items():
hub_layers[node].add(layer)
print(f" Unique hub node IDs: {len(hub_layers)}")
print(f" Top 5 hub nodes:")
for node, layers in sorted(hub_layers.items(), key=lambda x: len(x[1]), reverse=True)[:5]:
print(f" {node}: present in {len(layers)} layers - {list(layers)}")
Expected output:
MULTILAYER NETWORK EXPLORATION
======================================================================
1. Per-Layer Statistics:
layer1: 40 nodes, 95 edges, avg_degree=4.75
layer2: 40 nodes, 87 edges, avg_degree=4.35
layer3: 40 nodes, 102 edges, avg_degree=5.10
2. Versatile Nodes (multilayer presence):
Total versatile nodes: 35
Top 5 most versatile:
node7: 3 layers
node12: 3 layers
node3: 3 layers
node15: 3 layers
node1: 3 layers
3. Layer Overlap Analysis:
layer1 ∩ layer2: 35 nodes, Jaccard=0.875
layer1 ∩ layer3: 32 nodes, Jaccard=0.800
layer2 ∩ layer3: 33 nodes, Jaccard=0.825
4. Cross-Layer Hub Nodes:
Hub nodes (degree > 7): 18
Unique hub node IDs: 12
Top 5 hub nodes:
node7: present in 3 layers - ['layer1', 'layer2', 'layer3']
node12: present in 3 layers - ['layer1', 'layer2', 'layer3']
node3: present in 3 layers - ['layer1', 'layer2', 'layer3']
node15: present in 2 layers - ['layer1', 'layer3']
node8: present in 2 layers - ['layer2', 'layer3']
The thresholds (degree > 7, overlap counts) match the bundled synthetic_multilayer dataset. Adjust them for sparser or denser graphs so averages and Jaccard scores remain meaningful.
Community Detection + DSL Workflow
Combine community detection with DSL queries:
from py3plex.algorithms.community_detection.community_wrapper import louvain_communities
from py3plex.dsl import Q, execute_query
from collections import Counter
# Detect communities
communities = louvain_communities(network)
# Attach as node attributes
for (node, layer), comm_id in communities.items():
network.core_network.nodes[(node, layer)]['community'] = comm_id
print("COMMUNITY-BASED ANALYSIS")
print("=" * 70)
# Query each community
community_ids = set(communities.values())
for comm_id in sorted(community_ids):
# Use DSL to get community members
comm_nodes = execute_query(
network,
f'SELECT nodes WHERE community={comm_id}'
)
members = comm_nodes.get("nodes", [])
# Compute community metrics
comm_result = (
Q.nodes()
.where(community=comm_id)
.compute("degree", "betweenness_centrality")
.execute(network)
)
# Statistics
if comm_result:
avg_degree = sum(d['degree'] for d in comm_result.values()) / len(comm_result)
avg_betw = sum(d['betweenness_centrality'] for d in comm_result.values()) / len(comm_result)
else:
avg_degree = avg_betw = 0.0
# Layer composition
layer_counts = Counter(layer for _, layer in members)
print(f"\nCommunity {comm_id}:")
print(f" Size: {len(members)} nodes")
print(f" Avg degree: {avg_degree:.2f}")
print(f" Avg betweenness: {avg_betw:.6f}")
print(f" Layer composition: {dict(layer_counts)}")
Notes: execute_query returns a dictionary; access community members via result["nodes"] as shown. If a community has no nodes (rare with Louvain), averages safely fall back to 0.0. Community labels live on the NetworkX backing graph (network.core_network), so they persist across subsequent DSL queries.
Dynamics + DSL Workflow
Epidemic simulation with DSL-based analysis:
from py3plex.dynamics import SIRDynamics
from py3plex.dsl import Q, L
from collections import Counter
# Run SIR simulation
sir = SIRDynamics(
network,
beta=0.3,
gamma=0.1,
initial_infected=0.05
)
sir.set_seed(42)
results = sir.run(steps=100)
# Attach final state
final_state = results.trajectory[-1]
for node, state in final_state.items():
network.core_network.nodes[node]['sir_state'] = state
print("EPIDEMIC ANALYSIS")
print("=" * 70)
# Per-layer infection analysis
for layer in network.get_layers():
layer_nodes = Q.nodes().from_layers(L[layer]).execute(network)
state_counts = Counter(
network.core_network.nodes[node].get('sir_state', 'unknown')
for node in layer_nodes.keys()
)
total = len(layer_nodes)
def pct(count):
return count / total * 100 if total else 0
print(f"\n{layer}:")
print(f" S: {state_counts.get('S', 0)} ({pct(state_counts.get('S', 0)):.1f}%)")
print(f" I: {state_counts.get('I', 0)} ({pct(state_counts.get('I', 0)):.1f}%)")
print(f" R: {state_counts.get('R', 0)} ({pct(state_counts.get('R', 0)):.1f}%)")
# Identify superspreaders (infected nodes with high degree)
superspreaders = (
Q.nodes()
.where(sir_state='I')
.compute("degree", "betweenness_centrality")
.where(degree__gt=6)
.order_by("degree", reverse=True)
.execute(network)
)
print(f"\nSuperspreaders (infected, degree > 6): {len(superspreaders)}")
for node, data in list(superspreaders.items())[:5]:
print(f" {node}: degree={data['degree']}, betw={data['betweenness_centrality']:.4f}")
Notes: The SIR outcomes depend on beta (infection rate), gamma (recovery rate), and the random seed. Percentages are guarded against empty layers so the snippet can be reused on sparse networks, and sir_state is stored per node as S, I, or R for follow-up filtering.
Reusable DSL Query Functions
Create reusable query templates:
def get_layer_hubs(network, layer, degree_threshold=5):
"""Get high-degree nodes in a specific layer."""
from py3plex.dsl import Q, L
return (
Q.nodes()
.from_layers(L[layer])
.compute("degree")
.where(degree__gt=degree_threshold)
.order_by("degree", reverse=True)
.execute(network)
)
def get_versatile_nodes(network, min_layers=2):
"""Get nodes present in multiple layers."""
from collections import Counter
node_layer_count = Counter()
for node, layer in network.get_nodes():
node_layer_count[node] += 1
return {
node: count for node, count in node_layer_count.items()
if count >= min_layers
}
def compare_layer_centrality(network, layer1, layer2):
"""Compare centrality distributions between two layers."""
from py3plex.dsl import Q, L
import numpy as np
result1 = (
Q.nodes()
.from_layers(L[layer1])
.compute("betweenness_centrality")
.execute(network)
)
result2 = (
Q.nodes()
.from_layers(L[layer2])
.compute("betweenness_centrality")
.execute(network)
)
betw1 = [d['betweenness_centrality'] for d in result1.values()]
betw2 = [d['betweenness_centrality'] for d in result2.values()]
return {
'layer1': {'mean': np.mean(betw1), 'std': np.std(betw1)},
'layer2': {'mean': np.mean(betw2), 'std': np.std(betw2)}
}
# Use reusable functions
hubs_layer1 = get_layer_hubs(network, 'layer1', degree_threshold=7)
versatile = get_versatile_nodes(network, min_layers=3)
centrality_comp = compare_layer_centrality(network, 'layer1', 'layer2')
print(f"Layer1 hubs: {len(hubs_layer1)}")
print(f"Highly versatile nodes: {len(versatile)}")
print(f"Centrality comparison: {centrality_comp}")
Why use DSL-driven workflows?
Declarative: Express what to analyze, not how to compute
Composable: Chain queries to build complex analyses
Reproducible: Queries are self-documenting and version-controllable
Efficient: DSL optimizes execution internally
Readable: SQL-like syntax is intuitive for data analysis
Next steps with DSL workflows:
Full DSL tutorial: How to Query Multilayer Graphs with the SQL-like DSL - Comprehensive DSL guide
Community detection: How to Run Community Detection on Multilayer Networks - Community + DSL workflows
Dynamics simulation: How to Simulate Multilayer Dynamics - Dynamics + DSL workflows
Batch Processing
Process multiple networks:
import glob
from py3plex.core import multinet
results = []
for filename in glob.glob("data/*.multiedgelist"):
# Load network
network = multinet.multi_layer_network()
network.load_network(filename, input_type="multiedgelist")
# Apply analysis pipeline
stats = analyze_network(network) # Your custom function
results.append({
'filename': filename,
'stats': stats
})
# Aggregate results
summary = aggregate_results(results)
analyze_network and aggregate_results are placeholders for your own reusable pipeline (e.g., computing summary stats, exporting community labels). Keep them pure functions so the batch loop stays predictable.
Complete Example Templates
The following locations contain complete, runnable examples:
User Guide Recipes (Analysis Recipes & Workflows)
Recipe-style solutions with code + explanation
Focused on single tasks
Case Studies (Use Cases & Case Studies)
End-to-end analyses
Real-world datasets
Publication-ready results
Examples Gallery (Examples & Recipes)
Standalone Python scripts
Minimal, focused examples
Easy to adapt
Next Steps
Learn fundamentals: Quick Start Tutorial
Detailed recipes: Analysis Recipes & Workflows
Complete case studies: Use Cases & Case Studies
Browse examples: Examples & Recipes
Social Networks
Multi-platform social analysis:
See Use Cases & Case Studies for complete implementation.