Performance and Scalability Best Practices
This guide shows how to tune py3plex for large multilayer networks: memory usage, runtime, visualization, and tool choices. The advice targets typical sparse, research-scale graphs; adjust down if your network is unusually dense.
Network Scale Guidelines
py3plex is optimized for research-scale networks. These ranges are rough heuristics to pick appropriate tactics; dense graphs behave like larger networks than their node count alone suggests because memory and runtime scale with edge count.
Network Size |
Performance |
Visualization |
Recommendations |
|---|---|---|---|
Small (<100 nodes) |
Excellent |
Fast, detailed |
Dense visualization is fine |
Medium (100-1k nodes) |
Good |
Fast, balanced |
Default settings work well |
Large (1k-10k nodes) |
Good |
Slower, minimal |
Use sparse matrices, sampling |
Very Large (>10k nodes) |
Variable |
Very slow |
Sampling required, consider igraph/graph-tool |
Sparse Matrix Backend
Why Sparse Matrices?
Most real-world networks are sparse (few edges compared to possible edges). Using sparse matrices keeps storage and computation proportional to the number of non-zero entries rather than the square of node count:
Cuts memory by an order of magnitude for typical networks
Speeds up matrix operations (multiplication, inversion)
Enables analysis of larger networks before hitting RAM limits
Example:
import numpy as np
from scipy.sparse import csr_matrix
# Dense representation (10k x 10k network)
# Memory: 10_000^2 x 8 bytes = ~800 MB for float64
# Sparse representation (with 1% density)
# Memory: tens of MB (format-dependent), not hundreds
Automatic Sparse Matrix Usage
py3plex automatically uses sparse matrices for:
Supra-adjacency matrix operations
Large network storage (>1000 nodes)
Matrix-based algorithms (PageRank, spectral methods)
If SciPy’s sparse support is unavailable, py3plex falls back to dense matrices—install scipy to keep large workloads memory-efficient and avoid silent slowdowns.
Verify sparse usage:
from py3plex.core import multinet
network = multinet.multi_layer_network()
network.load_network("large_network.csv", input_type="multiedgelist")
# Get sparse adjacency matrix
adj_sparse = network.get_sparse_adjacency_matrix()
print(f"Matrix size: {adj_sparse.shape}")
print(f"Non-zero entries: {adj_sparse.nnz}")
print(f"Sparsity: {1 - adj_sparse.nnz / (adj_sparse.shape[0]**2):.2%}")
Force Sparse Operations
For custom algorithms, explicitly use sparse operations and avoid converting back to dense arrays mid-computation:
# Assumes ``network`` is already loaded; map nodes to indices once
from scipy.sparse import lil_matrix
from scipy.sparse import linalg as sparse_linalg
import numpy as np
# Map nodes to indices once
node_to_idx = {node: idx for idx, node in enumerate(network.core_network.nodes())}
n_nodes = len(node_to_idx)
# Create sparse adjacency matrix
adj = lil_matrix((n_nodes, n_nodes))
for u, v in network.core_network.edges():
i, j = node_to_idx[u], node_to_idx[v]
adj[i, j] = 1
adj[j, i] = 1 # Undirected
# Convert to efficient format for operations
adj_csr = adj.tocsr()
# Sparse matrix operations scale better than dense for large, sparse graphs
result = adj_csr @ adj_csr # Matrix multiplication
eigvals = sparse_linalg.eigs(adj_csr, k=10) # Top eigenvalues
Network Sampling
When to Sample
Sampling helps when:
The network is too large to visualize (>5k nodes)
Algorithms stall or exceed your runtime budget (>10 minutes)
Memory usage is excessive (>8 GB RAM)
You need quick exploratory analysis before full runs
Skip sampling when you must preserve exact counts or connectivity-sensitive metrics (diameter, exact shortest paths); run the full network once resources allow.
Random Node Sampling
Use when you want an unbiased glimpse of the network at smaller scale.
# Assumes ``network`` is loaded as in the random sampling example
import random
from py3plex.core import multinet
# Load full network
network = multinet.multi_layer_network()
network.load_network("large_network.csv", input_type="multiedgelist")
# Sample up to 1000 random nodes
all_nodes = list(network.get_nodes())
sample_nodes = random.sample(all_nodes, min(1000, len(all_nodes)))
# Create subnetwork
subnetwork = network.get_subnetwork(sample_nodes)
print(f"Original: {len(all_nodes)} nodes")
print(f"Sample: {len(sample_nodes)} nodes")
print(f"Sampling ratio: {len(sample_nodes)/len(all_nodes):.1%}")
Stratified Sampling (Preserve Layer Distribution)
Use when layer proportions matter (multiplex data).
import random
# Sample proportionally from each layer
layers = network.get_layer_names()
sample_nodes_per_layer = {}
for layer in layers:
layer_nodes = [n for n in network.get_nodes() if n[1] == layer]
if not layer_nodes:
continue
sample_size = max(1, len(layer_nodes) // 10) # 10% sample, at least 1
sample_nodes_per_layer[layer] = random.sample(layer_nodes, sample_size)
# Combine samples
all_samples = [node for nodes in sample_nodes_per_layer.values() for node in nodes]
subnetwork = network.get_subnetwork(all_samples)
Hub-Based Sampling (Keep Important Nodes)
Use when high-degree nodes dominate the phenomenon you study (traffic, influence).
# Uses the loaded ``network``; keeps high-degree nodes (hubs)
import networkx as nx
# Sample high-degree nodes (hubs)
degrees = dict(network.core_network.degree())
# Sort by degree and take up to top 1000
sorted_nodes = sorted(degrees.items(), key=lambda x: x[1], reverse=True)
hub_nodes = [node for node, deg in sorted_nodes[:min(1000, len(sorted_nodes))]]
subnetwork = network.get_subnetwork(hub_nodes)
print(f"Sampled {len(hub_nodes)} hubs")
Algorithm Optimization
Choose Efficient Algorithms
Choose algorithms with complexity close to O(n + m) where possible. The table uses n for nodes and m for edges:
Algorithm |
Complexity |
Recommendations |
|---|---|---|
Degree centrality |
O(n + m) |
Fast, use freely |
Betweenness centrality |
O(nm) |
Slow for large networks, sample first or approximate |
PageRank |
O(iterations x m) |
Fast if sparse, limit iterations |
Community detection (Louvain) |
O(m log n) |
Fast, recommended |
Shortest paths (all pairs) |
O(n^2 m) |
Very slow, use sampling or approximate |
Force-directed layout |
O(n^2) |
Slow for >5k nodes, use alternatives |
Example - Fast centrality:
import networkx as nx
G = network.core_network # assumes network is already loaded
# FAST: Degree centrality
degree_cent = nx.degree_centrality(G) # O(n+m) - instant
# SLOW: Betweenness centrality
# For large networks, sample first or use approximate algorithm
if G.number_of_nodes() < 1000:
between_cent = nx.betweenness_centrality(G)
else:
# Use approximate algorithm
between_cent = nx.betweenness_centrality(G, k=100) # Sample 100 nodes
Limit Algorithm Iterations
Iteration limits prevent runaway runtimes on large, sparse graphs while keeping results stable enough for exploration.
Lower tol increases accuracy but may require more iterations; pair it with a sensible max_iter cap.
import networkx as nx
# PageRank with iteration limit
pagerank = nx.pagerank(
network.core_network,
max_iter=50, # Limit iterations
tol=1e-4 # Tolerance for convergence
)
# Community detection with resolution limit
from py3plex.algorithms.community_detection import community_louvain
communities = community_louvain.best_partition(
network.core_network,
resolution=1.0 # Adjust resolution parameter
)
Parallel Processing
Multi-Core Processing
Use joblib for embarrassingly parallel node/edge operations; cap n_jobs to avoid oversubscribing shared machines or container limits:
from joblib import Parallel, delayed
import networkx as nx
def compute_node_centrality(node, graph):
"""Compute centrality for a single node."""
# Custom centrality computation
neighbors = list(graph.neighbors(node))
return node, len(neighbors)
# Parallel processing
nodes = list(network.core_network.nodes())
results = Parallel(n_jobs=-1)( # Use all cores
delayed(compute_node_centrality)(node, network.core_network)
for node in nodes
)
centralities = dict(results)
GPU Acceleration (Advanced)
For very large networks with a CUDA-capable GPU, try GPU acceleration. Keep data sparse to avoid blowing GPU memory and move only the operations that benefit from GPU throughput:
# Install CuPy for GPU NumPy operations
pip install cupy-cuda11x # Replace 11x with your CUDA version
# Requires NVIDIA GPU with CUDA
try:
import cupy as cp
import numpy as np
from cupyx.scipy.sparse import csr_matrix as gpu_csr
# Keep sparse; convert SciPy CSR to CuPy CSR
adj_cpu = network.get_sparse_adjacency_matrix().astype(np.float32)
gpu_adj = gpu_csr(adj_cpu)
# GPU-accelerated sparse matrix operations
gpu_result = gpu_adj @ gpu_adj
# Transfer back to CPU if needed
result = gpu_result.get()
except ImportError:
print("CuPy not available, using CPU")
Keep GPU arrays in float32 and sparse formats to fit device memory; fall back to CPU if allocations fail.
Visualization Optimization
Reduce Visual Complexity
Favor minimal styling for large graphs to avoid overplotting:
from py3plex.visualization.multilayer import draw_multilayer_default
# For large networks (>1000 nodes)
draw_multilayer_default(
network.get_layers(),
node_size=3, # Tiny nodes
labels=False, # No labels
edge_size=0.3, # Thin edges
alphalevel=0.2, # Very transparent
remove_isolated_nodes=True # Remove disconnected
)
Save to File Instead of Display
import matplotlib.pyplot as plt
# Don't show interactively (faster)
fig, ax = plt.subplots(1, 1, figsize=(10, 8))
draw_multilayer_default(
network.get_layers(),
display=False, # Don't show
axis=ax
)
# Save directly to file
plt.savefig('network.png', dpi=150, bbox_inches='tight')
plt.close() # Free memory
Use Lower Resolution
import matplotlib.pyplot as plt
# For quick exploration, use low DPI
plt.figure(figsize=(8, 6), dpi=72) # Low resolution
# For publications, use high DPI
plt.figure(figsize=(10, 8), dpi=300) # High resolution
Memory Management
Monitor Memory Usage
import psutil
import os
from py3plex.core import multinet
# Requires ``psutil``; install it if missing
def get_memory_usage():
"""Get current memory usage in MB."""
process = psutil.Process(os.getpid())
return process.memory_info().rss / 1024 / 1024
print(f"Memory before loading: {get_memory_usage():.1f} MB")
network = multinet.multi_layer_network()
network.load_network("large_network.csv", input_type="multiedgelist")
print(f"Memory after loading: {get_memory_usage():.1f} MB")
Free Memory When Done
# Delete network when no longer needed
del network
# Force garbage collection
import gc
gc.collect()
Use Generators Instead of Lists
# BAD: Loads all nodes into memory
all_nodes = list(network.get_nodes())
for node in all_nodes:
process(node)
# GOOD: Processes nodes one at a time
for node in network.get_nodes():
process(node)
Batch Processing
Process Networks in Batches
For multiple networks:
import os
import gc
network_files = ["net1.csv", "net2.csv", "net3.csv"] # Extend with your files
results = []
for i, file in enumerate(network_files):
print(f"Processing {i+1}/{len(network_files)}: {file}")
# Load network
network = multinet.multi_layer_network()
network.load_network(file, input_type="multiedgelist")
# Compute statistics
result = analyze_network(network)
results.append(result)
# Free memory
del network
gc.collect()
# Save intermediate results every 10 networks
if (i + 1) % 10 == 0:
save_results(results, f"results_batch_{i+1}.json")
Persist partial outputs regularly so an interrupted job does not lose prior work.
Benchmark Results
Illustrative Performance Benchmarks
Tested on: Intel i7-10700K, 32 GB RAM, Python 3.10. These figures illustrate order-of-magnitude behavior; disk speed, graph density, and algorithm parameters will change your results.
Operation |
100 nodes |
1k nodes |
10k nodes |
100k nodes |
|---|---|---|---|---|
Load from CSV |
<1s |
<1s |
2s |
20s |
Basic statistics |
<1s |
<1s |
<1s |
3s |
Degree centrality |
<1s |
<1s |
1s |
10s |
PageRank |
<1s |
<1s |
2s |
25s |
Louvain communities |
<1s |
1s |
5s |
60s |
Visualization (sparse) |
<1s |
2s |
15s |
N/A* |
* Visualization not recommended for >10k nodes without sampling
Scaling Recommendations
Based on network size:
- <1k nodes:
Use any algorithms
Full visualization
No sampling needed
- 1k-10k nodes:
Use sparse matrices
Minimal visualization
Sample for some algorithms
- 10k-100k nodes:
Sparse matrices required
Sample for visualization
Use approximate algorithms
Consider igraph for speed
- >100k nodes:
Use specialized tools (igraph, graph-tool, NetworKit)
Sample heavily for py3plex operations
Focus on specific analyses
Alternative Tools for Scale
When py3plex Isn’t Enough
For networks >100k nodes, consider exporting to a faster backend for heavy algorithms and re-importing results:
igraph (C-based, very fast):
import igraph as ig
# Often 10-100x faster for large networks
g = ig.Graph.Read_GraphML("large_network.graphml")
communities = g.community_multilevel()
# Export back to py3plex if needed
# (via GraphML or edge list)
graph-tool (C++, fastest):
import graph_tool.all as gt
# Fastest for >1M edges
g = gt.load_graph("large_network.graphml")
state = gt.minimize_blockmodel_dl(g)
communities = state.get_blocks()
NetworKit (C++, parallel):
import networkit as nk
# Excellent for parallel algorithms
G = nk.readGraph("large_network.edgelist", nk.Format.EdgeList)
communities = nk.community.detectCommunities(G)
Quick Performance Checklist
Before Running on Large Network
[ ] Enable sparse matrices (automatic for most ops)
[ ] Sample network if >10k nodes
[ ] Choose efficient algorithms (degree > betweenness)
[ ] Limit visualization detail
[ ] Monitor memory usage
[ ] Use batch processing for multiple networks
[ ] Consider alternative tools if >100k nodes
Optimization Order
Use sparse matrices (biggest impact, usually automatic)
Sample network (if >10k nodes)
Choose efficient algorithms (avoid O(n^3) operations)
Parallelize (if multi-core available)
GPU acceleration (only if CUDA GPU available)
Next Steps
I/O and Serialization - Efficient data loading
Visualization - Optimize visualizations
Docker Usage Guide - Docker deployment for production
For performance issues, open an issue on GitHub Issues.