Benchmarking & Performance
Performance characteristics and optimization strategies for py3plex.
Network Scale Guidelines
py3plex is optimized for research-scale networks:
Network Size |
Performance |
Visualization |
Recommendations |
|---|---|---|---|
Small (<100 nodes) |
Excellent |
Fast, detailed |
Use dense visualization mode |
Medium (100-1k nodes) |
Good |
Fast, balanced |
Default settings work well |
Large (1k-10k nodes) |
Good |
Slower, minimal |
Use sparse matrices, sampling |
Very Large (>10k nodes) |
Variable |
Very slow |
Sampling required |
Performance Tips
Use Sparse Matrices
For large networks, use sparse matrix representations:
from py3plex.core import multinet
network = multinet.multi_layer_network(sparse=True)
This reduces memory usage by 10-100x for typical networks.
Batch Operations
Process multiple operations together:
from py3plex.dsl import Q
# Compute multiple metrics at once
result = (
Q.nodes()
.compute("degree", "betweenness_centrality", "clustering")
.execute(network)
)
Avoid repeated single-metric computations.
Use Arrow/Parquet for I/O
For large datasets:
import pyarrow.parquet as pq
# Save
table = pq.write_table(edges_table, 'network.parquet')
# Load (much faster than CSV)
table = pq.read_table('network.parquet')
Parallel Processing
For Node2Vec and other CPU-intensive algorithms:
from py3plex.wrappers import train_node2vec
embeddings = train_node2vec(
network,
workers=8 # Use multiple CPU cores
)
Benchmark Results
Performance benchmarks for common operations on synthetic multilayer networks. These results provide guidance for planning analyses and optimizing workflows.
Test Environment:
CPU: Intel Core i7-9700K @ 3.6GHz (8 cores)
RAM: 32 GB DDR4
Python: 3.10
py3plex: v1.0.0
Algorithm Runtimes vs. Network Size
Community Detection (Louvain)
Nodes |
Edges |
Layers |
Runtime |
|---|---|---|---|
100 |
500 |
3 |
0.05s |
1,000 |
5,000 |
3 |
0.3s |
10,000 |
50,000 |
3 |
4.2s |
100,000 |
500,000 |
3 |
58s |
Centrality Computation (Betweenness)
Nodes |
Edges |
Layers |
Runtime |
|---|---|---|---|
100 |
500 |
3 |
0.12s |
1,000 |
5,000 |
3 |
8.5s |
10,000 |
50,000 |
3 |
1,240s (21 min) |
100,000 |
500,000 |
3 |
N/A (too slow) |
Note: Betweenness is O(n³) - use approximation methods for large networks
Node2Vec Embeddings
Nodes |
Edges |
Layers |
Runtime |
|---|---|---|---|
100 |
500 |
3 |
2.3s |
1,000 |
5,000 |
3 |
18s |
10,000 |
50,000 |
3 |
245s (4 min) |
100,000 |
500,000 |
3 |
3,200s (53 min) |
Dynamics Simulation (SIR)
Nodes |
Edges |
Layers |
Runtime |
|---|---|---|---|
100 |
500 |
3 |
0.8s |
1,000 |
5,000 |
3 |
4.5s |
10,000 |
50,000 |
3 |
52s |
100,000 |
500,000 |
3 |
680s (11 min) |
Memory Usage Profiles
Nodes |
Edges |
Dense Storage |
Sparse Storage |
|---|---|---|---|
100 |
500 |
2 MB |
0.5 MB |
1,000 |
5,000 |
24 MB |
2 MB |
10,000 |
50,000 |
2.4 GB |
18 MB |
100,000 |
500,000 |
240 GB |
180 MB |
Key Insight: Sparse storage reduces memory by 10-1000x for typical networks.
Comparison with Other Tools
Community Detection: py3plex vs. NetworkX
Tool |
Runtime |
Notes |
|---|---|---|
py3plex |
0.3s |
Multilayer-aware |
NetworkX + python-louvain |
0.2s |
Single-layer only |
graph-tool |
0.08s |
C++ backend, faster |
Verdict: py3plex is competitive for single-layer, adds multilayer capability others lack.
Node Embeddings: py3plex vs. node2vec
Tool |
Runtime |
Notes |
|---|---|---|
py3plex |
18s |
Python wrapper |
node2vec (original) |
15s |
C++ implementation |
Gensim |
12s |
Optimized Word2Vec |
Verdict: py3plex uses established libraries (gensim), performance is comparable.
Benchmarking Notes:
Results vary based on network structure (density, clustering, layer coupling)
Runtimes scale differently for different algorithms (linear, quadratic, cubic)
Use these benchmarks as rough guidelines, not exact predictions
For the most accurate estimates, run benchmarks on your specific hardware and data
Running Custom Benchmarks:
See the benchmarks/ directory in the repository for scripts to reproduce these results or run your own benchmarks.
Running Benchmarks
Run benchmarks yourself:
cd benchmarks
python run_benchmarks.py
See the benchmarks/ directory in the repository for benchmark scripts.
Profiling Your Code
Use Python profiling tools:
import cProfile
import pstats
# Profile your analysis
cProfile.run('your_analysis_function(network)', 'profile_stats')
# View results
stats = pstats.Stats('profile_stats')
stats.sort_stats('cumulative')
stats.print_stats(20)
Memory Profiling
pip install memory_profiler
python -m memory_profiler your_script.py
Optimization Strategies
For Large Networks
Sample the network for exploratory analysis
Use layer-specific analysis instead of full multilayer
Compute metrics incrementally rather than all at once
Cache intermediate results
For Repeated Analysis
Precompute and save expensive metrics
Use config-driven workflows for reproducibility
Batch process multiple networks
For Production
Use Docker containers for consistent environments
Implement monitoring for long-running jobs
Add checkpointing for crash recovery
See Docker Usage Guide for deployment best practices.
Hardware Recommendations
Minimum:
4 GB RAM
2 CPU cores
Small networks (<1k nodes)
Recommended:
16 GB RAM
8 CPU cores
Networks up to 10k nodes
High-Performance:
64+ GB RAM
16+ CPU cores
Large networks (>10k nodes)
Next Steps
Optimize I/O: How to Export and Serialize Networks
Deploy to production: Docker Usage Guide
See deployment guide: Performance and Scalability Best Practices (original full guide)