Architecture and Design
This document describes the system architecture, design patterns, and extension points of py3plex. It is a map of how the layers fit together, what invariants each layer keeps, and where to extend the library safely without breaking caller expectations.
System Overview
py3plex is built as a modular, layered architecture with clear separation of concerns. Each layer depends only on the one below it, which keeps algorithm code focused on logic instead of plumbing. All layers operate on encoded node-layer identifiers supplied by the core layer so that layer membership is always explicit.
┌─────────────────────────────────────────────────────┐
│ High-Level Interfaces (Wrappers) │
│ node2vec_embedding, benchmark_nodes │
└─────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────┐
│ Algorithms Layer │
│ ┌──────────────┬───────────────┬─────────────────┐ │
│ │ Community │ Statistics │ Multilayer │ │
│ │ Detection │ │ Algorithms │ │
│ └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────┐
│ Visualization Layer │
│ ┌──────────────┬───────────────┬─────────────────┐ │
│ │ Multilayer │ Drawing │ Layout │ │
│ │ Plots │ Machinery │ Algorithms │ │
│ └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────┐
│ Core Layer │
│ ┌──────────────┬───────────────┬─────────────────┐ │
│ │ multinet │ Parsers │ Converters │ │
│ │ (MultiLayer │ │ │ │
│ │ Network) │ │ │ │
│ └──────────────┴───────────────┴─────────────────┘ │
└─────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────┐
│ NetworkX Foundation │
│ MultiDiGraph, MultiGraph, Algorithms │
└─────────────────────────────────────────────────────┘
Architectural Layers
Each layer exposes a narrow surface to the one above it. When adding new features, decide which layer should own the responsibility first to avoid pushing implementation details upward.
Core Layer
Purpose: Fundamental data structures and I/O operations. All higher layers manipulate networks through this layer.
Key Components:
multinet.py-multi_layer_networkclassparsers.py- Input/output for supported formatsconverters.py- Format conversion helpersrandom_generators.py- Synthetic network generatorsHINMINE/- Heterogeneous network decomposition
Responsibilities:
Network construction and manipulation
File I/O (GraphML, GML, GEXF, edge lists, etc.)
Layer management
Matrix representations (adjacency, supra-adjacency) with clear encoding rules
NetworkX integration so downstream algorithms work on familiar graphs
Encoding invariants that keep node-layer separation consistent across the stack
Design Pattern: Facade Pattern - multi_layer_network provides a unified interface to complex NetworkX operations and centralizes encoding/decoding rules
Algorithms Layer
Purpose: Network analysis algorithms optimized for multilayer networks.
Key Components:
community_detection/- Community detection algorithmsstatistics/- Network statistics and metricsmultilayer_algorithms/- Multilayer-specific algorithmsnode_ranking/- Centrality and ranking measuresgeneral/- General-purpose algorithms (random walks, etc.)
Responsibilities:
Community detection (Louvain, Infomap, Label Propagation)
Statistical analysis (density, overlap, correlation)
Centrality computation (degree, betweenness, PageRank, etc.)
Random walks and embeddings
Network decomposition using explicit layer boundaries or coupling rules
Design Pattern: Strategy Pattern - Different algorithms implement common interfaces
Visualization Layer
Purpose: Network plotting and rendering.
Key Components:
multilayer.py- High-level multilayer plottingdrawing_machinery.py- Core drawing primitiveslayout_algorithms.py- Layout computationcolors.py- Color scheme generatorsfa2/- ForceAtlas2 layout
Responsibilities:
Diagonal projection plots (dense but layer-aware)
Force-directed layouts
Matrix visualizations
Color mapping and legends
Interactive plots (via Plotly)
Design Pattern: Template Method Pattern - Layout algorithms follow a common template and separate layout computation from rendering
Wrappers Layer
Purpose: High-level interfaces for common workflows.
Key Components:
node2vec_embedding.py- Node2Vec embedding generationbenchmark_nodes.py- Node classification benchmarking
Responsibilities:
Simplified interfaces for complex workflows
Integration with external tools
Benchmarking and evaluation
Design Pattern: Facade Pattern - Simplify complex multi-step operations
Core Data Structure
The multi_layer_network Class
Central to py3plex, this class manages multilayer network state and enforces encoding conventions.
class multi_layer_network:
def __init__(self, directed=True, label_delimiter="---", coupling_weight=1.0):
self.core_network = nx.MultiDiGraph() if directed else nx.MultiGraph()
self.layer_name_map = {} # Bidirectional mapping
self.label_delimiter = label_delimiter
self.coupling_weight = coupling_weight
self.embedding = None
self.labels = None
Key Attributes:
core_network- Underlying NetworkX graph (MultiDiGraphifdirected=True)layer_name_map- Maps layer names to integer IDslabel_delimiter- Separator for node-layer encoding (default:"---"). Avoid using the delimiter inside raw node IDs.coupling_weight- Default weight for inter-layer edges when none is providedembedding- Cached node embedding matrixlabels- Node classification labels
Encoding Scheme:
Nodes are encoded as "{node_id}{delimiter}{layer_id}" before they are added to core_network.
Example: Node ‘A’ in layer ‘social’ becomes "A---social". This keeps the multilayer structure explicit while allowing standard NetworkX algorithms to run.
Encoded node keys remain opaque to callers. When returning results (e.g., centrality), keep them encoded to preserve layer information unless explicitly decoded for presentation. If raw node IDs are needed, split on label_delimiter consistently so delimiter changes do not silently corrupt parsing.
Design Patterns
Facade Pattern
Used in: multi_layer_network, wrappers
Purpose: Provide simplified interface to complex subsystems
# Complex underlying operations hidden behind simple interface
network = multinet.multi_layer_network()
network.add_edges(edges, input_type='list') # Handles parsing, encoding, validation
network.basic_stats() # Aggregates multiple NetworkX calls and keeps caches coherent
Strategy Pattern
Used in: Algorithms, layout computation
Purpose: Interchangeable algorithms following common interface
# Different community detection strategies
def detect_communities(network, method='louvain'):
strategies = {
'louvain': community_louvain.best_partition,
'infomap': community_wrapper.infomap_communities,
'label_prop': label_propagation.propagate
}
return strategies[method](network.core_network)
Template Method Pattern
Used in: Visualization, layout algorithms
Purpose: Define algorithm skeleton, allow customization in subclasses
class LayoutAlgorithm:
def compute(self, graph):
self.initialize(graph)
self.iterate()
return self.finalize()
def initialize(self, graph):
raise NotImplementedError
def iterate(self):
raise NotImplementedError
def finalize(self):
raise NotImplementedError
Dependency Injection
Used in: Configuration, algorithm parameters
Purpose: Inject dependencies rather than hard-coding
# Configuration injected rather than hard-coded
from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS
def draw_network(network, colors=None, layout_params=None):
colors = colors or DEFAULT_COLORS
layout_params = layout_params or LAYOUT_PARAMS
# Use injected configuration
Data Flow
The steps below show how a typical workflow moves through the layers: creation in the core, processing in algorithms, summarization, and presentation in visualization or wrappers. The examples use encoded nodes throughout so layer context is preserved.
Typical Workflow
Input: Load or create network
network = multinet.multi_layer_network() network.load_network("data.graphml", input_type="graphml")
Processing: Apply algorithms
communities = community_louvain.best_partition(network.core_network) centrality = calc.multilayer_degree_centrality(network)
Analysis: Compute statistics
density = mls.layer_density(network, 'layer1') correlation = mls.inter_layer_degree_correlation(network, 'L1', 'L2')
Visualization: Render results
draw_multilayer_default([network], display=True)
Output: Export results
network.save_network("output.graphml", output_type="graphml")
State Management
Immutable Operations: Most algorithms don’t modify the network.
# These don't modify the network
centrality = calc.multilayer_degree_centrality(network)
communities = community_louvain.best_partition(network.core_network)
Mutable Operations: Some operations modify network state and should trigger cache invalidation (e.g., supra-adjacency, cached layouts).
# These modify the network
network.add_edges(new_edges, input_type='list')
network.aggregate_layers(['L1', 'L2'], 'combined')
When mutating, clear or recompute any cached matrices or embeddings that depend on structure before running further analysis. Mutations should be localized (e.g., via helper methods) so cache invalidation stays centralized and deterministic.
Extension Points
Custom Algorithms
Add new algorithms by following existing patterns. Prefer operating on network.core_network (already encoded) and return results keyed by encoded nodes or layers so downstream utilities work unchanged. Keep any cacheable intermediates (e.g., supra-adjacency) out of global scope to avoid cross-test leakage:
# py3plex/algorithms/my_module/my_algorithm.py
from py3plex.core.multinet import multi_layer_network
def my_centrality(network: multi_layer_network) -> dict:
"""
Custom centrality measure.
Parameters
----------
network : multi_layer_network
Input network
Returns
-------
dict
Node centrality scores
"""
G = network.core_network
centrality = {}
for node in G.nodes():
# Implement custom logic
centrality[node] = compute_score(node, G)
return centrality
Custom Visualizations
Create custom plots using drawing machinery. Keep layout computation separate from drawing so alternative layouts can be swapped in:
from py3plex.visualization import drawing_machinery as dm
import matplotlib.pyplot as plt
def my_custom_plot(network):
"""Custom visualization."""
fig, ax = plt.subplots(figsize=(10, 8))
# Compute layout
pos = dm.compute_layout(network.core_network, 'force')
# Draw elements
dm.draw_nodes(ax, network.core_network, pos, node_size=50)
dm.draw_edges(ax, network.core_network, pos, edge_width=1)
dm.draw_labels(ax, pos, labels=network.get_node_labels())
plt.show()
Custom Parsers
Add support for new file formats. Keep parsing side effects minimal and reuse existing helpers for validation where possible. Preserve the node-layer encoding contract so downstream algorithms receive encoded nodes:
# py3plex/core/parsers.py
def parse_my_format(input_file, **kwargs):
"""
Parse custom file format.
Parameters
----------
input_file : str
Path to input file
Returns
-------
multi_layer_network
Parsed network
"""
network = multi_layer_network()
with open(input_file, 'r') as f:
for line in f:
# Parse line and add to network
pass
return network
Configuration System
Centralized Configuration
py3plex/config.py provides centralized configuration:
# Default color palettes (8 options including colorblind-safe)
DEFAULT_COLORS = 'Set1'
COLORBLIND_SAFE = 'colorblind'
# Visualization defaults
DEFAULT_NODE_SIZE = 20
DEFAULT_EDGE_WIDTH = 1.0
DEFAULT_ALPHA = 0.7
# Layout parameters
LAYOUT_PARAMS = {
'force': {'iterations': 500, 'optimal_distance': 1.0},
'fa2': {'iterations': 1000, 'gravity': 1.0}
}
# Performance settings
SPARSE_THRESHOLD = 1000 # Use sparse matrices above this node count
MEMORY_WARNING_THRESHOLD = 10000 # Warn for large dense matrices
Usage:
from py3plex.config import DEFAULT_COLORS, LAYOUT_PARAMS
colors = DEFAULT_COLORS
iterations = LAYOUT_PARAMS['force']['iterations']
Override configuration at call sites instead of mutating globals to keep tests reproducible. If a global default must change, document the reason and expected downstream effects on visualization, performance, or memory.
Testing Architecture
Test Organization
tests/
├── test_core_functionality.py # Core data structure tests
├── test_multilayer_*.py # Multilayer algorithm tests
├── test_random_walks.py # Random walk tests
├── test_io_*.py # I/O and parsing tests
├── test_config_api.py # Configuration tests
└── test_utils.py # Utility function tests
Test Patterns
Unit Tests: Test individual functions in isolation
def test_layer_density():
network = create_test_network()
density = mls.layer_density(network, 'layer1')
assert 0 <= density <= 1
Integration Tests: Test workflows across modules
def test_community_detection_workflow():
network = load_network("test_data.graphml")
communities = community_louvain.best_partition(network.core_network)
assert len(communities) > 0
Property-Based Tests: Test invariants
def test_centrality_normalization():
network = create_random_network()
centrality = calc.multilayer_degree_centrality(network)
# Centrality values should be normalized
assert all(0 <= v <= 1 for v in centrality.values())
Use small synthetic networks for speed, seed any randomness (e.g., layouts or random walks), and isolate file I/O behind fixtures. Validate invariants on encoded nodes (node-layer pairs) to catch subtle encoding regressions early.
Performance Considerations
Lazy Evaluation
Expensive operations are computed on-demand:
class multi_layer_network:
@property
def supra_adjacency(self):
if self._supra_adj_cache is None:
self._supra_adj_cache = self._compute_supra_adjacency()
return self._supra_adj_cache
Sparse Matrices
Use sparse representations for large networks:
def get_supra_adjacency_matrix(self, sparse=True):
adj = self._compute_dense_adjacency()
if sparse or len(self.get_nodes()) > SPARSE_THRESHOLD:
return scipy.sparse.csr_matrix(adj)
return np.array(adj)
In this section, n refers to encoded nodes (node-layer pairs) and m to edges in the aggregated core graph. Use sparse matrices once n grows beyond a few thousand to avoid cubic blowups in dense linear algebra. Dense operations scale poorly on supra-adjacency matrices because they are block-structured but still quadratic in layer count.
Vectorization
Prefer NumPy vectorized operations:
# Bad: Python loop
degrees = [sum(1 for _ in G.neighbors(node)) for node in G.nodes()]
# Good: Vectorized
degrees = np.array(list(dict(G.degree()).values()))
Logging Infrastructure
Centralized Logging
py3plex/logging_config.py provides structured logging:
import logging
from py3plex.logging_config import get_logger
logger = get_logger(__name__)
logger.info("Processing network with %d nodes", num_nodes)
logger.warning("Large network detected, using sparse matrices")
logger.error("Invalid layer: %s", layer_name)
Configure logging once per process; repeated basicConfig calls in library code lead to duplicated log lines.
Log Levels
DEBUG: Detailed diagnostic information
INFO: General informational messages
WARNING: Warning messages (e.g., performance concerns)
ERROR: Error messages
CRITICAL: Critical errors
Error Handling
Custom Exceptions
py3plex/exceptions.py defines domain-specific exceptions:
class NetworkError(Exception):
"""Base exception for network errors."""
pass
class LayerNotFoundError(NetworkError):
"""Raised when layer doesn't exist."""
pass
class InvalidFormatError(NetworkError):
"""Raised when file format is invalid."""
pass
Usage:
from py3plex.exceptions import LayerNotFoundError
def get_layer(self, layer_name):
if layer_name not in self.layer_name_map:
raise LayerNotFoundError(f"Layer '{layer_name}' not found")
return self.layer_name_map[layer_name]
Future Architecture
Planned Improvements
Backend Registry: Support for igraph, cugraph backends
Streaming API: Process networks larger than memory
Distributed Computing: Dask/Ray integration for large-scale analysis
Plugin System: Easy addition of third-party algorithms
Type System: Expand type hints coverage and enforce mypy across core and algorithms
See Also
Contributing to py3plex - Contributing guidelines
Development Guide - Development workflow
Network Construction - Core API documentation