Random Walk Algorithms
Py3plex provides comprehensive random walk primitives for graph-based algorithms, including Node2Vec, DeepWalk, and diffusion processes. These implementations support weighted, directed, and multilayer networks with deterministic reproducibility.
Overview
Random walks are fundamental building blocks for many graph algorithms:
Node2Vec: Learn node embeddings using biased random walks
DeepWalk: Generate node embeddings through uniform random walks
Personalized PageRank: Compute node importance via random walks with restart
Diffusion processes: Model information or disease propagation
Community detection: Identify network structure through walk patterns
Key Features
✅ Basic random walks with proper edge weight handling
✅ Second-order (biased) random walks following Node2Vec logic
✅ Multiple simultaneous walks with deterministic seeding
✅ Support for directed, weighted, and multigraphs
✅ Multilayer network-aware walks with layer constraints
✅ Efficient sparse adjacency matrix handling
✅ Comprehensive test suite validating correctness properties
Basic Random Walk
The basic random walk samples the next node proportionally to normalized edge weights.
Simple Example
from py3plex.algorithms.general.walkers import basic_random_walk
import networkx as nx
# Create a simple graph
G = nx.karate_club_graph()
# Perform a random walk
walk = basic_random_walk(
G,
start_node=0,
walk_length=10,
seed=42 # for reproducibility
)
print(f"Walk: {walk}")
print(f"Length: {len(walk)}") # 11 nodes (includes start)
Weighted Graphs
Edge weights are automatically used when weighted=True
(default):
from py3plex.algorithms.general.walkers import basic_random_walk
import networkx as nx
# Create weighted graph
G = nx.Graph()
G.add_weighted_edges_from([
(0, 1, 10.0), # high weight -> more likely
(0, 2, 1.0), # low weight -> less likely
(1, 2, 5.0),
])
# Weighted walk (respects edge weights)
walk = basic_random_walk(G, 0, walk_length=20, weighted=True, seed=42)
# Unweighted walk (uniform transitions)
walk_uniform = basic_random_walk(G, 0, walk_length=20, weighted=False, seed=42)
Directed Graphs
Directed edges are handled correctly:
import networkx as nx
from py3plex.algorithms.general.walkers import basic_random_walk
# Create directed graph
G = nx.DiGraph()
G.add_edges_from([(0, 1), (1, 2), (2, 0), (1, 3)])
# Walk respects edge direction
walk = basic_random_walk(G, 0, walk_length=10, seed=42)
# Verify all transitions follow directed edges
for i in range(len(walk) - 1):
assert G.has_edge(walk[i], walk[i+1])
Node2Vec Biased Random Walk
Second-order random walks with return parameter p
and in-out parameter q
following the Node2Vec algorithm (Grover & Leskovec, 2016).
Parameters
When transitioning from node t → v → x
, the probability of choosing x
is biased:
If x == t (return to previous):
weight / p
If x is neighbor of t (stay close):
weight / 1
If x is not neighbor of t (explore):
weight / q
Parameter Effects:
p > 1: Less likely to return to previous node (explore forward)
p < 1: More likely to return (backtrack)
q > 1: Less likely to explore distant nodes (stay local, BFS-like)
q < 1: More likely to explore distant nodes (venture far, DFS-like)
p = q = 1: Equivalent to basic random walk
Basic Usage
from py3plex.algorithms.general.walkers import node2vec_walk
import networkx as nx
G = nx.karate_club_graph()
# Balanced walk (p=1, q=1)
walk_balanced = node2vec_walk(G, 0, walk_length=20, p=1.0, q=1.0, seed=42)
# BFS-like walk (stay local)
walk_bfs = node2vec_walk(G, 0, walk_length=20, p=1.0, q=2.0, seed=42)
# DFS-like walk (explore outward)
walk_dfs = node2vec_walk(G, 0, walk_length=20, p=1.0, q=0.5, seed=42)
Exploring vs Backtracking
from py3plex.algorithms.general.walkers import node2vec_walk
import networkx as nx
# Triangle graph
G = nx.Graph()
G.add_edges_from([(0, 1), (1, 2), (2, 0)])
# Low p, high q: tends to backtrack
walk_backtrack = node2vec_walk(G, 0, walk_length=20, p=0.1, q=10.0, seed=42)
# High p, low q: tends to explore outward
walk_explore = node2vec_walk(G, 0, walk_length=20, p=10.0, q=0.1, seed=42)
# Count backtracking patterns
backtracks = sum(1 for i in range(2, len(walk_backtrack))
if walk_backtrack[i] == walk_backtrack[i-2])
print(f"Backtracks: {backtracks}")
Multiple Walk Generation
Generate multiple walks efficiently with deterministic seeding.
Generate Walks from All Nodes
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
G = nx.karate_club_graph()
# Generate 10 walks from each node
walks = generate_walks(
G,
num_walks=10,
walk_length=10,
seed=42
)
print(f"Total walks: {len(walks)}") # 340 (34 nodes × 10 walks)
print(f"First walk: {walks[0]}")
Generate from Specific Nodes
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
G = nx.karate_club_graph()
# Generate walks only from nodes 0, 1, 2
walks = generate_walks(
G,
num_walks=5,
walk_length=15,
start_nodes=[0, 1, 2],
seed=42
)
print(f"Total walks: {len(walks)}") # 15 (3 nodes × 5 walks)
Node2Vec-Style Walks
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
G = nx.karate_club_graph()
# Generate biased walks with p=0.5, q=2.0
walks = generate_walks(
G,
num_walks=10,
walk_length=20,
p=0.5,
q=2.0,
seed=42
)
Edge Sequences
Return walks as edge sequences instead of node sequences:
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
G = nx.karate_club_graph()
# Generate edge sequences
edge_walks = generate_walks(
G,
num_walks=5,
walk_length=10,
return_edges=True,
seed=42
)
# First walk is a list of edges
print(edge_walks[0]) # [(0, 3), (3, 2), (2, 1), ...]
Multilayer Network Walks
Perform random walks on multilayer networks with layer constraints.
Layer-Constrained Walks
from py3plex.algorithms.general.walkers import layer_specific_random_walk
from py3plex.core import multinet
# Create multilayer network
network = multinet.multi_layer_network()
network.add_layer("social")
network.add_layer("biological")
network.add_nodes([
("A", "social"), ("B", "social"), ("C", "social"),
("A", "biological"), ("B", "biological")
])
network.add_edges([
(("A", "social"), ("B", "social")),
(("B", "social"), ("C", "social")),
(("A", "biological"), ("B", "biological")),
])
# Walk constrained to social layer
walk = layer_specific_random_walk(
network.core_network,
start_node="A---social",
walk_length=10,
layer="social",
cross_layer_prob=0.0, # never cross layers
seed=42
)
# All nodes in walk are in social layer
print(walk)
Cross-Layer Transitions
Allow occasional transitions between layers:
from py3plex.algorithms.general.walkers import layer_specific_random_walk
# Same network as above
# Walk with 10% probability of crossing layers
walk = layer_specific_random_walk(
network.core_network,
start_node="A---social",
walk_length=20,
layer="social",
cross_layer_prob=0.1, # 10% chance to cross
seed=42
)
# May contain nodes from both layers
print(walk)
Reproducibility
All walk functions support deterministic seeding for reproducibility.
Fixed Seed
from py3plex.algorithms.general.walkers import basic_random_walk
import networkx as nx
G = nx.karate_club_graph()
# Same seed produces identical walks
walk1 = basic_random_walk(G, 0, 50, seed=42)
walk2 = basic_random_walk(G, 0, 50, seed=42)
assert walk1 == walk2 # Identical
Different Seeds
# Different seeds produce different walks
walk1 = basic_random_walk(G, 0, 50, seed=42)
walk2 = basic_random_walk(G, 0, 50, seed=123)
assert walk1 != walk2 # Different
Batch Reproducibility
from py3plex.algorithms.general.walkers import generate_walks
G = nx.karate_club_graph()
# Same seed produces identical walk sets
walks1 = generate_walks(G, num_walks=100, walk_length=10, seed=42)
walks2 = generate_walks(G, num_walks=100, walk_length=10, seed=42)
assert walks1 == walks2
Performance Tips
Large Graphs
For large sparse graphs (>100k nodes), use basic walks for better performance:
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
# Large sparse graph
G = nx.erdos_renyi_graph(100000, 0.0001, seed=42)
# Use basic walk (p=1, q=1) for speed
walks = generate_walks(
G,
num_walks=10,
walk_length=100,
p=1.0, # No bias = faster
q=1.0,
seed=42
)
Parallel Processing
For generating many walks, consider parallel processing:
from py3plex.algorithms.general.walkers import generate_walks
import networkx as nx
from multiprocessing import Pool
G = nx.karate_club_graph()
def generate_batch(seed):
return generate_walks(G, num_walks=10, walk_length=20, seed=seed)
# Generate walks in parallel
with Pool(4) as pool:
batch_walks = pool.map(generate_batch, range(10))
# Flatten results
all_walks = [walk for batch in batch_walks for walk in batch]
Correctness Validation
The implementation has been validated against the following properties:
Uniformity Test
On unweighted regular graphs, transition probabilities are uniform:
import networkx as nx
from py3plex.algorithms.general.walkers import basic_random_walk
from collections import Counter
# Complete graph (regular)
G = nx.complete_graph(10)
# Count visits to neighbors from node 0
visits = Counter()
for i in range(10000):
walk = basic_random_walk(G, 0, 1, weighted=False, seed=i)
if len(walk) > 1:
visits[walk[1]] += 1
# All neighbors should be visited roughly equally
print(visits)
# Expected: ~1111 visits to each of 9 neighbors
Conservation Test
Sum of transition probabilities equals 1.0 (within machine epsilon):
import networkx as nx
import numpy as np
G = nx.karate_club_graph()
# For each node, verify probability conservation
for node in G.nodes():
neighbors = list(G.neighbors(node))
if len(neighbors) > 0:
weights = np.array([G[node][n].get('weight', 1.0) for n in neighbors])
probs = weights / weights.sum()
assert abs(probs.sum() - 1.0) < 1e-10 # Machine epsilon
Bias Consistency Test
Node2Vec parameters p and q produce expected biases:
import networkx as nx
from py3plex.algorithms.general.walkers import node2vec_walk
# Triangle graph
G = nx.Graph()
G.add_edges_from([(0, 1), (1, 2), (2, 0)])
# Low p should encourage backtracking
backtracks_low_p = 0
for i in range(1000):
walk = node2vec_walk(G, 0, 10, p=0.1, q=1.0, seed=i)
for j in range(2, len(walk)):
if walk[j] == walk[j-2]:
backtracks_low_p += 1
# High p should discourage backtracking
backtracks_high_p = 0
for i in range(1000):
walk = node2vec_walk(G, 0, 10, p=10.0, q=1.0, seed=i)
for j in range(2, len(walk)):
if walk[j] == walk[j-2]:
backtracks_high_p += 1
assert backtracks_low_p > backtracks_high_p * 2 # Significantly more
Edge Weight Test
Visit frequency matches edge weight ratios:
import networkx as nx
from py3plex.algorithms.general.walkers import basic_random_walk
from collections import Counter
G = nx.Graph()
G.add_weighted_edges_from([
(0, 1, 1.0),
(0, 2, 2.0),
(0, 3, 3.0),
])
visits = Counter()
for i in range(10000):
walk = basic_random_walk(G, 0, 1, weighted=True, seed=i)
if len(walk) > 1:
visits[walk[1]] += 1
# Expected ratio: 1:2:3
total = sum(visits.values())
ratios = [visits[1]/total, visits[2]/total, visits[3]/total]
expected = [1/6, 2/6, 3/6]
# Within 5% tolerance
for obs, exp in zip(ratios, expected):
assert abs(obs - exp) < 0.05
References
- Node2Vec:
Grover, A., & Leskovec, J. (2016). node2vec: Scalable feature learning for networks. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 855-864. https://doi.org/10.1145/2939672.2939754
- DeepWalk:
Perozzi, B., Al-Rfou, R., & Skiena, S. (2014). DeepWalk: Online learning of social representations. Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, 701-710. https://doi.org/10.1145/2623330.2623732
- Random Walks on Graphs:
Lovász, L. (1993). Random walks on graphs: A survey. Combinatorics, Paul Erdős is Eighty, 2, 1-46.
API Reference
See py3plex.algorithms.general.walkers
for complete API documentation.
Examples
Complete examples can be found in:
examples/example_random_walks.py
- Basic usage examplestests/test_random_walks.py
- Comprehensive test suite
Repository: https://github.com/SkBlaz/py3plex