AI Translated from Chinese
Introduction
What is Graph Theory?
Graph theory is a very interesting branch of discrete mathematics, with “graphs” as the main research object, studying the internal structure of graphs, exploring their combinatorial, algebraic, and topological properties, and finding “good graphs” that satisfy conditions. For example, the famous Seven Bridges problem (Euler circuit) belongs to graph theory.
So, what is a graph?
Simply put, a graph (Graph) is an object composed of multiple points and lines connecting these points. For specific definitions, see Wikipedia - Graph, OI Wiki - Graph Theory Concepts
In computers, graphs are also a very important data structure. Problems like shortest paths, minimum spanning trees, and network flows often require graphs to implement.
Graph Storage
This article uses $ n $ to refer to the number of nodes in a graph, $ m $ to refer to the number of edges, and $ d^+(u) $ to refer to the out-degree of node $ u $, i.e., the number of edges starting from $ u $.
Adjacency Matrix
Method
From the concept of graphs, we know that graphs consist of node sets and edge sets. We can use a 2D array G to store edges, where G[u][v] = 1 indicates an edge exists from u to v. If the edge has weight w, we can set G[u][v] = w.
#include <iostream>
#include <cstring>
using namespace std;
const int maxn = 1000;
int n, m;
bool vis[maxn];
int G[maxn][maxn];
// Check if edge (u, v) exists
bool find_edge(int u, int v) { return G[u][v]; }
void dfs(int u) {
if (vis[u]) return;
vis[u] = true;
cout << u << ' ';
for (int v = 1; v <= n; v++)
if (G[u][v]) dfs(v);
}
int main() {
// Total n nodes, m edges
cin >> n >> m;
for (int i = 1; i <= m; i++) {
int u, v, w;
cin >> u >> v >> w;
G[u][v] = w;
}
cout << endl;
for (int i = 1; i <= n; i++) {
memset(vis, false, sizeof(vis));
dfs(i);
cout << '\n';
}
cout << endl;
cout << find_edge(3, 1) << endl;
cout << find_edge(4, 5) << endl;
return 0;
}
/* Input example:
5 6
1 2 1
2 3 2
1 3 3
3 1 10
3 4 3
5 2 7
*/
Example problem: P1359 Renting a Yacht
Use adjacency matrix to store the graph. Note that in this problem, the adjacency matrix is a half-matrix (similar to an upper triangular matrix, but diagonal elements are 0, i.e., a[i][i] = 0). Note that plain DFS will time out, so pruning is needed.
Dynamic programming can also solve it, and the code is very concise (seems like it was originally a DP problem, but I hardcoded it into DFS).
Complexity
Query if an edge exists: $ O(1) $
Traverse all outgoing edges of a node: $ O(n) $
Traverse the entire graph: $ O(n^2) $
Space complexity: $ O(n^2) $
Applications
Adjacency matrix is only suitable when there are no multiple edges (or multiple edges can be ignored).
Its most significant advantage is being able to query whether an edge exists in $ O(1) $.
Since adjacency matrix has very low efficiency on sparse graphs (especially on graphs with many nodes where space is unaffordable), it’s generally only used on dense graphs.
(The above information about complexity and applications is from OI Wiki - Graph Storage)
Adjacency List
Method
We can store edges associated with each node separately, like:
1: -->a -->b -->c
2: -->b -->c
3: -->c -->d -->e
4: -->a -->d
As you can see, an adjacency list is a kind of linked list that stores edges in any order. But in algorithm competitions, we can use arrays or vector to simulate linked lists.
We can also store the endpoints of edges, as shown in the diagram:

1: [2, 3, 5]
2: [5]
3: [1]
4: [3]
5: []
For undirected edges, just store both endpoints once. For example, undirected edge (1, 2):
1: [2]
2: [1]
Here is the C++ implementation:
#include <iostream>
#include <vector>
using namespace std;
struct Node {
int v; // Endpoint of edge
int w; // Edge weight
};
int n, m;
vector<int> vis;
vector<vector<Node> > G; // Store all outgoing edge information of node u (endpoint, edge weight, etc.)
// Check if edge (u, v) exists
bool find_edge(int u, int v) {
for (int i = 0; i < (int)G[u].size(); i++)
if (G[u][i].v == v) return true;
return false;
}
void dfs(int u) {
if (vis[u]) return;
vis[u] = 1;
cout << u << ' ';
for (int i = 0; i < (int)G[u].size(); i++) dfs(G[u][i].v);
}
int main() {
// Total n nodes, m edges
cin >> n >> m;
vis.resize(n+1, false);
G.resize(n+1);
for (int i = 1; i <= m; i++) {
int u, v, w;
cin >> u >> v >> w;
G[u].push_back({v, w});
}
for (int i = 1; i <= n; i++) {
dfs(i);
cout << endl;
}
cout << find_edge(1, 2) << endl;
return 0;
}
Complexity
Query if edge from u to v exists: $ O(d^+(u)) $ (if sorted beforehand, binary search can achieve $ O(\log(d^+(u))) $)
Traverse all outgoing edges of node u: $ O(d^+(u)) $
Traverse the entire graph: $ O(n+m) $
Space complexity: $ O(m) $
Applications
Suitable for storing various graphs, unless there are special requirements (such as needing to quickly query if an edge exists with few nodes, adjacency matrix can be used).
Especially suitable when needing to sort all outgoing edges of a node.
Chain Forward Star
Method
An upgraded version of adjacency list, which is essentially a statically built adjacency list.
Its characteristic is numbering edges, with the first edge numbered 0.
The key is understanding these variables:
- Edge array:
e[i]stores the edge weight of the i-th edge ne, representing the number of the previous edge with the same starting point- Head node array:
h[i]represents the number of the last edge starting from i
The h array is generally initialized to -1, and when traversing the graph, ne = -1 is used as the termination condition.
#include <iostream>
#include <cstring>
#include <algorithm>
using namespace std;
const int N = 1e5 + 10;
int n, m, idx;
int h[N], ne[N], to[N], e[N], vis[N];
// h[u] initial value is -1
void add_edge(int u, int v, int w) {
to[idx] = v; // Endpoint
e[idx] = w; // Edge weight
ne[idx] = h[u]; // Previous edge number starting from u
h[u] = idx++; // Update the previous edge number starting from u
}
void dfs(int u) {
cout << u << ' ';
vis[u] = true;
for (int i = h[u]; ~i; i = ne[i]) { // ~i is equivalent to i != -1
if (!vis[to[i]]) dfs(to[i]);
}
}
void print() { // Traverse endpoints of outgoing edges for each node
for (int i = 1; i <= n; i++) {
cout << i << ": ";
for (int j = h[i]; ~j; j = ne[j])
cout << to[j] << ' ';
cout << "\n";
}
}
int main() {
memset(h, -1, sizeof(h)); // Initialize; ne[i] = -1 means reaching endpoint
// Store graph
cin >> n >> m;
for (int i = 1; i <= m; i++) {
int u, v, w;
cin >> u >> v >> w;
add_edge(u, v, w);
// add_edge(v, u, w); // Undirected edge
// Output storage process
cout<<"to["<<idx-1<<"] = "<<to[idx-1]<<"; "
<<"ne["<<idx-1<<"] = "<<ne[idx-1]<<"; "
<<"h["<<u<<"] = "<<h[u]<<";\n";
}
print();
for (int i = 1; i <= n; i++) {
memset(vis, 0, sizeof(vis));
dfs(i);
}
// Output storage results
cout << "\n";
for (int i = 0; i <= n; i++) {
cout << "h[" << i << "] = " << h[i] << ", ";
cout << "ne[" << i << "] = " << ne[i] << ", ";
cout << "to[" << i << "] = " << to[i] << ", ";
cout << "e[" << i << "] = " << e[i] << ",\n";
}
return 0;
}
/* Input example:
4 3
1 2 5
2 4 10
4 3 15
*/
Complexity
Query if edge from u to v exists: $ O(d^+(u)) $
Traverse all outgoing edges of node u: $ O(d^+(u)) $
Traverse the entire graph: $ O(n+m) $
Space complexity: $ O(m) $
Applications
Suitable for storing various graphs, but cannot quickly query if an edge exists, nor conveniently sort outgoing edges of a node.
The advantage is that edges are numbered, which can sometimes be very useful. Also, if idx’s initial value is odd, when storing bidirectional edges, i ^ 1 is the reverse edge of i (often used in network flow).
Example Problem
The trick for this problem is to store the graph in reverse, to find the largest numbered node each point can reach in reverse, i.e.:
cin >> u >> v >> w;
add_edge(v, u);
Reverse DFS to get the result:
for (int i = n; i > 0; i--)
if (!vis[i]) dfs(i, i);
Finally output vis[].
Graph Theory Related Problems
Topological Sort
In graph theory, a sequence of vertices of a directed acyclic graph is called a topological sort of the graph if and only if it satisfies the following conditions:
- The sequence contains each vertex exactly once;
- If A comes before B in the sequence, there is no path from B to A in the graph.
Topological sorting is obtaining a total order from a partial order. Simply put, it’s traversing all nodes of the graph in a specific order: before traversing a node, all nodes pointing to it must have been traversed. As you can see, only directed acyclic graphs (DAG) have topological order.
We can devise such an algorithm:
L = [] # Topological order
while there are nodes with in-degree 1:
randomly select a node u with in-degree 1
L.append(u)
for v in G[u]:
v's degree -= 1
delete node u from the graph
if len(L) == initial graph's total nodes:
return True # Topological order exists
else:
return False
For example, the diagram below:
# Its topological order can be:
1 2 3 4 5
# Or:
1 3 2 4 5
Can be implemented with BFS or DFS. Since only one traversal of nodes is needed, the complexity is quite reasonable, at $O(V+E)$.
BFS
int n, m, _in[N]; // Nodes, edges, in-degree
bool vis[N];
vector<vector<int>> G;
vector<int> topo;
bool toposort() {
queue<int> q;
for (int i = 1; i <= n; i++) {
if (!_in[i]) q.push(i);
}
while (!q.empty()) {
int u = q.front();
q.pop();
topo.emplace_back(u);
for (auto &v : G[u]) {
_in[v]--;
if (!_in[v]) q.push(v);
}
}
return (int)topo.size() == n;
}
DFS
void dfs(int i) {
if (!_in[i] && !vis[i] && !tmp[i]) {
topo.emplace_back(i);
vis[i] = true;
tmp[i] = true; // Temporary mark. If traversing to already marked node, not a DAG
for (auto &j : G[i]) {
_in[j]--;
dfs(j);
}
tmp[i] = false;
}
}
bool toposort() {
for (int i = 1; i <= n; i++) dfs(i);
return ((int)topo.size() == n);
}
Example Problem
Problem: Given an undirected tree, each operation deletes all leaf nodes. How many nodes remain after k operations?
Analogous to topological sort, each operation selects nodes with degree 1, traverses their outgoing edges and reduces the degree of the endpoint by 1, then deletes these nodes. Note: The given tree is undirected, so DFS cannot be used directly (for example, if the tree is a single chain, DFS will traverse from one end to the other directly). BFS is better to write, and the code is similar. The only difference is to check if a neighbor’s degree is 1 before deletion (meaning the neighbor is already in the queue).
#include <bits/stdc++.h>
#define IOS ios::sync_with_stdio(false);cin.tie(0);cout.tie(0);
#define _all(x) (x).begin(), (x).end()
#define _abs(x) ((x >= 0) ? (x) : (-x))
#define PII pair<int, int>
#define PLL pair<LL, LL>
#define PDD pair<double, double>
#define fi first
#define se second
typedef long long LL;
typedef long double LD;
typedef unsigned long long ULL;
const int INF = 0x3f3f3f3f;
const double PI = acos(-1.0);
using namespace std;
const int N = 4e5 + 10;
int d[N], vis[N], n, k; // d[] is node degree, vis[] is number of operations
vector<vector<int>> G;
void init() {
memset(d, 0, sizeof(d));
memset(vis, 0, sizeof(vis));
cin >> n >> k;
G.clear();
G.resize(n+1);
for (int i = 0; i < n-1; i++) {
int u, v;
cin >> u >> v;
G[u].emplace_back(v);
G[v].emplace_back(u);
d[u]++, d[v]++;
}
}
// BFS-based topological sort
void solve() {
queue<int> q;
for (int i = 1; i <= n; i++) {
if (d[i] == 1) {
q.push(i);
vis[i] = 1;
}
}
while (!q.empty()) {
int u = q.front();
q.pop();
for (auto &v : G[u]) {
// Neighbor degree is 1, meaning it's already in the queue
if (d[v] != 1) {
d[v]--;
// After deleting u, v's degree is 1, v enters queue
if (d[v] == 1) {
vis[v] = vis[u] + 1;
q.push(v);
}
}
}
}
int ans = 0;
for (int i = 1; i <= n; i++) {
if (vis[i] > k) ans++;
}
cout << ans << "\n";
}
int main() {
IOS
int test_case;
cin >> test_case;
while (test_case--) {
init();
solve();
}
return 0;
}
Euler Path, Euler Circuit
(To be filled…)
Minimum Spanning Tree
(To be filled…)
Appendix
References
[2]OI Wiki - Graph Theory Concepts
[4]Shuichi Miyazaki “Programmer’s Mathematics 4: Graph Theory Introduction”
[5]Chain Forward Star – The Most Easy-to-Understand Explanation