Algebra Basics: Groups, Fields, Vector Spaces, and Homomorphisms

Algebra is the study of mathematical structures and the relationships between them. It is a fundamental tool in many areas of mathematics and physics. Algebraic structures are abstract elements with operations between them that satisfy certain axioms. This essay aims to be a semi-formal, intuitive introduction to the fundamental algebraic structures. This introduction is in no way complete, but rather a starting point to build upon for other essays such as representation theory, Noether's theorem, Wigner's theorem. Structurally I will formally introduce the definitions, then follow up with intuitive explanations and examples. Note that with the intuitive formulation some mathematical rigor gets lost.

0. Sets

A set is a collection of distinct elements. We write sets with curly brackets: {1, 2, 3}. The order does not matter: {1, 2, 3} = {3, 2, 1} = {2, 1, 3}. Elements are either in the set or not, and duplicates are ignored: {3, 3, 1} = {3, 1} = {1, 3} and 4 is not in the set.

The notation x ∈ S means "x is an element of S". The notation S \ T means "the set S without the elements of T", for example {1, 2, 3} \ {2} = {1, 3}.

If any of the set notation in the following sections feels unfamiliar, a quick search for "set theory" should help.

1. Group

A group (G, ⋅_G) is a set G with a binary operation ⋅_G satisfying the following axioms:

Closure: for all g₁, g₂ ∈ G, g₁ ⋅_G g₂ ∈ G
Identity: there exists an element e_G ∈ G such that for all g ∈ G, g ⋅_G e_G = e_G ⋅_G g = g
Inverses: for all g ∈ G, there exists an element g^-1 ∈ G such that g ⋅_G g^-1 = g^-1 ⋅_G g = e_G
Associativity: for all g₁, g₂, g₃ ∈ G, (g₁ ⋅_G g₂) ⋅_G g₃ = g₁ ⋅_G (g₂ ⋅_G g₃)

If additionally g₁ ⋅_G g₂ = g₂ ⋅_G g₁ for all g₁, g₂ ∈ G, the group is called abelian (or commutative).

A binary operation just means it takes 2 arguments and puts one value out. The elements can be numbers, coordinates or whatever, and the operation can be addition, multiplication or whatever. They just need to satisfy the axioms above. Note that abelian and commutative are synonyms.

This list of axioms essentially describes is that if I take two elements, apply the binary operation, I end up still in the group. There is one element that if used in the binary operation causes the other element to be spat out unchanged. Then that there is for every element another element that transforms the result into this neutral element. And lastly, that if I have several elements and operations chained together, the order in which I prioritize the operations doesn't matter, However, the order in which the elements are applied still does matter. If the order in which the elements are applied does not matter, then it's abelian aka commutative.

Examples:

Integers under addition: (G, ⋅_G) = (ℤ, +), i.e. G = {..., -2, -1, 0, 1, 2, ...}, ⋅_G = +, e_G = 0. This is abelian.
Nonzero reals under multiplication are abelian: (G, ⋅_G) = (ℝ \ {0}, ⋅), i.e. G = {x ∈ ℝ | x ≠ 0}, ⋅_G = ⋅, e_G = 1. Zero is excluded because 0 ⋅ x = 0 for all x, so 0 can have no multiplicative inverse.
Symmetries of a polygon under composition: G = {rotations and reflections}, ⋅_G = composition, e_G = identity transformation. This is non-abelian: rotating then reflecting gives a different result than reflecting then rotating.

The identity transformation on a set X is the map id: X → X defined by id(x) = x for all x ∈ X. It leaves every element unchanged.

Note that id() stands for identity transformation. (other ways to write it are id_X, Id_X or 1_X, where the X just denotes the set of which it is the identity element of.)

The composition of two maps f: X → Y and g: Y → Z is the map g ∘ f: X → Z defined by (g ∘ f)(x) = g(f(x)). Apply f first, then g.

Composition essentially expresses just to take the function of a function, especially giving us more convenient notation. For matrices, composition becomes matrix multiplication. If A and B are matrices, then (A ∘ B)(v) = A(B(v)), which is the same as the matrix product AB applied to v. For example, rotating by 90° then reflecting across the x-axis means: first apply the rotation matrix, then the reflection matrix. The result is a single matrix (their product) that does both in one step.

1.1 Subgroups and Order

A subgroup H of a group (G, ⋅_G) is a subset H ⊆ G that is itself a group under the same operation. That is: H contains the identity, is closed under the group operation, and contains inverses of all its elements.

This is interesting to look at from the other direction: we can take two groups with the same operation and combine them. The original groups automatically become subgroups. But because we now have elements from both groups that we can combine via the operation, this can open up entirely new elements that were in neither of the original groups.

For example, the rotations {e, r, r², r³} form a subgroup of D₄ (the full symmetry group of the square; note some textbooks call it D₈ because of its order being 8). The rotations alone satisfy all group axioms under composition. Now take a single reflection, say s (reflection across the y = 0 axis). This by itself forms a group under composition: {e, s}, since applying s twice gives the identity (s² = e).

Now combine the rotations with this single reflection. Under composition, entirely new elements appear: s ∘ r gives a reflection across the y = -x line, s ∘ r² gives a reflection across the y-axis, and s ∘ r³ gives a reflection across the y = x line. None of these were in our original two subgroups, but they emerge from combining elements across them.

Note that the reflections alone ({s, sr, sr², sr³}) do not form a subgroup: composing two reflections gives a rotation, not a reflection. They are not closed under composition. Think about it. From just one rotation r and one reflection s, we have generated all 8 elements of D₄.

The order of a group element g ∈ G, written ord(g), is the smallest positive integer n such that gⁿ = e_G. If no such n exists, the element has infinite order.

In other words: ord(g) is how many times you need to apply the group operation before you get back to the identity.

For example, in D₄: ord(r) = 4 (four 90° rotations = 360° = identity), ord(r²) = 2 (two 180° rotations = identity), and ord(s) = 2 (reflecting twice = identity).

2. Field

A field (F, +_F, ⋅_F) is a set of elements f with two binary operations +_F (field addition) and ⋅_F (field multiplication) satisfying the following rules:

Closure: for all f₁, f₂ ∈ F, f₁ +_F f₂ ∈ F and f₁ ⋅_F f₂ ∈ F
Associativity: for all f₁, f₂, f₃ ∈ F, (f₁ +_F f₂) +_F f₃ = f₁ +_F (f₂ +_F f₃) and (f₁ ⋅_F f₂) ⋅_F f₃ = f₁ ⋅_F (f₂ ⋅_F f₃)
Commutativity: for all f₁, f₂ ∈ F, f₁ +_F f₂ = f₂ +_F f₁ and f₁ ⋅_F f₂ = f₂ ⋅_F f₁
Distributivity: for all f₁, f₂, f₃ ∈ F, f₁ ⋅_F (f₂ +_F f₃) = (f₁ ⋅_F f₂) +_F (f₁ ⋅_F f₃)
Additive identity: there exists an element 0_F ∈ F such that for all f ∈ F, f +_F 0_F = f
Multiplicative identity: there exists an element 1_F ∈ F (with 1_F ≠ 0_F) such that for all f ∈ F, f ⋅_F 1_F = f
Additive inverses: for all f ∈ F, there exists an element -f ∈ F such that f +_F (-f) = 0_F
Multiplicative inverses: for all f ∈ F with f ≠ 0_F, there exists an element f^-1 ∈ F such that f ⋅_F f^-1 = 1_F

Structurally, a field combines two abelian groups tied together by the distributive law: (F, +_F) is an abelian group with identity 0_F, and (F \ {0_F}, ⋅_F) is an abelian group with identity 1_F. 0_F is excluded from the multiplicative group because multiplication by 0_F collapses everything onto 0_F ⋅_F f = 0_F for all f ∈ F. Thus this map is not bijective, and no element f can satisfy 0_F ⋅_F f = 1_F, which means that the multiplicative inverse for 0_F does not exist (this is why you cannot divide by 0_F - as you might remember from school).

F* (also written F^×) is the multiplicative group of the field F excluding the zero element: F* = (F \ {0_F}, ⋅_F). This is a group under field multiplication with identity 1_F.

Reading the notation: {0_F} is the set containing just the zero element, and F \ {0_F} means "F without zero" (see section 0). So ℝ* = ℝ \ {0} = {..., -2, -1, 0.5, 1, 1.5, 2, ...}, all real numbers except zero.

Examples: real numbers, complex numbers, rational numbers, and finite fields like 𝔽₂ (the field with two elements 0 and 1).

Real numbers: (F, +_F, ⋅_F) = (ℝ, +, ⋅), 0_F = 0, 1_F = 1
Complex numbers: (F, +_F, ⋅_F) = (ℂ, +, ⋅), 0_F = 0, 1_F = 1
Rational numbers: (F, +_F, ⋅_F) = (ℚ, +, ⋅), 0_F = 0, 1_F = 1
Finite field 𝔽₂: (F, +_F, ⋅_F) = (𝔽₂, +, ⋅) where 𝔽₂ = {0, 1} with arithmetic modulo 2 (so 1 +_F 1 = 0), 0_F = 0, 1_F = 1

2.1 Characteristic

The characteristic of a field F is the smallest positive integer n such that adding 1_F to itself n times gives 0_F. If no such n exists, the characteristic is defined as 0.

Note that some also define the characteristic as ∞ (infinity) - both just meaning that the field has no characteristic. The characteristic of a field tells you how many times you need to add 1_F to itself before you loop back to 0_F. Similar to how a circle can be seen as a polygon with infinitely many corners or zero corners depending on perspective, a field where this never happens can be said to have characteristic ∞ or 0. By our convention it is 0.

Examples: ℝ, ℂ, and ℚ all have characteristic 0 (you can add 1 to itself forever and never reach 0). The finite field 𝔽₂ has characteristic 2 because 1_F +_F 1_F = 0_F.

3. Vector Space

Now getting onto vector spaces: A Vector Space V is a concept built on top of a field F. A vector space is comprised of things called vectors, equipped with a vector addition operation +_V and a scalar multiplication operation ⋅_V (multiplying a vector by a scalar from F), fulfilling the axioms below.

The standard way to express vectors are in terms of ordered tuples Fⁿ: (f₁, f₂, ..., f_n) of field elements, where addition and scalar multiplication act component-wise. Ordered means that the order of elements matters (1,2,3) is not the same as (2,1,3) and both are vectors in ℝ³. Think of such a tuple for intuition, the abstract definition allows for more generalized constructs to be vectors (e.g., the set of all polynomials of degree at most n are also a vector space).

Now back to the tuple model with v₁ = (f₁, f₂, f₃, ...) and v₂ = (h₁, h₂, h₃, ...) (where f₁, f₂, f₃, ... ∈ F and h₁, h₂, h₃, ... ∈ F are field elements):

Vector addition: v₁ +_V v₂ = (f₁ +_F h₁, f₂ +_F h₂, f₃ +_F h₃, ...)

Scalar multiplication: For a scalar c ∈ F, c ⋅_V v₁ = (c ⋅_F f₁, c ⋅_F f₂, c ⋅_F f₃, ...)

A vector space V over a field F with operations +_V and ⋅_V satisfies:

Closure: for all v₁, v₂ ∈ V, v₁ +_V v₂ ∈ V, and for all c ∈ F, v ∈ V, c ⋅_V v ∈ V
Commutativity of +_V: v₁ +_V v₂ = v₂ +_V v₁
Associativity of +_V: (v₁ +_V v₂) +_V v₃ = v₁ +_V (v₂ +_V v₃)
Compatibility: f₁ ⋅_V (f₂ ⋅_V v) = (f₁ ⋅_F f₂) ⋅_V v
Additive distributivity: f ⋅_V (v₁ +_V v₂) = (f ⋅_V v₁) +_V (f ⋅_V v₂)
Scalar distributivity: (f₁ +_F f₂) ⋅_V v = (f₁ ⋅_V v) +_V (f₂ ⋅_V v)
Additive identity: there exists 0_V ∈ V such that v +_V 0_V = v
Scalar identity: 1_F ⋅_V v = v
Additive inverses: for all v ∈ V, there exists -v ∈ V such that v +_V (-v) = 0_V

Here 0_V is the zero vector (the neutral element of vector addition) - all entries are 0_F. And 1_F is the multiplicative identity of the field (not an element of V). The additive inverse -v is the vector that "cancels out" v under addition.

Examples:

ℝ over ℝ: the real line as a 1D vector space over itself
ℚ over ℚ: the rationals as a 1D vector space over themselves
ℝ² over ℝ: vectors (x, y) with x, y ∈ ℝ
ℂ² over ℂ: vectors (z₁, z₂) with z₁, z₂ ∈ ℂ
ℝ³ over ℝ: vectors (x, y, z) with x, y, z ∈ ℝ

Note that in the context of vector spaces we often call the field elements scalars - probably because they scale the vector: 2 ⋅_V (2,1,3) = (4,2,6) (the vector has 2 times the length of the initial one).

3.1 Span, Basis, and Dimension

The span of a set of vectors {v₁, ..., v_k} is the set of all vectors you can make by adding and scaling them: span{v₁, ..., v_k} = {c₁ ⋅_V v₁ +_V ... +_V c_k ⋅_V v_k : c₁, ..., c_k ∈ F}.

For example, span{(1,0)} = {(c, 0) : c ∈ ℝ} is the x-axis. span{(1,0), (0,1)} = ℝ² is the whole plane, because any (a, b) can be written as a ⋅ (1,0) + b ⋅ (0,1).

A set of vectors {v₁, ..., v_n} is called linearly independent if no vector in the set can be written as a combination of the others. Formally: c₁ ⋅_V v₁ +_V ... +_V c_n ⋅_V v_n = 0_V implies c₁ = ... = c_n = 0_F.

A basis of a vector space V is a set of vectors that is both linearly independent and spans all of V. Every vector in V can be written as a unique combination of basis vectors.

The dimension of a vector space is the number of vectors in any basis. For example, ℝ³ has dimension 3 (one standard basis is {(1,0,0), (0,1,0), (0,0,1)}).

Essentially, the basis is the minimal set of vectors we need to reach every single possible vector in the field (which is what the span expresses) and the dimension is just the minimum number of these vectors. Note that every basis automatically has the minimum number or vectors that create the space that is opened up by the span. However, the span can also have linearly dependent vectors and still open up the same space. It just means that we will end up with more vectors in the span to open up the same space now. Once you choose a basis, every vector becomes a tuple of numbers (its coordinates in that basis), and every linear map becomes a matrix. Different basis, different matrix, same linear map. The basis is the "coordinate system" that turns abstract vectors into concrete numbers.

Examples:

ℂ² has span ℂ²=span{(1,0), (0,1)}= span{(1,1), (1,-1)}= span{(1,0), (0,1), (1,1)} has dimension 2 (realize how in the last span (1,0) + (0,1) = (1,1) is a linear combination, thus one of the 3 can be removed to form the basis)
ℂ³ has basis {(1,0,0), (0,1,0), (0,0,1)} and dimension 3

3.2 Subspaces

A subspace W of a vector space V is a subset W ⊆ V that is itself a vector space under the same operations. That is: W contains 0_V, is closed under vector addition, and is closed under scalar multiplication.

This is the same concept as for subgroups, just apply it onto vector spaces. Note that V is always a subspace of itself. For example, in ℝ³: the x-axis {(a, 0, 0) : a ∈ ℝ} is a 1D subspace. The xy-plane {(a, b, 0) : a, b ∈ ℝ} is a 2D subspace. The set {(1, 0, 0)} alone is not a subspace (it does not contain the zero vector and is not closed under addition). Every span of vectors is a subspace.

3.3 Inner Product

An inner product on a vector space V over ℝ (or ℂ) is a function ⟨-,-⟩: V × V → F that assigns a scalar to each pair of vectors, satisfying:

Conjugate symmetry: ⟨v, w⟩ = ⟨w, v⟩* (over ℝ this is just ⟨v, w⟩ = ⟨w, v⟩)
Linearity in the second argument: ⟨v, c₁w₁ + c₂w₂⟩ = c₁⟨v, w₁⟩ + c₂⟨v, w₂⟩
Positive definiteness: ⟨v, v⟩ ≥ 0, with equality only if v = 0_V

The inner product lets you measure lengths (||v|| = √⟨v, v⟩) and angles between vectors. The standard dot product on ℝⁿ is the most familiar example: ⟨(a₁, a₂), (b₁, b₂)⟩ = a₁b₁ + a₂b₂. (The concept of length and angle is more flexible in this context than what you know from geometry - it is dependant on how we define the inner product operation concretely.)

Think of an inner product as an additional structure on top of a vector space. Not every vector space comes with one.

Examples on ℝ²:

Standard (Euclidean) inner product: ⟨(a₁, a₂), (b₁, b₂)⟩ = a₁b₁ + a₂b₂. This gives the familiar Euclidean length ||(a₁, a₂)|| = √(a₁² + a₂²) and the angles you know from geometry.

Different inner products on the same vector space lead to different notions of length and angle. Each inner product implicitly defines a way to measure distances, called a metric. The Euclidean inner product gives the Euclidean metric (straight-line distance), while the Minkowski inner product gives the Minkowski metric (spacetime intervals). A metric is actually a more general concept that does not require an inner product at all:

3.4 Metric

A metric on a set X is a function d: X × X → ℝ satisfying:

Non-negativity: d(x, y) ≥ 0, with d(x, y) = 0 if and only if x = y
Symmetry: d(x, y) = d(y, x)
Triangle inequality: d(x, z) ≤ d(x, y) + d(y, z)

A metric is much more general than an inner product: it just needs a set, not a vector space. You can define metrics on graphs, strings, surfaces, anything where "distance" makes sense. Every inner product induces a metric via d(v, w) = ||v - w|| = √⟨v - w, v - w⟩, but not every metric comes from an inner product.

3.5 Hilbert Space

A Hilbert space is a vector space with an inner product that is complete: every Cauchy sequence of vectors converges to a vector in the space. (A Cauchy sequence is one where the elements get arbitrarily close to each other as the sequence progresses.)

Every finite-dimensional inner product space is automatically a Hilbert space (completeness is only a concern in infinite dimensions). The simplest example is ℂⁿ with the standard inner product. In physics, quantum mechanics uses infinite-dimensional Hilbert spaces (such as L², the space of square-integrable functions) to describe states with continuously many degrees of freedom. See Wigner's theorem for why Hilbert spaces are central to quantum mechanics (you probably will still need to do some research to fully understand it).

4. Linearity, Linear Maps, and GL(V)

A map T is called linear if it satisfies two properties: it preserves addition (T(a + b) = T(a) + T(b)) and it preserves scalar multiplication (T(c ⋅ a) = c ⋅ T(a)).

Intuitively, a linear map "plays nice" with the structure of a vector space. Scaling the input scales the output by the same amount, and adding two inputs then mapping gives the same result as mapping each input and then adding. Anything that bends, shifts, or warps in a way that breaks these rules is considered nonlinear.

Example: T(x) = 3x is linear (tripling respects addition and scaling). But T(x) = x² is not: T(2+3) = T(5) = 25, but T(2) + T(3) = 4 + 9 = 13 ≠ 25.

More precisely: let V and W be vector spaces over the same field F. A function T: V → W is a linear map if it satisfies:

For all v ∈ V and for all f ∈ F, T(f ⋅_V v) = f ⋅_W T(v) (note the ⋅_V on the left vs ⋅_W on the right: the scalar multiplications live in different vector spaces)
For all v₁, v₂ ∈ V, T(v₁ +_V v₂) = T(v₁) +_W T(v₂)

Matrices are the concrete way to write down linear maps once you have chosen a basis to represent the vectors. Every linear map between finite-dimensional vector spaces can be represented as a matrix, and every matrix defines a linear map.

4.1 Operators

An operator on a vector space V is a map T: V → V (from the space to itself). A linear operator is an operator that is also a linear map.

The distinction from a general linear map: a linear map goes from V to some other space W, while an operator maps V back to itself. This means you can apply an operator repeatedly under composition (T(T(v)), written as T²(v)) and compose operators with each other, which is not possible when the domain V and codomain W differ.

4.2 GL(V)

The general linear group GL(V) is the group of all invertible linear operators on V (equivalently, all invertible square matrices once a basis is chosen).

GL(V) is a group under composition: the product of two invertible linear operators is invertible, the identity operator is the neutral element, and every invertible operator has an inverse.

Note how GL(V) is specifically the group of invertible linear operators, not all operators.

5. Homomorphism, Isomorphism, Endomorphism, and Automorphism

A homomorphism φ is a function from a group (G, ⋅_G) to a group (H, ⋅_H) (reminder the group operations can be addition, multiplication, and more): φ: G → H The homomorphism φ has the property that for all g₁, g₂ ∈ G, φ(g₁ ⋅_G g₂) = φ(g₁) ⋅_H φ(g₂).

A map f: X → Y is called:

Injective (one-to-one): if f(x₁) = f(x₂) implies x₁ = x₂. No two distinct elements map to the same output.
Surjective (onto): if for every y ∈ Y there exists an x ∈ X such that f(x) = y. Every element in the target is hit.
Bijective: if it is both injective and surjective. Every element in Y is hit exactly once.

An isomorphism is a homomorphism φ: G → H that is also a bijection. It guarantees that each element in H is uniquely traceable back to exactly one element in G.

An endomorphism is a homomorphism from a group to itself: φ: G → G.

An automorphism is an endomorphism that is also a bijection (i.e., an isomorphism from a group to itself): φ: G → G with φ bijective. It maps each element of a group back onto an element of the same group (this does not mean that the element maps to itself!).

A general convention is to write products like g₁ ⋅_G g₂ or φ(g₁) ⋅_H φ(g₂) as g₁g₂ and φ(g₁)φ(g₂), the multiplicative group operation is implicit. Meanwhile we will keep writing the additive operations with the explicit + sign.

Examples:

The identity isomorphism: φ(g) = g (from any group to itself). This is an isomorphism because it is also a bijection.
The determinant homomorphism: det: GL(V) → F*, where det(AB) = det(A)det(B). Note that F* = (F \ {0_F}, ·_F) which is the multiplicative group of the field F without the zero element 0_F.
The sign homomorphism: sgn: S_n → {+1, -1}, mapping each permutation to its sign (even → +1, odd → -1).

Appendix: How 𝔽₂ Works

The finite field 𝔽₂ = {0, 1} is the smallest possible field. It has just two elements, and all arithmetic is done modulo 2 (the remainder after dividing by 2).

Addition table (modulo 2):

0 + 0 = 0
0 + 1 = 1
1 + 0 = 1
1 + 1 = 0 (because 2 mod 2 = 0)

Multiplication table (modulo 2):

0 ⋅ 0 = 0
0 ⋅ 1 = 0
1 ⋅ 0 = 0
1 ⋅ 1 = 1

You can verify all field axioms hold: 0 is the additive identity, 1 is the multiplicative identity, every element has an additive inverse (0 is its own inverse, 1 is its own inverse since 1 + 1 = 0), and the only nonzero element (1) has a multiplicative inverse (1^-1 = 1).

This field has characteristic 2 (section 2.1) because 1 + 1 = 0. It appears in error-correcting codes, cryptography, and computer science (binary arithmetic is essentially 𝔽₂).

0. Sets

1. Group

1.1 Subgroups and Order

2. Field

2.1 Characteristic

3. Vector Space

3.1 Span, Basis, and Dimension

3.2 Subspaces

3.3 Inner Product

3.4 Metric

3.5 Hilbert Space

4. Linearity, Linear Maps, and GL(V)

4.1 Operators

4.2 GL(V)

5. Homomorphism, Isomorphism, Endomorphism, and Automorphism

Appendix: How 𝔽2 Works

Appendix: How 𝔽₂ Works