Algebra Basics: Groups, Fields, Vector Spaces, and Homomorphisms

Algebra is the study of mathematical structures and the relationships between them. It is a fundamental tool in many areas of mathematics and physics. Algebraic structures are abstract elements with operations between them that satisfy certain axioms. This essay aims to be a semi-formal, intuitive introduction to the fundamental algebraic structures. This introduction is in no way complete, but rather a starting point to build upon for other essays such as representation theory, Noether's theorem, Wigner's theorem. Structurally I will formally introduce the definitions, then follow up with intuitive explanations and examples. Note that with the intuitive formulation some mathematical rigor gets lost.

0. Sets

A set is a collection of distinct elements. We write sets with curly brackets: {1, 2, 3}. The order does not matter: {1, 2, 3} = {3, 2, 1} = {2, 1, 3}. Elements are either in the set or not, and duplicates are ignored: {3, 3, 1} = {3, 1} = {1, 3} and 4 is not in the set.

The notation x ∈ S means "x is an element of S". The notation S \ T means "the set S without the elements of T", for example {1, 2, 3} \ {2} = {1, 3}.

If any of the set notation in the following sections feels unfamiliar, a quick search for "set theory" should help.

1. Group

A group (G, ⋅G) is a set G with a binary operation G satisfying the following axioms:

  • Closure: for all g1, g2 ∈ G, g1G g2 ∈ G
  • Identity: there exists an element eG ∈ G such that for all g ∈ G, g ⋅G eG = eGG g = g
  • Inverses: for all g ∈ G, there exists an element g-1 ∈ G such that g ⋅G g-1 = g-1G g = eG
  • Associativity: for all g1, g2, g3 ∈ G, (g1G g2) ⋅G g3 = g1G (g2G g3)

If additionally g1G g2 = g2G g1 for all g1, g2 ∈ G, the group is called abelian (or commutative).

A binary operation just means it takes 2 arguments and puts one value out. The elements can be numbers, coordinates or whatever, and the operation can be addition, multiplication or whatever. They just need to satisfy the axioms above. Note that abelian and commutative are synonyms.

This list of axioms essentially describes is that if I take two elements, apply the binary operation, I end up still in the group. There is one element that if used in the binary operation causes the other element to be spat out unchanged. Then that there is for every element another element that transforms the result into this neutral element. And lastly, that if I have several elements and operations chained together, the order in which I prioritize the operations doesn't matter, However, the order in which the elements are applied still does matter. If the order in which the elements are applied does not matter, then it's abelian aka commutative.

Examples:

The identity transformation on a set X is the map id: X → X defined by id(x) = x for all x ∈ X. It leaves every element unchanged.

Note that id() stands for identity transformation. (other ways to write it are idX, IdX or 1X, where the X just denotes the set of which it is the identity element of.)

The composition of two maps f: X → Y and g: Y → Z is the map g ∘ f: X → Z defined by (g ∘ f)(x) = g(f(x)). Apply f first, then g.

Composition essentially expresses just to take the function of a function, especially giving us more convenient notation. For matrices, composition becomes matrix multiplication. If A and B are matrices, then (A ∘ B)(v) = A(B(v)), which is the same as the matrix product AB applied to v. For example, rotating by 90° then reflecting across the x-axis means: first apply the rotation matrix, then the reflection matrix. The result is a single matrix (their product) that does both in one step.

1.1 Subgroups and Order

A subgroup H of a group (G, ⋅G) is a subset H ⊆ G that is itself a group under the same operation. That is: H contains the identity, is closed under the group operation, and contains inverses of all its elements.

This is interesting to look at from the other direction: we can take two groups with the same operation and combine them. The original groups automatically become subgroups. But because we now have elements from both groups that we can combine via the operation, this can open up entirely new elements that were in neither of the original groups.

For example, the rotations {e, r, r2, r3} form a subgroup of D4 (the full symmetry group of the square; note some textbooks call it D8 because of its order being 8). The rotations alone satisfy all group axioms under composition. Now take a single reflection, say s (reflection across the y = 0 axis). This by itself forms a group under composition: {e, s}, since applying s twice gives the identity (s2 = e).

Now combine the rotations with this single reflection. Under composition, entirely new elements appear: s ∘ r gives a reflection across the y = -x line, s ∘ r2 gives a reflection across the y-axis, and s ∘ r3 gives a reflection across the y = x line. None of these were in our original two subgroups, but they emerge from combining elements across them.

Note that the reflections alone ({s, sr, sr2, sr3}) do not form a subgroup: composing two reflections gives a rotation, not a reflection. They are not closed under composition. Think about it. From just one rotation r and one reflection s, we have generated all 8 elements of D4.

The order of a group element g ∈ G, written ord(g), is the smallest positive integer n such that gn = eG. If no such n exists, the element has infinite order.

In other words: ord(g) is how many times you need to apply the group operation before you get back to the identity.

For example, in D4: ord(r) = 4 (four 90° rotations = 360° = identity), ord(r2) = 2 (two 180° rotations = identity), and ord(s) = 2 (reflecting twice = identity).

2. Field

A field (F, +F, ⋅F) is a set of elements f with two binary operations +F (field addition) and F (field multiplication) satisfying the following rules:

  • Closure: for all f1, f2 ∈ F, f1 +F f2 ∈ F and f1F f2 ∈ F
  • Associativity: for all f1, f2, f3 ∈ F, (f1 +F f2) +F f3 = f1 +F (f2 +F f3) and (f1F f2) ⋅F f3 = f1F (f2F f3)
  • Commutativity: for all f1, f2 ∈ F, f1 +F f2 = f2 +F f1 and f1F f2 = f2F f1
  • Distributivity: for all f1, f2, f3 ∈ F, f1F (f2 +F f3) = (f1F f2) +F (f1F f3)
  • Additive identity: there exists an element 0F ∈ F such that for all f ∈ F, f +F 0F = f
  • Multiplicative identity: there exists an element 1F ∈ F (with 1F ≠ 0F) such that for all f ∈ F, f ⋅F 1F = f
  • Additive inverses: for all f ∈ F, there exists an element -f ∈ F such that f +F (-f) = 0F
  • Multiplicative inverses: for all f ∈ F with f ≠ 0F, there exists an element f-1 ∈ F such that f ⋅F f-1 = 1F

Structurally, a field combines two abelian groups tied together by the distributive law: (F, +F) is an abelian group with identity 0F, and (F \ {0F}, ⋅F) is an abelian group with identity 1F. 0F is excluded from the multiplicative group because multiplication by 0F collapses everything onto 0FF f = 0F for all f ∈ F. Thus this map is not bijective, and no element f can satisfy 0FF f = 1F, which means that the multiplicative inverse for 0F does not exist (this is why you cannot divide by 0F - as you might remember from school).

F* (also written F×) is the multiplicative group of the field F excluding the zero element: F* = (F \ {0F}, ⋅F). This is a group under field multiplication with identity 1F.

Reading the notation: {0F} is the set containing just the zero element, and F \ {0F} means "F without zero" (see section 0). So ℝ* = ℝ \ {0} = {..., -2, -1, 0.5, 1, 1.5, 2, ...}, all real numbers except zero.

Examples: real numbers, complex numbers, rational numbers, and finite fields like 𝔽2 (the field with two elements 0 and 1).

2.1 Characteristic

The characteristic of a field F is the smallest positive integer n such that adding 1F to itself n times gives 0F. If no such n exists, the characteristic is defined as 0.

Note that some also define the characteristic as (infinity) - both just meaning that the field has no characteristic. The characteristic of a field tells you how many times you need to add 1F to itself before you loop back to 0F. Similar to how a circle can be seen as a polygon with infinitely many corners or zero corners depending on perspective, a field where this never happens can be said to have characteristic ∞ or 0. By our convention it is 0.

Examples: , , and all have characteristic 0 (you can add 1 to itself forever and never reach 0). The finite field 𝔽2 has characteristic 2 because 1F +F 1F = 0F.

3. Vector Space

Now getting onto vector spaces: A Vector Space V is a concept built on top of a field F. A vector space is comprised of things called vectors, equipped with a vector addition operation +V and a scalar multiplication operation V (multiplying a vector by a scalar from F), fulfilling the axioms below.

The standard way to express vectors are in terms of ordered tuples Fn: (f1, f2, ..., fn) of field elements, where addition and scalar multiplication act component-wise. Ordered means that the order of elements matters (1,2,3) is not the same as (2,1,3) and both are vectors in 3. Think of such a tuple for intuition, the abstract definition allows for more generalized constructs to be vectors (e.g., the set of all polynomials of degree at most n are also a vector space).

Now back to the tuple model with v1 = (f1, f2, f3, ...) and v2 = (h1, h2, h3, ...) (where f1, f2, f3, ... ∈ F and h1, h2, h3, ... ∈ F are field elements):

Vector addition: v1 +V v2 = (f1 +F h1, f2 +F h2, f3 +F h3, ...)

Scalar multiplication: For a scalar c ∈ F, c ⋅V v1 = (c ⋅F f1, c ⋅F f2, c ⋅F f3, ...)

A vector space V over a field F with operations +V and V satisfies:

  • Closure: for all v1, v2 ∈ V, v1 +V v2 ∈ V, and for all c ∈ F, v ∈ V, c ⋅V v ∈ V
  • Commutativity of +V: v1 +V v2 = v2 +V v1
  • Associativity of +V: (v1 +V v2) +V v3 = v1 +V (v2 +V v3)
  • Compatibility: f1V (f2V v) = (f1F f2) ⋅V v
  • Additive distributivity: f ⋅V (v1 +V v2) = (f ⋅V v1) +V (f ⋅V v2)
  • Scalar distributivity: (f1 +F f2) ⋅V v = (f1V v) +V (f2V v)
  • Additive identity: there exists 0V ∈ V such that v +V 0V = v
  • Scalar identity: 1FV v = v
  • Additive inverses: for all v ∈ V, there exists -v ∈ V such that v +V (-v) = 0V

Here 0V is the zero vector (the neutral element of vector addition) - all entries are 0F. And 1F is the multiplicative identity of the field (not an element of V). The additive inverse -v is the vector that "cancels out" v under addition.

Examples:

Note that in the context of vector spaces we often call the field elements scalars - probably because they scale the vector: 2 ⋅V (2,1,3) = (4,2,6) (the vector has 2 times the length of the initial one).

3.1 Span, Basis, and Dimension

The span of a set of vectors {v1, ..., vk} is the set of all vectors you can make by adding and scaling them: span{v1, ..., vk} = {c1V v1 +V ... +V ckV vk : c1, ..., ck ∈ F}.

For example, span{(1,0)} = {(c, 0) : c ∈ ℝ} is the x-axis. span{(1,0), (0,1)} = ℝ2 is the whole plane, because any (a, b) can be written as a ⋅ (1,0) + b ⋅ (0,1).

A set of vectors {v1, ..., vn} is called linearly independent if no vector in the set can be written as a combination of the others. Formally: c1V v1 +V ... +V cnV vn = 0V implies c1 = ... = cn = 0F.

A basis of a vector space V is a set of vectors that is both linearly independent and spans all of V. Every vector in V can be written as a unique combination of basis vectors.

The dimension of a vector space is the number of vectors in any basis. For example, 3 has dimension 3 (one standard basis is {(1,0,0), (0,1,0), (0,0,1)}).

Essentially, the basis is the minimal set of vectors we need to reach every single possible vector in the field (which is what the span expresses) and the dimension is just the minimum number of these vectors. Note that every basis automatically has the minimum number or vectors that create the space that is opened up by the span. However, the span can also have linearly dependent vectors and still open up the same space. It just means that we will end up with more vectors in the span to open up the same space now. Once you choose a basis, every vector becomes a tuple of numbers (its coordinates in that basis), and every linear map becomes a matrix. Different basis, different matrix, same linear map. The basis is the "coordinate system" that turns abstract vectors into concrete numbers.

Examples:

3.2 Subspaces

A subspace W of a vector space V is a subset W ⊆ V that is itself a vector space under the same operations. That is: W contains 0V, is closed under vector addition, and is closed under scalar multiplication.

This is the same concept as for subgroups, just apply it onto vector spaces. Note that V is always a subspace of itself. For example, in 3: the x-axis {(a, 0, 0) : a ∈ ℝ} is a 1D subspace. The xy-plane {(a, b, 0) : a, b ∈ ℝ} is a 2D subspace. The set {(1, 0, 0)} alone is not a subspace (it does not contain the zero vector and is not closed under addition). Every span of vectors is a subspace.

3.3 Inner Product

An inner product on a vector space V over (or ) is a function ⟨-,-⟩: V × V → F that assigns a scalar to each pair of vectors, satisfying:

  • Conjugate symmetry: ⟨v, w⟩ = ⟨w, v⟩* (over this is just ⟨v, w⟩ = ⟨w, v⟩)
  • Linearity in the second argument: ⟨v, c1w1 + c2w2⟩ = c1⟨v, w1⟩ + c2⟨v, w2
  • Positive definiteness: ⟨v, v⟩ ≥ 0, with equality only if v = 0V

The inner product lets you measure lengths (||v|| = √⟨v, v⟩) and angles between vectors. The standard dot product on n is the most familiar example: ⟨(a1, a2), (b1, b2)⟩ = a1b1 + a2b2. (The concept of length and angle is more flexible in this context than what you know from geometry - it is dependant on how we define the inner product operation concretely.)

Think of an inner product as an additional structure on top of a vector space. Not every vector space comes with one.

Examples on 2:

Different inner products on the same vector space lead to different notions of length and angle. Each inner product implicitly defines a way to measure distances, called a metric. The Euclidean inner product gives the Euclidean metric (straight-line distance), while the Minkowski inner product gives the Minkowski metric (spacetime intervals). A metric is actually a more general concept that does not require an inner product at all:

3.4 Metric

A metric on a set X is a function d: X × X → ℝ satisfying:

  • Non-negativity: d(x, y) ≥ 0, with d(x, y) = 0 if and only if x = y
  • Symmetry: d(x, y) = d(y, x)
  • Triangle inequality: d(x, z) ≤ d(x, y) + d(y, z)

A metric is much more general than an inner product: it just needs a set, not a vector space. You can define metrics on graphs, strings, surfaces, anything where "distance" makes sense. Every inner product induces a metric via d(v, w) = ||v - w|| = √⟨v - w, v - w⟩, but not every metric comes from an inner product.

3.5 Hilbert Space

A Hilbert space is a vector space with an inner product that is complete: every Cauchy sequence of vectors converges to a vector in the space. (A Cauchy sequence is one where the elements get arbitrarily close to each other as the sequence progresses.)

Every finite-dimensional inner product space is automatically a Hilbert space (completeness is only a concern in infinite dimensions). The simplest example is n with the standard inner product. In physics, quantum mechanics uses infinite-dimensional Hilbert spaces (such as L2, the space of square-integrable functions) to describe states with continuously many degrees of freedom. See Wigner's theorem for why Hilbert spaces are central to quantum mechanics (you probably will still need to do some research to fully understand it).

4. Linearity, Linear Maps, and GL(V)

A map T is called linear if it satisfies two properties: it preserves addition (T(a + b) = T(a) + T(b)) and it preserves scalar multiplication (T(c ⋅ a) = c ⋅ T(a)).

Intuitively, a linear map "plays nice" with the structure of a vector space. Scaling the input scales the output by the same amount, and adding two inputs then mapping gives the same result as mapping each input and then adding. Anything that bends, shifts, or warps in a way that breaks these rules is considered nonlinear.

Example: T(x) = 3x is linear (tripling respects addition and scaling). But T(x) = x2 is not: T(2+3) = T(5) = 25, but T(2) + T(3) = 4 + 9 = 13 ≠ 25.

More precisely: let V and W be vector spaces over the same field F. A function T: V → W is a linear map if it satisfies:

  • For all v ∈ V and for all f ∈ F, T(f ⋅V v) = f ⋅W T(v) (note the V on the left vs W on the right: the scalar multiplications live in different vector spaces)
  • For all v1, v2 ∈ V, T(v1 +V v2) = T(v1) +W T(v2)

Matrices are the concrete way to write down linear maps once you have chosen a basis to represent the vectors. Every linear map between finite-dimensional vector spaces can be represented as a matrix, and every matrix defines a linear map.

4.1 Operators

An operator on a vector space V is a map T: V → V (from the space to itself). A linear operator is an operator that is also a linear map.

The distinction from a general linear map: a linear map goes from V to some other space W, while an operator maps V back to itself. This means you can apply an operator repeatedly under composition (T(T(v)), written as T2(v)) and compose operators with each other, which is not possible when the domain V and codomain W differ.

4.2 GL(V)

The general linear group GL(V) is the group of all invertible linear operators on V (equivalently, all invertible square matrices once a basis is chosen).

GL(V) is a group under composition: the product of two invertible linear operators is invertible, the identity operator is the neutral element, and every invertible operator has an inverse.

Note how GL(V) is specifically the group of invertible linear operators, not all operators.

5. Homomorphism, Isomorphism, Endomorphism, and Automorphism

A homomorphism φ is a function from a group (G, ⋅G) to a group (H, ⋅H) (reminder the group operations can be addition, multiplication, and more): φ: G → H The homomorphism φ has the property that for all g1, g2 ∈ G, φ(g1G g2) = φ(g1) ⋅H φ(g2).

A map f: X → Y is called:

  • Injective (one-to-one): if f(x1) = f(x2) implies x1 = x2. No two distinct elements map to the same output.
  • Surjective (onto): if for every y ∈ Y there exists an x ∈ X such that f(x) = y. Every element in the target is hit.
  • Bijective: if it is both injective and surjective. Every element in Y is hit exactly once.

An isomorphism is a homomorphism φ: G → H that is also a bijection. It guarantees that each element in H is uniquely traceable back to exactly one element in G.

An endomorphism is a homomorphism from a group to itself: φ: G → G.

An automorphism is an endomorphism that is also a bijection (i.e., an isomorphism from a group to itself): φ: G → G with φ bijective. It maps each element of a group back onto an element of the same group (this does not mean that the element maps to itself!).

A general convention is to write products like g1G g2 or φ(g1) ⋅H φ(g2) as g1g2 and φ(g1)φ(g2), the multiplicative group operation is implicit. Meanwhile we will keep writing the additive operations with the explicit + sign.

Examples:

Appendix: How 𝔽2 Works

The finite field 𝔽2 = {0, 1} is the smallest possible field. It has just two elements, and all arithmetic is done modulo 2 (the remainder after dividing by 2).

Addition table (modulo 2):

Multiplication table (modulo 2):

You can verify all field axioms hold: 0 is the additive identity, 1 is the multiplicative identity, every element has an additive inverse (0 is its own inverse, 1 is its own inverse since 1 + 1 = 0), and the only nonzero element (1) has a multiplicative inverse (1-1 = 1).

This field has characteristic 2 (section 2.1) because 1 + 1 = 0. It appears in error-correcting codes, cryptography, and computer science (binary arithmetic is essentially 𝔽2).