How to Understand Calculus: A Beginner's Guide to Limits and Differentiation
What Is Calculus?
Calculus is a study of rates of change of functions and accumulation of infinitesimally small quantities. It can be broadly divided into two branches:
- Differential Calculus. This concerns rates of changes of quantities and slopes of curves or surfaces in 2D or multidimensional space.
- Integral Calculus. This involves summing infinitesimally small quantities.
What's Covered in this Tutorial
In this first part of a two part tutorial you will learn about:
- Limits of a function
- How the derivative of a function is derived
- Rules of differentiation
- Derivatives of common functions
- What the derivative of a function means
- Working out derivatives from first principles
- 2nd and higher order derivatives
- Applications of differential calculus
- Worked examples
Who Invented Calculus?
Calculus was invented by the English mathematician, physicist and astronomer Isaac Newton and German mathematician Gottfried Wilhelm Leibniz independently of each other in the 17th century.
What Is Calculus Used For?
Calculus is used widely in mathematics, science, in the various fields of engineering and economics.
Introduction to Limits of Functions
To understand calculus, we first need to grasp the concept of limits of a function.
Imagine we have a continuous line function with the equation f(x) = x + 1 as in the graph below.
The value of f(x) is simply the value of the x coordinate plus 1.
The function is continuous which means that f(x) has a value that corresponds to all values of x, not just the integers ....-2, -1, 0, 1, 2, 3.... and so on, but all the intervening real numbers. I.e. decimals numbers like 7.23452, and irrational numbers like π, and √3.
So if x = 0, f(x) = 1
if x = 2, f(x) = 3
if x = 2.3, f(x) = 3.3
if x = 3.1, f(x) = 4.1 and so on.
Let's concentrate on the value x =3, f(x) = 4.
As x gets closer and closer to 3, f(x) gets closer and closer to 4.
So we could make x = 2.999999 and f(x) would be 3.999999.
We can make f(x) as close to 4 as we want. In fact we can choose any arbitrarily small difference between f(x) and 4 and there will be a correspondingly small difference between x and 3. But there will always be a smaller distance between x and 3 that produces a value of f(x) closer to 4.
So What's the Limit of a Function Then?
Referring to the graph again, the limit of f(x) at x = 3 is the value f(x) approaches as x gets closer to 3. Not the value of f(x) at x=3, but the value it approaches. As we'll see later, the value of a function f(x) may not exist at a certain value of x, or it may be undefined.
Formal Definition of a Limit
The (ε, δ) Cauchy definition of a limit:
The formal definition of a limit was specified by the mathematicians Augustin-Louis Cauchy and Karl Weierstrass
Let f(x) be a function defined on a subset D of the real numbers R.
c is a point of the set D. ( The value of f(x) at x = c may not necessarily exist)
L is a real number.
lim f(x) = L
x → c
- Firstly for every arbritarily small distance ε > 0 there exists a value δ such that, for all x belonging to D and 0 > | x - c | < δ, then | f(x) - L | < ε
- and secondly the limit approaching from the left and right of the x coordinate of interest must be equal.
In plain English, this says that the limit of f(x) as x approaches c is L, if for every ε greater than 0, there exists a value δ, such that values of x within a range of c ± δ (excluding c itself, c + δ and c - δ) produces a value of f(x) within L± ε.
....in other words we can make f(x) as close to L as we want by making x sufficiently close to c.
This definition is known as a deleted limit because the limit omits the point x = c.
Intuitive Concept of a Limit
We can make f(x) as close as possible to L by making x sufficiently close to c, but not equal to c.
Continuous and Discontinuous Functions
A function is continuous at a point x = c on the real line if it is defined at c and the limit equals the value of f(x) at x = c. I.e:
lim f(x) = L = f(c)
x → c
A continuous function f(x) is a function that is continuous at every point over a specified interval.
Examples of continuous functions:
- Temperature in a room versus time.
- The speed of a car as it changes over time.
A function that is not continuous, is said to be discontinuous. Examples of discontinuous functions are:
- Your bank balance. It changes instantly as you lodge or withdraw money.
- A digital signal, it's either 1 or 0 and never in between these values.
Limits of Common Functions
1/x as x tends to infinity
a/(a + x) as x tends to 0
sin x/x as x tends to 0
Calculating the Velocity of a Vehicle
Imagine we record the distance a car travels over a period of one hour. Next we plot all the points and join the dots, drawing a graph of the results (as shown below). On the horizontal axis, we have the time in minutes and on the vertical axis we have the distance in miles. Time is the independent variable and distance is the dependent variable. In other words, the distance travelled by the car depends on the time which has passed.
If the car travels at a constant velocity, the graph will be a line, and we can easily work out its velocity by calculating the slope or gradient of the graph. To do this in the simple case where the line passes through the origin, we divide the abscissa (horizontal distance from a point on the line to the origin) by the ordinate (vertical distance from a point on the line to the origin).
So if it travels 25 miles in 30 minutes,
Velocity = 25 miles/30 minutes = 25 miles / 0.5 hour = 50 mph
Similarly if we take the point at which it has travelled 50 miles, the time is 60 minutes, so:
Velocity is 50 miles/60 minutes = 50 miles / 1 hour = 50 mph
Note: In physics we normally speak of the "velocity" of a body. Technically, the definition of velocity is speed in a given direction, so it is a vector quantity. Speed is the magnitude of the velocity vector.
Average Velocity and Instantaneous Velocity
Ok, so this is all fine if the vehicle is travelling at a steady velocity. We just divide distance by time taken to get velocity. But this is the average velocity over the 50 mile journey. Imagine if the vehicle was speeding up and slowing down as in the graph below. Dividing distance by time still gives the average velocity over the journey, but not the instantaneous velocity which changes continuously. In the new graph, the vehicle accelerates mid way through the journey and travels a much greater distance in a short period of time before slowing down again. Over this period, its velocity is much higher.
In the graph below, if we denote the small distance travelled by Δs and the time taken as Δt, again we can calculate velocity over this distance by working out the slope of this section of the graph.
So average velocity over interval Δs = slope of graph = Δs/Δt
However the problem is that this still only gives us an average. It's more accurate than working out velocity over the full hour, but it's still not the instantaneous velocity. The car travels faster at the start of the interval Δs (we know this because distance changes more rapidly and the graph is steeper). Then the velocity starts to decrease midway and reduces all the way to the end of the interval Δs.
What we're aiming to do is find a way of determining the instantaneous velocity.
We can do this by making Δs and Δt smaller and smaller so we can work out the instantaneous velocity at any point on the graph.
See where this is heading? We're going to use the concept of limits we learned about before.
What is Differential Calculus?
Differential calculus is one of the two branches of calculus which also includes integral calculus. It is a study of the rate at which quantities change.
In the example above we saw how we could attempt to determine a more accurate measurement of velocity by working out the slope of a graph over a shorter interval. We can do this using limits.
Slope of a graph
In the graph below we have a generalised function y = f (x).
x is a point on the horizontal axis and Δx is a small change in x.
The value of the function at x is f (x)
As x changes to x + Δx , f (x) changes by Δy to f (x + Δx)
You may remember from coordinate geometry if we know two points on a graph: (x1, y1) and (x2, y2), the slope of a line joining the two points is:
(y2 - y1) / ( x2 - x1)
So the slope of this line is ( f (x + Δx) - f (x) ) / (x + Δx - x) = Δy / Δx
The slope Δy / Δx is approximately the slope of a tangent to the graph for small Δx.
What Happens When ΔX Becomes Smaller and Smaller?
The red line that intersects the graph at two points in the diagram above is called a secant.
If we now make Δx and Δy smaller and smaller, the red line eventually becomes a tangent to the curve. The slope of the tangent is the instantaneous rate of change of f (x) at the point x.
Derivative of a function
If we take the limit of the value of the slope as Δx tends to zero, the result is called the derivative of y = f (x).
lim Δx → 0 (Δy / Δx) = lim Δx → 0 ( f (x + Δx) - f (x) ) / (x + Δx - x) = dy/dx
The derivative of y = f (x) with respect to (wrt) x is written as dy/dx or f '(x) or just f ' and is also a function of x. I.e. it varies as x changes.
If the independent variable is time, the derivative is sometimes denoted by the variable with a dot superimposed on top.
E.g. if a variable x represents position and x is a function of time. I.e. x(t)
Derivative of x wrt t is dx/dt = ẋ (ẋ or dx/dt is speed, the rate of change of position)
We can also denote the derivative of f (x) wrt x as d/dx(f (x))
What Is the Derivative of a Function?
The derivative of a function f(x) is the rate of change of that function with respect to the independent variable x.
If y = f(x), dy/dx is the rate of change of y as x changes.
Differentiating Functions from First Principles
To find the derivative of a function, we differentiate it wrt to the independent variable. There are several identities and rules to make this easier, but first let's try to work out an example from first principles.
Example: Evaluate the derivative of x2
So f (x) = x2
f (x + Δx) = (x + Δx)2
d/dx( f (x)) = lim Δx → 0 ( f (x + Δx) - f (x) ) / ((x + Δx) - x)
= lim Δx → 0 ( (x + Δx)2 - x2 ) / Δx)
Expand out (x + Δx)2
= limΔx → 0 ( (x2 + 2xΔx + Δx2- x2 ) / Δx)
The two x2 cancel out so:
= limΔx → 0 ( (2xΔx + Δx2 ) / Δx)
Dividing by Δx gives:
= limΔx → 0 (2x + Δx )
Using Rules to Work out Derivatives
Rather than working out the derivatives of functions from first principles, we normally use a set of rules to make things easier.
In the table below, f and g are two functions.
f ' is the derivative of f
g ' is the derivative of g
Rules of Differentiation
Constant factor rule
Power rule for polynomials
f + g
f ' + g '
f - g
f ' - g '
fg ' + gf '
Reciprocal of a function
- f ' / f ^2
f / g
(f 'g - g ' f )/ g ^2
Chain rule (function of a function rule)
f(g) where g is a function of x
f ' (g) g ' (x)
Derivatives of Common Functions
Line with y axis intercept
mx + c
x ^ 2
x ^ 3
x ^ (1/2)
(1/2) x ^ (-1/2)
- sin (x)
1 + (tan (x))^2 = (cosec (x))^2
Examples of Working Out Derivatives
What is the derivative of 20?
Derivative of a constant is 0, so d/dx(20) = 0
What is the derivative of 6x3
Using the multiplication by a constant rule, d/dx(6x3) = 6 ( d/dx(x3) )
From the power rule, d/dx(x3) = 3x2
So d/dx(6x3) = 6 ( d/dx(x3) ) = 6 (3x2) = 18x2
Evaluate the derivative of 5sin (x) + 6x5
We use the sum rule to find the derivatives of 5sin (x) and 6x5 and then add the result together.
So d/dx (5sin (x)) = 5d/dx(sin (x)) = 5cos (x)
d/dx(6x5) = 6d/dx(x5) = 6(5x4) = 30x4
Adding the results together
d/dx (5sin (x) + 6x5) = 5cos (x) + 30x4
What is the derivative of x3sin (x) ?
Use the product rule, so:
d/dx(x3sin (x)) = x3d/dx(sin(x)) + sin(x)d/dx(x3)
d/dx (sin(x)) = cos(x)
d/dx(x3) = 3x2 (from the power rule)
so d/dx(x3sin (x)) = x3d/dx(sin(x)) + sin(x)d/dx(x3) = x3cos(x) + 3x2sin(x)
Evaluate the derivative of tan (x)
tan (x) = sin (x) / cos (x)
We can use the quotient rule to work this out:
d/dx (f(x)/g(x)) = (f '(x)g(x) - g '(x)f (x))/ g(x)2
so d/dx(sin (x) / cos (x)) = (d/dx(sin (x))cos (x) - d/dx(cos (x))sin (x)) / cos2 (x)
d/dx(sin (x)) = cos (x)
d/dx(cos (x)) = - sin x
(d/dx(sin (x))cos (x) - d/dx(cos (x))sin (x)) / cos2(x)
= (cos (x)cos (x) - (-sin (x))sin (x)) / cos2(x)
= (cos2(x) + sin2(x)) / cos2(x)
= 1 + tan2(x) = sec2(x)
What is the derivative of ln(5x3) ?
We use the chain rule to work this out.
For two functions f(g) and g(x)
df/dx = (df/dg)(dg/dx)
Let g(x) = 5x3
and f(g) = ln(g)
df/dg = 1/g
dg/dx = 5(3x2) = 15x2
df/dx = (df/dg)(dg/dx)
Substituting for g:
= 1/(5x3)((15x2) = 3/x
We could also have evaluated the derivative by first using the rules of logarithms to simplify the expression.
So ln(5x3) = ln(5) + ln(x3) = ln(5) + 3ln(x) ............(product rule and power rule)
d/dx (ln(5) + 3ln(x)) = d/dx (ln(5)) + d/dx(3ln(x)) = 0 + 3d/dx(ln(x))
= 0 + 3(1/x) = 3/x
Positive and Negative Values of the Derivative
The animation below shows the function sin(Ө) and it's derivative cos(Ө). At Ө = 0, the value of the derivative is cos(Ө) = cos(0) = 1. As Ө increases, the value of cos(Ө) decreases, i.e the slope of the tangent to sin(Ө) becomes smaller. Eventually at Ө = π/2, the slope is zero. This is an important point because as we'll see later, we can use this fact to find the maxima and minima of functions.
As Ө exceeds π/2, the value of the derivative becomes negative.
Remember the definition of the derivative?
Δx is positive, but the change Δy is negative since f (x +Δx) - f (x) is negative because f (x +Δx) < f (x)......(the function is decreasing in value)
In the limit as Δx and Δy tend to zero, the derivative is also less than zero.
Sin(Ө) and Its Derivative Cos(Ө)
2nd and Higher Order Derivatives
What happens if we take the derivative of a derivative?
Consider the function y = f (x)
The derivative of y is f '(x) or dy/dx
The derivative of dy/dx is known as the second derivative or second order derivative and is denoted by d2y/dx2 or f '' (x) ......(f with a double dash). We can have third and higher order derivatives so for instance the third order derivative of y is d3y/dx3
(1) If y = sin (x), what is d2y/dx2 ?
dy/dx = cos (x)
d2y/dx2 = d/dx (cos (x)) = -sin (x)
(2) What is the second derivative of ln (x)?
y = ln (x)
So dy/dx = 1/x = x(-1)
d2y/dx2 = d/dx (x(-1)) = - x(-2) = -1/x2
Since dy/dx is the rate of change of a function, roughly speaking we can think of d2y/dx2 as being the rate at which dy/dx itself is changing.
Returning to the car example:
s(t) is a function describing how distance travelled changes with time.
ds/dt is the rate of change of position, called speed or velocity.
d2s/dt2 is the rate of change of velocity, which is called acceleration.
If v is the velocity of the vehicle:
v = ds/dt
d2s/dt2 = d/dt(ds/dt) = d/dt(v) = dv/dt
So the second derivative of distance which is acceleration is equal to the first derivative of velocity.
We can go up to the third derivative of s, so:
Using the Derivative to Find the Maxima, Minima and Turning Points of Functions
We can use the derivative to find the maxima and minima of a function (the points at which the function has maximum and minimum values. These points are called turning points because the derivative changes sign from positive to negative or vice versa. For a function f (x), we do this by:
- differentiating f (x) wrt x
- equating f ' (x) to 0
- and finding the roots of the equation, i.e. the values of x that make f '(x) = 0
Find the maxima or minima of the quadratic function f (x) = 3x2 + 2x +7 (the graph of a quadratic function is called a parabola. See image below)
- f (x) = 3x2 + 2x +7
- f '(x) = 3(2x1) + 2(1x0) + 0 = 6x + 2
- Set f '(x) = 0
6x + 2 = 0
- Solve 6x + 2 = 0
6x = -2
giving x = - 1/3
and f(x) = 3x2 + 2x +7 = 3(-1/3)2 + 2(-1/3) + 7 = 6 2/3
A quadratic function has a maximum when the coefficient of x < 0 and a minimum when the coefficient > 0. In this case since the coefficient of x was 2, we have worked out the minimum and it occurs at the point (- 1/3, 6 2/3).
In the diagram above, a looped piece of string of length p is stretched into the shape of a rectangle. The sides of the rectangle are of length a and b. Depending on how the string is arranged, a and b can be varied and different areas of rectangle can be enclosed by the string. What is the maximum area that can be enclosed and what will be the relationship between a and b in this scenario?
p is the length of the string
The perimeter p = 2a + 2b (the sum of the 4 side lengths)
Call the area y
and y = ab
We need to find an equation for y in terms of one of the sides a or b, so we need to eliminate either of these variables.
Let's try to find b in terms of a:
So p = 2a + 2b
2b = p - 2a
b = (p - 2a)/2
y = ab
Substituting for b gives:
y = ab = a(p - 2a)/2 = ap/2 - a2
Work out the derivative dy/da and set it to 0:
dy/da = p/2 - 2a
Set to 0:
p/2 - 2a = 0
2a = p/2
so a = p/4
We can use the perimeter equation to work out b, but it's obvious that if a = p/4 the opposite side is p/4, so the two sides together make up half the length of the string which means both of the other sides together are half the length. In other words maximum area occurs when all sides are equal. I.e when the enclosed area is a square.
So area y = (p/4)(p/4) = p2/16
Example 3 (Max Power Transfer Theorem or Jacobi's Law):
The image above shows the simplified electrical schematic of a power supply. All power supplies have an internal resistance (RINT) which limits how much current they can supply to a load (RL). Calculate in terms of RINT the value of RL at which maximum power transfer occurs.
The current I through the circuit is given by Ohm's Law:
So I = V/(RINT+ RL)
Power = Current squared x resistance
So power dissipated in the load RL is given by the expression:
P = I2RL
Substituting for I:
= (V/(RINT+ RL))2RL
= V2RL/(RINT+ RL)2
Expanding the denominator:
= V2RL/(R2INT + 2RINTRL + R2L)
and dividing above and below by RL gives:
P = V2 / (R2INT / RL + 2RINT + RL)
Rather than finding when this is a maximum, it's easier to find when the denominator is a minimum and this gives us the point at which maximum power transfer occurs, i.e. P is a maximum.
So the denominator is R2INT / RL + 2RINT + RL
Differentiate it wrt RL giving:
d/dRL (R2INT / RL + 2RINT + RL) = -R2INT / R2L + 0 + 1
Set it to 0:
-R2INT / R2L + 0 + 1 = 0
R2INT / R2L = 1
and solving gives RL = RINT.
So max power transfer occurs when RL = RINT.
This is called the max power transfer theorem.
Up Next !
This second part of this two part part tutorial covers integral calculus and applications of integration.
© 2019 Eugene Brennan