DSC2606 SG 1 E 2018

DSC2606/1 Study Guide Nonlinear Mathematical Programming DSC2606 Department of Decision Sciences IMPORTANT INFORMATIO...

4 downloads 57 Views 1MB Size
DSC2606/1

Study Guide Nonlinear Mathematical Programming DSC2606

Department of Decision Sciences

IMPORTANT INFORMATION Please register on myUnisa, activate your myLife e-mail address and make sure that you have regular access to the myUnisa module website, as well as your group website.

Bar code

c

2017 Department of Decision Sciences, University of South Africa. All rights reserved. Printed and distributed by the University of South Africa, Muckleneuk, Pretoria. DSC2606/1

Contents Chapter 1

Basics

3

1.1

Historical background . . . . . . . . . . . . . . . . . . . . . .

4

1.2

The scientific approach . . . . . . . . . . . . . . . . . . . . .

5

1.3

What is a model? . . . . . . . . . . . . . . . . . . . . . . . .

6

1.4

What does a mathematical model look like? . . . . . . . . . .

7

1.5

Mathematical programming and linear programming . . . . .

9

1.6

Properties and assumptions of linear programming . . . . . .

9

1.7

Other operations research techniques . . . . . . . . . . . . . . 10

Chapter 2

Introducing linear programming models

13

2.1

Types of problem . . . . . . . . . . . . . . . . . . . . . . . . 14

2.2

Mark’s LP model . . . . . . . . . . . . . . . . . . . . . . . . 16

2.3

Christine’s LP model . . . . . . . . . . . . . . . . . . . . . . 18

2.4

Components of an LP model . . . . . . . . . . . . . . . . . . 20

2.5

General LP model . . . . . . . . . . . . . . . . . . . . . . . . 22

Chapter 3

Graphical representation

23

3.1

A graphical approach . . . . . . . . . . . . . . . . . . . . . . 24

3.2

Finding the feasible area . . . . . . . . . . . . . . . . . . . . 25

3.3

Identifying the optimal solution . . . . . . . . . . . . . . . . . 29

3.4

Solving Christine’s problem graphically . . . . . . . . . . . . 35

3.5

Types of solution . . . . . . . . . . . . . . . . . . . . . . . . 38 3.5.1

Infeasible LPs . . . . . . . . . . . . . . . . . . . . . . 38

3.5.2

Unbounded solutions . . . . . . . . . . . . . . . . . . 38

iii

DSC2606 CONTENTS

3.6

3.5.3

Multiple optimal solutions . . . . . . . . . . . . . . . 40

3.5.4

Degenerate solutions . . . . . . . . . . . . . . . . . . 41

Types of constraint . . . . . . . . . . . . . . . . . . . . . . . 42 3.6.1

Redundant constraints . . . . . . . . . . . . . . . . . 42

3.6.2

Binding and nonbinding constraints . . . . . . . . . . 42

3.7

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.8

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 44

Chapter 4

49

4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2

Using LINGO . . . . . . . . . . . . . . . . . . . . . . . . . . 50

4.3

Solving Mark’s problem with LINGO . . . . . . . . . . . . . 51

4.4

Solving Christine’s problem with LINGO . . . . . . . . . . . 53

4.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.6

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 54

Chapter 5

Introductory concepts

59

5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2

Linear versus nonlinear . . . . . . . . . . . . . . . . . . . . . 61

5.3

Examples of NLP categories . . . . . . . . . . . . . . . . . . 62

5.4

Assumptions of NLP . . . . . . . . . . . . . . . . . . . . . . 63

5.5

Solutions to models – basic concepts . . . . . . . . . . . . . . 64

5.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 65

Chapter 6

iv

Computer solutions

Formulating NLP models and computer solutions

67

6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

6.2

A dimensions problem . . . . . . . . . . . . . . . . . . . . . 68

6.3

An inventory problem . . . . . . . . . . . . . . . . . . . . . . 70

6.4

LINGO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.5

Formulation and LINGO solution of NLP models . . . . . . . 74

6.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

DSC2606 CONTENTS

6.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 75

6.8

Looking ahead at the remaining study units . . . . . . . . . . 77

Chapter 7

Limits and continuity

79

7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2

The limit of a function . . . . . . . . . . . . . . . . . . . . . 80

7.3

Infinite limits . . . . . . . . . . . . . . . . . . . . . . . . . . 90

7.4

Limits when x tends to infinity . . . . . . . . . . . . . . . . . 92

7.5

Continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 101

Chapter 8

The derivative of a function

103

8.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 104

8.2

The slope of a tangent line . . . . . . . . . . . . . . . . . . . 106

8.3

The rate of change . . . . . . . . . . . . . . . . . . . . . . . 109

8.4

The derivative of a function . . . . . . . . . . . . . . . . . . . 109

8.5

Differentiability and continuity . . . . . . . . . . . . . . . . . 114

8.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 117

Chapter 9

The rules of differentiation

119

9.1

Four basic rules . . . . . . . . . . . . . . . . . . . . . . . . . 120

9.2

The product rule . . . . . . . . . . . . . . . . . . . . . . . . . 125

9.3

The derivative of the exponential function . . . . . . . . . . . 125

9.4

The derivative of the logarithmic function . . . . . . . . . . . 126

9.5

Higher-order derivatives . . . . . . . . . . . . . . . . . . . . 127

9.6

The chain rule . . . . . . . . . . . . . . . . . . . . . . . . . . 128

9.7

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133

9.8

Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 134

v

DSC2606 CONTENTS

Chapter 10 Properties of functions and sketching graphs

137

10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 138 10.2 Increasing and decreasing functions . . . . . . . . . . . . . . 138 10.3 Relative and absolute extrema . . . . . . . . . . . . . . . . . 141 10.4 Concavity . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148 10.5 The second derivative test . . . . . . . . . . . . . . . . . . . . 152 10.6 Asymptotes . . . . . . . . . . . . . . . . . . . . . . . . . . . 154 10.7 Sketching graphs . . . . . . . . . . . . . . . . . . . . . . . . 157 10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160 10.9 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 161 Chapter 11 Zeros of functions or roots of equations

167

11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 168 11.2 Locating roots of equations . . . . . . . . . . . . . . . . . . . 168 11.3 Bisection method . . . . . . . . . . . . . . . . . . . . . . . . 170 11.3.1 Computer algorithms . . . . . . . . . . . . . . . . . . 173 11.4 Newton’s method . . . . . . . . . . . . . . . . . . . . . . . . 176 11.4.1 Computer algorithms . . . . . . . . . . . . . . . . . . 179 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 11.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 183 Chapter 12 Marginal analysis

185

12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 186 12.2 Cost functions . . . . . . . . . . . . . . . . . . . . . . . . . . 186 12.3 Average cost functions . . . . . . . . . . . . . . . . . . . . . 189 12.4 Revenue functions . . . . . . . . . . . . . . . . . . . . . . . . 192 12.5 Profit functions . . . . . . . . . . . . . . . . . . . . . . . . . 194 12.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 195 12.7 Solution to exercises . . . . . . . . . . . . . . . . . . . . . . 195 vi

DSC2606 CONTENTS

Chapter 13 Optimisation of NLPs in one variable

199

13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 200 13.2 Vital theorems . . . . . . . . . . . . . . . . . . . . . . . . . . 200 13.3 Solving NLPs in one variable by differential calculus . . . . . 201 13.4 Returning to LINGO . . . . . . . . . . . . . . . . . . . . . . 206 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207 13.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 208 Chapter 14 Golden section search

213

14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 214 14.2 The golden section search . . . . . . . . . . . . . . . . . . . . 214 14.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 14.4 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 219 Chapter 15 Integration

223

15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 15.2 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . . . 224 15.3 The indefinite integral . . . . . . . . . . . . . . . . . . . . . . 227 15.4 The basic rules of integration . . . . . . . . . . . . . . . . . . 228 15.5 Integration by substitution . . . . . . . . . . . . . . . . . . . 233 15.6 The area under the graph of a function . . . . . . . . . . . . . 237 15.7 The fundamental theorem of calculus . . . . . . . . . . . . . . 241 15.8 The method of substitution for definite integrals . . . . . . . . 245 15.9 Consumers’ and producers’ surplus . . . . . . . . . . . . . . . 248 15.9.1 Consumers’ surplus . . . . . . . . . . . . . . . . . . . 248 15.9.2 Producers’ surplus . . . . . . . . . . . . . . . . . . . 248 15.10Numerical integration . . . . . . . . . . . . . . . . . . . . . . 251 15.10.1 Trapezoidal rule . . . . . . . . . . . . . . . . . . . . 252 15.10.2 Computer algorithm for trapezoidal rule . . . . . . . . 254 15.11Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.12Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 257

vii

DSC2606 CONTENTS

Chapter 16 Partial differentiation

263

16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 16.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . . . . 266 16.3 Second-order partial derivatives . . . . . . . . . . . . . . . . 272 16.4 The Cobb-Douglas production function . . . . . . . . . . . . 275 16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277 16.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 277 Chapter 17 Optimisation of NLPs in several variables

281

17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 282 17.2 Convex and concave functions . . . . . . . . . . . . . . . . . 283 17.3 Stationary points and the nature of stationary points . . . . . . 287 17.4 Solving NLPs in several variables by differential calculus . . . 289 17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 17.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 294 Chapter 18 Method of steepest ascent

301

18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 302 18.2 The method of steepest ascent . . . . . . . . . . . . . . . . . 302 18.3 The method of steepest descent . . . . . . . . . . . . . . . . . 306 18.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 306 18.5 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 306 Chapter 19 Lagrange multipliers

309

19.1 The method of Lagrange multipliers . . . . . . . . . . . . . . 310 19.2 Verification . . . . . . . . . . . . . . . . . . . . . . . . . . . 312 19.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313 19.4 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 313 Chapter 20 Kuhn-Tucker conditions

317

20.1 Kuhn-Tucker conditions . . . . . . . . . . . . . . . . . . . . 318 20.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 20.3 Solutions to exercises . . . . . . . . . . . . . . . . . . . . . . 322

viii

Part 1 Linear Programming

This part introduces the concept of mathematical programming. The notes are a summary of the part on linear programming in the study guide of the module DSC2605. This background is necessary for placing nonlinear programming in context and for understanding the use of LINGO to solve mathematical programming problems. You must familiarise yourself with this background to mathematical programming, but you will not be examined explicitly on the study units in this part.

1

2

Chapter

1

Basics

Contents 1.1

Historical background . . . . . . . . . . . . . . . . . . .

4

1.2

The scientific approach . . . . . . . . . . . . . . . . . .

5

1.3

What is a model? . . . . . . . . . . . . . . . . . . . . . .

6

1.4

What does a mathematical model look like? . . . . . . .

7

1.5

Mathematical programming and linear programming .

9

1.6

Properties and assumptions of linear programming . .

9

1.7

Other operations research techniques . . . . . . . . . . 10

3

DSC2606 CHAPTER 1 BASICS

Sections from prescribed book, Winston Chapter 1 Chapter 3, Section 3.1

Learning objectives After completing this study unit you should be able to • explain what “operations research” is • describe how the scientific approach is applied to Operations Research/Quantitative Management • explain what a mathematical model is • explain what “mathematical programming” is • explain what “linear programming” is • give the properties and assumptions of linear programming • give examples of operations research techniques and their areas of application.

1.1 Historical background In the period between the two great wars, military technology developed by leaps and bounds without military leaders having the chance to apply all the new weapons together or to experiment with new weapon systems. When the German air attacks on England started during World War II, some of the British military managers and leaders recognised this fact and realised that they were confronted with problems for which there were no parallels in history. In order to maximise their war effort, the British government organised teams of scientific and engineering personnel to assist field commanders in solving perplexing strategic and tactical problems. They found that technically trained men could solve problems outside their normal professional competency. They asked biologists to examine problems in electronics, physicists to think in terms of the movement of people rather than the movement of molecules, mathematicians to apply probability theory to improve soldiers’ chances of survival, and chemists to study equilibria in systems other than in chemicals. Teams made up of specialists from these different disciplines studied problems ranging from the evaluation of cost and effectiveness of complete military systems (such as the defence system of a country) to the best placement of depth charges in anti-submarine warfare. 4

DSC2606 1.2. THE SCIENTIFIC APPROACH

The success of the British operational research teams led the United States to institute a similar effort in 1942 (small-scale projects dated back to 1937). The initial project involved the deployment of merchant marine convoys to minimise losses from enemy submarines. These mathematical and scientific approaches to military operations were then called operations research. The success of operations research techniques during the Second World War led to the techniques being extended and applied to other fields after the war. Today the term “operations research” means a scientific approach to decision making, which seeks to determine how best to design and operate a system, usually under conditions requiring the allocation of scarce resources. Operations research is now applied so extensively in many areas that a separate discipline has been established. The diversity of this discipline has led to its being known not only as operations research, but also by terms such as management science, decision science, decision analysis and quantitative management.

1.2 The scientific approach Descartes, a French philosopher and mathematician of the 17th century, emphasised the use of reason as the chief tool of enquiry. The scientific approach, also known as the scientific method, is a formalised reasoning process. It consists of the following steps: (a) The problem for analysis is defined, and the conditions for observation are determined. (b) Observations are made under different conditions to determine the behaviour of the system containing the problem. (c) Based on the observations, a hypothesis that describes how the factors involved are thought to interact, or what the best solution to the problem is, is conceived. (d) An experiment is designed to test the hypothesis. (e) The experiment is carried out, and measurements are obtained and recorded. (f) The results of the experiment are analysed, and the hypothesis is either accepted or rejected. The six steps of the scientific method can be applied to decision making in general and to Operations Research/Quantitative Management in particular, and can be adapted for this purpose as follows: • identify the problem 5

DSC2606 CHAPTER 1 BASICS

• collect the relevant data • construct a mathematical model to represent the problem • select a solution method • derive a solution to the model • test the model and evaluate the solution • implement and maintain the solution. There can be interaction between these steps.

1.3 What is a model? A model is a representation of a real entity, and may be constructed in order to gain some understanding of, or insight into, that entity. A model should be realistic enough to incorporate the important characteristics of the real-life system it represents, but not so complex as to hide those characteristics. Three types of model are generally distinguished: • Iconic models are physical representations of real objects, designed to resemble those objects in appearance. For example, when a tall building is planned, engineers may construct a small-scale model of that building, and of its surrounding area and environment, in order to conduct stress tests in a wind tunnel. • Analogue models are also physical models, but represent the entities under study by analogue rather than by replica. An example is a graph showing the movements over time of stock market prices; this type of model provides a pictorial representation of numerical data. • Mathematical models, or symbolic models, are more abstract representations than iconic or analogue models. They attempt to provide, for example, through an equation or system of equations, a description of a real-life system. Operations research makes extensive use of mathematical models. To be used successfully, a mathematical model must meet the following criteria: (a) The model should be as simple and understandable as possible. (b) The model should be reasonable. Its structures should constrain answers to a reasonable range of values and make it difficult for unrealistic answers to result from inputs. (c) The model should be easy to maintain. (d) The model should be adaptive. The parameters and structures of the model should be easy to change as new insights and information evolve. 6

DSC2606 1.4. WHAT DOES A MATHEMATICAL MODEL LOOK LIKE?

(e) The model should be complete on important issues.

1.4 What does a mathematical model look like? Consider a business that manufactures and sells a product. The product costs R5 to manufacture and sells for R20. A model that computes the total profit that will accrue from the items sold is given by the following equation: Z = 20x − 5x. Here

x Z

represents the number of units of the product that are manufactured and sold, and represents the total profit in rand that will result from the sale of the product.

The symbols, x and Z, are referred to as variables. The term variable is used because no set numerical value has been specified for these items. The number of units manufactured and sold and the resulting profit can be any amount (within limits) – they can vary. These two variables can be further distinguished as follows: • The variable Z is known as a dependent variable because its value is dependent on the number of units manufactured and sold. • The variable x is an independent variable, since the number of units manufactured and sold is not dependent upon anything else (in this model). It is also called the decision variable, because its value is usually determined by a conscious decision made by someone with the authority to do so. The numbers 20 and 5 in the equation are referred to as parameters. Parameters are constant values that are generally coefficients of the variables in an equation. Parameters may change in the longer term but usually remain constant for the duration of solving a specific problem, for example, the price of the product may change over time but is fixed for the time being. The equation as a whole is known as a functional relationship. This term is derived from the fact that profit, Z, is a function of the number of units manufactured and sold, x. Since only one functional relationship exists in this example, it is also the model. This model does not represent a real problem – it merely states a functional relationship in a mathematical form – and we expand our example to create a problem situation. Let us assume that the product is made from steel and that 500 grams of steel is needed to make one unit of the product, and that a total of 100 kilograms of steel is available for production.

7

DSC2606 CHAPTER 1 BASICS

If one unit of the product uses 500 grams of steel, then the total number of products manufactured, x, uses 500 × x grams of steel, or 0, 5x kilograms of steel. The steel used to manufacture the products may obviously not exceed the steel available for production. A mathematical inequality representing this relationship between steel used and steel available can now be developed and is as follows: 0, 5x ≤ 100 (steel utilisation in kilograms). The “less than or equal to” sign, ≤, is used as no more than 100 kilograms may be used. A negative number of products can obviously not be manufactured and this can be expressed mathematically as x ≥ 0. The “greater than or equal to” sign, ≥, is used as at least zero units of the product must be manufactured. The model now consists of three relationships: Z = 20x − 5x 0, 5x ≤ 100 x ≥ 0. To add some flavour to the model, we now assume that a manager, say the production manager, must decide on the number of units of the product to manufacture. What objective do you think he will have in mind when having to decide on this? Surely to achieve as much profit as possible. The ideal is an infinite profit, but this cannot be reached in practice because of the limited availability of steel. Based on this observation we can now make a distinction between the relationships in the model. The equation Z = 20x − 5x represents profit and is called the objective function of the model. The inequality 0, 5x ≤ 100 represents the steel utilisation and is called the resource constraint. The inequality x ≥ 0 specifies the numerical values that the variable x may assume. It is also a constraint and is called the sign restriction. To emphasise the distinction between the objective function and the constraints, the model is written as follows: Maximise Z = 20x − 5x subject to 0, 5x ≤ 100 and x ≥ 0. This is an example of a very simple model. 8

DSC2606 1.5. MATHEMATICAL PROGRAMMING AND LINEAR PROGRAMMING

1.5 Mathematical programming and linear programming In the world of quantitative analysis, the word “programming” refers to the modelling and solving of practical problems. “Modelling” refers to the construction of a model to represent a problem situation. Our interest is in mathematical models. “Mathematical programming” therefore means the construction of mathematical models of real-life problem situations and their solutions. All mathematical programming models consist of an objective function that has to be maximised or minimised, and a set of constraints. “Linear programming” is one category of mathematical programming. Linear programming models are distinguished by the fact that the objective function and the constraints are linear. This means that each term in an equation or inequality is either a number or a number multiplied by a symbol. There are no squares, cubes, cross-products or other funny things present. Linear programming is often denoted by LP and in this guide we will follow this convention. From the above we see that the word “programming” has a special meaning and should not be confused with computer programming. Computer programming has, however, played an important role in the advancement and use of operations research techniques. Most real-life problems are too complex to solve by hand or even with a calculator, and require the use of computer packages.

1.6 Properties and assumptions of linear programming Linear programming has been applied extensively in the past to military, industrial, financial, marketing, accounting and agricultural problems. Today LP continues to be used in many different fields and even though the applications are so diverse, all LP problems have four properties in common. Linear programming problems have the following properties: (a) All problems seek to optimise (maximise or minimise) some quantity (usually profit or cost). This property is referred to as the objective of the problem. This objective must be clearly stated and mathematically defined in the form of an objective function. (b) There are restrictions, called constraints, on the problem. These constraints limit the degree to which the objective of the problem can be pursued.

9

DSC2606 CHAPTER 1 BASICS

(c) There are alternative courses of action to choose from. Say, for example, a company manufactures three different products. Then management must decide how to allocate its limited production resources between the three products. (d) The objective and constraints of a problem are expressed as equations and inequalities and each of these is linear. Linear programming is based on the following assumptions: (a) Certainty. This means that all the parameters (numerical values) in the objective function and constraints are known with certainty and do not change during the period being studied. (b) Proportionality. Proportionality exists in the objective function and constraints. Say, for example, that the production of one unit of a product uses three hours of a resource. Then making 10 units of the product will use 10 × 3 hours of the resource. (c) Additivity. This means that the total of all activities equals the sum of the individual activities. Say, for example, that the objective is to maximise the profit resulting from the sale of two products. And the profit contribution of the first product is R5 and the profit contribution of the second product is R7. Then the total profit resulting from the manufacture of one unit of each product will be R5 + R7 = R12. (d) Divisibility. This means that the actual values of the decision variables need not be in whole numbers (integers). But they may take on fractional values, that is, they are divisible. For example, the production of 10,4 computers per day is quite acceptable since the production process is continuous. The rest of the eleventh computer can be completed on the following day. If integer answers are required for practical purposes, rounding off the values to the nearest integers may yield reasonable results. Sometimes a fractional solution is not acceptable and an integer solution must be forced. In this case, the model must specify that an integer solution is required and this is called integer programming. Integer programming will not be discussed in this module.

1.7 Other operations research techniques LP is just part of the vast number of techniques available for problem solving and decision making that form part of the tool kit provided by operations research. A few others are inventory control techniques, network techniques (network flow, CPM/PERT), probabilistic techniques (game theory, markov analysis, simulation, forecasting, etc.), nonlinear programming and other linear techniques such as integer and goal programming. 10

DSC2606 1.7. OTHER OPERATIONS RESEARCH TECHNIQUES

A schematic representation of some of the techniques of operations research and their areas of application is given in Figure 1.1. O P E R A T IO N S R E S E A R C H T E C H N IQ U E S

G e n e ra l m a n a g e m e n t

O p e ra tio n s m a n a g e m e n t

F in a n c ia l m a n a g e m e n t

M a rk e tin g m a n a g e m e n t

F o re c a s tin g

Q u e u in g th e o ry

S to c h a s tic m o d e llin g

S im u la tio n

In v e n to ry c o n tro l

P ro je c t s c h e d u lin g

N e tw o rk s

R o u tin g

E x p e rt s y s te m s

M u lti-c rite ria d e c is io n m a k in g

D e c is io n a n a ly s is

D y n a m ic p ro g ra m m in g

G o a l p ro g ra m m in g

In te g e r p ro g ra m m in g

N o n lin e a r p ro g ra m m in g

L in e a r p ro g ra m m in g

M a th e m a tic a l p ro g ra m m in g

E n v iro n m e n ta l Q u a lity S tra te g ic m a n a g e m e n t m a n a g e m e n t m a n a g e m e n t

Figure 1.1: A schematic representation

11

DSC2606 CHAPTER 1 BASICS

12

Chapter

2

Introducing linear programming models

Contents 2.1

Types of problem . . . . . . . . . . . . . . . . . . . . . . 14

2.2

Mark’s LP model . . . . . . . . . . . . . . . . . . . . . 16

2.3

Christine’s LP model . . . . . . . . . . . . . . . . . . . 18

2.4

Components of an LP model . . . . . . . . . . . . . . . 20

2.5

General LP model . . . . . . . . . . . . . . . . . . . . . 22

13

DSC2606 CHAPTER 2 INTRODUCING LINEAR PROGRAMMING MODELS

Sections from prescribed book, Winston Chapter 3, Section 3.1

Learning objectives After completing this study unit you should be able to • identify all the components of an LP model • write down the general LP model.

2.1 Types of problem Mark and Christine, a young, up-and-coming professional couple, are discussing the events of the day. Mark is an engineer whose team has developed a new television projection system. Final tests have just been successfully completed and the results are encouraging. The system will be installed on two models, the V H200 and the SB150. Although very excited, Mark is also a bit worried. “I wish we had a larger work force, more machine time and better marketing capabilities. I am sure we could make a fat profit. But as it is, we don’t even know how many of each model to manufacture.” Christine has problems of her own at the Unique Paint Company. A new expensive special purpose paint, Sungold, is becoming very popular. The production manager has asked Christine to see if she can find a combination of two new ingredients, code-named Alpha and Beta, that will result in the same brilliance and tint as the original ingredients; but at a lower cost. Christine feels confident that she can. Christine does not realise that her problem, a typical blending problem, is in many ways equivalent to Mark’s, a typical product-mix problem. Resource allocation problems appear in several forms. Examples are given below. Product-mix problems Most manufacturing companies are capable of producing more than one product and have the ability to adjust, to some extent at least, the proportions in which products are made. The objective is to choose the product mix that is most profitable. In making this choice, the firm will be constrained by its resources of equipment and labour, material, finances, etc. Blending problems Many products, for example those of the chemical, petroleum, pharmaceutical and processed food industries, contain mixtures of basic ingredients. 14

DSC2606 2.1. TYPES OF PROBLEM

The finished product must meet certain specifications. Subject to these being met, the manufacturer is free to choose the blend of basic ingredients, being constrained by their availability. The choice made aims at producing a satisfactory product at minimum cost. Transportation problems A company, for example, a textbook publisher, may have warehouses at various places in the country. All orders received from bookshops must be supplied by these warehouses. For example, copies of textbooks must be shipped to campus bookstores to meet the demand there. The company will want to minimise the distribution costs while meeting this demand. Purchasing problems The purchasing department of a company has access to different raw materials in various quantities and qualities. They can buy these in a number of different combinations. And they are subject to production requirements and budget restrictions. They want to find the combination with the lowest cost. Portfolio selection problems An investor must decide how to distribute his investments among alternative assets, such as common stocks and bonds, in order to maximise expected return. Advertising media mix problems The marketing department of a company must decide how much to spend on advertising in newspapers and magazines and on radio and television. They are restricted by the given budget for production promotion, and the available media space and time. The objective is to maximise the exposure of the product to potential customers. Production and inventory scheduling problems Manufacturing companies face variations in the market demand for their products. As it is generally costly to make changes to production schedules, inventory is carried to help meet fluctuations in demand. The problem is to minimise production and inventory holding costs, while meeting anticipated product demand. The range of applications is diverse, but we can trace the following two common threads: • Each of the applications involves optimisation, whether it is to maximise or to minimise something. • Optimisation is always subject to constraints on what is possible. Many practical management problems can be characterised as problems of constrained optimisation.

15

DSC2606 CHAPTER 2 INTRODUCING LINEAR PROGRAMMING MODELS

2.2 Mark’s LP model The television manufacturing company is in the market to make money – its objective is to maximise profit. A profit of R300 is made on each set of model V H200 sold and R250 on each set of model SB150 sold. We can see that the more V H200 sets manufactured and sold, the better. However, there are certain limitations which prevent the company from manufacturing and selling thousands of V H200 models. These limitations are as follows: • there are only 40 hours of labour time available per day for production • there are only 45 hours of machine time available per day • there is an inability to sell more than 12 sets of model V H200 per day. To manufacture one set of model V H200, two hours of labour time and one hour of machine time is required. To manufacture one set of model SB150, one hour of labour time and three hours of machine time is required. Mark’s problem is to determine how many sets of each model to manufacture each day so that the total profit will be as large as possible. Let us formulate his problem as an LP model. Decision variables What decisions must be made? Mark must decide how many model V H200 and how many model SB150 television sets to manufacture each day. These decisions can be represented by the following decision variables: V H = number of model V H200 sets to manufacture daily, SB = number of model SB150 sets to manufacture daily. The decision variables used when formulating models should completely describe the decisions to be made. Objective function The objective is to maximise profit. This objective can be expressed in the form of a function of the decision variables and is then called an objective function. Proportionality is assumed, and this means that since the profit contribution of one set of model V H200 is R300, then the profit contribution of V H units of this model is 300 ×V H. 16

DSC2606 2.2. MARK’S LP MODEL

Similarly, the profit resulting from the manufacture of SB units of model SB150 is 250 × SB. The total profit is therefore 300V H + 250SB. The objective function is then Maximise PROFIT = 300V H + 250SB. Constraints What limitations, or restrictions, are there on the problem? The labour and machine time is limited and there are restricted marketing capabilities. This means that there are limited resources. These resources restrict the number of television sets that can be manufactured. These restrictions are called constraints. Labour constraint One model V H200 set requires two hours of labour time and the unknown quantity V H requires 2 ×V H hours. Similarly 1 × SB hours of labour time is required to manufacture the model SB150 sets. The total labour time required is therefore 2V H + SB. There are 40 labour hours available for manufacturing the television sets. The labour time used may not be more than the available labour hours. This can be expressed as labour hours required ≤ labour hours available labour hours V H200 + labour hours SB150 ≤ labour hours available 2V H + SB ≤ 40. Note the ≤ sign. The total number of labour hours available, 40, need not necessarily all be used. Machine constraint The machine constraint can be deduced in a similar way as V H + 3SB ≤ 45. Marketing constraint The marketing capabilities are restricted and the result of this is that it is impossible to sell more than 12 sets of model V H200 daily. This constraint is represented as V H ≤ 12.

17

DSC2606 CHAPTER 2 INTRODUCING LINEAR PROGRAMMING MODELS

Sign restrictions It is impossible to manufacture a negative number of television sets and so V H and SB must be nonnegative. These constraints are expressed as V H ≥ 0; SB ≥ 0. The LP model for Mark’s problem is Maximise PROFIT = 300V H + 250SB subject to 2V H + SB ≤ 40 V H + 3SB ≤ 45 VH ≤ 12 and V H; SB ≥ 0.

(Labour time) (Machine time) (Marketing)

NOTE: Since production continues day after day, it is not necessary to complete all sets at the end of the day. A fractional number of sets is permissible (e.g. 5, 7 sets). However, had it been necessary to complete all sets by the end of the day, additional constraints limiting V H and SB to whole numbers would have been added. Such an addition changes the problem to one of integer programming, which will not be discussed in this module.

2.3 Christine’s LP model In preparing Sungold paint it is required that the paint has a brilliance rating of at least 300 degrees and a tint level of at least 250 degrees. Brilliance and tint levels are determined by the two ingredients, Alpha and Beta. Each gram of Alpha produces one degree of brilliance in a tin of paint. Likewise for Beta. However, the tint is controlled entirely by the amount of Alpha, one gram of it producing three degrees of tint in one tin of paint. The cost of Alpha is 45 cents per gram and the cost of Beta is 12 cents per gram. Assuming that the objective is to minimise the cost of the ingredients, the problem is to find the quantity of Alpha and Beta to be included in the preparation of each tin of paint. An optimal answer for one tin will remain optimal for any number of tins as long as the relationships are linear. The total quantity of paint to be produced is of course more than one tin, and it is determined mainly by the demand and the manufacturing technology formulation.

18

DSC2606 2.3. CHRISTINE’S LP MODEL

Decision variables The decision variables are ALPHA = quantity (in grams) of Alpha in each tin of paint, BETA = quantity (in grams) of Beta in each tin of paint. Objective function The objective is to minimise the total cost of the ingredients. Since the cost of Alpha is 45 cents per gram and since ALPHA grams are to be used in each tin, the cost per tin is 45 × ALPHA. Similarly for Beta, the cost is 12 × BETA. The total cost is 45ALPHA + 12BETA, and the objective function is given by Minimise COST = 45ALPHA + 12BETA. Constraints Brilliance constraint Each gram of Alpha produces one degree of brilliance in a tin of paint and so ALPHA grams of Alpha produces 1 × ALPHA degrees of brilliance. Similarly BETA grams of Beta produces 1 × BETA degrees of brilliance in a tin of paint. The total brilliance produced in a tin of paint by the two ingredients is then ALPHA + BETA. A brilliance rating of at least 300 degrees per tin is required. The brilliance produced must be at least the brilliance required. This can be expressed as brilliance produced ≥ brilliance required brilliance from Alpha + brilliance from Beta ≥ brilliance required ALPHA + BETA ≥ 300. Tint constraint The tint constraint can be deduced in a similar way as tint produced tint from Alpha + tint from Beta 3 × ALPHA + 0 × BETA 3ALPHA

≥ ≥ ≥ ≥

tint required tint required 250 250.

19

DSC2606 CHAPTER 2 INTRODUCING LINEAR PROGRAMMING MODELS

Sign restrictions It is impossible to use negative amounts of the ingredients and so ALPHA ≥ 0; BETA ≥ 0. The LP model for Christine’s problem is Minimise COST = 45ALPHA + 12BETA subject to ALPHA + BETA ≥ 300 3ALPHA ≥ 250 and ALPHA; BETA ≥ 0.

(Brilliance) (Tint)

2.4 Components of an LP model Formulation is actually more of an art than a science – and there is no foolproof “recipe” that can be followed to formulate a problem as an LP model. However, by formally defining the components of a model and following some broad directives, the process can be made easier. So let us take another look at the components of a model and use Mark and Christine’s problems as a reference. An LP model consists of the components as given below: Decision variables The sole purpose of formulating an LP model is to get answers to a problem. Decision variables are, as the name indicates, those variables that represent the decisions that have to be made. When the LP model is solved, the values obtained for the decision variables will be the answer to the problem. To make sure that we know exactly what it is we have to decide about and what the answers will actually mean, we have to define the decision variables as clearly and completely as possible. Mark’s problem is solved when he can tell the company how many of each model television set to manufacture each day to maximise profit. The decision variables are therefore V H = the number of units of model V H200 to manufacture daily, SB = the number of units of model SB150 to manufacture daily. Christine’s problem is solved when she can tell the Unique Paint Company how much of each ingredient to put into a tin of Sungold paint. The decision variables for her problem are therefore ALPHA = quantity, in grams, of Alpha in each tin of paint, BETA = quantity, in grams, of Beta in each tin of paint. 20

DSC2606 2.4. COMPONENTS OF AN LP MODEL

Objective function The objective function is a mathematical expression, given as a linear function, that shows the relationship between the decision variables and a single goal (or objective) under consideration. Linear programming attempts to either maximise or minimise the value of the objective function. Mark’s objective function is Maximise PROFIT = 300V H + 250SB. Christine’s objective function is Minimise COST = 45ALPHA + 12BETA. Profit or cost coefficients The coefficients in the objective function are either profit or cost coefficients. In Mark’s problem the objective is to maximise profit and the objective function is Maximise PROFIT = 300V H + 250SB. The 300 and 250 in this objective function are profit coefficients. In Christine’s problem the objective is to minimise cost and the objective function is Minimise COST = 45ALPHA + 12BETA. The 45 and 12 in this objective function are cost coefficients. Constraints Optimisation is always performed subject to a set of constraints. Therefore, linear programming can be defined as a constrained optimisation technique. These constraints are expressed in the form of linear inequalities or, sometimes, equalities. They reflect the fact that resources are limited (in Mark’s product-mix problem) or they specify the product requirements (in Christine’s blending problem). Activity (or technology) coefficients The coefficients of the decision variables in the constraints are called the activity coefficients. They indicate the rate at which a given resource is depleted (Mark) or at which a requirement is met (Christine). They appear on the left-hand side of the constraints. Right-hand elements On the right-hand side of a constraint either the capacity, or availability, of a resource (as in Mark’s problem) or the minimum requirements (as in Christine’s problem) are given. Sign restrictions

21

DSC2606 CHAPTER 2 INTRODUCING LINEAR PROGRAMMING MODELS

Only nonnegative (zero or positive) values of the decision variables are considered. This requirement merely specifies the fact that negative values of physical quantities do not exist.

2.5 General LP model The general LP model can be presented in the following mathematical terms: Maximise F(x1 ; x2 ; . . . ; xn ) = c1 x1 + c2 x2 + · · · + c j x j + · · · + cn xn subject to linear constraints a11 x1 + a12 x2 + · · · + aln xn ≤ b1 a21 x1 + a22 x2 + · · · + a2n xn ≤ b2 .. .. .. . . . ail x1 + ai2 x2 + · · · + ain xn ≤ bi .. .. .. . . . am1 x1 + am2 x2 + · · · + amn xn ≤ bm and x1 ; x2 ; · · · ; xn ≥ 0. Here ai j cj bi xj

= = = =

the activity coefficients, the profit (cost) coefficients, the right-hand elements, the decision variables,

with i j n m

= = = =

1; 2; . . . ; m, 1; 2; . . . ; n, number of decision variables, number of constraints.

The notation is interpreted as follows: The entry a22 is read as a-two-two. When the index consists of two figures, the first one denotes the row number and the second one the column number. The entry a32 (a-three-two) denotes the coefficient in the third row, second column.

22

Chapter

3

Graphical representation

Contents 3.1

A graphical approach . . . . . . . . . . . . . . . . . . . 24

3.2

Finding the feasible area . . . . . . . . . . . . . . . . . 25

3.3

Identifying the optimal solution . . . . . . . . . . . . . 29

3.4

Solving Christine’s problem graphically . . . . . . . . . 35

3.5

Types of solution . . . . . . . . . . . . . . . . . . . . . . 38

3.6

3.5.1

Infeasible LPs . . . . . . . . . . . . . . . . . . . . 38

3.5.2

Unbounded solutions . . . . . . . . . . . . . . . . 38

3.5.3

Multiple optimal solutions . . . . . . . . . . . . . 40

3.5.4

Degenerate solutions . . . . . . . . . . . . . . . . 41

Types of constraint . . . . . . . . . . . . . . . . . . . . . 42 3.6.1

Redundant constraints . . . . . . . . . . . . . . . 42

3.6.2

Binding and nonbinding constraints . . . . . . . . 42

3.7

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.8

Solutions to exercises . . . . . . . . . . . . . . . . . . . 44

23

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

Sections from prescribed book, Winston Chapter 3, Section 3.1 Chapter 3, Section 3.2 Chapter 3, Section 3.3

Learning objectives After completing this study unit you should be able to • use the graphical approach to determine the optimal solution to an LP model in two variables • identify infeasibility, unbounded solutions, multiple solutions and degeneracy from a graph • identify redundant, binding and nonbinding constraints from a graph.

3.1 A graphical approach We have now spent a considerable time “transforming” problems into linear programming models. We will also spend a considerable time solving them using a solution method called the simplex method. However, before doing this, we are going to use a graphical representation of simple LP models to illustrate certain characteristics of LP models. A graphical approach may also be used to solve small problems in two decision variables with only a few constraints. The nonnegativity constraints of linear programming problems restrict the graphical representation to the first quadrant only. The graphical approach consists of two phases: • graphical representation of the feasible area • identifying the optimal solution. As an illustration we return to Mark’s problem. This problem was formulated in Section 2.2 as follows: Maximise PROFIT = 300V H + 250SB subject to 2V H + SB ≤ 40 (Labour time) V H + 3SB ≤ 45 (Machine time) VH ≤ 12 (Marketing) and V H; SB ≥ 0. 24

DSC2606 3.2. FINDING THE FEASIBLE AREA

Here V H = number of sets of model V H200 produced daily, SB = number of sets of model SB150 produced daily.

3.2 Finding the feasible area The feasible area is established by graphing all of the constraints, which are in the form of inequalities and/or equations. The two decision variables under consideration in Mark’s problem are V H and SB. It does not matter which axis is used to represent which variable. We arbitrarily choose the horizontal axis to represent V H. Both V H and SB are ≥ 0. We therefore use only the first quadrant for the graphical representation. We have to represent the three constraints, which in this case are inequalities, on the graph. And we start by considering the equality part only (= sign), dropping the less than part (< signs): 2V H + SB = 40 V H + 3SB = 45 VH = 12.

(1) (2) (3)

In order to draw a line, we need two points. The easiest way to do this is as follows: set the one variable equal to zero and find the value for the other variable. This gives a point on one axis. Repeat the process for the other variable to find a point on the other axis. Connect the points to draw the line. Consider equation (1): 2V H + SB = 40. Let V H = 0, then SB = 40. The point on the SB-axis is (0; 40). Let SB = 0, then V H = 20. The point on the V H-axis is (20; 0). Plot the two points on the axes and draw the line connecting these points. Consider equation (2): V H + 3SB = 45. Let V H = 0, then SB = 15. The point on the SB-axis is (0; 15). Let SB = 0, then V H = 45. The point on the V H-axis is (45; 0). Plot the two points on the axes and draw the line. Consider equation (3): V H = 12. Any line consisting of one variable only is a line parallel to the axis representing the other variable. In this case, draw a line parallel to the SB-axis, passing through the point (12; 0). These three lines are drawn and are given in Figure 3.1. But these lines do not tell us anything about the inequalities.

25

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

SB

+S B=

30

H 2V

35

V H = 12

40

40

25 20 15 10

VH

+ 3S

5

5

10

15

20

25

B=

30

45

35

40

45

50

VH

Figure 3.1: Equality constraints Consider the first constraint of Mark’s model, the inequality 2V H +SB ≤ 40. This inequality states that the value of 2V H + SB must be less than or equal to 40. The equality part (= sign) is easy – all the points on the line 2V H + SB = 40 will satisfy this requirement. To deal with the “less than” part (< sign), we choose any point not on the line. If this point satisfies the inequality, then all points on the same side of the line will also satisfy the inequality. Conversely, if the point does not satisfy the inequality, then all points on the opposite side of the line will satisfy the inequality. The easiest way to determine on which side of the line the inequality is satisfied, is to substitute the point (0; 0) into the inequality. Then 2V H + SB = 2(0) + 0 = 0 < 40. The point (0; 0) therefore satisfies the inequality and all points on the same side of the line as (0; 0) will satisfy the inequality. The area satisfying the inequality is represented by the horizontal lines on Figure 3.2. All the points in the lined area as well as the points on the line 2V H + SB = 40 satisfy the inequality 2V H + SB ≤ 40. We call this the solution set of the inequality. NOTE: The equality sign (=) is part of the inequality 2V H + SB ≤ 40 and therefore the line 2V H + SB = 40 is represented by a solid line on the graph. If the inequality was in fact a strict inequality, say 2V H + SB < 40, where the equality sign is not part of the inequality, then the points on the line 2V H + SB = 40 are not part of the solution set of the inequality and this fact should be represented on a graph by drawing the line 2V H + SB = 40 as a dotted line. 26

DSC2606 3.2. FINDING THE FEASIBLE AREA

SB

+S B=

30

H 2V

35

V H = 12

40

40

25 20 15 10

VH

+ 3S

5

5

10

15

20

25

B=

30

45

35

40

45

50

VH

Figure 3.2: Labour time inequality Now consider the second constraint of Mark’s model, V H + 3SB ≤ 45, and substitute the point (0; 0) into this inequality. Then V H + 3SB = 0 + 3(0) = 0 < 45. The point (0; 0) satisfies the inequality and all points on the same side of the line satisfy the inequality. The solution set of this inequality is represented by the solid line V H + 3SB = 45 and the positively slanted lines on Figure 3.3. SB

+S B=

30

H 2V

35

V H = 12

40

40

25 20 15 10

VH

+ 3S

5

5

10

15

20

25

B=

30

45

35

40

45

50

VH

Figure 3.3: Labour and machine time inequalities Now consider the third constraint of Mark’s model, V H ≤ 12. 27

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

The points satisfying this inequality lie on the line V H = 12 and to the left of this line. The solution set of this inequality is represented by the solid line V H = 12 and the negatively slanted lines on Figure 3.4. SB

+S B=

30

H 2V

35

V H = 12

40

40

25 20 15 10

VH

+ 3S

5

5

10

15

20

25

B=

30

45

35

40

45

50

VH

Figure 3.4: Labour and machine time and marketing inequalities The solution sets of all the inequalities are now determined. In Figure 3.5 we show the area where these solution sets overlap as a shaded area. SB

+S B=

30

H 2V

35

V H = 12

40

40

25 20 15 10

VH

+ 3S

5

5

10

15

20

25

B=

30

45

35

40

45

50

VH

Figure 3.5: Feasible area This shaded area where all the constraints overlap is called the feasible area or feasible region of the LP model. 28

DSC2606 3.3. IDENTIFYING THE OPTIMAL SOLUTION

NOTE: The constraint 2V H + SB ≤ 40 makes no contribution to the feasible area, in fact it falls completely outside the feasible area. Any point in the feasible area will also satisfy this constraint. This “non-contributing” constraint is called a redundant constraint and can be omitted from the model.

3.3 Identifying the optimal solution Any point in the feasible area satisfies all the constraints and is therefore a feasible solution to the LP model. Since there are an infinite number of points in this area, there are an infinite number of feasible solutions for this problem. To find an optimal solution, it is necessary to identify a solution (point) in the feasible area that maximises the profit (objective function). How can this be accomplished? Let us examine the point (6; 3) which is a feasible solution since it lies inside the feasible region. The profit associated with this point is PROFIT = 300V H + 250SB = 300(6) + 250(3) = 2 550. This means that if the company manufactures six sets of model V H200 and three sets of model SB150 daily, then the profit will be R2 550. We now move from the point (6; 3) to the point (12; 3), which is a point on the boundary of the feasible area, but still part of the feasible area. The profit associated with point (12; 3) is R4 350, which is larger than the profit associated with point (6; 3). Similarly, we move from point (6; 3) to another boundary point (6; 13) and find that the associated profit is R5 050, which is larger than the profit associated with point (6; 3). Figure 3.6 illustrates this. By repeating this process with other points, we can show that there will be at least one point somewhere on the boundary of the feasible area that is better than the point inside the feasible area. Therefore, the optimal solution to a linear programming model can never be inside the feasible area, but must be on the boundary. If we follow the same argument as before, we can show that the optimal solution to an LP model will always be at a corner point (an extreme point) of the feasible area, that is, where two constraints intersect. The feasible area of Mark’s problem has four corner points. The first one is the origin (0; 0), the second one is the intercept on the V H-axis (12; 0) and the third one is the intercept on the SB-axis (0; 15). The fourth corner

29

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

SB V H = 12

30 25 20 15

PROFIT = 5 050 (6; 13) b

10 VH

+ 3S

5 (6; 3) PROFIT = 2 550 b

5

10

b

B=

45

(12; 3) PROFIT = 4 350

15

20

25

30

35

40

45

VH

Figure 3.6: Searching for the optimal solution point is at the intersection of the two constraints. If we substitute V H = 12 in V H + 3SB = 45, we obtain SB = 11. The fourth corner point is (12; 11). Let us calculate the value of the objective function at the corner points of the feasible region: Corner points (V H; SB) (0; 0) (12; 0) (12; 11) (0; 15)

Value of objective function PROFIT = 300V H + 250SB 0 3 600 6 350 3 750

From this we see that the point (12; 11) gives the maximum profit. Therefore, the optimal solution is to produce 12 sets of model V H200 and 11 sets of model SB150 for a profit of R6 350. This method of calculating the value of the objective function at the corner points is not very efficient and may be a very lengthy exercise. A more efficient method for obtaining the optimal solution must be found. The optimal solution is the solution that optimises (maximises in this case) the objective function. Therefore, it is obvious that we should use the objective function to find the optimal solution. The objective function of Mark’s problem is Maximum PROFIT = 300V H + 250SB. 30

DSC2606 3.3. IDENTIFYING THE OPTIMAL SOLUTION

Let us examine a point, say (5; 5) in the feasible area. Here five units of each type of model are produced and the associated profit is PROFIT = 300(5) + 250(5) = 2 750. The objective function can now be written as 300V H + 250SB = 2 750. This is the equation of a straight line and each of the points on this line will have an objective function value (profit) of R2 750. This line is called an isoprofit line because all points on it have the same profit (in Greek “iso” means “same”). We now want to draw this line and we proceed as follows: We already have one point, namely (5; 5). To find another point, we calculate the intercept on the SB-axis. If V H = 0 then 300V H + 250SB = 2 750 300(0) + 250SB = 2 750 SB = 11. We can use the points (5; 5) and (0; 11) to draw the isoprofit line 300V H + 250SB = 2 750. And this is shown in Figure 3.7 by the dashed line. We use a dashed line so that we can distinguish the isoprofit line from the constraints. Choose another point in the feasible region, say (5; 10). An isoprofit line going through this point will have a corresponding profit of PROFIT = 300(5) + 250(10) = 4 000. The SB-intercept of 300V H + 250SB = 4 000 is at V H = 0, and is SB = 16. We can use the points (5; 10) and (0; 16) to draw the isoprofit line 300V H + 250SB = 4 000. Likewise, points (10; 10) and (0; 22) can be used to draw the isoprofit line 300V H + 250SB = 5 500. And points (10; 15) and (0; 27) can be used to draw the isoprofit line 300V H + 250SB = 6 750.

31

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

SB V H = 12

30 25 20 15 b

10

(0; 11) OF PR

VH

IT b

+ 3S

B=

=

5

(5; 5)

45

27 50

5

10

15

20

25

30

35

40

45

VH

40

45

VH

Figure 3.7: Isoprofit line = R2 750 SB V H = 12

30 OF PR

25

IT = 67

OF PR

50

20

IT = 55 OF PR

b

X

IT

VH

= 40 00

= b

+ 3S

B=

27

(5; 5)

(10; 15)

b

IT

5

b

b

OF PR

10

00

15

45

50

5

10

15

20

25

30

35

Figure 3.8: Isoprofit lines The four isoprofit lines can be shown on our graph of the feasible area and appear in Figure 3.8. We see that the four isoprofit lines are parallel and that the profit gets larger as the isoprofit lines move further away from the origin (point (0; 0)) in a direction upwards and to the right. The first three isoprofit lines lie within 32

DSC2606 3.3. IDENTIFYING THE OPTIMAL SOLUTION

the feasible area and therefore give feasible solutions. The fourth isoprofit line, 300V H + 250SB = 6 750, lies outside the feasible area. Therefore, the points on this line will give infeasible solutions. An infeasible solution is an impossible solution. In Mark’s case, this means that that combinations of the two model television sets cannot be produced given the current constraints. Figure 3.8 shows four isoprofit lines. There are, however, an infinite number of isoprofit lines parallel to these lines. Since the isoprofit lines are parallel, there is no need for us to draw any more lines. The tendency will remain the same, namely, the profit increases as the lines move upwards and to the right. We are looking for the isoprofit line with the maximum profit that satisfies the constraints. This means that the isoprofit line must be within the feasible area. The isoprofit line with the maximum profit will be the one that is as far as possible to the right of the origin, but still within the feasible area. The easiest way to determine this maximum isoprofit line is to place a ruler on one of the isoprofit lines and, keeping it parallel to the line, move it upwards and to the right until we find the last isoprofit line that still lies within the feasible area, in other words, the last isoprofit line before the one that lies outside the feasible area. If we go back to Figure 3.8 and do this, we find that the last isoprofit line is the one that passes through point X. This point (12; 11) is the point of intersection of the lines that represent two boundaries of the feasible area, namely V H + 3SB = 45 and V H = 12. This maximum isoprofit line contains just this one point in the feasible area and this point is therefore the solution to the model. The profit associated with this point is PROFIT = 300V H + 250SB = 300(12) + 250(11) = 6 350. The optimal solution is therefore to produce 12 sets of model V H200 and 11 sets of model SB150 for a profit of R6 350. This is the same solution as was obtained by means of evaluating the corner points. In solving Mark’s problem we drew four isoprofit lines. This was done to determine the direction in which the isoprofit lines should be moved to increase profit. However, it is unnecessary to draw so many isoprofit lines; in fact, we require just two isoprofit lines. If we draw two isoprofit lines, we can compare their objective function values to determine the direction of increasing profit. Once we have determined this, we can use a ruler and move it in the required direction until we find the point where the isoprofit line last touches the feasible area. This point will be the optimal solution.

33

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

The fact that the isoprofit lines are parallel, and therefore have the same slope, can be applied to derive yet another method of drawing the isoprofit lines. The objective function can be rewritten as follows: 300V H + 250SB = PROFIT 250SB = −300V H + PROFIT SB = −

300 PROFIT VH + 250 250

6 PROFIT SB = − V H + . 5 250 (The function is rewritten by writing SB in terms of V H and PROFIT since SB is represented on the vertical axis of the graph.) The slope of all the isoprofit lines will then be − 65 . The slope is negative, which means that the slant is \, in contrast to a positive slope which has slant /. To draw the first isoprofit line, we measure six units on the vertical axis and five units on the horizontal axis and draw a line through these points. In other words, we draw a line through points (0; 6) and (5; 0). To determine the value of this isoprofit line, we substitute either of these points into the objective function. Then PROFIT = 300V H + 250SB = 300(0) + 250(6) = 1 500. A second isoprofit line can be drawn by measuring 2 × 6 units on the vertical axis and 2 × 5 units on the horizontal axis and drawing a line through these two points, namely (0; 12) and (10; 0). The corresponding profit value is PROFIT = 300(10) + 250(0) = 3 000. The graphical representation showing these two isoprofit lines is given in Figure 3.9. These two isoprofit lines once again show us that the profit increases as we move upwards and towards the right. If we place a ruler on one of these isoprofit lines and continue moving it parallel to the isoprofit lines in the direction of increasing profit, we will find that the maximum profit is obtained when the isoprofit line passes through point (12; 11). This is exactly the same result as was obtained before. 34

DSC2606 3.4. SOLVING CHRISTINE’S PROBLEM GRAPHICALLY

SB V H = 12

30 25 20 15

(12; 11) b

VH

IT =

OF PR

+ 3S

B=

30

45

=

00

IT

5

OF PR

10

15 00

5

10

15

20

25

30

35

40

45

VH

Figure 3.9: Two isoprofit lines

3.4 Solving Christine’s problem graphically The graphical approach can also be used for minimisation problems. As an illustration, we return to Christine’s LP model (see Section 2.3): Minimise COST = 45ALPHA + 12BETA subject to ALPHA + BETA ≥ 300 3ALPHA ≥ 250 and ALPHA; BETA ≥ 0.

(Brilliance) (Tint)

Here ALPHA = quantity (in grams) of Alpha in each tin of paint, BETA = quantity (in grams) of Beta in each tin of paint. The cost is given in cents per gram. Let us decide that ALPHA will be represented on the horizontal axis and BETA on the vertical axis. The first constraint is rewritten as an equality as ALPHA + BETA = 300. Let ALPHA = 0, then BETA = 300. The point on the BETA-axis is (0; 300).

35

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

Let BETA = 0, then ALPHA = 300. The point on the ALPHA-axis is (300; 0). Plot these two points on the axes and draw the line connecting them. Now substitute point (0; 0) into the constraint. Then ALPHA + BETA ≥ 300 0 + 0 300. The point (0; 0) does not satisfy the inequality and so (0; 0) is not part of the feasible area. All points on the opposite side of the line ALPHA + BETA = 300 will satisfy the inequality. Now rewrite the second constraint as an equation. Then 3ALPHA = 250 ALPHA = 83, 33. This line is a straight line parallel to the BETA-axis, passing through the point (83, 33; 0). The point (0; 0) does not satisfy the inequality 3ALPHA ≥ 250 and therefore it does not form part of the feasible area. The objective function is now rewritten as COST = 45ALPHA + 12BETA 12BETA = −45ALPHA +COST

45 COST ALPHA + 12 12 15 COST = − ALPHA + . 4 12

BETA = −

The slope of the isocost lines is − 15 4. We can draw the first isocost line by measuring say 10 × 15 = 150 units on the vertical axis and 10 × 4 = 40 units on the horizontal axis, and then drawing a line through these points (0; 150) and (40; 0). The cost associated with this isocost line is COST = 45ALPHA + 12BETA = 45(40) + 12(0) = 1 800. A second isocost line can be drawn by measuring 30 × 15 = 450 units on the vertical axis and 30 × 4 = 120 units on the horizontal axis, and then drawing a line through these points (0; 450) and (120; 0). The associated cost is R5 400. 36

DSC2606 3.4. SOLVING CHRISTINE’S PROBLEM GRAPHICALLY

BETA 500 450 3ALPHA = 250

400

C OS T = 54

350

00

300 250

b

A

200

A + BE TA

= 18

= 30

00

0

50

C OS T

100

PH AL

150

50

100

150

200

250

300

350 ALPHA

Figure 3.10: Christine’s problem

The graphical representation of the LP model is given in Figure 3.10. The cost increases as the isocost lines move upwards and to the right. Since the objective is to minimise cost, the optimal solution is found where the isocost line first enters the feasible area. Conversely, the cost decreases as the isocost lines move downwards and to the left. The optimal solution will be where the isocost line just touches the feasible area before it moves out of the feasible area. Therefore, the optimal solution is at point A. Point A is at the intersection of the two constraints. Here ALPHA = 83, 33 and this can be substituted into the first constraint to obtain the value of BETA. Then ALPHA + BETA = 300 83, 33 + BETA = 300 BETA = 216, 67.

37

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

The optimal solution to Christine’s problem is to use 83, 33 grams of Alpha and 216, 67 grams of Beta in each tin of paint. The minimum cost will be COST = 45ALPHA + 12BETA = 45(83, 33) + 12(216, 67) = 6 350. The cost is given in cents and therefore the minimum cost is R63, 50 per tin of paint.

3.5 Types of solution We have now solved Mark and Christine’s problems by means of the graphical approach. And each of these models had a single optimal solution. This may not always be the case. It may happen that an LP model has no solution, or even has many optimal solutions. Let us consider the different types of solution that can be obtained for LP models.

3.5.1 Infeasible LPs Consider the following LP model: Maximise Z = 5x + 3y subject to 2x + y ≤ 4 x ≥ 4 y ≥ 6 and x; y ≥ 0.

(1) (2) (3)

The graphical representation of the constraints is in Figure 3.11. The solution sets of the three constraints do not overlap and as such there is no point that satisfies all three constraints simultaneously (at the same time). Therefore, a feasible solution area does not exist, and there is no solution to the problem. Such a situation where no feasible area exists is called an infeasible LP, and it has no solution.

3.5.2 Unbounded solutions Consider the following LP model: 38

DSC2606 3.5. TYPES OF SOLUTION

y

(2)

8 (3)

6 4 2

(1)

2

4

6

x

8

Figure 3.11: Infeasible LP Maximise Z = 4M + 2N subject to −M + 2N ≤ 6 −M + N ≤ 2 and M; N ≥ 0.

(1) (2)

The graphical representation of this model is in Figure 3.12. N (1)

6

(2)

5 4 3

Z= 12

8

4

−6 −5 −4 −3 −2 −1 −1

Z=

1

Z=

2

1

2

3

4

5

6

M

Figure 3.12: Unbounded solution The isoprofit lines increase as they move upwards and to the right across the feasible area. As the feasible area has no boundary on the right, the value of the objective function can increase indefinitely without ever reaching a

39

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

maximum. This LP model is said to have an unbounded solution.

3.5.3 Multiple optimal solutions Consider the following LP model: Maximise P = 2x + 3y subject to 2x + 3y −x + y x + y x and x; y ≥ 0.

≤ 30 ≤ 5 ≥ 5 ≤ 10

(1) (2) (3) (4)

The graphical representation of this model is in Figure 3.13. y (4)

(2)

12 10 A

8

b

P=

6

P= P=

4 2

−8

−6

−4

−2

30

24

18 b

B

(3) 2

(1) 4

6

8

10

12

14

16

Figure 3.13: Multiple optimal solutions From the graph it is clear that the isoprofit lines are parallel to constraint (1). Both have a slope of − 32 . As we move the isoprofit lines to the right, the whole line segment AB, and not just a single extreme point, will be touched before the isoprofit lines leave the feasible area. Every point on this line segment will result in the same maximum profit, and so every point on this line segment is optimal. The end points, A and B, are referred to as the alternate end point optimal solutions with the understanding that these points represent the endpoints of a range of optimal solutions. 40

18

x

DSC2606 3.5. TYPES OF SOLUTION

This LP model is said to have multiple optimal solutions.

3.5.4 Degenerate solutions Consider the following LP model: Maximise z = 3x + 9y subject to x + 4y ≤ 8 x + 2y ≤ 4 and x; y ≥ 0.

(1) (2)

The graphical representation of this model is in Figure 3.14. y 3 2

A

1

z= 1

2

z=

9

(2)

3

4

18 5

(1)

6

7

8

x

9

Figure 3.14: Degenerate solution

The optimal solution is at point A. There are three lines going through point A, namely x = 0, x + 4y = 8 and x + 2y = 4. Only two lines are needed to define a point in a two-dimensional environment. Point A is therefore over determined and one of the lines (constraints) is redundant. From the graph it is clear that constraint (1) is the culprit. The feasible area will not change if we remove this constraint from the model. A constraint like this is called a redundant constraint and can be removed from the model without changing the solution. This LP model is said to have a degenerate solution.

41

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

3.6 Types of constraint 3.6.1 Redundant constraints Let us refer back to Mark’s problem. The labour time constraint, 2V H + SB ≤ 40, falls outside the feasible area and makes no contribution to the feasible area. Therefore, it is a redundant constraint. The redundant constraint can be removed from the model. In fact, it should be removed, since it only complicates the model and does not make any positive contribution.

3.6.2 Binding and nonbinding constraints Consider the following LP model: Maximise PROFIT = 2X + 3Y subject to 5X + 6Y ≤ 60 X + 2Y ≤ 16 X ≤ 10 and X ; Y ≥ 0.

(Labour time) (Machine time) (Demand)

The graphical representation of this model is given in Figure 3.15. Y 12 Demand

10 8

La bo

6

ur

PR OF

4 2

PR OF

2

IT

=

b

IT

=

G= (6; 5)

24

Ma chi n

12

4

6

8

10

12

14

e

16

Figure 3.15: Binding and nonbinding constraints

42

18

X

DSC2606 3.7. EXERCISES

The optimal solution is found at point G where the labour time and machine time constraints intersect, that is, where X = 6 and Y = 5. We now substitute these optimal values into the constraints. If we substitute X = 6 and Y = 5 into the labour time constraint, we find left-hand side = = = =

5X + 6Y 5(6) + 6(5) 60 right-hand side.

Likewise the left-hand side of the machine time constraint equals the righthand side at the optimal point (X ;Y ) = (6; 5). This means that the available labour time and machine time are completely used up. Constraints like these where the resources are fully utilised are called binding constraints. We now substitute the optimal values into the demand constraint and find left-hand side = X = 6 < 10 = right-hand side. From this we see that the optimal quantity of X produced is less than the maximum number demanded, that is, it is not optimal to produce the maximum number demanded. The difference between the left-hand side and the right-hand side of a constraint is called the surplus or slack, depending on the context of the problem. A constraint where the resource is not utilised fully, in other words, where there is surplus or slack, is called a nonbinding constraint. From the graph we see that a constraint is binding if the optimal solution falls on the line representing the constraint and a constraint is nonbinding if the optimal solution does not fall on the line representing the constraint.

3.7 Exercises 1. Holiday Meal Turkey Ranch buys two different brands of turkey feed and blends them to provide a good, low-cost diet for its turkeys. Each brand of feed contains some or all of the three nutritional ingredients essential for fattening turkeys. Each kilogram of brand 1 contains 6 grams of ingredient A, 4 grams of ingredient B and 0, 5 grams of ingredient C. Each kilogram of brand 2 contains 9 grams of ingredient A, 3 grams of ingredient B, but nothing of ingredient C. The brand 1 feed costs the Ranch 75c a kilogram, while the brand 2 feed costs R1 a kilogram.

43

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

The minimum monthly requirement per turkey is 90 grams of ingredient A, 48 grams of ingredient B and 1, 5 grams of ingredient C. Formulate an LP model to decide how to mix the two brands of turkey feed so that the minimum monthly intake requirement for each nutritional ingredient is met at minimum cost. Use the graphical approach to solve this model. 2. Solve the following LP models graphically. For each solution state clearly what type of solution it is and identify the binding, nonbinding or redundant constraints, giving reasons for your answers. (a) Maximise Z = P + Q subject to P + 2Q 2P + Q P Q

≤ ≤ ≥ ≥

6 8 7 0.

(1) (2) (3)

(b) Maximise Z = M + 3N subject to M + N ≤ 25 2M + N ≤ 30 N ≤ 35 and M; N ≥ 0.

(1) (2) (3)

3.8 Solutions to exercises 1. Holiday Meal Turkey Ranch data can be summarised as follows:

Ingredient A B C Cost per kg

Composition of each kg of feed (grams) Brand 1 Brand 2 6 9 4 3 0,5 0 75c 100c

Minimum monthly requirement (grams) 90 48 1,5

Let ONE = number of kilograms of brand 1 feed bought monthly, TW O = number of kilograms of brand 2 feed bought monthly. 44

DSC2606 3.8. SOLUTIONS TO EXERCISES

The LP model is Minimise COST = 75ONE + 100TW O subject to 6ONE + 9TW O ≥ 90 4ONE + 3TW O ≥ 48 0, 5ONE ≥ 1, 5 and ONE; TW O ≥ 0.

(Ingredient A) (Ingredient B) (Ingredient C)

Graphical solution is given in Figure 3.16.

Ing

16

Ingr C

TW O 18

rB 14 12 10

Ing

rA

8 6 CO

4

b

ST

2

2

4

6

8

=

90

X

CO

ST

0

10

12

=

12

00

14

16

18 ONE

Figure 3.16: Study Unit 4, Exercise 1

We see that COST decreases as the isocost lines move downwards to the left over the feasible area. The last point that the isocost line touches before leaving the feasible area is point X = (9; 4). This represents the solution resulting in the minimum cost. Therefore, the optimal solution is to buy 9 kilograms of brand 1 feed and 4 kilograms of brand 2 feed monthly and the associated minimum

45

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

cost per turkey is COST = = = = 2.

75ONE + 100TW O 75(9) + 100(4) 1 075 R10, 75.

(a) The graphical representation of the constraints is given in Figure 3.17. Q (3)

8 7

(2)

6 5 4 3

(1)

2 1 1

2

3

4

5

6

7

8

9

10

P

Figure 3.17: Study Unit 4, Exercise 2(a) There is no area satisfying all the inequalities simultaneously and so no feasible area exists. We say the LP model (or the problem) is infeasible. There is no solution to this LP model. (b) The graphical solution to the LP model is given in Figure 3.18. We see that the objective function value Z increases as the isoprofit lines are moved upwards and to the right across the feasible region. The last point where an isoprofit line touches the feasible area before moving out of the feasible area is at point K = (0; 25), and this is the optimal solution. The optimal solution is at M = 0 and N = 25 and the associated maximum value of the objective function is Z = M + 3N = 0 + 3(25) = 75.

46

DSC2606 3.8. SOLUTIONS TO EXERCISES

N (3)

35

30

25

K Max

i m um

20

Z -va l ue

15

10

5

Z= Z=

5

(2)

(1)

30

15

10

15

20

25

30

35

M

Figure 3.18: Study Unit 4, Exercise 2(b) The optimal solution lies on constraint (1) and so it is a binding constraint. The optimal solution does not lie on constraint (2) and so it is a nonbinding constraint. Constraint (3) does not influence the feasible area and so it is a redundant constraint.

47

DSC2606 CHAPTER 3 GRAPHICAL REPRESENTATION

48

Chapter

4

Computer solutions

Contents 4.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 50

4.2

Using LINGO . . . . . . . . . . . . . . . . . . . . . . . 50

4.3

Solving Mark’s problem with LINGO . . . . . . . . . . 51

4.4

Solving Christine’s problem with LINGO . . . . . . . . 53

4.5

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 54

4.6

Solutions to exercises . . . . . . . . . . . . . . . . . . . 54

49

DSC2606 CHAPTER 4 COMPUTER SOLUTIONS

Sections from prescribed book, Winston Chapter 4, Section 4.9 Chapter 4, Section 4.12 Chapter 4, Appendices A, B and C

Learning objectives After completing this study unit you should be able to • use the computer package LINGO to solve LP models

4.1 Introduction The computer package LINGO, contained in the prescribed book Wayne Winston: Operations Research, Applications and Algorithms will be used to solve our problems.

4.2 Using LINGO Now we want to learn how to use the computer package LINGO. Turn to Winston for instructions on the use of LINGO. Refer to Tutorial Letter 101 for the exact page references. The following hints will help you when keying a model into LINGO: 1. The objective function is keyed in by typing “max” or “min”, followed by the “=” sign and then the equation. 2. LINGO constraints do not have to be preceded by “subject to”, “s.t.” or “st”. 3. Mathematical operators must be added and an asterisk “ * ” is used for multiplication. 4. All LINGO statements must end with a semicolon “ ; ”. 5. A constraint name may be given at the beginning of the line containing the constraint and the name must be enclosed in square brackets “[ ]”. 6. Variable and constraint names may consist of a maximum of 32 characters. 7. A title cannot be given, but comments may be included anywhere in the model provided they are preceded by an exclamation mark “ ! ” and end with a semicolon “ ; ”. 50

DSC2606 4.3. SOLVING MARK’S PROBLEM WITH LINGO

4.3 Solving Mark’s problem with LINGO You must be sitting at your computer with the LINGO computer package open. Key Mark’s model into the LINGO “untitled” window and it should appear as shown in Figure 4.1.

Figure 4.1: Mark’s LINGO model To solve the LINGO model, select the SOLVE command from the LINGO menu. A dialog box appears which should be closed. The solution then appears in the Reports Window, which lies behind the window containing the model. Select the option “2. Reports Window” from the WINDOW menu to see the solution. It should appear as given in Figure 4.2.

Figure 4.2: Mark’s LINGO solution The optimal solution results are briefly discussed below. The objective function

51

DSC2606 CHAPTER 4 COMPUTER SOLUTIONS

The optimal value of the objective function is given in the first part of the output. In Mark’s case, the maximum profit that can be made is R6 350. The variables The variables and their values are given in the second part of the output. The decision variables are given in the first column and their optimal value in the second column. The optimal product mix for Mark’s problem is to produce 12 model V H200 and 11 model SB150 television sets daily. The constraints The constraints and the value of their slack or surplus are given in the third part of the output. The constraints are given in the first column of the output. It is always advisable to give names to the constraints, as these constraint names will then be printed here. If you do not give names to the constraints, row numbers instead of constraint names will be printed in this first column. This makes it rather difficult to interpret the output. The second column gives the slack or surplus on constraints. Slacks are associated with ≤ constraints and surpluses with ≥ constraints. In Mark’s case, the labour constraint is a ≤ constraint and therefore the 5 in the second column is a slack. This means that five labour hours are not utilised each day. The labour constraint does not use up all the available resource and is a nonbinding constraint. The machine constraint has a slack of zero indicating that no machine hours are left over and all the hours are utilised. The market constraint also has a slack of zero indicating that the maximum number of sets that can be sold daily are produced. The latter two constraints with no slack are binding constraints. Binding constraints are called “binding” as they “bind” the solution to a problem, that is, they determine the optimal values of the decision variables and the objective function. In Mark’s case, they prevent the objective function value, profit, from increasing. The factors that determine the machine and market constraints must be examined if a larger profit is required. The columns containing “Reduced Cost” and “Dual Price” in the LINGO output are ignored as this is outside the scope of this module. 52

DSC2606 4.4. SOLVING CHRISTINE’S PROBLEM WITH LINGO

4.4 Solving Christine’s problem with LINGO The LINGO model and its solution are given in Figure 4.3 and Figure 4.4 respectively.

Figure 4.3: Christine’s LINGO model

Figure 4.4: Christine’s LINGO solution

The optimal solution is to use 83, 33 grams of ALPHA and 216, 67 grams of BETA in each tin of Sungold paint for a minimum cost of R63, 50 per tin (cost was given in cents). Both constraints are ≥ constraints and therefore the SLACK or SURPLUS values given in the output are in fact surpluses. Since these surplus values are zero, both constraints are binding, indicating that the requirements for brilliance and tint have been met exactly.

53

DSC2606 CHAPTER 4 COMPUTER SOLUTIONS

4.5 Exercises 1. Solve the following LP model with LINGO: Minimise z = x1 − x2 − 3x3 − 5x4 subject to 3x1 − 4x2 + x3 − x4 ≤ 2 5x2 − 5x3 − 2x4 ≤ 8 x1 − x2 + 2x3 + x4 ≤ 7 and x1 ; x2 ; x3 ; x4 ≥ 0. 2. Photo Chemicals produces two types of picture-developing fluid. Product 1 costs Photo Chemicals R2 per litre to produce and product 2 costs Photo Chemicals R2, 25 per litre to produce. At least 30 litres of product 1 and at least 20 litres of product 2 must be produced during the next two weeks. The perishable raw material needed to produce these two products will spoil within the next two weeks if not used. Product 1 requires one kilogram and product 2 requires two kilograms of this raw material per litre. Management requires that at least 80 kilograms of the raw material must be used in the next two weeks. Formulate this problem as an LP model. Solve with LINGO.

4.6 Solutions to exercises 1. The LINGO model and its solution are given in Figure 4.5 and Figure 4.6 respectively.

Figure 4.5: Study Unit 4, Exercise 1 LINGO model The optimal solution that results from LINGO is x1 = 0; x2 = 7, 333; x3 = 0; x4 = 14, 333; z = −79. 54

DSC2606 4.6. SOLUTIONS TO EXERCISES

Figure 4.6: Study Unit 4, Exercise 1 LINGO solution The first constraint has a slack of 45,667 and is a nonbinding constraint. The second and third constraints have no slack and are therefore binding constraints. 2. Let P1 = number of litres of product 1 produced in the next two weeks, P2 = number of litres of product 2 produced in the next two weeks. The LP model is Minimise COST = 2P1 + 2, 25P2 subject to P1 ≥ 30 P2 ≥ 20 P1 + 2P2 ≥ 80 and P1; P2 ≥ 0. The LINGO model and its solution are shown in Figure 4.7 and Figure 4.8 respectively. Note that decimal commas are keyed in as points in LINGO; we key in 2.25 instead of 2, 25.

55

DSC2606 CHAPTER 4 COMPUTER SOLUTIONS

Figure 4.7: Photo Chemicals LINGO model

Figure 4.8: Photo Chemicals LINGO solution The optimal solution is to produce 30 litres of product 1 and 25 litres of product 2 in the next two weeks. The minimum cost will be R116, 25. Since there is no surplus on the first constraint, the required minimum quantity is produced. Since there is a surplus of five on the second constraint, five litres more than the required minimum quantity is produced. Since there is no surplus on the third constraint, the minimum quantity that must be used, 80 kilograms, is in fact used. The first and third constraints are binding and the second constraint is nonbinding.

56

Part 2 Nonlinear Programming

This part introduces the concept of nonlinear mathematical programming.. The concepts of a limit and the derivative of a function form background knowledge. These concepts are discussed in Study unit 7 and in Study unit 8. You must familiarise yourself with this background to limits and differentiation, but you will not be examined on these two study units explicitly. You will however be examined explicitly on all of the material in all the other study units in this part.

57

58

Chapter

5

Introductory concepts

Contents 5.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 60

5.2

Linear versus nonlinear . . . . . . . . . . . . . . . . . . 61

5.3

Examples of NLP categories . . . . . . . . . . . . . . . 62

5.4

Assumptions of NLP . . . . . . . . . . . . . . . . . . . . 63

5.5

Solutions to models – basic concepts . . . . . . . . . . . 64

5.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 64

5.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . 65

59

DSC2606 CHAPTER 5 INTRODUCTORY CONCEPTS

Sections from prescribed book, Winston Chapter 11, Section 11.2

Learning objectives After completing this study unit you should be able to • understand the definitions of linear and nonlinear functions • determine whether a function is linear or nonlinear • distinguish between LP and NLP models • identify the objective function and the constraints of an NLP model • distinguish between constrained and unconstrained NLPs • understand why NLPs do not necessarily satisfy the assumptions of LPs • check whether a particular point is in the feasible area of an NLP or not (the same as for LP models) • explain differences between LPs and NLPs concerning the positioning of the optimal solution on the feasible area • explain what is meant by local extremum.

5.1 Introduction In the first part of this study guide we studied linear programming (LP). Our goal was to optimise (maximise or minimise) the objective function subject to constraints. The objective function and all the constraints were linear equations and/or linear inequalities. In many optimisation problems, however, the objective function may not be a linear function and/or some or all of the constraints need not be linear equations and/or inequalities. An optimisation problem containing nonlinear equations and/or inequalities is called a nonlinear programming (NLP) problem. In this second part of the study guide, we discuss techniques that are used to solve NLP problems. 60

DSC2606 5.2. LINEAR VERSUS NONLINEAR

5.2 Linear versus nonlinear In the LP part of this study guide we saw that a linear function is one where each term is either a number or a number multiplied by a symbol. The terms do not contain squares, exponents, cross-products, square roots etc. The definition of a linear function as given in the prescribed textbook, Winston, is as follows:

Definition 5.1 A function f (x1 ; x2 ; . . . ; xn ) of x1 ; x2 ; . . . ; xn is a linear function if and only if for some set of constants c1 ; c2 ; . . . ; cn , f (x1 ; x2 ; . . . ; xn ) = c1 x1 + c2 x2 + · · · + cn xn .

A function where some terms contain exponents, cross-products, square roots etc is by implication a nonlinear function. For example, f (x1 ; x2 ) = 3x1 + 2x2 is a linear function and f (x1 ; x2 ) = x1 x32 is a nonlinear function of x1 and x2 . Definition 5.2 For any linear function f (x1 ; x2 ; . . . ; xn ) and any number b, the inequalities f (x1 ; x2 ; . . . ; xn ) ≤ b and f (x1 ; x2 ; . . . ; xn ) ≥ b, are linear inequalities.

For example, 4x1 + 3x2 ≤ 3 and 2x1 + 5x2 ≥ 4 are linear inequalities, but 1

x12 x32 ≥ 5 is a nonlinear inequality.

The general form of an LP model was given in Section 2.5. Winston gives the general form of an NLP as follows:

61

DSC2606 CHAPTER 5 INTRODUCTORY CONCEPTS

Definition 5.3 A general nonlinear programming problem (NLP) (labelled (1)) can be expressed as follows: Find the values of decision variables x1 ; x2 ; . . . ; xn that max (or min) z = f (x1 ; x2 ; . . .; xn ) subject to g1 (x1 ; x2 ; . . .; xn ) (≤, =, or ≥) b1 g2 (x1 ; x2 ; . . .; xn ) (≤, =, or ≥) b2 .. . gm (x1 ; x2 ; . . . ; xn ) (≤, =, or ≥) bm . As in LP, the function z = f (x1 ; x2 ; . . . ; xn ) is the NLP’s objective function, and g1 (x1 ; x2 ; . . . ; xn ) (≤, =, or ≥) b1 .. .. .. . . . gm (x1 ; x2 ; . . . ; xn ) (≤, =, or ≥) bm are the NLP’s constraints. Such an NLP is called a constrained NLP. If an NLP does not have any constraints, it is called an unconstrained NLP. Obviously if f ; g1 ; g2 ; . . . ; gm are all linear functions, then (1) is a linear programming model.

5.3 Examples of NLP categories Nonlinear programming models fall into one of two categories; constrained NLPs or unconstrained NLPs. Winston gives the following example of an unconstrained NLP:

Example 5.1 It costs a company c rand to produce one unit of a product. The demand for the product depends on the price charged per unit, p rand, and this relationship is D(p). The company wants to know what price per unit to charge for the product so as to maximise its profit.

62

DSC2606 5.4. ASSUMPTIONS OF NLP

Solution The company’s decision variable is p. The profit function is P(p) = Profit per unit × number demanded = (p − c) × D(p). Therefore, the company must maximise the following unconstrained NLP: Maximise P(p) = (p − c)D(p).

Winston gives the following example of a constrained NLP: Example 5.2 A company can produce KL units of a product if K units of capital and L units of labour are used. Capital can be purchased at R4 per unit and labour at R1 per unit. A total of R8 is available to purchase capital and labour. The company wants to maximise the quantity of products manufactured. Solution Let K and L represent the quantities of capital and labour purchased, respectively. Then K and L must satisfy 4K + L ≤ 8, K ≥ 0 and L ≥ 0. Therefore, the company must maximise the following constrained NLP: Maximise Z = KL subject to 4K + L ≤ 8 and K; L ≥ 0.

5.4 Assumptions of NLP In the first part of this study guide we saw that for an LP model to be an appropriate representation of a real-life situation, certain assumptions must be satisfied. Unlike an LP, an NLP may not satisfy the Proportionality and Additivity assumptions. If we refer back to Example 5.2, we see that if we increase L by one unit, then Z increases by K units. Therefore, the effect on the objective function

63

DSC2606 CHAPTER 5 INTRODUCTORY CONCEPTS

value, Z, of increasing L by one depends on K. This means that the Additivity assumption is not satisfied. Consider the following NLP: 1

1

Maximise z = x 3 + y 3 subject to x + y = 1 and x; y ≥ 0. Doubling the value of x does not double the contribution of x to the objective function value, z. Therefore, the Proportionality assumption is not satisfied.

5.5 Solutions to models – basic concepts Refer to Winston, Section 11.2 and study the following: • a feasible region • an optimal solution • differences between LPs and NLPs concerning the positioning of the optimal solution on the feasible region • local extremum. You will find the exact page references in Tutorial Letter 101.

5.6 Exercises 1. For each of the following functions, state whether the function is linear or nonlinear: (a) f (x) = ex (b) g(y) = y2 + 2y + 4 (c) f (K; L) = 4K + L (d) F(K; L) = KL 1

1

(e) g(x; y) = x 3 + y 3 (f) h(x; y) = x + y (g) Z(A; A1 ; A2 ) = (h) Z(A1 ; A2 ) =

0,04A1 +0,01A2 A

0,04A1 +0,01A2 0,01

(i) D(X ;Y ) = (X − 5)2 + (Y − 5)2 64

DSC2606 5.7. SOLUTIONS TO EXERCISES

(j) d(X ;Y ) =

p (X − 5) + (Y − 5)

2. Consider the following NLP model:

Minimise Z = (x − 1)2 + (y − 1)2 subject to x − 2y ≥ −2 x + 2y ≤ 10 x − y ≤ 4 x ≥ 2 y ≥ 0. (a) Are the following points in the feasible area or not? (6; 1), (6; 2), (3; 1), (3; 3), (2; 1); (2; 2) (b) For each of the feasible points in (a), calculate the objective function value. (c) With the information available from (a) and (b), can we say that the point (2; 1) is a local minimum of the NLP?

5.7 Solutions to exercises 1.

(a) Nonlinear (b) Nonlinear (c) Linear (d) Nonlinear (e) Nonlinear (f) Linear (g) Nonlinear (h) Linear (i) Nonlinear (j) Nonlinear

2.

(a) For a point to be in the feasible area, it must satisfy all five of the constraints. Consider point (6; 1). Constraint 1: x − 2y = 6 − 2(1) = 4 > −2

⇒ satisfied 65

DSC2606 CHAPTER 5 INTRODUCTORY CONCEPTS

Constraint 2: x + 2y = 6 + 2(1) = 8 < 10 Constraint 3: x − y = 6 − 1 = 5  4

⇒ satisfied

⇒ not satisfied

The third constraint is not satisfied. Therefore, point (6; 1) is not in the feasible area. Consider point (6; 2). Constraint 1: 6 − 2(2) = 2 > −2

⇒ satisfied

Constraint 2: 6 + 2(2) = 10 = 10

⇒ satisfied

Constraint 3: 6 − 2 = 4 = 4

⇒ satisfied

Constraint 4: 6 > 2

⇒ satisfied

Constraint 5: 2 > 0

⇒ satisfied

All five constraints are satisfied. Therefore, point (6; 2) is in the feasible area. If we apply this method to the remaining points, we obtain the following result: The points in the feasible area are (6; 2), (3; 1), (2; 2) and (2; 1). The points not in the feasible area are (6; 1) and (3; 3). (b) The objective function values of the feasible points are as follows: Point (6; 2): Z = = = =

(x − 1)2 + (y − 1)2 (6 − 1)2 + (2 − 1)2 52 + 12 26

Point (3; 1): Z = (3 − 1)2 + (1 − 1)2 = 4 Point (2; 2): Z = (2 − 1)2 + (2 − 1)2 = 2 Point (2; 1): Z = (2 − 1)2 + (1 − 1)2 = 1 (c) No, even though the objective function value at the point (2; 1) is less than any other point considered, we have not looked at all feasible points in the neighbourhood of (2; 1).

66

Chapter

6

Formulating NLP models and computer solutions

Contents 6.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 68

6.2

A dimensions problem . . . . . . . . . . . . . . . . . . . 68

6.3

An inventory problem . . . . . . . . . . . . . . . . . . . 70

6.4

LINGO . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

6.5

Formulation and LINGO solution of NLP models . . . 74

6.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 75

6.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . 75

6.8

Looking ahead at the remaining study units . . . . . . . 77

67

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

Sections from prescribed book, Winston Chapter 11, Section 11.2

Learning objectives After completing this study unit you should be able to • formulate an NLP model for a given problem • solve an NLP model with LINGO.

6.1 Introduction In Part 1 of this study guide, the formulation (modelling) process, that is, the construction of models to represent problem situations, was described in detail and many examples were given. Only linear functions were considered in this section. The principles and guidelines given there for modelling are, however, also applicable when modelling nonlinear functions. You should refer back to the relevant study units to refresh your memory. We will now illustrate the modelling of problem situations where nonlinear functions are involved by means of two problems: a dimensions problem and an inventory problem.

6.2 A dimensions problem A rectangular piece of cardboard, which is 16 centimetres long and 10 centimetres wide, must be turned into an open box. This can be done by cutting away identical squares from each corner of the cardboard and folding up the resultant flaps. We want to find the dimensions of the box that will yield the maximum volume for the box. Decision variable Let x denote the length (in centimetres) of one side of each of the identical squares to be cut out of the cardboard. This problem is illustrated in Figure 6.1. 68

DSC2606 6.2. A DIMENSIONS PROBLEM

x x

x x

1 0

x

1 0 - 2 x

x x x

x

1 6 - 2 x 1 6

Figure 6.1: Dimensions of a box problem We see that the dimensions of the box are as follows: length = (16 − 2x) cm width = (10 − 2x) cm height = x cm. Objective function The volume of the box must be maximised. If we let V (x) represent the volume (in cubic centimetres) of the box, we have V (x) = = = = =

length × width × height (16 − 2x) × (10 − 2x) × x (160 − 52x + 4x2 )x 160x − 52x2 + 4x3 4(x3 − 13x2 + 40x).

The objective function is then Maximise V (x) = 4(x3 − 13x2 + 40x). Constraints Each side of the box must be nonnegative. Therefore, 16 − 2x ≥ 0 ⇒ 2x ≤ 16 ⇒ x ≤ 8 10 − 2x ≥ 0 ⇒ 2x ≤ 10 ⇒ x ≤ 5 x ≥ 0. These three inequalities must be satisfied simultaneously. So 0 ≤ x ≤ 5. The NLP model is

69

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

Maximise V (x) = 4(x3 − 13x2 + 40x) subject to 0 ≤ x ≤ 5.

6.3 An inventory problem One problem faced by many companies is that of controlling the inventory of goods carried. Ideally, the manager must ensure that the company has sufficient stock to meet customer demand at all times. At the same time he must make sure that this is accomplished without overstocking (incurring unnecessary storage costs) and also without having to place orders too frequently (incurring reordering costs). We now consider an inventory problem. The Dixie Company is the sole agent for the Excalibur 250 cc motorcycle. Management estimates that the annual demand for these motorcycles is 10 000 and that they will sell at a uniform rate throughout the year. The cost incurred in ordering each shipment of motorcycles is R10 000 and the cost per year of storing each motorcycle is R200. Dixie’s management faces the following problem. Ordering too many motorcycles at one time ties up valuable storage space and increases the storage cost. On the other hand, placing orders too frequently increases the ordering costs. They obviously want to minimise ordering and storage costs, and need to determine the size of each order and how often the orders should be placed to achieve this. Decision variable Let x denote the number of motorcycles in each order (the lot size). Then, assuming that each shipment arrives just as the previous shipment has been sold, the average number of motorcycles in storage during the year is x . This is illustrated in Figure 6.2. 2 Inventory level x x 2

Average inventory Time Figure 6.2: Inventory problem

Objective function The annual costs, ordering and storage costs, must be minimised. 70

DSC2606 6.4. LINGO

Dixie’s annual storage cost is given by storage cost per motorcycle × number of motorcycles in storage x = 200 × 2 = 100x. Since the annual demand is 10 000 motorcycles and each lot size (order size) 10 000 is x motorcycles, the number of orders per year will be . x Dixie’s annual ordering cost is given by ordering cost per order × number of orders 10 000 = 10 000 × x 100 000 000 = . x Therefore, Dixie’s total annual cost is ordering cost + storage cost 100 000 000 + 100x. = x The objective function is then Minimise COST S = 100 000 000x−1 + 100x. Constraints The symbol x represents the number of motorcycles in each order. The maximum value that x can assume will be the annual demand of 10 000 motorcycles. This will result if only one order is placed annually. Therefore, x ≤ 10 000. Obviously x cannot be negative, so x ≥ 0. The NLP model is Minimise COST S = 100 000 000x−1 + 100x subject to 0 ≤ x ≤ 10 000. NOTE: An actual inventory process does not necessarily have a straight line depletion graph as illustrated in this example. More sophisticated mathematics can be used to model it as a stochastic process.

6.4 LINGO The computer package LINGO can be used to solve NLP models. (LINDO cannot solve NLP models.)

71

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

In Part 1 of this study guide, you learnt how to use LINGO. You should turn back to Study unit 4.2 to refresh your memory. We now want to use LINGO to solve the two problems already formulated in this study unit. Before doing so, however, here are some additional hints that will help you when keying the model into LINGO: (a) The symbol “*” is used for multiplication. For example, to type 4x, we key in 4*x. (b) The symbol “∧ ” is used to indicate raising to a power. For example, to type x2 , we key in x∧ 2. (c) There is no symbol that can be used for division. You must do the division on a calculator and then key in the resulting decimal. Use the point “.” to represent the decimal comma. For example, to type 5,4, we key in 5.4. (d) To avoid confusion, enclose terms in round brackets whenever necessary. For example, to type xy+2 , we key in x∧ (y + 2). If the brackets are omitted and x∧ y + 2 is keyed in, LINGO will interpret it as xy + 2. (e) LINGO assumes that all variables are nonnegative; so we do not need to key this into LINGO. (f) LINGO assumes strict inequalities; so we can key in “” to represent ≤ and ≥ . We now return to the dimensions problem of Section 6.2 and solve it with LINGO. The LINGO model is given in Figure 6.3 and its solution in Figure 6.4.

Figure 6.3: Dimensions model From the solution in Figure 6.4, we find x = 2 and objective value = 144. A square with sides of two centimetres must therefore be cut out from each corner of the cardboard. 72

DSC2606 6.4. LINGO

Figure 6.4: Dimensions solution The dimensions of the box can then be calculated as length = (16 − 2x) = 16 − 2(2) = 12cm width = (10 − 2x) = 10 − 2(2) = 6cm height = 2cm. The maximum volume of this box is then 144cm3. We now return to the inventory problem of Section 6.3 and solve it with LINGO. The LINGO model is given in Figure 6.5 and its solution in Figure 6.6.

Figure 6.5: Inventory model The solution in Figure 6.6 gives x = 1 000 and optimal value of 200 000. The number of motorcycles in each order should be 1 000 and the resulting minimum costs will be R200 000. The number of orders to be placed annually can now be calculated as annual demand 10 000 = = 10. order size 1 000 73

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

Figure 6.6: Inventory solution We therefore conclude that ten orders should be placed annually and each order should contain 1 000 motorcycles. The associated minimum ordering and storage costs will then be R200 000.

6.5 Formulation and LINGO solution of NLP models We now need to practise our formulation skills and see whether we can solve our models with LINGO. Refer to Winston, Section 11.2 and work through all the given examples. You must try to formulate the models yourself before referring to the given models. You must also practise solving these models with LINGO by actually doing so on your own computer. Winston states that there is no guarantee that the LINGO solution found is in fact the optimal solution to the model. It may be a solution associated with a local extremum, which is not necessarily the overall absolute extremum of the function. There are, however, criteria that must apply for us to be sure that the LINGO solution is the optimal solution. We will have to extend our study of functions to enable us to understand these criteria. This we will do in the following study units. 74

DSC2606 6.6. EXERCISES

6.6 Exercises 1. The area of a triangle with sides of length a, b, and c is where s is half the perimeter of the triangle.

p s(s − a)(s − b)(s − c),

A triangular area must be fenced and 60 metres of fencing is available for this purpose. (a) Formulate an NLP model that will maximise the fenced area. (b) Use LINGO to solve the model and explain the solution.

6.7 Solutions to exercises 1.

(a) The triangular area is illustrated in Figure 6.7.

a

b

c Figure 6.7: Triangular area Decision variables The decision variables are a b c s

= = = =

length in metres of first side of triangle, length in metres of second side of triangle, length in metres of third side of triangle, half the perimeter of the triangle.

Objective function The triangular area must be maximised. This means that the area of the triangle must be maximised. The objective function is p Maximise AREA = s(s − a)(s − b)(s − c). Constraints The variable s is half the perimeter of the triangle; s = 21 (a + b + c). Sixty metres of fencing is available. The perimeter of the area may not exceed the available fencing. Therefore, a + b + c ≤ 60. Sign restrictions

75

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

Obviously the lengths of the sides of the triangle may not be negative. So a; b; c ≥ 0. The NLP model is Maximise AREA = subject to

and

p s(s − a)(s − b)(s − c)

s = 12 (a + b + c) a + b + c ≤ 60 a; b; c ≥ 0.

(b) The LINGO model and its solution are given in Figure 6.8 and Figure 6.9 respectively.

Figure 6.8: Triangular model

Figure 6.9: Triangular solution

From the solution we see that a = b = c = 20 and the objective value is 173, 205. 76

DSC2606 6.8. LOOKING AHEAD AT THE REMAINING STUDY UNITS

This means that a maximum area of 173, 205 square metres will be fenced if the sides of the triangle are all of equal length, that is, 20 metres.

6.8 Looking ahead at the remaining study units The objective functions we encountered when formulating NLP models were either functions of one variable only (eg the dimensions of the box problem and the inventory problem), or functions of several variables (eg Oilco’s problem in Winston and the triangular area problem). We also saw that we can conclude that the solution obtained with LINGO is indeed the optimal solution to the model only if certain criteria apply, and that we will have to extend our study of functions, of one and several variables, to understand these criteria. In the following study units we will return to the basics of functions and how to solve them. To do this we need differential calculus. We start off by introducing differential calculus by considering functions of one variable only. After covering the basic concepts (Study units 7–10), we will proceed to the practical applications (Study units 11–14). We will also introduce integral calculus and its practical applications (Study unit 15). In later study units we proceed to functions of several variables; first covering the basic concepts of partial differential calculus (Study unit 16) and then the practical applications (Study units 17–20).

77

DSC2606 CHAPTER 6 FORMULATING NLP MODELS AND COMPUTER SOLUTIONS

78

Chapter

7

Limits and continuity

Contents 7.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 80

7.2

The limit of a function . . . . . . . . . . . . . . . . . . . 80

7.3

Infinite limits . . . . . . . . . . . . . . . . . . . . . . . . 90

7.4

Limits when x tends to infinity . . . . . . . . . . . . . . 92

7.5

Continuity . . . . . . . . . . . . . . . . . . . . . . . . . 94

7.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 97

7.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . 101

79

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

Sections from prescribed book, Winston Chapter 11, Section 11.1

Learning objectives After completing this study unit you should be able to • explain the concept of the limit of a function • understand how limits are evaluated • explain the difference between continuous and discontinuous functions.

7.1 Introduction We now begin the study of differential calculus. These notes were compiled with the following book as reference: Mathematics for the Managerial, Life and Social Sciences by S.T. Tan. Historically, the calculus of differentiation was developed in response to the problem of finding the tangent line to an arbitrary curve. But it quickly became apparent that solving this problem provided mathematicians with a method for solving many practical problems involving the rate of change of one quantity with respect to another. The basic tool used in differential calculus is the derivative of a function. The concept of the derivative is based, in turn, on a more fundamental notion – that of the limit of a function. This study unit introduces the concept of a limit which is necessary for understanding the derivative of a function. You must familiarise yourself with this background to differentiation, but you will not be examined on this study unit explicitly.

7.2 The limit of a function The concept of the limit of a function bridges the gap between the mathematics of algebra and geometry and the mathematics of calculus. Although the idea of a limit is somewhat difficult to understand at first, the evaluation of limits is fairly easy as you will see in this section. x2 − x − 2 . x−2 It is clear that the function is not defined at x = 2, since division by zero is undefined. However, what happens when x gets close to 2? Let us tabulate a Consider the function f (x) =

80

DSC2606 7.2. THE LIMIT OF A FUNCTION

few points close to 2 and see what happens to the value of the function f (x). This is shown in Figure 7.1. 2 x

1 ,8 5

1 ,9 0

1 ,9 5

1 ,9 9

1 ,9 9 9

2 ,0 0 1

2 ,0 1

2 ,0 5

2 ,1 0

2 ,1 5

f(x )

2 ,8 5

2 ,9 0

2 ,9 5

2 ,9 9

2 ,9 9 9

3 ,0 0 1

3 ,0 1

3 ,0 5

3 ,1 0

3 ,1 5

3

Figure 7.1: Values of f (x) close to x = 2 We notice that the value of f (x) gets closer to 3 as the value of x gets closer to 2. (From both sides.) When this is the case, we say that the limit of f (x) when x approaches 2 is equal to 3 and we use the following notation: lim f (x) = 3.

x→2

We can simplify the function and obtain the following: x2 − x − 2 x−2 (x − 2)(x + 1) = (x − 2) = x + 1 for x 6= 2.

f (x) =

The graph of this function is given in Figure 7.2. f (x) 5 4 3 bc

2 1 −1

1

2

3

4

5

x

Figure 7.2: f (x) = x + 1 for x 6= 2 From Figure 7.2 we can clearly see that lim f (x) = 3.

x→2

81

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

If we follow the line representing f (x) from the left and from the right of x = 2, it is obvious that f (x) tends to 3. Note that there is a gap in the line as f (x) is not defined at x = 2. Let us now consider a function that is defined at the point under consideration and whose graph is regular at this point, say f (x) = x + 1 for all x ∈ R. The graph of this function is given in Figure 7.3. f (x) 5 4 3 2 1 −1

1

2

3

4

5

x

Figure 7.3: f (x) = x + 1 for all x ∈ R Here the function is defined at point x = 2 and lim f (x) = 3. x→2

We now substitute x = 2 into the function f (x) = x + 1 and we get f (2) = 2 + 1 = 3. Therefore, we can say that lim f (x) = 3 = f (2).

x→2

Consider the following function: ( f (x) =

x+1 5

for x 6= 2

for x = 2.

From the graph of this function in Figure 7.4 we see that, although f (2) = 5, the value of f (x) tends towards 3 as x approaches 2. Once again lim f (x) = 3. x→2

We now consider another function and its graph is in Figure 7.5: ( 1 for x ≥ 0 f (x) = −1 for x < 0. It is clear that f (x) tends towards 1 if x approaches 0 from the right. We write this as lim f (x) = 1, x→0+

82

DSC2606 7.2. THE LIMIT OF A FUNCTION

f (x) 5 b

4 3 bc

2 1 −1

1

2

3

4

5

x

Figure 7.4: f (x) = x + 1 for x 6= 2 and f (x) = 5 for x = 2 f (x) 1 b

0

x bc

−1

Figure 7.5: f (x) = 1 for x ≥ 0 and −1 for x < 0 and call it the limit from the right. Now, when x approaches 0 from the left, f (x) tends to −1. This is written as lim f (x) = −1,

x→0−

and is called the limit from the left. In this case lim f (x) 6= lim f (x) and we say that the limit does not exist. x→0+

x→0−

This leads to a theorem.

Theorem 7.1 lim f (x) exists if and only if lim f (x) = lim f (x). x→a−

x→a

x→a+

The limit of a function at a point exists if and only if both the limit from the left and the limit from the right (at that point) exist and are equal.

An important corollary of this theorem is as follows:

Corollary 7.1 If lim f (x) 6= lim f (x), then lim f (x) does not exist. x→a−

x→a+

x→a

83

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

If the limit from the left and the limit from the right of a function at a certain point are not equal, then the limit at that point does not exist.

Example 7.1 Consider the graphs in Figure 7.6. f (x)

f (x)

x0

x0

x

(a) lim f (x) exists

x

(b) lim f (x) exists

x→x0

x→x0

f (x)

f (x)

x0 bc

x0 (c) lim f (x) does not exist x→x0

x

x (d) lim f (x) does not exist x→x0

Figure 7.6: Example of limits that exist and limits that do not exist

Theorem 7.2 If f (x) = k for all x ∈ R, then lim f (x) = k. x→a

The limit of a constant function is equal to the constant.

84

DSC2606 7.2. THE LIMIT OF A FUNCTION

Example 7.2 Consider the function f (x) = 4 for all x ∈ R. The graph is given in Figure 7.7. f (x) 4 2 2

4

6

8

10

x

Figure 7.7: Graph of f (x) = 4 Here lim f (x) = 4, lim f (x) = 4, etc. x→5

x→10

Theorem 7.3 For any polynomial function f (x) = c0 + c1 x + c2 x2 + · · · + cn xn where c0 ; . . . ; cn are real numbers and n is a natural number, lim f (x) = c0 + c1 a + c2 a2 + · · · + cn an = f (a).

x→a

The limit of a polynomial function at a certain point is equal to the value of the function at that point.

Example 7.3 Consider the function f (x) = x2 − 2x + 6. The graph of this function is given in Figure 7.8. f (x) 10 b

8 6 4 2 −1

1

2

3

4

x

Figure 7.8: Graph of f (x) = x2 − 2x + 6 85

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

Here lim f (x) = 9. x→3

If we substitute x = 3 into the function f (x) = x2 − 2x + 6, we obtain f (3) = 32 − 2(3) + 6 = 9. Therefore, lim f (x) = lim (x2 − 2x + 6)

x→3

x→3

= 9 = f (3).

Theorem 7.4 If c is any real number, then lim c f (x) = c lim f (x). x→a

x→a

The limit of a function multiplied by a constant, is the constant multiplied by the limit of the function.

Example 7.4 Consider the function f (x) = 2x + 3. Then lim 5 f (x) = lim 5(2x + 3)

x→1

x→1

= lim (10x + 15) x→1

= 25. Compare this to lim 5 f (x) = 5 lim f (x)

x→1

x→1

= 5 lim (2x + 3) x→1

= 5(5) = 25. Therefore, we see that lim 5(2x + 3) = 5 lim (2x + 3).

x→1

86

x→1

DSC2606 7.2. THE LIMIT OF A FUNCTION

Theorem 7.5 If lim f (x) and lim g(x) exists, then x→a

x→a

lim [ f (x) + g(x)] = lim f (x) + lim g(x).

x→a

x→a

x→a

The limit of the sum of two (or more) functions is equal to the sum of their limits.

Example 7.5 Consider the functions f (x) = 2x + 3

and g(x) = x + 1.

Then lim [5 f (x) + g(x)] = lim [5(2x + 3) + (x + 1)]

x→1

x→1

= lim (10x + 15 + x + 1) x→1

= lim (11x + 16) x→1

= 27. Compare this to lim [5 f (x) + g(x)] = lim 5 f (x) + lim g(x)

x→1

x→1

x→1

= 5 lim f (x) + lim g(x) x→1

x→1

= 5 lim (2x + 3) + lim (x + 1) x→1

x→1

= 5(5) + 2 = 27. Therefore, we see that lim [5(2x + 3) + (x + 1)] = 5 lim (2x + 3) + lim (x + 1).

x→1

x→1

x→1

Theorem 7.6 If lim f (x) and lim g(x) exist, then x→a

x→a

lim [ f (x) × g(x)] = [lim f (x)] × [lim g(x)].

x→a

x→a

x→a

The limit of the product of two (or more) functions is equal to the product of the limits of the functions.

87

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

Example 7.6 Consider the functions f (x) = 2x + 3

and g(x) =

Then

x2 − 1 . x−1

5(2x + 3)(x2 − 1) lim 5 f (x)g(x) = lim x→1 x→1 x−1 5(2x + 3)(x − 1)(x + 1) = lim x→1 (x − 1) = lim 5(2x + 3)(x + 1) x→1

= 5(5)(2) = 50. Compare this to lim 5 f (x)g(x) = lim 5 f (x) × lim g(x)

x→1

x→1

x→1

= 5 lim f (x) × lim g(x) x→1 x→1  2  x −1 = 5 lim (2x + 3) × lim x→1 x→1 x−1 (x − 1)(x + 1) = 5 lim (2x + 3) × lim x→1 x→1 (x − 1) = 5 lim (2x + 3) × lim (x + 1) x→1

x→1

= 5(5) × 2 = 50. Therefore, we see that (x2 − 1) 5(2x + 3)(x2 − 1) = 5 lim (2x + 3) × lim . x→1 x→1 (x − 1) x→1 (x − 1) lim

Theorem 7.7 If lim f (x) exists, then x→a

h in lim [ f (x)]n = lim f (x) .

x→a

x→a

The limit of a function to the power n is equal to the limit of the function, to the power n.

88

DSC2606 7.2. THE LIMIT OF A FUNCTION

Example 7.7 Consider the function f (x) = x2 + 1. Then lim [ f (x)]4 = lim (x2 + 1)4

x→2

x→2 2

= (2 + 1)4 = 625. Compare this to 4

lim [ f (x)]



=

x→2



=

lim f (x)

x→2

4

2

lim (x + 1)

x→2 2

= (2 + 1)4 = 625.

4

Therefore, we see that 2

4



2

lim (x + 1) = lim (x + 1)

x→2

x→2

4

.

Theorem 7.8 If lim f (x) and lim g(x) exist and lim g(x) 6= 0, then x→a

x→a

x→a

lim f (x) f (x) x→a = . x→a g(x) lim g(x) lim

x→a

The limit of the quotient of two functions is equal to the quotient of the limits.

Example 7.8 Consider the functions f (x) = x + 1

and g(x) = x2 − 2.

Then   f (x) x+1 lim = lim 2 x→3 g(x) x→3 x − 2 3+1 = 2 3 −2 4 = . 7 89

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

Compare this to lim f (x) f (x) x→3 lim = x→3 g(x) lim g(x) x→3

lim (x + 1)

x→3

=

lim (x2 − 2)

x→3

4 . 7

= Therefore, we see that lim

x→3



x+1 x2 − 2

lim (x + 1)



=

x→3

lim (x2 − 2)

.

x→3

7.3 Infinite limits 1 . x2 The function f (x) does not exist at x = 0, since division by zero is not defined. To find lim f (x) we again tabulate some points close to 0 and show it

Consider the function f (x) =

x→0

in Figure 7.9. 0

x f(x )

-1 1

- 0 ,5

- 0 ,1

- 0 ,0 1

0 ,0 0 5

0 ,0 5

0 ,2 5

2

4

1 0 0

1 0 0 0 0

4 0 0 0 0

4 0 0

1 6

0 ,2 5

¥

Figure 7.9: Values of f (x) close to x = 0 The value of f (x) increases very fast as x approaches 0 from the left as well as from the right and certainly does not tend to any specific real number. From the graph of this function in Figure 7.10 we see that lim f (x) does not x→0

exist. To express the behaviour of f (x) around x = 0 in a case like this, we write lim f (x) = ∞ and lim f (x) = ∞

x→0−

90

x→0+

DSC2606 7.3. INFINITE LIMITS

f (x)

0 Figure 7.10: Graph of f (x) =

x 1 x2

to indicate that f (x) increases without bound immediately to the left and to the right of x = 0. Since these two limits behave in the same way, we write symbolically lim f (x) = ∞. x→0

1 . x2 From the graph of this function in Figure 7.11 we see that as x approaches 0 from the left and from the right, f (x) decreases without bound on both sides. We denote this symbolically by Now consider the function f (x) = −

lim f (x) = lim f (x) = −∞

x→0−

x→0+

and can also write lim f (x) = −∞.

x→0

f (x) 0

x

Figure 7.11: Graph of f (x) = −

1 x2

1 Let us now determine lim f (x) if the function is defined by f (x) = . x→0 x Once again f (x) does not exist at x = 0 and we use a table to see what happens to f (x) when x approaches 0 from the left as well as from the right. This is shown in Figure 7.12.

91

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

0

x

- 0 ,5

- 0 ,1

- 0 ,0 1

- 0 ,0 0 1

0 ,0 0 1

0 ,0 1

0 ,1

0 ,5

f(x )

-2

-1 0

-1 0 0

-1 0 0 0

1 0 0 0

1 0 0

1 0

2

?

Figure 7.12: Values of f (x) close to x = 0 We see that as x approaches 0 from the left, the value of f (x) decreases without bound. This is written as lim f (x) = −∞.

x→0−

On the other hand, if x approaches 0 from the right, the value of f (x) increases without bound. We write this as lim f (x) = +∞.

x→0+

In this case, we can only say that lim f (x) does not exist and no special notax→0

tion applies. From the graph in Figure 7.13, we can see clearly that no limit exists if x tends to 0. f (x)

0

Figure 7.13: Graph of f (x) =

x

1 x

7.4 Limits when x tends to infinity In our discussion of lim f (x) so far, we have considered a to be a real numx→a ber. It is sometimes necessary to determine the behaviour of a function when 92

DSC2606 7.4. LIMITS WHEN X TENDS TO INFINITY

x tends to plus (or minus) infinity. If the value of x increases (or decreases) without bound, we write it as lim f (x) and

lim f (x).

x→+∞

x→−∞

1 Let us go back to the previous function f (x) = , which was graphed in x Figure 7.13. From the graph of f (x) we see that as the x-values tend to plus infinity (move towards the right along the x-axis), the values of f (x) become smaller (move down towards the x-axis) and get arbitrarily close to zero. We write this as lim f (x) = 0,

x→+∞

and read it as follows: the limit of f (x) as x tends to plus infinity is 0. As x, on the other hand, tends to minus infinity (moves towards the left along the x-axis) the values of f (x) become larger, that is, “less negative” (moves up towards the x-axis) and gets arbitrarily close to zero. We write this as lim f (x) = 0, x→−∞

which is read as: the limit of f (x) as x tends to minus infinity is 0. Let us now consider the function f (x) =

2x2 . 1 + x2

Suppose we have to find lim f (x). We set up a table of values for f (x) where x→∞ x becomes very large and this is shown in Figure 7.14.

x

0

1

2

5

1 0

1 0 0

f(x )

0

1

1 ,6

1 ,9 2 3

1 ,9 8

1 ,9 9 9 8

1 0 0 0 1 ,9 9 9 9 9 8

Figure 7.14: Values of f (x) as x increases We see that f (x) gets closer to 2 as x increases. We write this as lim f (x) = x→∞ 2. The graph of this function is given in Figure 7.15. The value of f (x) gets closer to 2 as x increases and will never become larger than 2. The line f (x) = 2 is called a horizontal asymptote.

93

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

f (x) 2

0

x

Figure 7.15: Graph of f (x) =

2x2 1 + x2

7.5 Continuity Continuous functions will play an important role throughout our study of calculus. Loosely speaking, a function is continuous at a point if the graph of the function at that point is devoid of holes, gaps, jumps or breaks. In other words, the graph is an uninterrupted curve. Consider the graphs of each of the functions given in Figure 7.16. f (x)

f (x)

x

x

f (x)

x

Figure 7.16: Continuous functions In each case, the graph of the function can be drawn without lifting your pencil. These are examples of continuous functions. Consider the graph of the function f (x) in Figure 7.17. At the point x = a: the graph.)

f (x) is not defined at the point. (There is a “hole” in

At the point x = b:

Here f (b) is not equal to lim f (x). (There is a “jump ”

in the graph.) 94

x→b

DSC2606 7.5. CONTINUITY

f (x) b

bc

bc

bc

a

b

c

d

x

Figure 7.17: Examples of discontinuity At the point x = c: (There is a “jump”.) At the point x = d:

lim f (x) does not exist since lim f (x) 6= lim f (x).

x→c

x→c+

x→c−

lim f (x) does not exist. (There is a break in the graph.)

x→d

The function f (x) is discontinuous at each of these points and continuous at all other points.

Definition 7.1 A function f (x) is continuous at the point x = a if the following conditions are satisfied: • f (a) is defined, • lim f (x) exists, and x→a

• lim f (x) = f (a). x→a

Consider the graph of the function f (x) in Figure 7.18. f (x)

a

b

x

Figure 7.18: Continuous function on (a; b) Here f (x) is defined at every point in the interval (a; b) and there are no jumps or gaps in the graph. We can therefore say that f (x) is continuous over the interval (a; b).

Definition 7.2 A function f (x) is said to be continuous over an interval if it is continuous at every point x in that interval.

95

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

A function f (x) is said to be a continuous function or everywhere continuous if f (x) is continuous at all points x ∈ R. 1 Let us refer back to Figure 7.13 where the graph of the function f (x) = is x given. We know that lim f (x) does not exist. This implies that f (x) is discontinux→0

ous at x = 0, as can be seen from the graph. However, the function f (x) is continuous at every other point. Therefore, we can say that f (x) is continuous over the intervals (−∞; 0) and (0; +∞). Definition 7.3 A polynomial p(x) = a + bx + cx2 + · · · + nxn is continuous at every point x. A rational function R(x) =

p(x) is continuous at every point where q(x) 6= 0. q(x)

Let us look at a practical example of a discontinuous function. Example 7.9 The graph in Figure 7.19 depicts the learning curve associated with a certain individual. Beginning with no knowledge of the subject, the individual makes steady progress towards understanding it over the time interval 0 ≤ t < t1 . The individual’s progress slows down as time t1 is approached because he fails to grasp a particularly difficult concept. All of a sudden, a breakthrough occurs at time t1 , propelling his knowledge of the subject to a higher level. The curve is discontinuous at t1 . f (t) 100 b bc

0

t1

t

Figure 7.19: Learning curve

We close this study unit by pointing out a property of continuous functions that will play a very important role in calculus.

96

DSC2606 7.6. EXERCISES

Property 7.1 Suppose that a continuous function f (x) assumes the values f (a) and f (b) at two points x = a and x = b with a < b. If f (a) and f (b) have opposite signs, then there must be at least one point x = c, with a < c < b, where f (c) = 0. If f (x) is a continuous function and f (a) and f (b) have opposite signs and a 6= b, then the function must cross the x-axis at least once between a and b. The graphs in Figure 7.20 illustrate this property. f (x) f (a)

0

b x

c

a

f (b) f (x) f (b)

0

a

c1

c2

b

x

f (a)

Figure 7.20: Continuous functions crossing x-axis Geometrically, this property states that if the graph of a continuous function goes from above the x-axis to below the x-axis or vice versa, it must cross the x-axis. This, of course, is not necessarily true if the function is not continuous as can be seen from the graph in Figure 7.21.

7.6 Exercises 1. For each of the following, find lim f (x) if it exists: x→a

97

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

f (x) f (b) b

a

b bc

x

f (a) Figure 7.21: Discontinuous function (a) Calculate the limit at a = −2 in Figure 7.22. f (x) 5 b

4 bc

3 2 1 −4 −3 −2 −1

1

2

3

x

Figure 7.22: Study Unit 7, Exercise 1(a) (b) Calculate the limit at a = 1 in Figure 7.23. f (x) 4

2

−2

−1

1

2

x

Figure 7.23: Study Unit 7, Exercise 1(b) (c) Calculate the limit at a = 1 in Figure 7.24. 98

DSC2606 7.6. EXERCISES

f (x) 3 bc

2 b

1 −3 −2 −1

1

2

x

3

Figure 7.24: Study Unit 7, Exercise 1(c) 2. Calculate the limits of the following functions by using the theorems for limits: (a) lim (2s2 − 1)(2s + 4). s→2

(b) lim

x→−2

√ 3

5x + 2.

x2 − 6x − 8 . x→0 2x2 − 5x + 2 3. For each of the following, study the graph of the function and find the indicated limit: 1 (a) The graph of the function f (x) = is given in Figure 7.25. x−3 (c) lim

f (x) 2

−1

1

2

3

4

x

−2 −4 Figure 7.25: Study Unit 7, Exercise 3(a) There is a vertical asymtote at x = 3.   1 Find lim . x→3− x − 3

1 + 1 is given in Figure 7.26. x There is a horizontal asymtote at f (x) = 1.

(b) The graph of the function f (x) =

Find

99

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

f (x)

1 x

Figure 7.26: Study Unit 7, Exercise 3(b) 

 1 (i) lim +1 . x→∞ x   1 (ii) lim +1 . x→−∞ x (c) The graph in Figure 7.27 represents the function f (x) =

( |x| for x > −4 4

for x ≤ −4.

f (x) 4 2

−6

−4

−2

2

4

x

Figure 7.27: Study Unit 7, Exercise 3(c) Find (i) lim f (x). x→∞

(ii) lim f (x). x→−∞

4. Find the values of x for which each of the following functions is continuous: (a) f (x) = 3x3 + 2x2 − x + 10. (b) h(x) =

100

x2 − 6x + 9 . x−3

DSC2606 7.7. SOLUTIONS TO EXERCISES

5. Determine all the values of x at which the following function is discontinuous: x2 − 3x + 2 f (x) = . x2 − 2x

7.7 Solutions to exercises 1.

(a) lim f (x) = 4. x→−2

(b) lim f (x) = 2. x→1

(c) lim f (x) does not exist. x→1

2.

(a) The limit is lim (2s2 − 1)(2s + 4) = lim (2s2 − 1) × lim (2s + 4)

s→2

s→2

s→2

2

= (2 · 2 − 1) × (2 · 2 + 4) = 7×8 = 56. (b) The limit is lim

x→−2

√ 3

5x + 2 =

1

lim (5x + 2) 3

x→−2

1

= [ lim (5x + 2)] 3 x→−2

1

= [5(−2) + 2] 3 1

= (−8) 3 = −2. (c) The limit is

lim (x2 − 6x − 8) x2 − 6x − 8 x→0 lim = x→0 2x2 − 5x + 2 lim (2x2 − 5x + 2) x→0

−8 2 = −4. =



 1 3. (a) lim = −∞. x→3− x − 3   1 (b) (i) lim + 1 = 1. x→∞ x 101

DSC2606 CHAPTER 7 LIMITS AND CONTINUITY

(ii) lim

x→−∞



 1 + 1 = 1. x

(c) (i) lim f (x) = ∞. x→∞

(ii) lim f (x) = 4. x→−∞

4.

(a) The function f (x) is a polynomial function and is therefore continuous for all values of x. (b) The function h(x) is a rational function and is continuous at every point where the denominator, x − 3 6= 0. The function h(x) is therefore continuous everywhere except at x = 3.

5. The function f (x) is discontinuous where x2 − 2x = x(x − 2) = 0, that is, at x = 0 and x = 2.

102

Chapter

8

The derivative of a function

Contents 8.1

Introduction . . . . . . . . . . . . . . . . . . . . . . . . 104

8.2

The slope of a tangent line . . . . . . . . . . . . . . . . . 106

8.3

The rate of change . . . . . . . . . . . . . . . . . . . . . 109

8.4

The derivative of a function . . . . . . . . . . . . . . . . 109

8.5

Differentiability and continuity . . . . . . . . . . . . . . 114

8.6

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 115

8.7

Solutions to exercises . . . . . . . . . . . . . . . . . . . 117

103

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

Learning objectives After completing this study unit you should be able to • explain the concept of a tangent line to the curve of a function • explain the concept of the average rate of change of a function over an interval • explain the concept of the instantaneous rate of change of a function at a point • explain the concept of the derivative of a function • understand how the derivative of a function is evaluated by means of a limit • understand the relationship between continuity and differentiability.

8.1 Introduction The derivative of a function will be defined in this study unit. You must understand the concepts discussed, but you will not be examined on this study unit explicitly. We will show that the derivative of a function is equal to the slope of the tangent line to the function. We mentioned in the previous study unit that the problem of finding the rate of change of one quantity with respect to another is mathematically equivalent to finding the slope of the tangent line to a curve at a given point on the curve. In other words, since the slope of a tangent line and the derivative are the same thing, a rate of change problem can be solved by finding the derivative of the function. Before going on to define the derivative, let us show the relationship between the rate of change and the slope of the tangent line by means of a graphical example. Consider the speed of a car as given by the function f (t) = 2t 2 for t ≥ 0, where t represents time. The graph of this function is given in Figure 8.1. 104

DSC2606 8.1. INTRODUCTION

f (t)

t

0

Figure 8.1: Speed of a car

Observe that the graph of f (t) rises slowly at first but more rapidly as t increases, reflecting the fact that the speed of the car is increasing with time. This observation suggests a relationship between the speed of the car at any time t and the steepness of the curve at the point corresponding to this value of t. It would therefore appear that we can solve the problem of finding the speed of the car at any time if we can find a way of measuring the steepness of the curve at any point on the curve. Now the speed of the car is related to the steepness of the curve and the steepness of the curve is in fact the rate at which the speed function, f (t), increases with respect to time, t. If we can find a yardstick with which to measure the steepness of a curve, we would have solved this rate of change problem. Consider the graph of a function f (x) as given in Figure 8.2. T

f(x ) P

0 x

Figure 8.2: Steepness of curve Think of this curve as representing a stretch of roller coaster track, as illustrated in Figure 8.3. When the car is at the point P on the curve, position P on the roller coaster track, a passenger sitting erect in the car and looking straight ahead, will have a line of sight that is parallel to the line T, which is the tangent to the curve at P. From this we see that the steepness of the curve is given by the slope of the tangent line to the curve at point P. This example shows that a rate of change problem can be solved by finding the slope of the tangent line to the curve.

105

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

f(x ) T P

0 x

Figure 8.3: Roller coaster track We will now continue by investigating the slope of a tangent line and defining the term “derivative”.

8.2 The slope of a tangent line Consider the graph of the function f (x) as given in Figure 8.4. f (x) Q b

b

Secants b

b

P

b

T x

Figure 8.4: Secant lines and tangent line Let P be a fixed point on the curve of f (x) and let Q be another point on the curve, distinct from P. The straight line passing through points P and Q is called a secant line. Now as point Q is allowed to move closer to P along the curve, the secant line through P and Q rotates about the fixed point P and approaches a line T. This line passes through only one point, point P, on the curve and is the tangent line to f (x) at the point P. To express the above in mathematical terms, we say that P is the point (x; f (x)) and Q is the point ((x + h); f (x + h)) where h is some appropriate nonzero number. This is illustrated in Figure 8.5.

106

DSC2606 8.2. THE SLOPE OF A TANGENT LINE

f (x) Q

f (x + h)

b

P

f (x)

b

x+h

x

x

Figure 8.5: Slope of secant line PQ Now, using the formula for the slope of a line, we can write the slope of the secant line passing through P and Q as vertical change horizontal change f (x + h) − f (x) = (x + h) − x f (x + h) − f (x) . = h

slope =

If h now becomes smaller (approaches zero), the secant line through P and Q approaches the tangent line T. This is illustrated in Figure 8.6. f (x) Q b

b

b

P b

b

x

h h h h

Figure 8.6: Point Q approaches point P This leads to the definition of the slope of a tangent line to the graph of f (x).

107

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

Definition 8.1 The slope of the tangent line to the graph f (x) at the point (x; f (x)) is given by f (x + h) − f (x) lim , h h→0 if the limit exists.

108

DSC2606 8.3. THE RATE OF CHANGE

8.3 The rate of change Let us now return to the function f (x) of Section 8.2 and consider two points P = (x; f (x)) and Q = (x + h; f (x + h)). The graphical representation is given in Figure 8.7. Here f (x + h) − f (x) measures the change in f (x) that corresponds to a change h in x. f (x) Q(x + h; f (x + h))

f (x + h)

b

f (x + h) − f (x)

f (x)

P(x; f (x)) b

h

x

x+h

x

Figure 8.7: Change in f (x) corresponding to change in x The slope of the secant line through P and Q is given by the difference quotient f (x + h) − f (x) , h and this measures the average rate of change of f (x) over the interval [x; x + h]. The slope of the tangent line through point (x; f (x)) is given by lim

h→0

f (x + h) − f (x) , h

and this measures the rate of change of f (x) at x. This is also called the instantaneous rate of change of f (x) at x to distinguish it from the average rate of change which is computed over an interval. The problem of finding the rate of change of one quantity with respect to another is therefore equivalent to finding the slope of the tangent line to a curve at a point.

8.4 The derivative of a function The limit which measures both • the slope of the tangent line to the graph of f (x) at P, and 109

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

• the instantaneous rate of change of f (x) at x, is given a special name, namely the derivative of f (x) at x.

Definition 8.2 We define the derivative of a function f (x) with respect to x as the new function f (x + h) − f (x) f ′ (x) = lim . h→0 h

From these definitions it follows that the derivative of a function is equal to the slope of the tangent line to the function. Different notations are used for the derivative of f (x) with respect to x. These notations are • f ′ (x)

( f prime x)

dy (dee y dee x) if y = f (x) dx d f (x) (dee dee x of f (x)). • dx To calculate the derivative of a function f (x), the following four steps are applied: •

Step 1: Calculate f (x + h). Step 2: Calculate the difference f (x + h) − f (x).

f (x + h) − f (x) . h f (x + h) − f (x) Step 4: Calculate f ′ (x) = lim . h→0 h We illustrate this by means of some examples. Step 3: Calculate the quotient

Example 8.1 Find the slope of the tangent line to the graph of f (x) = 3x + 5 at any point (x; f (x)). Solution The slope of the tangent line to the graph f (x) at point (x; f (x)) is given by the derivative of f (x) at x. To find the derivative, we use the four-step process. Step 1: f (x + h) = 3(x + h) + 5 = 3x + 3h + 5.

110

DSC2606 8.4. THE DERIVATIVE OF A FUNCTION

Step 2: f (x + h) − f (x) = 3x + 3h + 5 − (3x + 5) = 3h. Step 3: f (x + h) − f (x) 3h = h h = 3. Step 4: f (x + h) − f (x) h = lim 3

f ′ (x) = lim

h→0 h→0

= 3. We could have expected this result, since the tangent line to a straight line at any point must coincide with the line itself and, therefore, must have the same slope as the line. In this case, the graph of f (x) is a straight line with slope 3.

Example 8.2 Consider the function f (x) = x2 − 4x. (a) Calculate f ′ (x).

(b) Find the point on the graph of f (x) where the tangent line to the curve is horizontal. (c) Sketch the graph of f (x) and the tangent line to the curve at the point found in (b). (d) What is the rate of change of f (x) at the point found in (b)? Solution (a) To find f ′ (x) we use the four-step process. Step 1: f (x + h) = (x + h)2 − 4(x + h) = x2 + 2xh + h2 − 4x − 4h. Step 2: f (x + h) − f (x) = x2 + 2xh + h2 − 4x − 4h − (x2 − 4x) = 2xh + h2 − 4h = h(2x + h − 4).

111

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

Step 3: h(2x + h − 4) f (x + h) − f (x) = h h = 2x + h − 4. Step 4: f (x + h) − f (x) h→0 h = lim (2x + h − 4)

f ′ (x) = lim

h→0

= 2x − 4. (b) At the point on the graph of f (x) where the tangent line to the curve is horizontal, the tangent line has a slope of zero. So at this point, the derivative of f (x) is zero. To find such point(s), we set the derivative equal to zero and find f ′ (x) = 0

⇒ 2x − 4 = 0

⇒ x = 2.

The corresponding value of f (x) is given by f (2) = 22 − 4(2) = −4. The required point is (2; −4). (c) The graphical representation is given in Figure 8.8.

f (x) 4 2 −1 −2 −4

1

2

3

4

5

x

b

(2; −4)

Figure 8.8: Graph of function f (x) = x2 − 4x (d) The rate of change of f (x) at point (2; −4) is the derivative of f (x) at x = 2 and this is zero.

Example 8.3 1 Consider the function f (x) = . x 112

DSC2606 8.4. THE DERIVATIVE OF A FUNCTION

(a) Calculate f ′ (x). (b) Find the slope of the tangent line to the graph of f (x) at the point x = 1. (c) Find the equation of the tangent line at point x = 1. Solution (a) Step 1: f (x + h) =

1 . x+h

Step 2: f (x + h) − f (x) = = =

1 1 − x+h x x − (x + h) x(x + h) −h . x(x + h)

Step 3: f (x + h) − f (x) = h =

−h 1 · x(x + h) h −1 . x(x + h)

Step 4: f (x + h) − f (x) h→0 h −1 = lim h→0 x(x + h) 1 = − 2. x

f ′ (x) = lim

(b) The slope of the tangent line to the graph of f (x) at x = 1 is f ′ (1) = −

1 = −1. 12

(c) The tangent line is a straight line. At point x = 1 we have f (1) = 1. The tangent line therefore passes through the point (1; 1). From (b) we know that the slope of the tangent line is −1. Now remember that if the slope of a straight line is m and the line passes through a point (x1 ; y1 ) then m=

y − y1 . x − x1

113

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

Therefore, we have y−1 x−1 −x + 1 = y − 1 y = −x + 2. −1 =

Since y = f (x), we can say that the equation of the tangent line at point x = 1 is y = f (x) = −x + 2. The graphical representation of this is given in Figure 8.9.

f (x) 4 3 2 1

(1; 1)

1

3

4

5

x

− x+ 2

−2

2

y=

−4 −3 −2 −1 −1

b

−3 −4

Figure 8.9: Graph of function f (x) =

1 and its tangent line at (1; 1) x

8.5 Differentiability and continuity Certain functions in practical applications fail to be differentiable – that is, they do not have a derivative at certain values in the domain of the function. Here are two examples of such cases: (a) A continuous function f (x) fails to be differentiable at a point x = a if the graph of f (x) makes an abrupt change of direction at that point. We call such a point a “corner”. See Figure 8.10(a). (b) A function fails to be differentiable at a point if the tangent line at that point is vertical. This is so since the slope of a vertical line is undefined. See Figure 8.10(b). 114

DSC2606 8.6. EXERCISES

f (x)

f (x) (a; f (a)) b

b

x

a

(a; f (a))

a

x

(b)

(a)

Figure 8.10: Nondifferentiable functions Consider the graph of a function f (x) as given in Figure 8.11. The graph has a corner at x = 8 and so it is not differentiable at x = 8. It is clear that f (x) is continuous everywhere and, in particular, at x = 8. f (x) 100 75 50 b

(8; 48)

25 2

4

6

8 10 12

x

Figure 8.11: Continuous but nondifferentiable function This shows that, in general, the continuity of a function at a point x = a does not necessarily imply the differentiability of the function at that point. However, the converse is true: if a function is differentiable at x = a, then it is also continuous at x = a.

8.6 Exercises 1. Consider the function f (x) = −x2 − 2x + 3. The graphical representation of this function and its tangent line at point (0; 3) are given in Figure 8.12. (a) Calculate the derivative of f (x). (b) Find the slope of the tangent line at point (0; 3). (c) Find the rate of change of f (x) at point x = 0.

115

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

f (x) 5 4 3 b

2 1 −4 −3 −2 −1 −1

1

2

3

4

x

Figure 8.12: Study Unit 13, Exercise 1

(d) Find the equation of the tangent line. 2. The weekly demand function of Super Titan tyres is given by p = f (x) = 144 − x2 , where x represents the number of tyres demanded in thousands and p represents the unit price of tyres in rand. The graphical representation is given in Figure 8.13. p 200 144 100

5

10 12

15

x

Figure 8.13: Study Unit 13, Exercise 2

(a) Calculate the average rate of change in the unit price of tyres if the number demanded is between 5 000 and 6 000 tyres. Interpret the result. (b) Calculate the instantaneous rate of change in the unit price of tyres if 5 000 tyres are demanded. Interpret the result.

116

DSC2606 8.7. SOLUTIONS TO EXERCISES

8.7 Solutions to exercises 1.

(a) The derivative of f (x) is f (x + h) − f (x) h→0 h −(x + h)2 − 2(x + h) + 3 − (−x2 − 2x + 3) lim h→0 h −x2 − 2xh − h2 − 2x − 2h + 3 + x2 + 2x − 3 lim h→0 h −2xh − h2 − 2h lim h→0 h lim (−2x − h − 2)

f ′ (x) = lim = = = =

h→0

= −2x − 2. (b) The slope of the tangent line at x = 0 is equivalent to the derivative of f (x) at x = 0. Therefore, f ′ (0) = −2(0) − 2 = −2. (c) The rate of change of f (x) at x = 0 is equivalent to the derivative of f (x) at x = 0 and is −2. (d) We know that the slope of the tangent line at point (0; 3) is −2. Therefore, y−3 x−0 −2x = y − 3 y = −2x + 3. −2 =

The equation of the tangent line is f (x) = y = −2x + 3. 2.

(a) The average rate of change over the interval [x; x + h] is f (x + h) − f (x) 144 − (x + h)2 − (144 − x2 ) = h h 2 144 − x − 2xh − h2 − 144 + x2 = h h(−2x − h) = h = −2x − h. 117

DSC2606 CHAPTER 8 THE DERIVATIVE OF A FUNCTION

This average rate of change must be calculated for a demand of between 5 000 and 6 000 tyres. Therefore, x = 5 and x + h = 6, which means that h = 1. Now the average rate of change if between 5 000 and 6 000 tyres are demanded is −2(5) − 1 = −11. This means that the unit price decreases by R11 per thousand tyres if between 5 000 and 6 000 tyres are demanded. (b) The instantaneous rate of change if x tyres are demanded is lim

h→0

f (x + h) − f (x) = lim (−2x − h) h h→0 = −2x.

The instantaneous rate of change if 5 000 tyres are demanded is then −2(5) = −10. This means that at the level where 5 000 tyres are demanded, the unit price of a tyre is dropping at the rate of R10 per thousand tyres.

118

Chapter

9

The rules of differentiation

Contents 9.1

Four basic rules . . . . . . . . . . . . . . . . . . . . . . 120

9.2

The product rule . . . . . . . . . . . . . . . . . . . . . . 125

9.3

The derivative of the exponential function . . . . . . . . 125

9.4

The derivative of the logarithmic function . . . . . . . . 126

9.5

Higher-order derivatives . . . . . . . . . . . . . . . . . 127

9.6

The chain rule . . . . . . . . . . . . . . . . . . . . . . . 128

9.7

Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 133

9.8

Solutions to exercises . . . . . . . . . . . . . . . . . . . 134

119

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

Sections from prescribed book, Winston Chapter 11, Section 11.1

Learning objectives After completing this study unit you should be able to • calculate the derivatives of functions by applying the rules of differentiation • use derivatives to solve rate of change problems.

9.1 Four basic rules When we calculated the derivative of a function in the previous study unit, we based the method on a faithful interpretation of the definition of the derivative as the limit of a quotient. To find the derivative f ′ (x) of a function f (x), we first calculated the difference quotient f (x + h) − f (x) , h and then evaluated its limit as h approached 0. As you probably observed, this method is tedious even for relatively simple functions. The main purpose of this study unit is to give certain rules that will simplify the process of finding the derivative of a function. We will use the notation d [ f (x)] dx [read “dee, dee x of f of x”] to mean “the derivative of the function with respect to x”. In stating the rules of differentiation, we assume that the functions f (x) and g(x) are differentiable.

Rule Derivative of a constant d (c) = 0, where c is a constant. dx The derivative of a constant function is equal to zero.

120

DSC2606 9.1. FOUR BASIC RULES

f (x) c x Figure 9.1: Graph of f (x) = c

The graph of a constant function is a straight line parallel to the x-axis, as given in Figure 9.1. Since the tangent line to a straight line at any point on the line coincides with the straight line itself, the slope of f (x) (as given by the derivative of f (x) = c) must be zero.

Example 9.1 (a) If f (x) = 28, then f ′ (x) = (b) If f (x) = −2, then f ′ (x) =

d (28) = 0. dx d (−2) = 0. dx

Rule The power rule If n is any real number, and n 6= 0, then d n (x ) = nxn−1 . dx The derivative of x to the power n is equal to n multiplied by x to the power n − 1.

Let us verify the power rule for the special case n = 2, that is, for f (x) = x2 .

121

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

The derivative is f (x + h) − f (x) h→0 h 2 (x + h) − x2 lim h→0 h 2 x + 2xh + h2 − x2 lim h→0 h 2 2xh + h lim h→0 h lim (2x + h)

f ′ (x) = lim = = = =

h→0

= 2x.

Example 9.2 d (x) = 1x1−1 = x0 = 1. dx d (b) If f (x) = x8 , then f ′ (x) = (x8 ) = 8x8−1 = 8x7 . dx 5 5 5 3 d 5 5 (c) If f (x) = x 2 , then f ′ (x) = (x 2 ) = x 2 −1 = x 2 . dx 2 2 In order to differentiate functions containing radicals, like square roots, cube roots, etc, we first rewrite the functions using fractional powers. These functions can then be differentiated using the power rule. (a) If f (x) = x, then f ′ (x) =

Example 9.3 (a) The function f (x) =



1

x can be rewritten as f (x) = x 2 , and then d  1  1 1 −1 1 − 1 1 1 f ′ (x) = x2 = x2 = x 2 = 1 = √ . dx 2 2 2 x 2x 2

1 1 (b) The function g(x) = √ can be rewritten as g(x) = x− 3 , and then 3 x d  −1  1 1 1 4 1 g′ (x) = x 3 = − x− 3 −1 = − x− 3 = − 4 . dx 3 3 3x 3

Rule Derivative of a constant multiple of a function d d c f (x) = c f (x), where c is a constant. dx dx 122

DSC2606 9.1. FOUR BASIC RULES

The derivative of a constant times a function is equal to the constant times the derivative of the function.

Example 9.4 (a) If f (x) = 5x3 , then f ′ (x) =

d d (5x3 ) = 5 (x3 ) = 5(3x2 ) = 15x2 . dx dx

3 1 (b) If f (x) = √ , then rewriting this as 3x− 2 we have x 1 d 1 3 3 f ′ (x) = (3x− 2 ) = 3(− x− 2 ) = − 3 . dx 2 2x 2

Rule The sum/difference rule d d d [ f (x) ± g(x)] = f (x) ± g(x). dx dx dx The derivative of the sum (or difference) of two differentiable functions is equal to the sum (or difference) of their derivatives.

This result may be extended to the sum and difference of any finite number of differentiable functions. Example 9.5 (a) If f (x) = 4x5 + 3x4 − 8x2 + x + 3, then d f ′ (x) = (4x5 + 3x4 − 8x2 + x + 3) dx d d d d d = (4x5 ) + (3x4 ) − (8x2 ) + (x) + (3) dx dx dx dx dx = 20x4 + 12x3 − 16x + 1. (b) If g(t) =

t2 5 1 + 3 , then rewriting 3 as t −3, we have 5 t t   d 1 2 ′ −3 g (t) = t + 5t dt 5 2 = t − 15t −4 5 2 15 = t− 4 . 5 t 123

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

Here the independent variable is t instead of x, so we differentiate with respect to t.

124

DSC2606 9.2. THE PRODUCT RULE

9.2 The product rule Rule The product rule d [ f (x)g(x)] = f (x)g′ (x) + g(x) f ′ (x). dx The derivative of the product of two functions is the first function times the derivative of the second plus the second function times the derivative of the first.

The product rule may be extended to the case involving the product of any finite number of differentiable functions. Example 9.6 (a) If f (x) = (2x2 − 1)(x3 + 3) then, by using the product rule,

d 3 d (x + 3) + (x3 + 3) (2x2 − 1) dx dx 2 2 3 = (2x − 1)(3x ) + (x + 3)(4x) = 6x4 − 3x2 + 4x4 + 12x = 10x4 − 3x2 + 12x.   1 √ (b) The function f (x) = x3 ( x + 1) can be rewritten as f (x) = x3 x 2 + 1 and by the product rule,   1  d d  1 f ′ (x) = x3 x2 + 1 + x2 + 1 (x3 ) dx dx     1 3 1 − 12 = x x + x 2 + 1 (3x2 ) 2 5 1 5 = x 2 + 3x 2 + 3x2 2 7 5 = x 2 + 3x2 . 2 f ′ (x) = (2x2 − 1)

9.3 The derivative of the exponential function To analyse mathematical models involving exponential and logarithmic functions in greater detail, we need to develop rules for calculating the derivative of these functions.

125

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

Rule Derivative of the exponential function d x (e ) = ex . dx The derivative of the exponential function is equal to the function itself.

Example 9.7 If f (x) = x2 ex , then by the product rule d d x (e ) + ex (x2 ) dx dx = x2 ex + ex (2x) = xex (x + 2).

f ′ (x) = x2

9.4 The derivative of the logarithmic function Rule Derivative of ln x

d 1 (ln x) = for x > 0. dx x The derivative of the ln of x is equal to one over x. . Example 9.8 If f (x) = x ln x, then by the product rule d d (ln x) + (ln x) (x) dx dx   1 = x + ln x x = 1 + ln x.

f ′ (x) = x

126

DSC2606 9.5. HIGHER-ORDER DERIVATIVES

9.5 Higher-order derivatives The derivative f ′ (x) of a function f (x) is also a function. As such, the differentiability of f ′ (x) may be considered. Therefore, the function f ′ (x) has a derivative f ′′ (x) at a point x in the domain of f ′ (x) if the limit of the quotient f ′ (x + h) − f ′ (x) h exists as h approaches 0. In other words, f ′′ (x) is the derivative of the first derivative. The function f ′′ (x) obtained in this manner is called the second derivative of the function f (x), just as the derivative f ′ (x) is often called the first derivative of f (x). Continuing in this fashion, we are led to considering the third, fourth, and higher-order derivatives of f (x) whenever they exist. Notations for the first, second, third, and, in general, n-th derivatives of a function f (x) at a point x are f ′ (x), f ′′ (x), f ′′′ (x), . . . , f (n) (x) respectively. If f (x) is written in the form y = f (x), then the notation for its derivatives are d 2y d 3y d ny dy , , , . . . , . dx dx2 dx3 dxn Example 9.9 The derivatives of all orders of the polynomial function f (x) = x5 − 3x4 + 4x3 − 2x2 + x − 8, are as follows: f ′ (x) = 5x4 − 12x3 + 12x2 − 4x + 1, d ′ f ′′ (x) = f (x) = 20x3 − 36x2 + 24x − 4, dx d ′′ f ′′′ (x) = f (x) = 60x2 − 72x + 24, dx d ′′′ f (4) (x) = f (x) = 120x − 72, dx d (4) f (5) (x) = f (x) = 120, dx f (n) (x) = 0 for all n > 5.

Example 9.10

127

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

2

To find the third derivative of the function f (x) = x 3 , we proceed as follows: 2 −1 x 3 3 d ′ f ′′ (x) = f (x) dx 2 1 −4 = (− )x 3 3 3 2 4 = − x− 3 , 9 f ′ (x) =

and the required derivative is then    2 4 −7 8 7 ′′′ f (x) = − − x 3 = x− 3 . 9 3 27 Just as the derivative of a function f (x) at a point x measures the rate of change of the function at that point, the second derivative of f (x) (the derivative of f ′ (x)) measures the rate of change of the derivative f ′ (x) of the function f (x). The third derivative of the function f ′′′ (x), measures the rate of change of f ′′ (x), and so on.

9.6 The chain rule Consider the function k(x) = (x2 + x + 1)2 . If we were to compute k′ (x) using only the rules of differentiation from the previous sections, then our approach might be to expand k(x). Therefore, k(x) = (x2 + x + 1)2 = (x2 + x + 1)(x2 + x + 1) = x4 + 2x3 + 3x2 + 2x + 1, from which we find k′ (x) = 4x3 + 6x2 + 6x + 2. But what about the function H(x) = (x2 + x + 1)100 ? The same technique may be used to find the derivative of the function H(x), but the amount of work involved in this case would be prodigious! √ Consider also the function G(x) = x2 + 1. For each of the two functions H(x) and G(x), the rules of differentiation cannot be applied directly to calculate the derivatives H ′ (x) and G′ (x). Observe that both H(x) and G(x) are composite functions; that is, each is composed of, or build up of, simpler functions. Let us consider the function H(x) = (x2 + x + 1)100 . 128

DSC2606 9.6. THE CHAIN RULE

If we let u = f (x) = x2 + x + 1 and y = g(u) = u100 , then H(x) = = = =

(x2 + x + 1)100 [ f (x)]100 u100 g(u).

And we can say that H(x) is composed of two functions, namely f (x) and g(u), or equivalently H(x) is composed of two functions u and y. √ Now consider the function G(x) = x2 + 1. √ If we let u = f (x) = x2 + 1 and y = g(u) = u, then p G(x) = x2 + 1 p = f (x) √ = u = g(u). And G(x) is composed of the two functions f (x) and g(u), or equivalently the functions u and y. We now consider a general composite function y = h(x) = g[ f (x)], which can also be represented as y = g(u) and u = f (x). We now want to determine the derivative of this composite function. Since the composite function h(x) is composed of two functions y = g(u) and u = f (x), we suspect that the derivative of h(x), h′ (x), will be given by an dy expression that involves the derivative of y with respect to u, = g′ (u), du du and the derivative of u with respect to x, = f ′ (x). But how do we comdx bine these derivatives to find the derivative of h(x)? This question can be answered by the fact that the derivative of each function represents the rate of change of that function. For example, suppose that y = g(u) changes twice as fast as u, that is g′ (u) =

dy = 2. du

And that u = f (x) changes three times faster than x, that is f ′ (x) =

du = 3. dx 129

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

Then we would expect y = h(x) to change six times faster than x, that is h′ (x) = g′ (u) f ′ (x) = 2 × 3 = 6, or equivalently dy dy du = × . dx du dx This observation suggests the following result:

Rule The chain rule If h(x) = g[ f (x)], then h′ (x) =

d g[ f (x)] = g′ [ f (x)] f ′ (x). dx

Equivalently, if we write h(x) as y = g(u), where u = f (x), then dy dy du = × . dx du dx The derivative of a composite function is equal to the derivative of the outer function with respect to the inner function multiplied by the derivative of the inner function, where inner function ↑ g [ f (x)]. ↓ outer function Many composite functions h(x) = g[ f (x)] have the special form where g is a function to the power n and n is a real number; that is h(x) = [ f (x)]n. In other words, the function h(x) is given by the power of a function f (x). The functions k(x) = (x2 + x + 1)2 , H(x) = (x2 + x + 1)100 , G(x) =

p

x2 + 1

are examples of this type of composite function. By using the following corollary of the chain rule, we are able to find the derivative of this type of function much more easily than by using the chain rule directly:

130

DSC2606 9.6. THE CHAIN RULE

Corollary 9.1 The general power rule If h(x) = [ f (x)]n (n a real number, n 6= 0), and f (x) is differentiable, then d [ f (x)]n = n[ f (x)]n−1 f ′ (x). dx

h′ (x) =

If h(x) is a composite function consisting of a function to a power, then the derivative of h(x) is the power of the function times the function to the power minus one multiplied by the derivative of the function.

Example 9.11 (a) For H(x) = (x2 + x + 1)100 the derivative is d 2 (x + x + 1) dx = 100(x2 + x + 1)99 (2x + 1).

H ′ (x) = 100(x2 + x + 1)100−1

(b) The function G(x) = the derivative is



1

x2 + 1 can be rewritten as G(x) = (x2 + 1) 2 and 1 1 2 d (x + 1) 2 −1 (x2 + 1) 2 dx 1 2 − 21 = (x + 1) (2x) 2 x = √ . x2 + 1

G′ (x) =

(c) Consider the function f (x) = x2 (2x + 3)5. Applying the product rule, we find d d (2x + 3)5 + (2x + 3)5 (x2 ) dx dx d (x2 )5(2x + 3)4 · (2x + 3) + (2x + 3)5 (2x) dx 2 4 5x (2x + 3) (2) + 2x(2x + 3)5 10x2 (2x + 3)4 + 2x(2x + 3)5 2x(2x + 3)4 (5x + 2x + 3) 2x(7x + 3)(2x + 3)4 .

f ′ (x) = x2 = = = = =

(d) The function f (x) =

x2 + 1 can be rewritten as f (x) = (x2 + 1)(x2 − x2 − 1 131

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

1)−1 . Applying the product rule, we find d 2 d (x − 1)−1 + (x2 − 1)−1 (x2 + 1) dx  dx  d = (x2 + 1) −(x2 − 1)−2 (x2 − 1) + (x2 − 1)−1 (2x) dx   2 2 −2 = (x + 1) −(x − 1) (2x) + (x2 − 1)−1 (2x)

f ′ (x) = (x2 + 1)

= −2x(x2 + 1)(x2 − 1)−2 + 2x(x2 − 1)−1 = −2x(x2 − 1)−1 [(x2 + 1)(x2 − 1)−1 − 1] 4x = − 2 . (x − 1)2

Once again we use the chain rule to enable us to differentiate composite exponential functions of the form h(x) = e f (x) .

Rule The chain rule for exponential functions If h(x) = e f (x) and f (x) is differentiable, then d  f (x)  ′ h (x) = e = e f (x) f ′ (x). dx The derivative of a composite exponential function is the same exponential function multiplied by the derivative of the exponent.

Example 9.12 (a) If g(t) = e(2t

2 +t)

, then d (2t 2 + t) dt t = e(2t +t) · (4t + 1)

g′ (t) = e(2t

2 +t)

·

= (4t + 1) e(2t

2 +t)

.

(b) If y = xe−2x we use the product rule followed by the chain rule to find dy d d = x (e−2x ) + e−2x (x) dx dx dx d = xe−2x (−2x) + e−2x (1) dx −2x = xe (−2) + e−2x = e−2x (1 − 2x).

132

DSC2606 9.7. EXERCISES

Once again we use the chain rule to differentiate composite logarithmic functions of the form h(x) = ln f (x), where f (x) is a positive, differentiable function.

Rule The chain rule for logarithmic functions If h(x) = ln f (x) and f (x) is a differentiable function, then h′ (x) =

d f ′ (x) [ln f (x)] = for f (x) > 0. dx f (x)

The derivative of a logarithmic function containing a function is the derivative of this function divided by the function.

Example 9.13 (a) If f (x) = ln(x2 + 1), then ′

f (x) =

d dx

 x2 + 1 2x = . x2 + 1 x2 + 1

(b) If h(t) = 2 lnt 5, then ′

h (t) = 2 · = 2· =

d 5 dt (t ) t5

5t 4 t5

10 . t

9.7 Exercises 1. Find the derivative of each of the following functions using the rules of differentiation: (a) f (x) = 1, 5x2 + 2x1,5 √ 3 (b) g(x) = 2 x + √ x 1

(c) f (t) = −(2t 2 − 1)− 2 133

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

2 −3)

(d) k(x) = (1 − 2x3 )e(x

  2. What is the slope of the tangent line to the graph of f (x) = x2 + 1 2x3 − 3x2 + 1 at the point (2; 25)? How fast is the function f (x) changing at x = 2? 3. The sales, in millions of rand, of a laser disc recording of a hit movie t years from the date of release is given by S(t) =

5t t2 + 1

.

(a) Find the rate at which sales are changing at time t. (b) (i) How fast are sales changing at the time the laser discs are released (t = 0)? (ii) How fast are sales changing two years from the date of release?

9.8 Solutions to exercises 1.

(a) The derivative of f (x) is d d (1, 5x2 ) + (2x1,5 ) dx dx = 1, 5(2x) + 2(1, 5x0,5) √ = 3x + 3 x.

f ′ (x) =

(b) The derivative of g(x) is 1 1 d d (2x 2 ) + (3x− 2 ) dx dx 1 −1 1 3 = 2( x 2 ) + 3(− x− 2 ) 2 2 1 3 3 = x− 2 − x− 2 2 1 3 = √ − 3. x 2x 2

g′ (x) =

(c) The derivative of f (t) is i 1 d h −(2t 2 − 1)− 2 dt  3 d 1 = − − (2t 2 − 1)− 2 (2t 2 − 1) 2 dt 3 1 2 = (2t − 1)− 2 (4t) 2 2t = 3 . 2 (2t − 1) 2

f ′ (t) =

134

DSC2606 9.8. SOLUTIONS TO EXERCISES

(d) The derivative of k(x) is 2 d d (x2 −3) (e ) + e(x −3) (1 − 2x3 ) dx dx 2 3 (x2 −3) d 2 = (1 − 2x )e (x − 3) + e(x −3) (−6x2 ) dx 2 3 (x2 −3) = (1 − 2x )e (2x) − 6x2 e(x −3)

k′ (x) = (1 − 2x3 )

2 −3)

= 2xe(x

(1 − 2x3 − 3x).

2. The slope of the tangent line to the graph of f (x) at any point x is given by d d (2x3 − 3x2 + 1) + (2x3 − 3x2 + 1) (x2 + 1) dx dx = (x2 + 1)(6x2 − 6x) + (2x3 − 3x2 + 1)(2x).

f ′ (x) = (x2 + 1)

The slope of the tangent line to the graph of f (x) at x = 2 is     f ′ (2) = 22 + 1 6(22 ) − 6(2) + 2(23) − 3(22 ) + 1 [2(2)] = 60 + 20 = 80. We conclude that the function f (x) is changing at the rate of 80 units per unit change in x at x = 2. (Note that it is not necessary to simplify the expression for f ′ (x), since we are required only to evaluate the expression at x = 2.) 3.

(a) The rate at which sales are changing at time t is given by S′ (t). We rewrite the function as S(t) = 5t(t 2 + 1)−1 and use the product rule to differentiate. Then d 2 d (t + 1)−1 + (t 2 + 1)−1 (5t) dt dt 2 −2 2 5t[−(t + 1) × 2t] + (t + 1)−1 × 5 −10t 2 5 + (t 2 + 1)2 t 2 + 1 −10t 2 + 5(t 2 + 1) (t 2 + 1)2 5(1 − t 2) . (t 2 + 1)2

S′ (t) = 5t = = = =

(b) (i) The rate at which sales are changing at the time the laser discs are released is given by S′ (0) =

5(1 − 0) = 5, (0 + 1)2

that is, sales are increasing at the rate of R5 million per year.

135

DSC2606 CHAPTER 9 THE RULES OF DIFFERENTIATION

(ii) Two years from the date of release, sales are changing at the rate of 3 5(1 − 4) S′ (2) = = − = −0, 6, 2 (4 + 1) 5 that is, decreasing at the rate of R600 000 per year.

136

Chapter

10

Properties of functions and sketching graphs

Contents 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 138 10.2 Increasing and decreasing functions . . . . . . . . . . . 138 10.3 Relative and absolute extrema . . . . . . . . . . . . . . 141 10.4 Concavity . . . . . . . . . . . . . . . . . . . . . . . . . . 148 10.5 The second derivative test . . . . . . . . . . . . . . . . . 152 10.6 Asymptotes . . . . . . . . . . . . . . . . . . . . . . . . . 154 10.7 Sketching graphs . . . . . . . . . . . . . . . . . . . . . . 157 10.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 160 10.9 Solutions to exercises . . . . . . . . . . . . . . . . . . . 161

137

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Sections from prescribed book, Winston Chapter 11, Section 11.3 Chapter 11, Section 11.4

Learning objectives After completing this study unit you should be able to • determine where a function is increasing/decreasing • determine the stationary points of a function • determine relative and absolute extrema of a function • understand what is meant by a concave/convex function • determine where a function is concave/convex • understand the concept of an inflection point of a function • determine horizontal and vertical asymptotes • sketch the graph of a function.

10.1 Introduction This study unit further explores the power of the derivative. The derivative is used to analyse the properties of functions. The information obtained can then be used to sketch graphs of functions.

10.2 Increasing and decreasing functions The graph in Figure 10.1 shows the fuel economy of car as a function f (x) of its speed x. Observe that the fuel economy, f (x), of the car improves as the speed of the car, x, increases from 0 to 60, and then drops as the speed increases beyond 60. We use the terms increasing and decreasing to describe the behaviour of this function as we move from left to right along its graph. 138

DSC2606 10.2. INCREASING AND DECREASING FUNCTIONS

f (x) 40 30 20 10 20 40 60 80 100

x

Figure 10.1: Fuel economy of a car A function f (x) is increasing on an interval (a; b) if, for any two numbers x1 and x2 in (a; b), f (x1 ) < f (x2 ) for x1 < x2 . This is illustrated in Figure 10.2. f (x)

f (x2 ) f (x1 ) a x1 x2

b

x

Figure 10.2: Increasing function on (a; b) A function f (x) is decreasing on an interval (a; b) if, for any two numbers x1 and x2 in (a; b), f (x1 ) > f (x2 ) for x1 < x2 . This is illustrated in Figure 10.3. f (x) f (x1 ) f (x2 )

a x1 x2

b

x

Figure 10.3: Decreasing function on (a; b) We say that f (x) is increasing at a point x = c if there exists an interval (a; b) containing c such that f (x) is increasing on (a; b). Similarly, we say that f (x) is decreasing at a point x = c if there exists an interval (a; b) containing c such that f (x) is decreasing on (a; b). Since the rate of change of a function at a point x = c is given by the derivative of the function at that point, the derivative can be used to determine the

139

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

intervals where a differentiable function is increasing or decreasing. Indeed, as we saw earlier, the derivative of a function at a point measures both the slope of the tangent line to the graph of the function at that point and the rate of change of the function at the same point. In fact, at a point where the derivative is positive, the slope of the tangent line to the graph is positive and the function is increasing. This is illustrated in Figure 10.4(a). At a point where the derivative is negative, the slope of the tangent line to the graph is negative and the function is decreasing. This is illustrated in Figure 10.4(b). f (x)

f (x)

c (a)

x

c

x

(b)

Figure 10.4: Increasing/decreasing function and the slope of the tangent line These observations lead to the following important theorem:

Theorem 10.1 1. If f ′ (x) > 0 for each value of x in interval (a; b), then f (x) is increasing on (a; b). 2. If f ′ (x) < 0 for each value of x in interval (a; b), then f (x) is decreasing on (a; b). 3. If f ′ (x) = 0 for each value of x in interval (a; b), then f (x) is constant on (a; b).

Example 10.1 Determine the interval where the function f (x) = x2 is increasing and the interval where it is decreasing. Solution The derivative is f ′ (x) = 2x. Since f ′ (x) > 0 for x > 0 and f ′ (x) < 0 for x < 0, we can say that f (x) is increasing on the interval (0; ∞) and decreasing on the interval (−∞; 0). This is confirmed by the graph in Figure 10.5.

140

DSC2606 10.3. RELATIVE AND ABSOLUTE EXTREMA

f (x) 5 4 3 2 1 −2

2

x

Figure 10.5: Graph of f (x) = x2

Example 10.2 Determine the intervals where the function f (x) = x3 − 3x2 − 24x + 32 is increasing and where it is decreasing. Solution The derivative is f ′ (x) = 3x2 − 6x − 24 = 3(x + 2)(x − 4). If we set f ′ (x) = 0, then we find x = −2 and x = 4. These points divide the real line into the intervals (−∞; −2), (−2; 4) and (4; ∞). To determine the signs of f ′ (x) in these intervals, we calculate the value of f ′ (x) at a convenient test point in each interval. The results are shown in the following table: Interval (−∞; −2) (−2; 4) (4; ∞)

Test point c −3 0 5

f ′ (c) 21 −24 21

Sign of f ′ (x) + − +

Using these results, we conclude that f (x) is increasing on the intervals (−∞; −2) and (4; ∞) and decreasing on the interval (−2; 4). This is confirmed by the graph in Figure 10.6.

10.3 Relative and absolute extrema In addition to helping us determine where the graph of a function is increasing and decreasing, the first derivative may be used to help us locate certain “high points” and “low points” on the graph of f (x). Knowing these points is invaluable in sketching the graphs of functions and solving optimisation problems. These “high points” and “low points” correspond to the relative (local) maxima and relative minima of a function. They are so called

141

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

f (x) 60 40 20

−5 −4 −3 −2 −1 −20

1 2 3 4 5 6 7

x

−40 Figure 10.6: Graph of f (x) = x3 − 3x2 − 24x + 32 because they are the highest or the lowest points when compared to points nearby.

Definition 10.1 A function f (x) has a relative maximum at x = c if there exists an open interval (a; b) containing c such that f (x) ≤ f (c) for all x in (a; b).

Geometrically, this means that there is some interval containing x = c such that no point on the graph of f (x) with its x-coordinate in that interval lies higher than the point (c; f (c)); that is, f (c) is the largest value of f (x) in some interval around x = c. The graph of a function f (x) is given in Figure 10.7 and it shows that f (x) has a relative maximum at x = x1 and another at x = x3 . Definition 10.2 A function f (x) has a relative minimum at x = c if there exists an open interval (a; b) containing c such that f (x) ≥ f (c) for all x in (a; b).

From Figure 10.7 we see that f (x) has a relative minimum at x = x2 and x = x4 . We can use the derivative of a function to determine where these relative extrema occur. Examine the graph of a relative maximum point of the differentiable function f (x) as illustrated in Figure 10.8. We see that the tangent lines on the interval (a; c) have positive slopes, that is, f ′ (x) > 0 on (a; c), while the tangent lines on the interval (c; b) have ne142

DSC2606 10.3. RELATIVE AND ABSOLUTE EXTREMA

f (x)

x1 x2

x3

x

x4

Figure 10.7: Relative maxima and minima of function f (x) f (x)

0

f (′ x) >

< ′ x) f(

0

f ′ (x) = 0

a

c

b

x

Figure 10.8: Relative maximum of function f (x) at x = c gative slopes, that is, f ′ (x) < 0 on (c; b). Therefore, the tangent line has to be horizontal at the point x = c, that is, f ′ (c) = 0. The graph in Figure 10.9 shows a relative minimum of the differentiable function f (x).

0

f (′ x) >

< ′ x) f(

0

f (x)

f ′ (x) = 0

a

c

b

x

Figure 10.9: Relative minimum of function f (x) at x = c Using the same argument as above, we conclude that the slope of the tangent line at the relative minimum point x = c can only be zero. Therefore, f ′ (c) = 0 at any relative extreme point x = c of the function f (x). However, the opposite of this statement is not true. The fact that f ′ (c) = 0 143

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

does not necessarily imply that there is a relative extremum at x = c. For example, consider the function f (x) = x3 . Here f ′ (x) = 3x2 , so f ′ (0) = 0. If we look at the graph of f (x) = x3 in Figure 10.10, we see that f (x) has neither a relative maximum nor a relative minimum at x = 0, but has an inflection point at x = 0. This will be discussed later on. f (x) 4 2

−2

2

x

−2 −4 Figure 10.10: Graph of function f (x) = x3 So far we have assumed that the function under consideration is differentiable at the point that gives rise to a relative extremum. The function f (x) = |x| demonstrates that a relative extremum may exist at a point where the derivative does not exist. The graph of f (x) = |x| is given in Figure 10.11. This function is not differentiable at the point x = 0, but there is obviously a relative minimum at x = 0. f (x) 4 2

−4

−2

2

4

x

Figure 10.11: Graph of function f (x) = |x| Any point in the domain of a function f (x) that may give rise to a relative extremum is called a stationary (or critical) point. Observe the graph of a function f (x) in Figure 10.12. The function is differentiable at points a, b and c, and f ′ (x) = 0 at a, b and c. The function is not differentiable at point x = d as the graph has a corner at this point. The function is not differentiable at x = e as the tangent line is vertical at this point. The points x = a, x = b and x = d give rise to relative extrema while the points x = c and x = e do not. All these points are called stationary points of the function.

144

DSC2606 10.3. RELATIVE AND ABSOLUTE EXTREMA

f (x)

Corner Horizontal tangents

Vertical tangent

a

c

b

d

e

x

Figure 10.12: Stationary points of function f (x)

Definition 10.3 A stationary (or critical) point of a function f (x) is any point x in the domain of f (x) where f ′ (x) = 0, or where f ′ (x) does not exist.

The graph of the function f (t) in Figure 10.13 shows the average age of cars in use in the United States from the beginning of 1946 (t = 0) to the beginning of 1990 (t = 44). f (t) 9 Absolute maximum 8 Absolute minimum 7 6 5 4 3 2 1 b

b

b

12

23

44 t

Figure 10.13: Age of cars in USA from 1946 to 1990 Observe that the highest average age of cars during this period is nine years, whereas the lowest average age of cars in use during the same period is 5 21 years. The number 9, the largest value of f (t) for all values of t in the interval [0; 44] (the domain of f (t)), is called the absolute maximum value of f (t) on that interval. The number 5 21 , the smallest value of f (t) for all values of t in [0; 44], is called the absolute minimum value of f (t) on that interval. Notice that the absolute maximum is attained at the endpoint t = 0 of the interval, whereas the absolute minimum value is attained at the two interior points t = 12 and t = 23 (corresponding to 1958 and 1969 respectively).

145

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Definition 10.4 • If f (x) < f (c) for all x in the domain of f (x), then f (c) is called the absolute maximum value of f (x). • If f (x) > f (c) for all x in the domain of f (x), then f (c) is called the absolute minimum value of f (x).

A continuous function defined on an arbitrary interval does not always have an absolute maximum or an absolute minimum. However, an important case often arises in practical applications where both the absolute maximum and the absolute minimum of a function are guaranteed to exist. This occurs when a continuous function is defined on a closed interval.

Property 10.1 If a function f (x) is continuous on a closed interval [a; b], then f (x) has both an absolute maximum value and an absolute minimum value on [a; b].

The graph in Figure 10.14 illustrates a typical situation. f (x)

Absolute maximum b Relative maximum b b

b

Relative minimum b

Absolute minimum

a

x1

x2

x3

b

x

Figure 10.14: Relative and absolute extrema of function f (x) on [a; b] Here x1 , x2 and x3 are stationary points of f (x). The absolute minimum of f (x) occurs at x3 . The absolute maximum of f (x) occurs at b, an endpoint. Example 10.3 Find the absolute extrema of the function f (x) = x3 − 2x2 − 4x + 4, defined on the interval [0; 3].

146

DSC2606 10.3. RELATIVE AND ABSOLUTE EXTREMA

Solution The function is a polynomial and is therefore continuous on the closed interval [0; 3]. The derivative is f ′ (x) = 3x2 − 4x − 4 = (3x + 2)(x − 2). The stationary points are obtained from f ′ (x) = 0. Then f ′ (x) = 0 ⇒ (3x + 2)(x − 2) = 0 2 ⇒ x=− and x = 2. 3 Since the point x = − 23 lies outside the interval [0; 3], it is dropped from further consideration. Now we evaluate the function at the stationary point (x = 2) and at the endpoints (x = 0 and x = 3) obtaining f (2) = 23 − 2(2)2 − 4(2) + 4 = −4 f (0) = 4 f (3) = 1. From these results we conclude that the absolute minimum of f (x) occurs at x = 2 and the absolute maximum occurs at x = 0. The graph in Figure 10.15 confirms our results.

f (x) 4 b

Absolute maximum

2

1

2

3

4

x

−2 −4

b

Absolute minimum

Figure 10.15: Graph of function f (x) = x3 − 2x2 − 4x + 4 on [0; 3]

147

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

10.4 Concavity Consider the graphs of the population of the world and the USA over a 50year time period as shown in Figure 10.16. f (x) 7 6 5 4 3 2 1

f (x) 300 200 100

1950

60

70

80

90 2000 x

(a) World population in billions

1950

60

70

80

90 2000 x

(b) USA population in millions

Figure 10.16: Population of world and USA over 50 years In Figure 10.16(a), the graph opens upwards, whereas the graph in Figure 10.16(b) opens downwards. What is the significance of this? Let us look at the slopes of tangent lines to points on each graph as shown in Figure 10.17.

(a)

(b)

Figure 10.17: Slope of tangent lines to population graphs We see that the slopes of the tangent lines to the graph in Figure 10.17(a) are increasing as we move from left to right. Since the slope of the tangent line to the graph at a point on the graph measures the rate of change of the function at that point, we conclude that the world population is not only increasing till the year 2000, but is increasing at an increasing pace. A similar analysis of the graph in Figure 10.17(b) reveals that the USA population is increasing, but at a decreasing pace. The shape of a curve can be described using the notion of concavity. If a function is differentiable on an interval (a; b), then f (x) is concave upward (also called convex) on (a; b) if f ′ (x) is increasing on (a; b). Geometrically, a curve is concave upward (convex) if it lies above its tangent lines as can be seen from the graph in Figure 10.18. 148

DSC2606 10.4. CONCAVITY

f (x)

a

b

x

Figure 10.18: Graph of a convex function f (x) Similarly, if a function f (x) is differentiable on an interval (a; b), then f (x) is concave downward (also called concave) on (a; b) if f ′ (x) is decreasing on (a; b). A curve is concave downward (concave) if it lies below its tangent lines as can be seen from the graph in Figure 10.19. f (x)

a

b

x

Figure 10.19: Graph of a concave function f (x) We can use the second derivative f ′′ (x) to determine the concavity of a function f (x). Recall that f ′′ (x) measures the rate of change of f ′ (x) at the point x. Therefore, if f ′′ (x) > 0 on an interval (a; b), the slope of the tangent lines to the graph of f (x) are increasing on (a; b) and we say that f (x) is convex on (a; b). Similarly, if f ′′ (x) < 0 on (a; b), then f (x) is concave on (a; b). Theorem 10.2 1. If f ′′ (x) > 0 for each value of x in an interval (a; b), then f (x) is convex on (a; b). 2. If f ′′ (x) < 0 for each value of x in an interval (a; b), then f (x) is concave on (a; b).

Example 10.4 Determine the intervals where the function f (x) = x3 − 3x2 − 24x + 32 is convex and where it is concave.

149

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Solution The derivative is f ′ (x) = 3x2 − 6x − 24.

The second derivative is f ′′ (x) = 6x − 6. If we set f ′′ (x) = 0, then we find x = 1. This point divides the real line into two intervals, (−∞; 1) and (1; ∞). We determine the signs of f ′′ (x) in these intervals by calculating f ′′ (x) at a convenient test point in each interval. The results are as follows: Select x = 0 in the interval (−∞; 1). Then f ′′ (0) = −6 < 0

⇒ f (x) is concave.

Select x = 2 in the interval (1; ∞). Then f ′′ (2) = 6(2) − 6 = 6 > 0

⇒ f (x) is convex.

Therefore, the function is concave over the interval (−∞; 1) and convex over the interval (1; ∞). The graph in Figure 10.20 confirms this.

f (x) 60 40 20

−5 −4 −3 −2 −1 −20

1 2 3 4 5 6 7

x

−40 Figure 10.20: Graph of function f (x) = x3 − 3x2 − 24x + 32

From this example we see that at the point x = 1 the function changes from being concave to convex. This point is called an inflection point of the function.

Definition 10.5 The point on the graph of a differentiable function f (x) at which the concavity changes, is called an inflection point.

150

DSC2606 10.4. CONCAVITY

We saw earlier that the graph of a convex function lies above its tangent lines and the graph of a concave function lies below its tangent lines. At a point of inflection, the graph of the function crosses its tangent lines. This is illustrated by the graphs in Figure 10.21. f (x)

f (x) Convex b

Convex b

Concave

Concave

x

x

f (x) Concave b

Convex

x Figure 10.21: Inflection points Inflection points may be identified by setting f ′′ (x) = 0. The resulting points are only candidates for inflection points, but are not necessarily inflection points. This means that a point that is not an inflection point, say point c in the domain of a function, may also have f ′′ (c) = 0. Example 10.5 The total sales, in thousands of rand, of the Arctic Air Corporation is related to the amount of money x, in thousands of rand, the company spends on advertising its products by the function S(x) = −0, 01x3 + 1, 5x2 + 200,

where

0 ≤ x ≤ 100.

Find the inflection points of the function. Solution The first and second derivatives are S′ (x) = −0, 03x2 + 3x S′′ (x) = −0, 06x + 3. If we set S′′ (x) = 0, then we find x = 50. This is the only candidate for an inflection point. Now S′′ (x) > 0 for x < 50 and S′′(x) < 0 for x > 50, which means that S(x) changes from convex to concave at x = 50. The point (x; S(x)) = (50; 2 700) is therefore an inflection point of the function S(x).

151

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

S(x) 5000 4000 3000 b

(50; 2 700)

2000 1000 20

40

60

80

100 120

x

Figure 10.22: Inflection point of function S(x) = −0, 01x3 + 1, 5x2 + 200 The graph of function S(x) is given in Figure 10.22. To understand the significance of this inflection point, observe that the total sales increase rather slowly at first, but as more money is spent on advertising, the total sales increase rapidly. This rapid increase reflects the effectiveness of the company’s advertisements. However, a point is soon reached after which any additional advertising expenditure results in increased sales but at a slower rate of increase. This point, commonly known as the point of diminishing returns, is the point of inflection of the function S(x).

Convex and concave functions play an important role in the study of nonlinear programming, and it is essential that you understand these concepts. Refer to Winston, Section 11.3 and study the parts that deal with functions of one variable (omit functions of several variables). You will find the exact page references in Tutorial Letter 101.

10.5 The second derivative test We will now show how the second derivative f ′′ (x) of a function f (x) can be used to help us determine whether a stationary point of f (x) is a relative extremum. Examine the graphs in Figure 10.23. f (x)

f (x)

c (a)

c

x (b)

Figure 10.23: Concavity and relative extrema

152

x

DSC2606 10.5. THE SECOND DERIVATIVE TEST

The function in Figure 10.23(a) has a relative maximum at point x = c. We see that f (x) is concave at this point; therefore, we know that f ′′ (c) < 0. The function in Figure 10.23(b) shows that the function has a relative minimum at point x = c and that f (x) is convex at this point, and f ′′ (c) > 0. Based on these observations, we formulate the second derivative test for functions with continuous second derivatives on an interval around c as follows: Step 1: Calculate f ′ (x) and f ′′ (x). Step 2: Find all the stationary points of f (x) by evaluating f ′ (x) = 0. Step 3: Evaluate f ′′ (c) for each such stationary point c, and • if f ′′ (c) < 0, then f (x) has a relative maximum at c • if f ′′ (c) > 0, then f (x) has a relative minimum at c

• if f ′′ (c) = 0, further investigation is necessary to determine what the nature of the stationary point is. Example 10.6 Determine the relative extrema of the function f (x) = x3 − 3x2 − 24x + 32 using the second derivative test. Solution The first and second derivatives are f ′ (x) = 3x2 − 6x − 24 f ′′ (x) = 6x − 6. The stationary points are where f ′ (x) = 0. Then 3x2 − 6x − 24 = 0 3(x + 2)(x − 4) = 0 x = −2 and x = 4. Now we evaluate f ′′ (x) at the stationary points. Then f ′′ (−2) = 6(−2) − 6 = −18 < 0 ⇒ relative maximum f ′′ (4) = 6(4) − 6 = 18 > 0 ⇒ relative minimum. The corresponding f (x) values at the stationary points are f (−2) = (−2)3 − 3(−2)2 − 24(−2) + 32 = 60 f (4) = −48. We conclude that the point (−2; 60) is a relative maximum and the point (4; −48) is a relative minimum of f (x).

153

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

The following table shows the general characteristics of the function f (x) for various possible combinations of the signs of f ′ (x) and f ′′ (x) on the interval (a; b): Signs of f ′ (x) and f ′′ (x)

Properties of the graph of f (x)

f ′ (x) > 0

f (x) increasing

f ′′ (x) > 0

f (x) convex

f ′ (x) > 0

f (x) increasing

f ′′ (x) < 0

f (x) concave

f ′ (x) < 0

f (x) decreasing

f ′′ (x) > 0

f (x) convex

f ′ (x) < 0

f (x) decreasing

f ′′ (x) < 0

f (x) concave

General shape of the graph of f (x)

To emphasise the importance of concepts learned in this section, you should see how they are applied by studying Winston, Section 11.4.

10.6 Asymptotes It may be helpful to review the material on limits from the left and right and infinite limits covered in Study unit 7 before continuing. Consider the graph of f (x) =

x+1 , given in Figure 10.24. x−1

Observe that f (x) increases without bound (tends to infinity) as x approaches 1 from the right; that is x+1 lim = ∞. x→1+ x − 1 You can verify this by taking a sequence of values of x approaching x = 1 from the right and looking at the corresponding values of f (x). Here is another way of looking at the situation. Observe that if x is a number that is a little larger than 1, then both (x + 1) and (x − 1) are positive, so (x + 1) that is also positive. As x approaches x = 1, the numerator (x + 1) (x − 1) approaches the number 2, but the denominator (x − 1) approaches zero, so (x + 1) the quotient tends to infinity, as observed earlier. (x − 1) The line x = 1 is called a vertical asymptote of the graph of f (x). 154

DSC2606 10.6. ASYMPTOTES

f (x) 8 6 4 2 1 1 2

−6 −4 −2 −2

4

6

8

x

−4 −6 Figure 10.24: Graph of function f (x) =

x+1 x−1

It can also be shown that lim

x→1−

x+1 = −∞, x−1

and this tells us that f (x) approaches the asymptote x = 1 from the left.

Definition 10.6 The line x = a is a vertical asymptote of the graph of a function f (x) if either lim f (x) = ∞

or

−∞

lim f (x) = ∞

or

− ∞.

x→a+

or

x→a−

For any rational function f (x) defined by f (x) =

g(x) , h(x)

where g(x) and h(x) are polynomial functions, the line x = a is a vertical asymptote of the graph of f (x) if h(a) = 0 but g(a) 6= 0.

x+1 considered earlier, g(x) = x + 1 and x−1 h(x) = x − 1. Observe that h(1) = 0 but g(1) = 2 6= 0, so x = 1 is a candidate for a vertical asymptote of the graph of f (x). For the rational function f (x) =

x2 . This is a rational function with 4 − x2 g(x) = x2 and h(x) = 4 − x2 . To find the points where h(x) = 0, we solve We now consider the function f (x) =

155

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

4 − x2 = 0 (2 − x)(2 + x) = 0 x = −2 and x = 2. These are candidates for the vertical asymptotes of the graph of f (x). We now consider x = −2, then g(−2) = (−2)2 = 4 6= 0

⇒ x = −2

is a vertical asymptote.

Similarly, we find g(2) = 4 6= 0

⇒ x = 2 is also a vertical asymptote.

The graph of the function is given in Figure 10.25. f (x) 6 4 2 −6

−4

−2 −2

2

4

6

x

−4 −6

Figure 10.25: Graph of function f (x) =

x2 4 − x2

We see that f (x) approaches the horizontal line f (x) = −1 from below as x tends to infinity, as well as when x tends to minus infinity. The line f (x) = −1 is called a horizontal asymptote of the graph of f (x). Definition 10.7 The line f (x) = b is a horizontal asymptote of the graph of a function f (x) if either lim f (x) = b x→∞

or lim f (x) = b.

x→−∞

156

DSC2606 10.7. SKETCHING GRAPHS

For the function f (x) =

x+1 we see that x−1 1 + 1x x+1 lim = lim = 1, x→∞ x − 1 x→∞ 1 − 1 x

if we divide the numerator and the denominator by x. Also, 1 + 1x x+1 lim = lim = 1. x→−∞ x − 1 x→−∞ 1 − 1 x In either case, we conclude that f (x) = 1 is a horizontal asymptote of the graph of f (x) as observed earlier. Example 10.7 Calculate the horizontal asymptotes of the graph of the function f (x) = Solution We calculate

x2 = lim x→∞ 4 − x2 x→∞ lim

4 x2

1 = −1. −1

Therefore, f (x) = −1 is a horizontal asymptote.

NOTE: A polynomial function has no asymptotes.

10.7 Sketching graphs We have now seen how the first and second derivatives of a function are used to reveal various properties of the graph of the function. We will now show how this information can be used to help us sketch the graph of the function. We begin by giving a general procedure for sketching graphs of functions f (x) as follows: Step 1: Determine the domain of f (x). Step 2: Find the intercepts of f (x) on the axes. The equation f (x) = 0 may be difficult to solve, in which case one may need to search for the x-intercepts numerically. Numerical search algorithms such as the bisection method and Newton’s method are discussed in the next study unit. Step 3: Determine the behaviour of f (x) for large absolute values of x.

157

x2 . 4 − x2

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Step 4: Find all horizontal and vertical asymptotes of f (x). Step 5: Determine the intervals where f (x) is increasing and where it is decreasing. Step 6: Find the relative extrema of f (x). Step 7: Determine the concavity of f (x) and find the inflection points of f (x). Step 8: Plot a few additional points to help further identify the shape of the graph of f (x), and sketch the graph. Example 10.8 Sketch the graph of the function y = f (x) = Solution

x+1 . x−1

Step 1: The function is not defined for x = 1 since division by zero is not allowed. The domain of f (x) is the set of all real numbers other than x = 1. Step 2: The y-intercept is found by setting x = 0. Then y = f (0) = −1. The x-intercept is found by setting y = f (x) = 0. Then x+1 = 0 x−1 x+1 = 0 x = −1. Step 3: The behaviour of f (x) for large absolute values of x. We saw earlier that x+1 lim = 1 x→∞ x − 1 x+1 lim = 1. x→−∞ x − 1 Therefore, f (x) approaches the line y = 1 as |x| becomes large. Step 4: From the results of step 3, we conclude that y = 1 is a horizontal asymptote. We saw earlier that x+1 x−1 x+1 lim − x→1 x − 1 lim

x→1+

= ∞ = −∞.

Therefore, x = 1 is a vertical asymptote.

158

DSC2606 10.7. SKETCHING GRAPHS

Step 5: The derivative is d d (x − 1)−1 + (x − 1)−1 (x + 1) dx dx (x + 1)(−1)(x − 1)−2 + (x − 1)−1(1) −(x + 1) 1 + 2 (x − 1) x−1 −x − 1 + x − 1 (x − 1)2 −2 . (x − 1)2

f ′ (x) = (x + 1) = = = =

f ′ (x) < 0 f ′ (x) < 0

On the interval (−∞; 1): On the interval (1; ∞):

⇒ f (x) is decreasing. ⇒ f (x) is decreasing.

Step 6: We see that f ′ (x) 6= 0 for all values of x in the intervals (−∞; 1) and (1; ∞). This means that f (x) has no stationary points. Step 7: The second derivative is  d  −2(x − 1)−2 dx = −2(−2)(x − 1)−3 4 = . (x − 1)3

f ′′ (x) =

On the interval (−∞; 1): On the interval (1; ∞):

f ′′ (x) < 0 f ′′ (x) > 0

⇒ f (x) is concave. ⇒ f (x) is convex.

We see that f ′′ (x) 6= 0 for all values of x in the intervals (−∞; 1) and (1; ∞). This means that there are no inflection points. Step 8: Points on f (x) are as follows: x −5 −4 −3 −2 0, 5 1, 5 2 3 4 5 f (x) 0, 667 0, 6 0, 5 0, 333 −3 5 3 2 1, 667 1, 5 The graph of f (x) is given in Figure 10.26.

159

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

f (x) 6 5 b

4 3 b

2 b b

b b b

1 b b b

−5 −4 −3 −2 −1 −1

1 b

2

3

4

5

6

x

−2 −3 b

−4 −5 Figure 10.26: Graph of function f (x) =

x+1 x−1

10.8 Exercises 1. Find the intervals where the following function is increasing and where it is decreasing: 2 f (x) = x3 − x2 − 12x + 3. 3 2. Determine where the following function is concave and where it is convex: f (x) = 4x3 − 3x2 + 6. 3. Using the second derivative test, find the relative extrema of the function 1 f (x) = 2x3 − x2 − 12x − 10. 2 4. Find the horizontal and vertical asymptotes of the graph of the function 2x2 f (x) = 2 . x −1 5. Sketch the graph of the function y = f (x) = x3 − 6x2 + 9x + 2. 160

DSC2606 10.9. SOLUTIONS TO EXERCISES

10.9 Solutions to exercises 1. The derivative is f ′ (x) = 2x2 − 2x − 12. If we set f ′ (x) = 0, we have

2x2 − 2x − 12 x2 − x − 6 (x + 2)(x − 3) x = −2 and

= 0 = 0 = 0 x = 3.

These points divide the real line into the intervals (−∞; −2), (−2; 3) and (3; ∞). To determine the signs of f ′ (x) in these intervals, we calculate f ′ (x) at a convenient test point in each interval. The results are as follows: Select x = −3 in the interval (−∞; −2). Then f ′ (−3) = 2(−3)2 − 2(−3) − 12 = 12 > 0

⇒ f (x) is increasing.

Select x = 0 in the interval (−2; 3). Then f ′ (0) = −12 < 0

⇒ f (x) is decreasing.

Select x = 4 in the interval (3; ∞). Then f ′ (4) = 12 > 0

⇒ f (x) is increasing.

Therefore, f (x) is increasing over (−∞; −2) and (3; ∞), and it is decreasing over (−2; 3). 2. The first and second derivatives are f ′ (x) = 12x2 − 6x f ′′ (x) = 24x − 6. If we set f ′′ (x) = 0, we have 24x − 6 = 0 1 x = . 4 This point divides the real line into two intervals (−∞; 14 ) and ( 41 ; ∞). Select x = 0 in the interval (−∞; 41 ). Then f ′′ (0) = 24(0) − 6 = −6 < 0

⇒ f (x) is concave. 161

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Select x = 1 in the interval ( 14 ; ∞). Then f ′′ (1) = 24(1) − 6 = 18 > 0

⇒ f (x) is convex.

Therefore, f (x) is concave over (−∞; 14 ), and it is convex over ( 14 ; ∞). 3. The first and second derivatives are f ′ (x) = 6x2 − x − 12 f ′′ (x) = 12x − 1. The stationary points are obtained from f ′ (x) = 0. Then 6x2 − x − 12 = 0 (3x + 4)(2x − 3) = 0

x = − 43 and x = 32 x = −1, 333 and x = 1, 5.

Now for x = −1, 333 we have f ′′ (−1, 333) = 12(−1, 333) − 1 = −17 < 0

⇒ relative maximum.

The corresponding value of f (x) at x = −1, 333 is f (−1, 333) = 2(−1, 333)3 −0, 5(−1, 333)2 −12(−1, 333)−10 = 0, 37. Therefore, point (−1, 333; 0, 37) is a relative maximum. Now for x = 1, 5 we have f ′′ (1, 5) = 12(1, 5) − 1 = 17 > 0

⇒ relative minimum.

The corresponding value of f (x) at x = 1, 5 is f (1, 5) = 2(1, 5)3 − 0, 5(1, 5)2 − 12(1, 5) − 10 = −22, 375. Therefore, point (1, 5; −22, 375) is a relative minimum. 4. Let g(x) = 2x2 and h(x) = x2 − 1, then f (x) = function.

To find the points where h(x) = 0, we solve x2 − 1 = 0 x = 1 and x = −1. 162

2x2 is a rational x2 − 1

DSC2606 10.9. SOLUTIONS TO EXERCISES

These are candidates for the vertical asymptotes. We consider x = 1. Then g(1) = 2(1)2 = 2 6= 0

⇒ x = 1 is a vertical asymptote.

If we consider x = −1, then g(−1) = 2 6= 0, which means that x = −1 is also a vertical asymptote. We can write

2 2x2 = lim 2 x→∞ 1 − x→∞ x − 1 lim

1 x2

= 2.

This means that f (x) = 2 is a horizontal asymptote. Therefore, the graph of the function has vertical asymptotes at x = ±1 and a horizontal asymptote at f (x) = 2. 5. The following information is obtained: Step 1: The domain of f (x) is the interval (−∞; ∞). Step 2: The y-intercept is found by setting x = 0 and this is at y = 2. The x-intercept is found by setting y = f (x) = 0. In this case, it is a cubic equation. The solution to this equation can be found by applying a numerical search procedure. This is discussed in the next study unit. Step 3: Since lim f (x) = lim (x3 − 6x2 + 9x + 2) = −∞,

x→−∞

x→−∞

and lim f (x) = lim (x3 − 6x2 + 9x + 2) = ∞,

x→∞

x→∞

we see that f (x) decreases without bound as x tends to minus infinity, and f (x) increases without bound as x tends to infinity. Step 4: Since f (x) is a polynomial, there are no asymptotes. Step 5: The first derivative is f ′ (x) = 3x2 −12x+9. If we set f ′ (x) = 0, we have 3x2 − 12x + 9 = 0 3(x2 − 4x + 3) = 0 3(x − 1)(x − 3) = 0 x = 1 and x = 3. These points divide the line into the intervals (−∞; 1), (1; 3) and (3; ∞).

163

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

Select x = 0 in the interval (−∞; 1). Then f ′ (0) = 9 > 0

⇒ f (x) is increasing.

Select x = 2 in the interval (1; 3). Then f ′ (2) = 3(2)2 − 12(2) + 9 = −3 < 0

⇒ f (x) is decreasing.

Select x = 4 in the interval (3; ∞). Then f ′ (4) = 3(4)2 − 12(4) + 9 = 9 > 0

⇒ f (x) is increasing.

Step 6: The second derivative is f ′′ (x) = 6x − 12. The stationary points, x = 1 and x = 3, were found by setting f ′ (x) = 0 in Step 5 above. Evaluating the second derivative at the stationary points gives f ′′ (1) = 6(1) − 12 = −6 < 0 ⇒ relative maximum f ′′ (3) = 6(3) − 12 = 6 > 0 ⇒ relative minimum. The corresponding f (x) values are f (1) = 13 − 6(1)2 + 9(1) + 2 = 6 f (3) = 33 − 6(3)2 + 9(3) + 2 = 2. The point (1; 6) is a relative maximum and the point (3; 2) is a relative minimum. Step 7: If we set f ′′ (x) = 0, then we have 6x − 12 = 0

⇒ x = 2.

This point is a candidate for an inflection point and it divides the real line into two intervals (−∞; 2) and (2; ∞). Select x = 0 in the interval (−∞; 2). Then f ′′ (0) = 6(0) − 12 = −12 < 0

⇒ f (x) is concave.

Select x = 3 in the interval (2; ∞). Then f ′′ (3) = 6(3) − 12 = 6 > 0

⇒ f (x) is convex.

We see that f (x) changes from a concave to a convex function at x = 2; therefore, it is an inflection point. The corresponding f (x) value is f (2) = 23 − 6(2)2 + 9(2) + 2 = 4. Point (2; 4) is an inflection point. 164

DSC2606 10.9. SOLUTIONS TO EXERCISES

Step 8: Points on f (x) are as follows: x −1 −0, 5 4 4, 5 5 f (x) −14 −4, 13 6 12, 13 22 The graph of the function is given in Figure 10.27. f (x) 24 22 20 18 16 14 12 10 8 6 4 2 −2

−1 −2 −4 −6 −8 −10 −12 −14 −16

b

b

b

b

b

b

b

1

2

3

4

5

6

x

b

b

Figure 10.27: Graph of function y = f (x) = x3 − 6x2 + 9x + 2

165

DSC2606 CHAPTER 10 PROPERTIES OF FUNCTIONS AND SKETCHING GRAPHS

166

Chapter

11

Zeros of functions or roots of equations

Contents 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 168 11.2 Locating roots of equations . . . . . . . . . . . . . . . . 168 11.3 Bisection method . . . . . . . . . . . . . . . . . . . . . . 170 11.3.1 Computer algorithms . . . . . . . . . . . . . . . . 173 11.4 Newton’s method . . . . . . . . . . . . . . . . . . . . . . 176 11.4.1 Computer algorithms . . . . . . . . . . . . . . . . 179 11.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 182 11.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . 183

167

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

Learning objectives After completing this study unit you should be able to • determine zeros of functions or roots of equations algebraically, where possible, or else numerically • apply numerical search algorithms, such as the bisection method and Newton’s method, to approximate zeros of functions or roots of equations • implement iterative procedures of numerical search algorithms to approximate zeros of functions or roots of equations in a computer package such as Maxima

11.1 Introduction When we sketched the graphs of functions in the previous study unit, it was sometimes difficult to determine the x-intercepts algebraically. We mentioned that we might need to search for them numerically. In this study unit we discuss two numerical search algorithms for determining the x-intercepts, namely the bisection method and Newton’s method. We also discuss how to implement the iterative procedures of these two algorithms in Maxima. If you have not done so yet, you should now install Maxima on your computer and work through the Maxima Programming Tutorial in Tutorial Letter 101.

11.2 Locating roots of equations The notes in this study unit have been developed by referring to the textbooks Numerical Mathematics and Computing by Ward Cheney and David Kincaid, and Numerical methods for Mathematics, Science and Engineering by John H Mathews.

Definition 11.1 Assume f (x) is a continuous function. Any real number r for which f (r) = 0, is called the root of the equation f (x) = 0 or the zero of the function f .

168

DSC2606 11.2. LOCATING ROOTS OF EQUATIONS

The zeros of the quadratic function f (x) = ax2 + bx + c or the roots of the equation ax2 + bx + c = 0 can be easily calculated with the formula √ −b ± b2 − 4ac . x= 2a

Example 11.1 The equation 6x2 − 7x + 2 = 0 has two real roots r1 = 21 and r2 = 23 . Hence the function f (x) = 6x2 − 7x + 2 has 12 and 23 as zeros. The zeros could be calculated using factorising 6x2 − 7x + 2 = (2x − 1)(3x − 2) = 0, which means that either 2x − 1 = 0 or 3x − 2 = 0, or by applying the formula p −(−7) ± (−7)2 − 4(6)(2) ri = . 2(6) In the case of a quadratic equation or parabola or polynomial of degree 2, we can calculate the roots without any trouble. However, how do you determine the solution(s) of equations such as 0, 5x5 − 7, 32x4 + 31, 344x3 − 18, 464x2 − 79, 104x − 32, 256 = 0, 3 560 = 1 000

(1 + i)5 − 1 i(1 + i)5

or 2x(1 − x2 + x) ln x = x2 − 1 ? All the previous equations can be written in the form f (x) = 0. But it is impossible to solve them algebraically. We need computer based methods based on iterative procedures (sometimes called trail-and-error methods) to determine the solutions. Quite a number of methods are available for locating the zeros of a nonlinear function. In this study unit we discuss two methods (the bisection method and Newton’s method) to explain iterative procedures to approximate zeros. We also explain how these methods can be implemented using Maxima. However, you may implement the algorithms using any other computer language (such as Java, C, Visual Basic) of your choice . The methods could also be implemented on any programmable pocket calculator. Maxima cannot be considered the most suitable package for numerical algorithms, but the package is also used in other modules offered by the Department of Decision Sciences, and can be used interactively to simplify numerical calculations.

169

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

11.3 Bisection method The bisection method is a bracketing algorithm to calculate a zero of a continuous function. The method is simple and will eventually converge to a zero if we start with an interval [a; b] where f (a) and f (b) have opposite signs.

Property 11.1 If a < b and f (a) f (b) < 0, so that f (a) and f (b) have opposite signs, the intermediate-value theorem (see introductory calculus textbooks) implies that there exists a point r such that a < r < b and f (r) = 0.

The graph in Figure 11.1 illustrates this property. f (x) f (a)

0

r

a

b x

f (b) Figure 11.1: Zero r of function f where a < b and f (a) f (b) < 0

Rule Bisection algorithm • Suppose f is continuous on the interval [a; b] and f (a) · f (b) < 0. • Construct the midpoint c =

a+b 2

and calculate f (c).

• Analyse the following possibilities that might arise: 1. If f (c) = 0, then c is the zero of f . In practice the value of c will be accepted if the absolute value of f (c) is less than a prescribed tolerance ε , that is, if | f (c)| < ε . 2. If f (a) and f (c) have opposite signs, the zero lies in [a; c]. 3. If f (c) and f (b) have opposite signs, the zero lies in [c; b]. • Construct a new smaller interval containing the zero: 170

DSC2606 11.3. BISECTION METHOD

1. If f (a) · f (c) < 0, then the new interval containing the zero is [a; c]. 2. If f (c) · f (b) < 0, then the new interval containing the zero is [c; b] • Repeat the process with the new bisected interval. • The iteration steps are repeated until the interval is smaller than a prescribed tolerance (or precision) ε and f (c) ≈ 0. After the first iteration, the interval containing the zero r of function f has been halved. The new interval is [a; c], as illustrated in Figure 11.2. f (x) f (a)

c= 0

a

r

a+b 2

c

b x

f (c) f (b) Figure 11.2: Interval [a; c] contains zero r of function f

Example 11.2 Use the bisection method to determine the zero (accurately to 3 decimal digits) of f (x) = x3 − 3x + 1 on the interval [0; 1]. Solution Drawing a graph using Maxima To draw the graph of f using Maxima, input wxplot2d([x^3-3*x+1], [x,-2,3], [y,-2,3.5])$

or click on [plot2d]

and complete the menu. Then the following graph is obtained:

171

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

Applying the bisection algorithm • Tolerance or precision ε = 0, 01. • On the initial interval [0; 1], f (0) = 1 > 0 and f (1) = −1 < 0. The interval will therefore contain a zero of f . • c=

0+1 2

= 0, 5 and f (0, 5) = −0, 375.

• Then: iteration 1

a 0

c 0, 5

f (a) 1

b 1

f (b) −0, 375

f (c) −1

abs(a − b) 1

• Since f (0) and f (0, 5) have opposite signs, the zero lies in the interval [0; 0, 5]. Hence the new interval is [a; b] = [0; 0, 5]. • c=

0+0,5 2

• Then: iteration 1 2

= 0, 25 and f (c) = 0, 266. a 0 0

c 0, 50 0, 25

b 1, 0 0, 5

f (a) 1, 00 1, 00

f (c) −0, 375 0, 266

f (b) −1 −0, 375

abs(a − b) 1 0, 5

Note that since we only care about the signs of f we do not need to keep all the digits for f (a), f (b) and f (c). Use your pocket calculator or Maxima to help with the calculations and repeat the iteration steps. Then verify the following values: iteration 1 2 3 4 5 6 7 8

172

a 0 0 0, 25 0, 25 0, 3125 0, 34375 0, 34375 0, 34375

c 0, 50 0, 25 0, 375 0, 3125 0, 34375 0, 35937 0, 35156 0, 34765

b 1, 0 0, 5 0, 5 0, 375 0, 375 0, 375 0, 35937 0, 35156

f (a) 1, 00 1, 00 0, 266 0, 266 0, 093 0, 009 0, 009 0, 009

f (c) −0, 375 0, 266 −0, 072 0, 093 0, 009 0, 032 −0, 011 −9, 5E − 4

f (b) −1 −0, 375 −0, 375 −0, 072 −0, 072 −0, 722 −0, 032 −0, 011

abs(a − b) 1 0, 5 0, 25 0, 125 0, 0625 0, 03125 0, 015 0, 0078

DSC2606 11.3. BISECTION METHOD

Then the zero of f ≈ 0, 34765 and abs( f (0, 34765) ≈ 9, 5 × 10−4. Note that abs(a − b) < 0, 01. After 20 iterations the approximation will be c ≈ 0, 347296 with f (c) ≈ 3, 48 × 10−7. Of course this is a more accurate solution.

At this stage you should either welcome a computer program to do all the calculations or prefer a method that converges faster to the root.

11.3.1 Computer algorithms In this section the Maxima instructions to determine the root(s) are explained. The interactive use of Maxima to find a zero is described, followed by an example of the Maxima code to implement your own program. Before attempting to write your own program, work through the programming hints in the Maxima Programming Tutorial in Tutorial Letter 101. Maxima standard function Maxima has a standard function to determine the root(s) of an equation numerically. Choose [Equations] on the top menu, then [Solve numerically...]. Complete the pop-up menu. Example 11.3 Determine a root of the equation f (x) = x3 − 3x + 1 = 0 in the interval [0; 1] by using the standard Maxima function for solving an equation numerically. (Initial intervals for the location of roots of f (x) = 0 can be determined from the graph of f on p.172.) Solution Choosing [Equations] on the top menu, then [Solve numerically...], and by completing the pop-up menu, the Maxima inputs and outputs for solving the equation over the interval [0; 1] are: (%i1) (%o1) (%i3) (%o3)

f(x):=x^3-3*x+1; f(x):=x^3-3*x+1 find_root(f(x)=0, x, 0, 1); 0.34729635533386

Using Maxima interactively In Maxima all input instructions are indicated by %i followed by a number, and all output responses by %o followed by the same number.

173

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

Example 11.4 Determine a root of the equation f (x) = x3 − 3x + 1 = 0 in the interval [0; 1] by implementing the bisection algorithm in Maxima. Solution The Maxima inputs and outputs for the bisection algorithm are as follows: (%o5) a:0; (%o5) 0 (%i6) b:1; (%o6) 1 (%i7) f(x):=x^3-3*x+1; (%o7) f(x):=x^3-3*x+1 (%i8) fa:f(a); (%o8) 1 (%i9) fb:f(b); (%o9) -1 (%i10) c:(a+b)/2.0; (%o10) 0.5 (%i11) fc:f(c); (%o11) -0.375 (%i12) b:c; (%o12) 0.5 (%i13) fb:fc; (%o13) -0.375

Input instructions 10–13 must be performed repeatedly, and the signs of the function values compared manually in each repetition. Quite a tedious task!

A program can simplify the process. Pseudocode A pseudocode explains how a computer program can be constructed, using any available computer language or package. The notation c ← 10 implies that the value 10 is assigned to a variable c. Also: f c ← f (c) implies that the function f is evaluated in the point c. In this case the value f (10) is calculated and the function value is stored in the variable f c.

174

DSC2606 11.3. BISECTION METHOD

Rule Pseudocode for the bisection algorithm Input f , a, b, ε Test: Is f (a) · f (b) < 0? MaxIter = 100 [Comment 1] f a ← f (a) f b ← f (b) Iteration steps: for i = 0 to MaxIter do c ← (a + b)/2.0 f c ← f (c) Output i, a, c, b, f a, f c, f b, abs(a − b) [Comment 2] if abs(a − b) < ε or | f c| < ε exit loop else choose new interval: if f a · f c < 0 then (b ← c and f b ← f c) [Comment 3] if f c · f b < 0 then (a ← c and f a ← f c) [Comment 4] end do Output c, f c Comments 1. Specifying the maximum number of iterations is useful to prevent a program getting stuck in a loop, should an error occur. Start with a small number. 2. Output only necessary to understand the iteration steps. 3. New interval [a; c] 4. New interval [c; b]

Maxima program code for the bisection algorithm Choose [Edit] on the top menu and then [Long input] and type the instructions. Then [Enter] the input. Make sure you remember all the brackets. (Every left bracket should be closed with a right bracket.) If you make a mistake and Maxima complains, [Copy] the input instruction from the screen and [Paste] it in [Long input], then correct the code before entering the input again. Initially it may take a few attempts. Example 11.5 One possible code to determine the zero of the function is:

175

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

block(f(x):=x^3-3*x+1, a:0, b:1, fa:f(a),fb:f(b),fc:1000, for i:1 thru 100 step 1 while abs(fc) > 0.0001 do ( c:(a+b)/2.0, fc:f(c),print (i,a,b,c,fa,fb,fc), if (fa*fc) < 0 then (b:c, fb:fc), if (fb*fc) 0.0001 do block(fx:x^2-1, fderv:2*x, x:x-fx/fderv, display (x,fx));

Note that the previous instructions may be grouped in a [block] instruction to write your own Maxima program code. The Maxima output is: x=1.083333333333333 fx=1.25 x=1.003205128205128 fx=0.17361111111111 x=1.000005120013107 fx=0.0064205292570676 x=1.000000000013107 fx=1.0240052429111302*10^-5 (%o7) done

Maxima program code for Newton’s method You may implement the previous instructions directly or develop a recursive function. A recursive function calls itself and is a more advanced technique to use. If you want to improve your Maxima programming ability, then try the following program code. Choose [Edit] on the top menu, then [Long input] to type the code, then [Enter] the code. Example 11.8 Code for function myNew in Maxima myNew(f,guess,prec):=block([fx,derx,xnew], fx: f(guess), 180

DSC2606 11.4. NEWTON’S METHOD

derx: subst(guess, x, diff(f(x),x)), xnew: guess-fx/derx, if abs(xnew-guess)< prec then return(guess) else myNew(f,xnew,prec)); The code was obtained from A Maxima Guide for Calculus Students by Moses Glasner. It is available on http://www.math/psu.edu/glasner/Max$_doc

Example 11.9 √ Determine 2, that is, determine the positive root of x2 − 2 = 0 by implementing the function myNew in Maxima. Solution (%i2) (%o2) (%i3) (%o3)

f(x):=x^2-2; f(x):=x^2-2 myNew(f,1.7,0.00001); 1.414213576599356

Maxima standard function for Newton’s method Maxima contains a standard built-in function to approximate the zero of a function f using Newton’s method. The function must first be loaded ([load newton] and is called using the instruction [newton(f, initial guess)].

Example 11.10 Determine the root of ex − 4x = 0 close to 2, 1 by implementing the standard function for Newton’s method in Maxima. Solution (%i4) (%o4) (%i5) (%o5)

load(newton); C:/PROGRA~1/MAXIMA~1.0/share/maxima/5.14.0/share/numeric/newton.mac newton(exp(x)-4*x, 2.1); 2.153292364162442b0

181

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

11.5 Exercises 1. Determine the other two roots of x3 − 3x + 1 = 0 using the bisection algorithm and Newton’s method. Do at least 3 iterations using your pocket calculator, and then use a computer program to find a more accurate solution. Use the graph on page 172 to choose initial intervals containing a root or to determine an initial estimate of a root.

182

DSC2606 11.6. SOLUTIONS TO EXERCISES

2. Determine all the zeros of the previous polynomial using the following standard function in Maxima: Choose [Equations] on the top task bar, then [Roots of polynomial]. This function determines all the roots (real and complex roots). If you are only interested in the real-valued roots, choose [Roots of polynomial(real)] and complete the pop-up menu. 3. Determine the zeros of f (x) = ex − 4x accurately to 4 decimal digits. Use Maxima to plot the function and verify that 0, 35 and 2, 1 are good initial estimates for the two zeros.

11.6 Solutions to exercises 1. Initial intervals for the location of all three roots of f (x) = 0 can be determined by referring to the graph of f on p.172. The Maxima inputs and outputs for numerically solving the equation over the intervals [0; 1], [1; 2], and [−2; −1], respectively, are obtained by choosing [Equations] on the top menu, then [Solve numerically...], and by completing the pop-up menu. (%i1) (%o1) (%i3) (%o3) (%i4) (%o4) (%i5) (%o5)

f(x):=x^3-3*x+1; f(x):=x^3-3*x+1 find_root(f(x)=0, x, 0, 1); 0.34729635533386 find_root(f(x)=0, x, 1, 2); 1.532088886237956 find_root(f(x)=0, x, -2, -1); -1.879385241571817

Implementing the standard function for Newton’s method in Maxima:

(%i4) load(newton); (%o4) C:/PROGRA~1/MAXIMA~1.0/share/maxima/5.14.0/share/numeric/newton (%i6) newton(x^3-3*x-1,-1.5); Warning: Float to bigfloat conversion of -1.5 (%o6) -1.532088886241467b0 (%i7) newton(x^3-3*x-1,0.5); Warning: Float to bigfloat conversion of 0.5 (%o7) -3.472963553338599b-1 (%i8) newton(x^3-3*x-1,1.5); Warning: Float to bigfloat conversion of 1.5 (%o8) 1.879385241571822b0 2. Choosing [Equations] and [Roots of polynomial] or [Roots of polynomial(real)] in Maxima:

183

DSC2606 CHAPTER 11 ZEROS OF FUNCTIONS OR ROOTS OF EQUATIONS

(%i17) kill(all); (%o0) done (%i1) x^3-3*x-1=0; (%o1) x^3-3*x-1=0 (%i2) allroots(%); (%o2) [x=-0.34729635533386,x=-1.532088886237956,x=1.879385241571817] (%i3) realroots(%); (%o3) [[x=-0.34729635533386,x=-1.532088886237956,x=1.879385241571817]=0] 3. By clicking on [plot2d] and completing the menu, the following graph is obtained: 5

4

%ex-4*x

3

2

1

0

-1

-2 0

0.5

1

1.5 x

2

2.5

3

Implementing the standard function for Newton’s method in Maxima: (%i4) load(newton); (%o4) C:/PROGRA~1/MAXIMA~1.0/share/maxima/5.14.0/share/numeric/newton.mac (%i5) newton(exp(x)-4*x, 2.1); (%o5) 2.153292364162442b0 (%i6) newton(exp(x)-4*x,0.35); Warning: Float to bigfloat conversion of 0.34999999999999998 (%o6) 3.574029561179523b-1

184

Chapter

12

Marginal analysis

Contents 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 186 12.2 Cost functions . . . . . . . . . . . . . . . . . . . . . . . 186 12.3 Average cost functions . . . . . . . . . . . . . . . . . . . 189 12.4 Revenue functions . . . . . . . . . . . . . . . . . . . . . 192 12.5 Profit functions . . . . . . . . . . . . . . . . . . . . . . . 194 12.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 195 12.7 Solution to exercises . . . . . . . . . . . . . . . . . . . . 195

185

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

Learning objectives After completing this study unit you should be able to • explain the concepts of marginal analysis, cost, revenue and profit • derive the revenue function from a demand function • derive the profit function • determine the marginal cost, marginal revenue and marginal profit functions, and explain their meaning • determine the average cost function and the marginal average cost function, and explain their meaning • use differential calculus to solve marginal analysis problems.

12.1 Introduction Marginal analysis is the study of the rate of change of economic quantities. For example, an economist is not merely concerned with the value of an economy’s gross domestic product (GDP) at a given time, but is equally concerned with the rate at which it is growing or declining. In the same vein, a manufacturer is not only interested in the total cost corresponding to a certain level of production of a commodity, but is also interested in the rate of change of the total cost with respect to the level of production, and so on.

12.2 Cost functions The total weekly cost (in rand) incurred by the Polarair Company is given by C(x) = 8 000 + 200x − 0, 2x2

for

0 ≤ x ≤ 400,

where x represents the number of refrigerators produced per week. Suppose we want to know how much it actually costs to produce the 251st refrigerator. This can be found by calculating the difference between the total cost of producing the first 251 and the total cost of producing the first 250 refrigerators, as follows: C(251) −C(250) = [8 000 + 200(251) − 0, 2(251)2] − [8 000 + 200(250) − 0, 2(250)2] = 45 599, 8 − 45 500 = R99, 80. Now, if we want to know how fast total cost will change with respect to production, we have to find the derivative of C(x). Then C′ (x) = 200 − 0, 4x. 186

DSC2606 12.2. COST FUNCTIONS

This is the rate of change of the total cost function C(x) with respect to x. At a level of production of 250 refrigerators, we have C′ (250) = 200 − 0, 4(250) = R100. This means that total cost is increasing at a rate of R100 per refrigerator when 250 refrigerators are produced, which in turn implies that the next refrigerator will actually cost R100 to produce. Let us now compare these two results. The difference C(251) −C(250) may be written in the form C(250 + 1) −C(250) C(251) −C(250) = 1 1 C(250 + h) −C(250) = , h where h = 1. This is the average rate of change of the total cost function C(x) over the interval [250; 251]. On the other hand, C′ (250) is the instantaneous rate of change of C(x) at x = 250. Now, when h is very small, the average rate of change of the function C(x) is a good approximation of the instantaneous rate of change of C(x). We can therefore say that C(250 + h) −C(250) C(250 + h) −C(250) ≈ lim = C′ (250), h h→0 h which means that C′ (250) also approximates the actual cost incurred in producing the 251st refrigerator. This gives rise to the following definition:

Definition 12.1 The marginal cost function at a particular point is the derivative of the total cost function evaluated at that point, and this gives the actual cost incurred in producing an additional unit, given that production is at a level corresponding to the point under consideration.

From this definition it follows that the marginal cost function, C′ (x), evaluated at point x, provides us with a good approximation of the actual cost incurred in producing the (x + 1)st unit of a product, assuming that x units are already produced. Example 12.1

187

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

A subsidiary of the Elektra Electronics Company manufactures a programmable pocket calculator. The daily total cost (in rand) of producing these calculators is given by C(x) = 0, 0001x3 − 0, 08x2 + 40x + 5 000, where x represents the number of calculators produced. (a) Find the marginal cost function. (b) Calculate the marginal cost for x = 100, 200, 300, 400, 600 and 700. (c) Interpret the results. Solution (a) The marginal cost function is given by the derivative of the total cost function C′ (x) = 0, 0003x2 − 0, 16x + 40. (b) The required marginal costs are C′ (100) C′ (200) C′ (300) C′ (400) C′ (600) C′ (700)

= = = = = =

0, 0003(100)2 − 0, 16(100) + 40 = 27 0, 0003(200)2 − 0, 16(200) + 40 = 20 19 24 52 75.

(c) The actual cost of producing an additional calculator at a certain production level is the marginal cost and this is given by the derivative of the total cost function at this production level. We see that the actual cost incurred in producing one additional calculator at a production level of 100 calculators, is R27; at a production level of 200 calculators, it is R20, and so on. The actual cost of producing an additional calculator drops at first as the production level increases. This is true up to a production level of between 300 and 400 calculators. Thereafter the actual cost of producing an additional unit increases. Observe that at a production level of 700 calculators, the actual cost of producing one additional calculator is R75. The higher cost of producing one additional unit at this production level may be the result of several factors, among them excessive costs incurred because of overtime or higher maintenance, production breakdown caused by greater stress and strain on equipment, and so on.

188

DSC2606 12.3. AVERAGE COST FUNCTIONS

12.3 Average cost functions Let us now introduce another marginal concept closely related to the marginal cost. If C(x) denotes the total cost incurred in producing x units of a certain product, then the average cost of producing x units of the product is obtained by dividing the total production cost by the number of units produced.

Definition 12.2 The average cost of producing one unit of a product is given by the function C(x) C(x) = , x where C(x) is the total cost of producing x units. C(x) is called the average cost function.



The marginal average cost function, C (x), measures the rate of change of the average cost function with respect to the number of units produced. This means that the average cost function gives the average cost of producing one unit of a product and the marginal average cost function gives the rate at which this average cost changes.

Example 12.2 The total cost (in rand) of producing x units of a certain product is given by C(x) = 400 + 20x. (a) Find the average cost function. (b) Find the marginal average cost function. (c) Interpret the results obtained in (a) and (b). Solution (a) The average cost function is given by C(x) x 400 + 20x = x 400 = 20 + . x

C(x) =



(b) The marginal average cost function is C (x) = −

400 . x2

189

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

(c) The average cost function C(x) calculated in (a), gives the average cost of producing one unit of the product. This average cost function C(x) = 20 +

400 , x

is the sum of a fixed cost (20) and the production cost per unit ( If we calculate



400 lim 20 + x→∞ x



400 ). x

= 20,

we see that as the production level increases, the production cost per 400 , drops steadily with the result that unit, represented by the term x the average cost of producing one unit of the product approaches the fixed cost, 20. ′

The marginal average cost function C (x) calculated in (b), gives the ′ rate of change in the average cost function. Since C (x) is negative for all values of x, we see that the average cost function decreases as x increases. This means that the average cost of producing one unit of the product decreases as the production level increases. The graph of the average cost function C(x) is given in Figure 12.1.

C(x) 80 60 40 20 25

50

75 100 125 150

x

Figure 12.1: Average cost function C(x) = 20 +

400 x

Note that the line C(x) = 20 is a horizontal asymptote. This was apparent from lim C(x) = 20 calculated earlier. The line C(x) = 20 reprex→∞ sents the fixed cost.

190

DSC2606 12.3. AVERAGE COST FUNCTIONS

Example 12.3 Once again consider the subsidiary of the Elektra Electronics Company. The daily total cost (in rand) of producing its programmable calculators is given by C(x) = 0, 0001x3 − 0, 08x2 + 40x + 5 000, where x represents the number of calculators produced. ′

(a) Find the marginal average cost function and calculate C (500). (b) Interpret the results obtained in (a). Solution (a) The average cost function is C(x) =

C(x) x

= 0, 0001x2 − 0, 08x + 40 +

5 000 . x

The marginal average cost function is ′

C (x) = 0, 0002x − 0, 08 −

5 000 . x2

Therefore ′

C (500) = 0, 0002(500) − 0, 08 −

5 000 = 0. (500)2



(b) The fact that C (500) = 0 means that the average cost function C(x) has a critical point at x = 500. The second derivative of C(x) is ′′

C (x) = 0, 0002 +

5 000 . x3

Evaluating the second derivative at x = 500 gives ′′

C (500) = 0, 00024 > 0 ⇒

relative minimum.

Now at x = 500, the value of the average cost function is C(500) = 0, 0001(500)2 − 0, 08(500) + 40 +

5 000 500

= 35. The point (500; 35) is a relative minimum point of C(x).

191

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

′′

The fact that C (x) > 0 for all x > 0, means that the average cost function is a convex function for x > 0. Therefore, the point (500; 35) is the absolute minimum point of the average cost function. The graph of the average cost function is shown in Figure 12.2.

C(x) 125 100 75

Absolute minimum

50

b

25

(500; 35) 250

500

750

1000

Figure 12.2: C(x) = 0, 0001x2 − 0, 08x + 40 +

x 5 000 x

We see that, as expected, the average cost of producing one unit of the product drops as the level of production increases. In this case, however, the average cost reaches a minimum value of R35, corresponding to a production level of 500, and increases thereafter.

12.4 Revenue functions We now introduce another marginal concept, the marginal revenue function. The revenue function R(x) is given by R(x) = px, where x represents the number of units of a certain product sold and p represents the unit selling price of the product. In general, the unit selling price of a product is related to the quantity of the product demanded and this relationship can be represented by the demand function p = p(x). Now the revenue function reduces to R(x) = p(x) · x. 192

DSC2606 12.4. REVENUE FUNCTIONS

The derivative, R′ (x), of the revenue function is called the marginal revenue function and it measures the rate of change of the revenue function with respect to the number of units demanded. A definition for marginal revenue follows easily from the definition for marginal cost given in Section 12.2.

Definition 12.3 The marginal revenue function at a particular point is the derivative of the revenue function evaluated at that point, and this gives the actual revenue earned from selling an additional unit, given that sales are already at a level corresponding to the point under consideration.

The marginal revenue function, R′ (x), evaluated at point x, provides us with a good approximation of the actual revenue earned from the sale of the (x + 1)st unit of a product, assuming that x units have already been sold.

Example 12.4 The relationship between the unit price p (in rand) and the quantity demanded x of the Acrosonic model F loudspeaker system is given by the function p(x) = −0, 02x + 400 for

0 ≤ x ≤ 20 000.

(a) Find the revenue function. (b) Find the marginal revenue function. (c) Calculate R′ (2 000) and explain the meaning of this. Solution (a) The revenue function is R(x) = p(x) · x = (−0, 02x + 400) · x = −0, 02x2 + 400x. (b) The marginal revenue function is R′ (x) = −0, 04x + 400. (c) The required calculation is R′ (2 000) = −0, 04(2 000) + 400 = 320. This means that the actual revenue earned from the sale of the 2001st loudspeaker system is approximately R320.

193

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

12.5 Profit functions Since profit is calculated as the difference between revenue and cost, it is obvious that the profit function P(x) is given by P(x) = R(x) −C(x), where R(x) is the revenue function, C(x) is the cost function and x represents the number of units of a product produced and sold. The marginal profit function, P′ (x), measures the rate of change of the profit function P(x).

Definition 12.4 The marginal profit function at a particular point is the derivative of the profit function evaluated at that point, and this gives the actual profit realised from selling an additional unit, given that sales are already at a level corresponding to the point under consideration. The marginal profit function, P′ (x), evaluated at point x, provides us with a good approximation of the actual profit or loss realised from the sale of the (x + 1)st unit of a product, assuming that x units have already been sold. Example 12.5 Refer to example 12.4. Suppose that the cost (in rand) of producing x units of the Acrosonic model F loudspeaker is C(x) = 100x + 200 000. (a) Find the profit function. (b) Find the marginal profit function. (c) Calculate P′ (2 000) and explain the meaning of this. Solution (a) The revenue function was calculated as R(x) = −0, 02x2 + 400x. The required profit function is P(x) = R(x) −C(x) = −0, 02x2 + 400x − (100x + 200 000) = −0, 02x2 + 300x − 200 000. (b) The marginal profit function is P′ (x) = −0, 04x + 300.

(c) The required calculation is P′ (2 000) = −0, 04(2 000) + 300 = 220. This means that the actual profit realised from the sale of the 2001st loudspeaker system is approximately R220.

194

DSC2606 12.6. EXERCISES

12.6 Exercises 1. The weekly demand for Pulsar VCRs (videocassette recorders) is given by the demand function p = −0, 02x + 300 for 0 ≤ x ≤ 15 000, where p represents the wholesale unit price in rand and x represents the quantity demanded. The weekly total cost function (in rand) associated with manufacturing these VCRs is given by C(x) = 0, 000003x3 − 0, 04x2 + 200x + 70 000. (a) Find the revenue function and the profit function. (b) Find the marginal cost function, the marginal revenue function, and the marginal profit function. (c) Calculate the marginal cost function, the marginal revenue function and the marginal profit function at the point x = 3 000 and explain the results. (d) Calculate the marginal cost function at points x = 30 and x = 300 and compare these results with the result of the marginal cost function evaluated at x = 3 000 in (c) above. (e) Find the average cost function and the marginal average cost function. (f) Calculate the average cost function and the marginal average cost function at point x = 3 000 and explain the results.

12.7 Solution to exercises 1.

(a) The revenue function is R(x) = px = (−0, 02x + 300) · x = −0, 02x2 + 300x. The profit function is P(x) = R(x) −C(x) = −0, 02x2 + 300x − (0, 000003x3 − 0, 04x2 + 200x + 70 000) = −0, 000003x3 + 0, 02x2 + 100x − 70 000. 195

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

(b) The marginal functions are C′ (x) = 0, 000009x2 − 0, 08x + 200 R′ (x) = −0, 04x + 300 P′ (x) = −0, 000009x2 + 0, 04x + 100. (c) The marginal cost function evaluated at x = 3 000 is C′ (3 000) = 0, 000009(3 000)2 − 0, 08(3 000) + 200 = 41. The marginal revenue function evaluated at x = 3 000 is R′ (3 000) = −0, 04(3 000) + 300 = 180. The marginal profit function evaluated at x = 3 000 is P′ (3 000) = −0, 000009(3 000)2 + 0, 04(3 000) + 100 = 139. This means that at a production level of 3 000 VCRs, the actual cost of producing one additional unit is approximately R41, the actual revenue from selling the 3001st VCR is approximately R180 and the actual profit from selling the 3001st VCR is R139. (d) The marginal cost function evaluated at x = 30 is C′ (30) = 0, 000009(30)2 − 0, 08(30) + 200 = 197, 61. The marginal cost function evaluated at x = 300 is C′ (300) = 176, 81. From (c), the marginal cost function evaluated at x = 3 000 was 41. This means that the cost of producing an additional VCR is R197, 61 at a production level of 30; is R176, 81 at a production level of 300; and R41 at a production level of 3 000. This illustrates clearly that the cost of producing an additional VCR decreases as the production level increases from 30 to 3 000 units. (e) The average cost function is C(x) =

C(x) x

= 0, 000003x2 − 0, 04x + 200 +

70 000 . x

The marginal average cost function is ′

C (x) = 0, 000006x − 0, 04 − 196

70 000 . x2

DSC2606 12.7. SOLUTION TO EXERCISES

(f) The average cost function evaluated at x = 3 000 is C(3 000) = 0, 000003(3 000)2 −0, 04(3 000)+200+

70 000 = 130, 33. 3 000

The marginal average cost function evaluated at x = 3 000 is ′

C (3 000) = 0, 000006(3 000) − 0, 04 −

70 000 = −0, 03. (3 000)2

This means that if 3 000 VCRs have already been produced, the average cost of producing one such VCR is R130, 33 and the average cost of producing one VCR is decreasing at the rate of R0, 03 per unit.

197

DSC2606 CHAPTER 12 MARGINAL ANALYSIS

198

Chapter

13

Optimisation of NLPs in one variable

Contents 13.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 200 13.2 Vital theorems . . . . . . . . . . . . . . . . . . . . . . . 200 13.3 Solving NLPs in one variable by differential calculus . . 201 13.4 Returning to LINGO . . . . . . . . . . . . . . . . . . . 206 13.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 207 13.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . 208

199

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

Learning objectives After completing this study unit you should be able to • solve NLPs in one variable by means of differential calculus • test whether the solution obtained by means of differential calculus is indeed the optimal solution to the NLP • test whether the solution obtained by means of LINGO is indeed the optimal solution to the NLP.

13.1 Introduction In this study guide we see that real-world problems can be represented as models, and these models may be LP or NLP models depending on the nature of the problem. Various techniques are used to find solutions to the models. In Part 1 of the study guide we used the graphical method to solve LP models. We used the computer, specifically the LINGO package, to simplify this solution process. In Study unit 6 we used the LINGO computer package to solve NLP models. We warned that the solution thus obtained is not necessarily the optimal solution and that certain criteria must apply before we may conclude that the solution is optimal. We needed more information on functions and their properties and turned to differential calculus for this. Winston gives the criteria necessary for a solution to be the optimal solution to an NLP model. This is related to the concavity of functions and is given in two theorems. These theorems are of the utmost importance and are therefore repeated in the next section.

13.2 Vital theorems Theorem 13.1 If we are maximising a concave function over a convex domain, then any local maximum point will be the absolute maximum point of the function, and therefore the optimal solution.

Theorem 13.2 If we are minimising a convex function over a convex domain, then any local minimum point will be the absolute minimum point of the function, and therefore the optimal solution.

200

DSC2606 13.3. SOLVING NLPS IN ONE VARIABLE BY DIFFERENTIAL CALCULUS

These theorems can be applied in practical problems to determine whether the solution obtained is in fact the optimal solution. Some practical problems will now be solved by means of differential calculus.

13.3 Solving NLPs in one variable by differential calculus Many real-world applications call for finding the absolute maximum value or the absolute minimum value of a function. For example, management is interested in finding the level of production that will yield the maximum profit for the company; a farmer is interested in finding the right amount of fertiliser to maximise crop yield; a doctor is interested in finding the maximum concentration of a drug in a patient’s body and the time at which it occurs; and an engineer is interested in finding the dimensions of a container with a specified shape and volume that can be constructed at a minimum cost. Example 13.1 The Acrosonic Company’s total profit (in rand) from producing and selling x units of their Model F loudspeaker systems is given by P(x) = −0, 02x2 + 300x − 200 000. How many units of the loudspeaker system must Acrosonic Company produce to maximise its profit? Solution The stationary points are found by setting the derivative of P(x) equal to zero. Then P′ (x) = −0, 04x + 300 = 0 ⇒ x = 7 500. The second derivative is P′′ (x) = −0, 04 < 0 for all values of x



a relative maximum.

This also implies that P(x) is concave for all values of x. The domain of P(x) is x ∈ R and is a straight line; therefore a convex set. We can conclude that the stationary point is also the absolute maximum of the function. The total profit associated with x = 7 500 is P(x) = −0, 02(7 500)2 + 300(7 500) − 200 000 = 925 000. The absolute maximum is at point (7 500; 925 000). Therefore, the maximum profit of R925 000 is realised by producing 7 500 units of the loudspeaker system.

201

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

The graph of P(x) in Figure 13.1 (where x is measured in thousands of units and P(x) is measured in thousands of rand) confirms these findings.

P(x) 1000

Absolute maximum = (7, 5; 925) b

800 600 400 200 −200

5

10

15

x

Figure 13.1: Acrosonic Company’s profit function

Example 13.2 The present value of the market price of the Blakely Office Building at time t is given by (−0,09t+

P(t) = 300 000e

√ t 2 )

for

0 ≤ t ≤ 10.

Find the optimal present value of the market price of the building. Solution The derivative is √

  d 1 1 P (t) = 300 000e −0, 09t + t 2 dt 2  √  1 −1 (−0,09t+ 2t ) 2 = 300 000e −0, 09 + t . 4 ′

(−0,09t+ 2t )

√ t

The stationary points follow from setting P′ (t) = 0. Since e(−0,09t+ 2 ) is never zero for any value of t, we must have −0, 09 +

202

1 1

4t 2

= 0.

DSC2606 13.3. SOLVING NLPS IN ONE VARIABLE BY DIFFERENTIAL CALCULUS

Solving this equation, we find 1 1

= 0, 09

4t 2 1 4(0, 09)  2 1 t = 0, 36 = 7, 72.

1

t2

=

And the stationary point occurs at t = 7, 72. We evaluate P(t) at the stationary point as well as at the endpoints of [0; 10]: t 0 7,72 10

P(t) 300 000 600 779 592 838

We conclude, accordingly, that the optimal present value of the market price of the building is R600 779 and this will occur 7, 72 years from now.

In the previous examples the functions to be optimised were given. Now, we consider a problem where we are required to first find the appropriate function. Once the function has been found, it may be optimised. The annual total cost function (in rand) of a maize farmer is given by C(x) =

x2 + 10 000, 10

where x is the number of tons of mealies produced. Mealies are sold at R250 per ton. Obviously, the farmer wants to maximise his profit. If the farmer produces x tons of mealies per year, his revenue is given by R(x) = p · x = 250x. Now, his profit function is given by profit = revenue − cost P(x) = R(x) −C(x)  2  x = 250x − + 10 000 10 x2 = − + 250x − 10 000. 10 203

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

To find the value of x that will maximise P(x), we set the derivative of the profit function equal to zero. Then x P′ (x) = − + 250 = 0 5 x = 250 5 x = 1 250. To determine whether this point is a maximum, we use the second derivative. Then 1 P′′ (x) = − < 0 for all x ⇒ relative maximum. 5 This also indicates that P(x) is a concave function for all values of x and we can conclude that the stationary point is the absolute maximum of the function. Profit is therefore indeed a maximum at the production level of 1 250 tons. The maximum profit is (1 250)2 + 250(1 250) − 10 000 10 = R146 250.

P(1 250) = −

In general, we can say that for maximum profit P′ (x) = R′ (x) −C′ (x) = 0

⇒ R′ (x) = C′ (x),

which means marginal revenue = marginal cost. This is the basis for the classical economic criterion which states that if marginal revenue equals marginal cost, profit is a maximum. Beyond this point (ie for larger x values) the cost of producing additional units exceeds the revenue derived from the extra units. The maize farmer’s problem can also be solved using this criterion. We have R(x) = 250x ⇒ R′ (x) = 250 x x2 C(x) = + 10 000 ⇒ C′ (x) = . 10 5 For maximum profit R′ (x) = C′ (x) x 250 = 5 x = 1 250 tons. In Study unit 6 we modelled two problems; a dimensions of a box problem and an inventory problem. And we solved them with LINGO. Let us now see how these models are solved with differential calculus. 204

DSC2606 13.3. SOLVING NLPS IN ONE VARIABLE BY DIFFERENTIAL CALCULUS

Example 13.3 Return to Section 6.2 for the dimensions problem. The NLP model found there is Maximise V (x) = 4(x3 − 13x2 + 40x) subject to 0 ≤ x ≤ 5. The derivatives are V ′ (x) = 4(3x2 − 26x + 40) V ′′ (x) = 4(6x − 26). Now the stationary points follow from setting V ′ (x) = 0. Then 4(3x2 − 26x + 40) 4(3x − 20)(x − 2) 20 x= and x 3 x = 6, 67 and x

= 0 = 0 = 2 = 2.

The point x = 6, 67 lies outside the interval [0; 5]: so it is discarded. Therefore, the stationary point is at x = 2. Now, the second derivative at x = 2 is V ′′ (2) = 4[6(2) − 26] = 4(−14) = −56 < 0



maximum point.

Therefore, a relative maximum exists at x = 2. To determine the absolute maximum, we must evaluate the function at the stationary point and at the endpoints. However, since the endpoints of the interval [0; 5] are not feasible solutions (we will not have a box if x = 0 or x = 5), we conclude that the stationary point x = 2 is the absolute maximum. The dimensions of the box will then be length = 16 − 2x = 16 − 2(2) = 12cm width = 10 − 2x = 6cm height = 2cm. The maximum volume will be V (2) = 4[23 − 13(2)2 + 40(2)] = 144cm3 .

205

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

Example 13.4 Return to Section 6.3 for the inventory problem. The NLP model found there is Minimise C(x) = 100 000 000x−1 + 100x subject to 0 ≤ x ≤ 10 000. The derivatives are C′ (x) = −100 000 000x−2 + 100 C′′ (x) = 200 000 000x−3 . Now the stationary points follow from setting C′ (x) = 0. Then −

100 000 000 + 100 x2 100x2 x2 x

= 0 = 100 000 000 = 1 000 000 = ±1 000.

Since x = −1 000 is outside the domain of the function C(x), it is discarded and x = 1 000 is then the only stationary point. Now since C′′ (x) > 0 for all x > 0, we can deduce that C′′ (1 000) > 0; therefore x = 1 000 is a relative minimum point. Also since C′′ (x) > 0, the function C(x) is convex for x > 0; therefore x = 1 000 is the absolute minimum point of C(x). Therefore, the order size should be 1 000 and the number of orders per year should be demand 10 000 = = 10. order size 1 000 The accompanying cost will be 100 000 000 + 100(1 000) 1 000 = R200 000.

C(1 000) =

13.4 Returning to LINGO We now use the inventory problem to compare the solutions obtained by means of differential calculus and by means of LINGO (in Study unit 6). 206

DSC2606 13.5. EXERCISES

Both methods yielded R200 000 as the minimum cost and this is obtained by having 1 000 motorcycles in each order. When solving by means of differential calculus, we tested that the solution obtained was indeed the optimal solution. At the time of solving the problem by means of LINGO, we accepted the solution and were warned that it might not be optimal. Subsequently we have acquired the skills needed to test the solution and will now proceed to do so. By using theorem 17.2, we can conclude that the LINGO solution is optimal if we can show that the function to be optimised, C(x) = 100 000 000x−1 + 100x, is a convex function on a convex domain [0; 10 000]. The interval [0; 10 000] is a straight line between x = 0 and x = 10 000 and is therefore a convex set. The derivatives of C(x) are C′ (x) = −100 000 000x−2 + 100 C′′ (x) = 200 000 000x−3 . Since C′′ (x) > 0 for all x > 0, we conclude that C(x) is a convex function for x > 0; therefore for x ∈ [0; 10 000]. We have now shown that C(x) is a convex function on a convex domain. Therefore, it follows that the solution obtained is the optimal solution. NOTE: You must know whether the solution obtained for an NLP is the optimal solution to the problem or not. To make this deduction you must test whether the criteria of theorem 17.1 (or theorem 17.2) applies.

13.5 Exercises 1. The operating rate (expressed as a percentage) of factories, mines and utilities in a certain region of the country on the tth day of the year 2003 is given by the function f (t) = 80 +

1 200t t 2 + 40 000

for

0 ≤ t ≤ 250.

On which day of the first 250 days of 2003 was the operating rate the highest? (You may assume that f (t) is a concave function on [0; 250]. 2. Ship Shape produces small cabin cruisers; the Neptune. The cost function is given by C(x) = 100 000 + 6 000x + 4x2 ,

207

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

where x is the number of Neptunes produced per year. The unit price of a Neptune is related to demand by the following demand function: p(x) = 30 000 − 2x. How many Neptunes should be produced to maximise profit? Find the maximum annual profit and the optimal selling price. 3. Titan Tyres produces a certain brand of tyres. The set-up cost for each of their production runs is R4 000. The production cost per tyre is R20. The annual storage cost per tyre is R2. The annual demand for the tyres is 1 000 000. Assuming uniformity of demand throughout the year and instantaneous production, determine how many tyres should be manufactured per production run in order to keep the total annual costs to a minimum.

13.6 Solutions to exercises 1. The first derivative is  d  2 d (t + 40 000)−1 + (t 2 + 40 000)−1 (1 200t) dt dt 2 −2 2 1 200t(−1)(t + 40 000) (2t) + (t + 40 000)−1(1 200) 1 200 −2 400t 2 + 2 2 2 (t + 40 000) (t + 40 000) 2 −2 400t + 1 200(t 2 + 40 000) (t 2 + 40 000)2 1 200(−t 2 + 40 000) . (t 2 + 40 000)2

f ′ (t) = 1 200t = = = =

The stationary points follows from f ′ (t) = 0. Then 1 200(−t 2 + 40 000) (t 2 + 40 000)2 −t 2 + 40 000 t2 t

= 0 = 0 = 40 000 = ±200.

Since t = −200 is outside the domain of the function f (t), the only stationary point is at x = 200. It is given that f (t) is a concave function on [0; 250]. Therefore, it follows that x = 200 is a maximum point. 208

DSC2606 13.6. SOLUTIONS TO EXERCISES

The function is now evaluated at the end points of the interval and at the stationary point. Then f (0) = 80 1 200(200) = 83 2002 + 40 000 f (250) = 82, 93. f (200) = 80 +

From this follows that the point (200; 83) is the absolute maximum of f (t). We conclude that the operating rate is at its highest, 83%, on the 200th day of 2003. 2. The marginal cost function is C′ (x) = 6 000 + 8x. The revenue function is R(x) = p(x) · x = (30 000 − 2x)x = 30 000x − 2x2 . The marginal revenue function is R′ (x) = 30 000 − 4x. For maximum profit, we must have C′ (x) 6 000 + 8x 12x x

= = = =

R′ (x) 30 000 − 4x 24 000 2 000.

The optimal selling price can be calculated from the demand function as p(2 000) = 30 000 − 2(2 000) = 26 000. The profit function is P(x) = R(x) −C(x) = 30 000x − 2x2 − (100 000 + 6 000x + 4x2) = −6x2 + 24 000x − 100 000, and evaluated at x = 2 000 gives P(2 000) = −6(2 000)2 + 24 000(2 000) − 100 000 = 23 900 000.

209

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

We conclude that Ship Shape should produce 2 000 Neptunes per year and sell them for R26 000 each, and the maximum profit realised will be R23, 9 million. 3. Let x denote the number of tyres produced per production run. The annual demand is 1 000 000 tyres and this demand must be met. The number of production runs per year is therefore 1 000 000 total demand per year = . number of tyres produced per production run x The total annual cost is set-up cost + production cost + storage cost. Set-up cost The set-up cost per production run is R4 000. Therefore, the annual set-up cost is   1 000 000 4 000 000 000 4 000 × the number of runs = 4 000 = . x x Production cost The production cost per tyre is R20. Therefore, the production cost per production run is 20x and the annual production cost is   1 000 000 20x × the number of runs = 20x = 20 000 000 x OR 20 × demand = 20 × 1 000 000 = 20 000 000. Storage cost The number of tyres produced per production run is x and so the avx erage number of tyres in storage is . This is the average number of 2 tyres in storage during the year. The storage cost is R2 per tyre and annual storage cost is therefore 2×

x = x. 2

The total annual cost is C(x) =

4 000 000 000 + 20 000 000 + x x

= 4 000 000 000x−1 + 20 000 000 + x. 210

DSC2606 13.6. SOLUTIONS TO EXERCISES

Now C′ (x) = −4 000 000 000x−2 + 1.

Stationary points follow from C′ (x) = 0, that is, −

4 000 000 000 +1 = 0 x2 x2 x

= 4 000 000 000 = ±63 245, 55 ≈ 63 246 (ignore negative root).

Now C′′ (x) = 8 000 000 000x−3 > 0 for all x > 0. Therefore, C is convex for all x > 0 and x = 63 246 is the absolute minimum of C. 1 000 000 The company should manufacture 63 246 tyres in each of the = 15, 8 ≈ 16 63 246 production runs during the year. The minimum annual cost is C(63 245) =

4 000 000 000 + 20 000 000 + 63 246 63 246

= R20 126 491, 11.

211

DSC2606 CHAPTER 13 OPTIMISATION OF NLPS IN ONE VARIABLE

212

Chapter

14

Golden section search

Contents 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 214 14.2 The golden section search . . . . . . . . . . . . . . . . . 214 14.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 219 14.4 Solutions to exercises . . . . . . . . . . . . . . . . . . . 219

213

DSC2606 CHAPTER 14 GOLDEN SECTION SEARCH

Sections from prescribed book, Winston Chapter 11, Section 11.5

Learning objectives After completing this study unit you should be able to • determine whether a function is unimodal • use the golden section search to optimise a unimodal function of one variable.

14.1 Introduction It may, for various reasons, be difficult to solve an NLP model by means of differential calculus and in such cases alternative solution techniques must be used. In this study unit we concentrate on NLP models where the function to be optimised is a special type of function, namely a unimodal function. We then solve such models by means of the golden section search, a solution technique that uses a search algorithm as basis.

14.2 The golden section search Consider the following NLP model: Maximise z = f (x) subject to a ≤ x ≤ b. The golden section search can be used to solve this model if the objective function z is a unimodal function. The unimodality of the function is required to ensure that the interval [a; b] contains only one local maximum. Study Winston, Section 11.5 for a description of the golden section search technique (omit the section on spreadsheets). Refer to Tutorial Letter 101 for the exact page references. The strategy behind the golden section search is to obtain successive intervals of uncertainty, each smaller than the previous one. The golden section search is based on an algorithm that is repeated until the required accuracy is obtained. The golden section search involves the following two steps: 214

DSC2606 14.2. THE GOLDEN SECTION SEARCH

Step 1: Let [L; R] be an interval of uncertainty. (We usually choose L = a and R = b for the first iteration. The symbol L refers to the left endpoint of the interval and R to the right endpoint of the interval.) Calculate two points, X and Y , using the golden section relation, r = 0, 618: X = R − r(R − L) Y = L + r(R − L). Step 2: Calculate the function values f (X ) and f (Y ). Determine which one of the following cases holds: Case 1: If f (X ) < f (Y ), then the smaller interval of uncertainty is (X ; R]. Case 2: If f (X ) = f (Y ), then the smaller interval of uncertainty is [L;Y ). Case 3: If f (X ) > f (Y ), then the smaller interval of uncertainty is [L;Y ). We have obtained a new (smaller) interval of uncertainty. (Either L moved in to X or R moved in to Y .) If the length of this new interval of uncertainty satisfies the required accuracy, then the algorithm stops. If the length of this new interval of uncertainty does not satisfy the required accuracy, then another iteration must be performed. To perform another iteration, think of the left endpoint of the new (most current) interval of uncertainty as L and the right endpoint as R. Repeat steps 1 and 2. By required accuracy we mean the following: There is a possibility that the exact solution will not be obtained even after infinitely many iterations. Therefore, we decide beforehand on an acceptable level of accuracy, called the tolerance. Suppose we chose 0, 5 as tolerance for a specific model. This value is usually indicated by the symbol ε (epsilon). This tolerance implies that the iterations must be repeated until the length of the interval of uncertainty is less than or equal to the tolerance, ε = 0, 5. Assume we have completed an iteration resulting in the length of the interval of uncertainty being less than or equal to 0, 5. A point from this interval of uncertainty can be selected and used as an approximation of the solution. If this is done, then |approximate solution − exact solution| ≤ 0, 5. If we use the approximate solution instead of the exact solution, the error we make is less than or equal to 0, 5.

215

DSC2606 CHAPTER 14 GOLDEN SECTION SEARCH

The smaller the ε that is chosen, the more accurate the approximation. The model will lead us in our choice of the tolerance. (You are not expected to be able to choose a tolerance.) The golden section search is now illustrated by means of an example. Example 14.1 Use the golden section search to determine (to within an interval of length 0, 5) the optimal solution to the following NLP model: Minimise f (x) = x4 + x2 − 4x subject to 0 ≤ x ≤ 2. Solution We transform this model into a maximisation model by setting g(x) = − f (x) and the resultant model is Maximise g(x) = −x4 − x2 + 4x subject to 0 ≤ x ≤ 2. For this problem a = 0, b = 2. So initially the interval of uncertainty is I0 = [a; b] = [0; 2], and the length of the interval of uncertainty is L0 = b − a = 2 − 0 = 2. Here L0 = 2 > 0, 5, the tolerance; so we proceed with the algorithm. Iteration 1 The first point is x1 = b − r(b − a) = 2 − 0, 618(2) = 0, 764. Its function value is g(x1) = g(0, 764) = 2, 132. The second point is x2 = a + r(b − a) = 0 + 0, 618(2) = 1, 236. Its function value is g(x2) = g(1, 236) = 1, 082.

216

DSC2606 14.2. THE GOLDEN SECTION SEARCH

Now g(x1 ) > g(x2), which means I1 = [a; x2 ) = [0; 1, 236) L1 = 1, 236 − 0 = 1, 236 > 0, 5. Iteration 2 Determine two points: x3 = x2 − r(x2 − a) = 1, 236 − 0, 618(1, 236) = 0, 472, x4 = a + r(x2 − a) = 0 + 0, 618(1, 236) = 0, 764. The function values are g(x3) = 1, 616;

g(x4 ) = 2, 132.

Now g(x3 ) < g(x4), which means I2 = (x3 ; x2 ) = (0, 472; 1, 236) L2 = 1, 236 − 0, 472 = 0, 764 > 0, 5. Iteration 3 Determine two points: x5 = 1, 236 − 0, 618(0, 764) = 0, 764, x6 = 0, 472 + 0, 618(0, 764) = 0, 944. The function values are g(x5) = 2, 132;

g(x6 ) = 2, 091.

Now g(x5 ) > g(x6), which means I3 = (x3 ; x6 ) = (0, 472; 0, 944) L3 = 0, 944 − 0, 472 = 0, 472 < 0, 5. The length of this interval of uncertainty is less than the required tolerance. So the algorithm stops. The maximum of g(x) and therefore, the minimum of f (x), occurs in the interval (0, 472; 0, 944).

Example 14.2 Use the golden section search to determine (to within an interval of length 0, 5) the optimal solution to the following NLP model: Minimise f (x) = 21 + x5 − 45 x subject to 0 ≤ x ≤ 1. 217

DSC2606 CHAPTER 14 GOLDEN SECTION SEARCH

Solution We transform this model into a maximisation model by setting g(x) = − f (x) and the resultant model is Maximise g(x) = − 12 − x5 + 45 x subject to 0 ≤ x ≤ 1. For this problem a = 0, b = 1. So initially the interval of uncertainty is I0 = [a; b] = [0; 1], and the length of the interval of uncertainty is L0 = b − a = 1 − 0 = 1. Here L0 = 1 > 0, 5, the tolerance; so we proceed with the algorithm. Iteration 1 The first point is x1 = b − r(b − a) = 1 − 0, 618(1) = 0, 382. Its function value is g(x1 ) = g(0, 382) = −0, 203. The second point is x2 = a + r(b − a) = 0 + 0, 618(1) = 0, 618. Its function value is g(x2 ) = g(0, 618) = −0, 096. Now g(x1 ) < g(x2), which means I1 = (x1 ; b] = (0, 382; 1] L1 = 1 − 0, 382 = 0, 618 > 0, 5. Iteration 2 Determine two points: x3 = b − r(b − x1) = 1, 000 − 0, 618(0, 618) = 0, 618, x4 = x1 + r(b − x1) = 0, 382 + 0, 618(0, 618) = 0, 764. The function values are g(x3 ) = −0, 096;

218

g(x4 ) = −0, 149.

DSC2606 14.3. EXERCISES

Now g(x3 ) > g(x4), which means I2 = (x1 ; x4 ) = (0, 382; 0, 764) L2 = 0, 764 − 0, 382 = 0, 382 < 0, 5. The length of this interval of uncertainty is less than the required tolerance. So the algorithm stops. The maximum of g(x) and therefore, the minimum of f (x), occurs in the interval (0, 382; 0, 764).

Of course, in these examples, the local minimum of f (x) could also have been determined by applying the bisection method or Newton’s method for determining the zeros of the function f ′ (x) or the roots of the equation f ′ (x) = 0 in the relevant intervals. The algorithm for the golden section search can also be implemented in Maxima.

14.3 Exercises 1. Use the golden section search to determine (within an interval of 0,8) the optimal solution to Maximise f (x) = x2 + 2x subject to − 3 ≤ x ≤ 5. (This problem is from Winston, Section 11.5.)

14.4 Solutions to exercises 1. For this problem a = −3, b = 5. So initially the interval of uncertainty is I0 = [a; b] = [−3; 5], and the length of the interval of uncertainty is L0 = b − a = 5 − (−3) = 8. Here L0 = 8 > 0, 8; so we proceed with the algorithm. Iteration 1

219

DSC2606 CHAPTER 14 GOLDEN SECTION SEARCH

Determine two points: x1 = b − r(b − a) = 5 − 0, 618(8) = 0, 056, x2 = a + r(b − a) = −3 + 0, 618(8) = 1, 944. The function values are f (x1 ) = f (0, 056) = 0, 115;

f (x2 ) = f (1, 944) = 7, 667.

Now f (x1 ) < f (x2 ), which means I1 = (x1 ; b] = (0, 056; 5] L1 = 5 − 0, 056 = 4, 944 > 0, 8. Iteration 2 Determine two points: x3 = 5 − 0, 618(4, 944) = 1, 944, x4 = 0, 056 + 0, 618(4, 944) = 3, 111. The function values are f (x3 ) = 7, 667;

f (x4 ) = 15, 900.

Now f (x3 ) < f (x4 ), which means I2 = (x3 ; b] = (1, 944; 5] L2 = 5 − 1, 944 = 3, 056 > 0, 8. Iteration 3 Determine two points: x5 = 5 − 0, 618(3, 056) = 3, 111, x6 = 1, 944 + 0, 618(3, 056) = 3, 833. The function values are f (x5 ) = 15, 900;

f (x6 ) = 22, 358.

Now f (x5 ) < f (x6 ), which means I3 = (x5 ; b] = (3, 111; 5] L3 = 5 − 3, 111 = 1, 889 > 0, 8.

220

DSC2606 14.4. SOLUTIONS TO EXERCISES

Iteration 4 Determine two points: x7 = 5 − 0, 618(1, 889) = 3, 833, x8 = 3, 111 + 0, 618(1, 889) = 4, 278. The function values are f (x7 ) = 22, 358;

f (x8 ) = 26, 857.

Now f (x7 ) < f (x8 ), which means I4 = (x7 ; b] = (3, 833; 5] L4 = 5 − 3, 833 = 1, 167 > 0, 8. Iteration 5 Determine two points: x9 = 5 − 0, 618(1, 167) = 4, 279, x10 = 3, 833 + 0, 618(1, 167) = 4, 554. The function values are f (x9 ) = 26, 868;

f (x10 ) = 29, 847.

Now f (x9 ) < f (x10 ), which means I5 = (x9 ; b] = (4, 279; 5] L5 = 5 − 4, 279 = 0, 721 < 0, 8. The length of this interval of uncertainty is less than the required tolerance; so the algorithm stops. Therefore, f (x) = x2 + 2x is maximised in the interval (4, 279; 5].

221

DSC2606 CHAPTER 14 GOLDEN SECTION SEARCH

222

Chapter

15

Integration

Contents 15.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 224 15.2 Antiderivatives . . . . . . . . . . . . . . . . . . . . . . . 224 15.3 The indefinite integral . . . . . . . . . . . . . . . . . . . 227 15.4 The basic rules of integration . . . . . . . . . . . . . . . 228 15.5 Integration by substitution . . . . . . . . . . . . . . . . 233 15.6 The area under the graph of a function . . . . . . . . . 237 15.7 The fundamental theorem of calculus . . . . . . . . . . 241 15.8 The method of substitution for definite integrals . . . . 245 15.9 Consumers’ and producers’ surplus . . . . . . . . . . . 248 15.9.1 Consumers’ surplus . . . . . . . . . . . . . . . . . 248 15.9.2 Producers’ surplus . . . . . . . . . . . . . . . . . 248 15.10Numerical integration . . . . . . . . . . . . . . . . . . . 251 15.10.1 Trapezoidal rule . . . . . . . . . . . . . . . . . . 252 15.10.2 Computer algorithm for trapezoidal rule . . . . . . 254 15.11Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 256 15.12Solutions to exercises . . . . . . . . . . . . . . . . . . . 257

223

DSC2606 CHAPTER 15 INTEGRATION

Sections from prescribed book, Winston Chapter 12, Section 12.1

Learning objectives After completing this study unit you should be able to • explain the concepts of antiderivative of a function, integral of a function, integration, indefinite integral, integrand, constant of integration, definite integral, consumers’ surplus and producers’ surplus • calculate integrals of functions by applying the rules of integration • explain the fundamental theorem of calculus and apply it • use integral calculus in rate of change problems and in marginal analysis • use integral calculus in rate of change problems and in marginal analysis • approximate a definite integral by applying a numerical integration technique such as the trapezoidal rule • implement the algorithm for the trapezoidal rule using a computer package such as Maxima.

15.1 Introduction Differential calculus is concerned with the problem of finding the rate of change of one quantity with respect to another. In this study unit we begin the study of the other branch of calculus, known as integral calculus. Here we are interested in precisely the opposite problem: if we know the rate of change of one quantity with respect to another, can we find the relationship between these two quantities? The principal tool used in the study of integral calculus is the antiderivative of a function, and we develop rules for antidifferentiation, or integration, as the process of finding the antiderivative is called.

15.2 Antiderivatives In previous study units we discussed the concepts of cost and marginal cost functions. Assume the total cost function associated with the production of x units of a product is represented by the function C(x). We are interested in finding the rate of change in the cost when x units are produced. This is 224

DSC2606 15.2. ANTIDERIVATIVES

given by the marginal cost function C′ (x). And this is the derivative of C(x) with respect to x. Now we consider the opposite problem: given the rate of change in cost when x units are produced, can we find the total cost function? This boils down to the question: if we know the marginal cost function C′ (x), can we find the cost function C(x)? In order to solve this problem, we need the concept of an antiderivative of a function.

Definition 15.1 A function F(x) is an antiderivative of f (x) on an interval I if F ′ (x) = f (x) for all x ∈ I. An antiderivative of a function f (x) is a function F(x) whose derivative is f (x).

This is illustrated by means of examples.

Example 15.1 Consider the function F(x) = x3 + 2x + 1. The derivative of F(x) is d 3 (x + 2x + 1) dx = 3x2 + 2 = f (x).

F ′ (x) =

Then F(x) = x3 + 2x + 1 is the antiderivative of f (x) = 3x2 + 2.

Example 15.2 1 Consider the function F(x) = x3 − 2x2 + x − 1. 3 We now want to find the function f (x) so that F(x) is the antiderivative of f (x). We differentiate F(x) and find F ′ (x) = x2 − 4x + 1. Therefore, the required function is f (x) = x2 − 4x + 1. 225

DSC2606 CHAPTER 15 INTEGRATION

Consider three functions, F(x) = x, G(x) = x + 2 and H(x) = x +C, where C is a constant. The derivatives of these functions are F ′ (x) =

d (x) = 1, dx

G′ (x) =

d (x + 2) = 1, dx

H ′ (x) =

d (x +C) = 1. dx

If we set f (x) = 1, we see that its antiderivatives are F(x), G(x) and H(x). This shows that once an antiderivative F(x) of a function f (x) is known, another antiderivative may be found by adding an arbitrary constant to the function F(x).

Definition 15.2 Let G(x) be an antiderivative of a function f (x). Then every antiderivative F(x) of f (x) must be of the form F(x) = G(x) +C, where C is a constant.

We see that there are infinitely many antiderivatives of the function f (x) = 1. We obtain each one by specifying the constant C in the function F(x) = x + C. The graph in Figure 15.1 shows some of these antiderivatives for selected values of C. These graphs constitute part of a family of indefinitely many parallel straight lines, each having a slope equal to 1. F(x) = x + 3

(C = 3)

4

F(x) = x + 2

(C = 2)

3

F(x) = x + 1

(C = 1)

2

F(x) = x

1

F(x) = x − 1

F(x)

−4

−3

−2

−1 −1

1

2

3

(C = 0)

4

−2 −3

Figure 15.1: Antiderivatives of f (x) = 1

226

(C = −1) x

DSC2606 15.3. THE INDEFINITE INTEGRAL

Example 15.3 Prove that the function G(x) = x2 is an antiderivative of the function f (x) = 2x. Give a general expression for the antiderivatives of f (x) = 2x. Solution Since G′ (x) = 2x = f (x), we can say that G(x) = x2 is an antiderivative of f (x) = 2x. We know that every antiderivative of the function f (x) = 2x has the form F(x) = x2 +C, where C is some constant. The graph in Figure 15.2 shows a few of these antiderivatives. F(x) F(x) = x2 + 1 (C = 1) F(x) = x2 (C = 0) 1 2 F(x) = x − 2 (C = − 12 )

4 3 2 1 −2

−1 −1

1

2

x

Figure 15.2: Antiderivatives of f (x) = 2x

15.3 The indefinite integral The process of finding all antiderivatives of aRfunction is called antidifferentiation, or integration. We use the symbol , called an integral sign, to indicate that the operation of integration is to be performed on some function f (x); therefore, Z

f (x)dx = F(x) +C.

Read this as: “the indefinite integral of f (x) with respect to x equals F(x) plus C.” This tells us that the indefinite integral of f (x) is the family of functions given by F(x) +C, where F ′ (x) = f (x). 227

DSC2606 CHAPTER 15 INTEGRATION

The function f (x) to be integrated is called the integrand and the constant C is called the constant of integration. The expression dx following the integrand f (x) reminds us that the operation Z is performed with respect to x. If the independent variable is t, we write

f (t)dt instead.

Using this notation, some previous results may be written as Z

1dx = x +C

and

where C and K are arbitrary constants.

Z

2xdx = x2 + K,

15.4 The basic rules of integration In this section we give basic rules for finding the indefinite integral, F(x), of a given function f (x), and then verify them by showing that F ′ (x) = f (x).

Rule Z

Observe that

k dx = kx +C,

k a constant.

d (kx +C) = k. dx

Example 15.4 (a) Z

2 dx = 2x +C.

(b) Z

π 2 dx = π 2 x +C.

Rule The power rule is 1 n+1 x +C, n 6= −1. n+1 An antiderivative of a power function is another power function obtained from the integrand by increasing its power by one and dividing the resulting expression by the new power. Z

228

xn dx =

DSC2606 15.4. THE BASIC RULES OF INTEGRATION

d Observe that dx



 1 n+1 n+1 n x +C = x = xn . n+1 n+1

Example 15.5 (a) 1 x3 dx = x4 +C. 4

Z

(b) Z

3 1 5 2 5 x 2 dx = 5 x 2 +C = x 2 +C. 5 2

(c) Z

1 x

3 2

dx =

Z

3

x− 2 dx =

1 −1 −2 x 2 +C = √ +C. 1 x −2

Rule Z

c f (x)dx = c

Z

f (x)dx,

c a constant.

The indefinite integral of a constant multiple of a function is equal to the constant multiple of the indefinite integral of the function.

Example 15.6 (a) Z

3

2t dt = 2

Z

t 3dt



 1 4 t +k = 2 4 1 4 = t + 2k 2 1 4 = t +C where C = 2k. 2

NOTE: From now on we write the constant of integration as C, since any nonzero constant multiple of an arbitrary constant is an arbitrary constant.

229

DSC2606 CHAPTER 15 INTEGRATION

(b) Z

−2

−3x



dx = −3

Z

x−2 dx

= (−3)(−1)x−1 +C 3 = +C. x

Rule The sum or difference rule Z

and

Z

[ f (x) + g(x)] dx =

Z

[ f (x) − g(x)] dx =

Z

Z

g(x)dx

Z

g(x)dx.

f (x)dx +

f (x)dx −

The indefinite integral of a sum (or difference) of two integrable functions is equal to the sum (or difference) of their integrals.

This result is easily extended to the case involving the sum (or difference) of any finite number of functions.

230

DSC2606 15.4. THE BASIC RULES OF INTEGRATION

Example 15.7

Z 

5

− 12

3 2

3x + 4x − 2x



dx =

Z

= 3

5

3x dx + Z

5

Z

x dx + 4

3 2

4x dx − Z

3 2

Z

x dx − 2

1

2x− 2 dx Z

1

x− 2 dx

1 1 2 5 = 3 · x6 + 4 · x 2 − 2 · 2x 2 +C 6 5 √ 1 6 8 5 = x + x 2 − 4 x +C. 2 5

Rule Z

ex dx = ex +C.

The indefinite integral of the exponential function is equal to the function itself plus the constant of integration.

Example 15.8 Z

x

(2e ) dx = 2

Z

ex dx = 2ex +C.

Rule Z

−1

x

dx =

Z

1 dx = ln |x| +C. x

The indefinite integral of x to the power −1, or one over x, is equal to ln |x| plus the constant of integration.

NOTE: The power rule has an exception, namely that it is not applicable to x−1 . Observe that d 1 ln |x| = . dx x (Rule 9.4 of differentiation)

231

DSC2606 CHAPTER 15 INTEGRATION

Example 15.9 Z

3 dx = 3 x

Z

1 dx = 3 ln |x| +C. x

We now illustrate these rules of integration by means of some examples.

Example 15.10 Find the the function f (x) if it is known that f ′ (x) = 3x2 − 4x + 8 and f (1) = 9. Solution Integrating the function f ′ (x) we find f (x) = =

Z

Z

f ′ (x)dx  3x2 − 4x + 8 dx

= x3 − 2x2 + 8x +C. Using the condition f (1) = 9, we have

f (1) = 13 − 2(1)2 + 8(1) +C = 7 +C = 9

⇒ C = 2.

Therefore, the required function is given by f (x) = x3 − 2x2 + 8x + 2.

Example 15.11 The management of the Staedtler Office Company has determined that the daily marginal revenue function associated with producing and selling their battery-operated pencil sharpeners is given by R′ (x) = −0, 0006x + 6, where x denotes the number of units produced and sold and R′ (x) is measured in rand per unit. Determine the revenue function R(x) associated with producing and selling these pencil sharpeners.

232

DSC2606 15.5. INTEGRATION BY SUBSTITUTION

Solution The revenue function is found by integrating the marginal revenue function R(x) = =

Z

Z

R′ (x)dx (−0, 0006x + 6)dx

= −0, 0003x2 + 6x +C. To determine the value of the constant C, observe that the revenue realised by Staedtler is zero when the production and sales level is zero; that is, R(0) = 0. This condition implies that R(0) = −0, 0003(0)2 + 6(0) +C = 0

or C = 0.

The required revenue function is given by R(x) = −0, 0003x2 + 6x.

15.5 Integration by substitution In this section we introduce a method of integration called the method of substitution, which is related to the chain rule for differentiation. When used in conjunction with the basic rules of integration, the method of substitution is a powerful tool for integrating a large class of functions. Consider the indefinite integral Z

2 (2x + 4)5 dx.

One way of evaluating this integral is to expand the expression (2x + 4)5 and then integrate the resulting integrand term by term. Let us see if we can use an alternative approach to simplify this process. We do so by making a change in variable. Let u = 2x + 4 and since it is a function of x, we write u = 2x + 4 = g(x). The derivative of this is du =2 dx

or g′ (x) = 2.

We rewrite this as du =2 dx

⇒ du = 2dx

⇒ du = g′ (x)dx

since

g′ (x) = 2.

233

DSC2606 CHAPTER 15 INTEGRATION

We now return to our original indefinite integral and substitute u = 2x + 4 and du = 2dx into it. Then Z

5

2 (2x + 4) dx =

Z

5

(2x + 4) (2dx) =

Z

u5 du.

Now the last integral involves a power function and is easily evaluated using the power rule. Then Z 1 u5 du = u6 +C. 6 Using this result and replacing u with 2x + 4, we obtain Z

2 (2x + 4)5 dx =

1 (2x + 4)6 +C. 6

We can verify that the foregoing result is indeed correct by computing the following by means of the chain rule:   d 1 1 6 (2x + 4) = · 6 (2x + 4)5 (2) = 2 (2x + 4)5 , dx 6 6 which is just the integrand. Let us summarise the steps involved in integration by substitution: Step 1: Let u = g(x), where g(x) is part of the integrand, usually the “inside function” of the composite function f [g(x)]. Step 2: Calculate g′ (x) and derive du = g′ (x)dx, (from g′ (x) =

du ). dx

Step 3: Use the substitution u = g(x) and du = g′ (x)dx to convert the entire integrand into one involving only u. Step 4: Evaluate the resulting integral. Step 5: Replace u with g(x) to obtain the final solution as a function of x. Example 15.12 Evaluate

Z

t2 t3 + 1

Solution

 23

dt.

Step 1: The integrand contains the composite function t 3 + 1 side function” g(t) = t 3 + 1. Let u = t 3 + 1. Step 2: Now g′ (t) =

234

du = 3t 2 dt

⇒ du = 3t 2 dt

 23

with “in-

1 or t 2 dt = du. 3

DSC2606 15.5. INTEGRATION BY SUBSTITUTION

Step 3: Substitute u = t 3 + 1 and t 2 dt = 13 du, then Z

t

2

3

t +1

 23

dt = =

Z

1 3

u Z

3 2

 3

1 du 3



u 2 du.

Step 4: Integrate: 1 3

Z

3

u 2 du =

1 2 5 2 5 · u 2 +C = u 2 +C. 3 5 15

Step 5: Replace u with t 3 + 1. Then Z

t2 t3 + 1

 32

dt =

5 2 3 t + 1 2 +C. 15

In the remaining examples, we drop the practice of labelling the steps involved in evaluating each integral. Example 15.13 Evaluate

Z

a 3a2 + 1

da.

Solution 1 du = 6a or du = 6a da or a da = du. Making the Let u = 3a2 + 1. Then da 6 appropriate substitutions we have Z

a da = 2 3a + 1 = = =

1 1 · du u 6 Z 1 1 du 6 u 1 ln |u| +C 6  1 ln 3a2 + 1 +C 6

Z

(3a2 + 1 > 0).

Example 15.14 Evaluate

Z

(ln x)2 dx. 2x

235

DSC2606 CHAPTER 15 INTEGRATION

Solution Let u = ln x. Then

du 1 1 = or du = dx. dx x x

Making the appropriate substitutions we have Z

(ln x)2 1 · dx 2 x Z 1 2 = u du 2 1 3 = u +C 6 1 = (ln x)3 +C. 6

(ln x)2 dx = 2x

Z

Example 15.15 A study prepared by the marketing department of the Universal Instruments Company forecasts that, after its new line of home computers is introduced onto the market, sales will grow at the rate of f (t) = 2 000 − 1 500e−0,05t

for

0 ≤ t ≤ 60,

units per month. (a) Find an expression that gives the total number of computers that will be sold t months after they become available on the market. (b) How many computers will be sold in the first year they are on the market? Solution (a) Let N(t) denote the total number of computers that will be sold t months after their introduction into the market. We have N ′ (t) = 2 000 − 1 500e−0,05t . Now N(t) =

Z 

 2 000 − 1 500e−0,05t dt

= 2 000t − 1 500

236

Z

e−0,05t dt.

DSC2606 15.6. THE AREA UNDER THE GRAPH OF A FUNCTION

du = −0, 05 and du = −0, 05dt. We have dt   Z Z du −0,05t u e dt = e −0, 05 Z 1 eu du = − 0, 05 eu = − +C 0, 05 e−0,05t = − +C. 0, 05

Now let u = −0, 05t, then

Therefore,  −0,05t  e N(t) = 2 000t − 1 500 − +C = 2 000t + 30 000e−0,05t +C. 0, 05 To determine the value of C, note that the number of computers sold at the end of month 0 is zero; so N(0) = 0. Now N(0) = 30 000e0 +C = 30 000 +C

⇒ C = −30 000.

The required expression is N(t) = 2 000t + 30 000e−0,05t − 30 000. (b) The expression N(12) will give the number of computers sold in the first year and N(12) = 2 000(12) + 30 000e−0,05(12) − 30 000 = 10 464, 35 ≈ 10 464.

15.6 The area under the graph of a function Suppose that a country’s annual rate of petrol consumption over a four-year period is constant and is given by the function f (t) = 1, 2 for

0 ≤ t ≤ 4,

where t is measured in years and f (t) in millions of barrels per year. The country’s total petrol consumption (in million barrels) over the period of time in question is Rate of consumption × time lapsed = 1, 2(4 − 0) = 4, 8. If we examine the graph in Figure 15.3, we see that this total is the area of the rectangular region bounded above by the graph of f (t), below by the

237

DSC2606 CHAPTER 15 INTEGRATION

f (t) 1.2

1

2

3

4

t

Figure 15.3: Area of constant petrol consumption t-axis, and to the left and right by the vertical lines t = 0 and t = 4, respectively. However, the rate of petrol consumption is not constant. The actual rate of petrol consumption of the country over a four-year period is given by a function f (t) as illustrated in Figure 15.4. f (t) 0.6

0.3

0

1

2

3

4

t

Figure 15.4: Area of fluctuating petrol consumption It seems fair to say that the country’s total petrol consumption over the fouryear period is given by the area of the region bounded above by the graph of f (t), below by the t-axis, to the left and right by the vertical lines t = 0 and t = 4, respectively. This example touches on a fundamental theorem of calculus, namely to calculate the area of the region bounded by the graph of a function f (x), the x-axis and the vertical lines x = a and x = b as shown in Figure 15.5. This area is called the area under the graph of f (x) on the interval [a; b] or from a to b. How do we find this area? Let us start with something familiar. We know that the area of a rectangle is given by the product of the lengths of two adjacent sides. Consider the function f (x) = x2 , and suppose we are interested in the region R under the graph of f (x) on the interval [0; 1] as shown in Figure 15.6. In order to obtain an approximation of the area of R, we construct four nonoverlapping rectangles (n = 4) each with a width of △x as in Figure 15.7. 238

DSC2606 15.6. THE AREA UNDER THE GRAPH OF A FUNCTION

f (x)

0

a

x

b

Figure 15.5: Area under f (x) on [a; b] f (x)

1

R 0

x

1

Figure 15.6: Area under f (x) = x2 on [0; 1] f (x) 1 f (x4 ) f (x3 ) f (x1 )

f (x2 ) 0

x1

x2

x3

x4 1

x

Figure 15.7: Area under f (x) = x2 for n = 4 The sum of the areas of these rectangles, that is 4

∑ △x f (xi) = △x f (x1) + △x f (x2) + △x f (x3 ) + △x f (x4),

i=1

gives an approximation of the area R.

239

DSC2606 CHAPTER 15 INTEGRATION

We now increase the number of rectangles to 8 (n = 8) and then to 16 (n = 16) and this is shown in Figure 15.8.

f (x)

f (x)

1

1

0

x

1

0

(a) n = 8

1

x

(b) n = 16

Figure 15.8: Area under f (x) = x2 for n = 8 and n = 16 These graphs suggest that the approximation of the area of R, denoted by A, improves as n increases. This is confirmed by the following table of values generated for increasing values of n:

Number of rectangles n Approximation of A

4 0,328125

8 0,332031

16 0,333008

32 0,333252

64 0,333313

From these values we see that the approximation approaches the number 13 as n gets larger and larger. In general, we can say that the required area is given by a Riemann sum as follows: n

A ≈ ∑ △x f (xi ) = △x f (x1 ) + △x f (x2 ) + · · · + △x f (xn ), i=1

which will approach a unique number as n becomes arbitrarily large. We can therefore say that the area of the region under the graph of a continuous function, is given by A = lim (△x f (x1 ) + △x f (x2 ) + · · · + △x f (xn )) . n→∞

This limit is called the definite integral of f (x) from a to b and is denoted by Z b

f (x)dx,

a

where a is called the lower limit of integration and b the upper limit of integration. We conclude that the area of the region under the graph of f (x) on the interval [a; b] is given by

Z b a

240

f (x)dx and is illustrated in Figure 15.9.

DSC2606 15.7. THE FUNDAMENTAL THEOREM OF CALCULUS

f (x)

A=

Z b

f (x)dx

a

0

a

b

x

Figure 15.9: Graphical representation of the definite integral

15.7 The fundamental theorem of calculus The following theorem shows how to evaluate the definite integral of a continuous function provided we can find an antiderivative of that function. Because of its importance in establishing the relationship between differentiation and integration, this theorem, discovered independently by Sir Isaac Newton (1642-1727) in England and Gottfried Wilhelm Leibniz (16461716) in Germany, is called the fundamental theorem of calculus.

Theorem 15.1 If a function f (x) is continuous on [a; b], then Z b a

f (x)dx = F(b) − F(a),

where F(x) is any antiderivative of f (x), that is, F ′ (x) = f (x).

We use the following notation when applying this theorem: Z b a

f (x)dx = F(x)|ba = F(b) − F(a).

Example 15.16 Let R be the region under the graph of f (x) = x on the interval [1; 3]. Use the fundamental theorem of calculus to find the area A of R and verify your result by elementary means.

241

DSC2606 CHAPTER 15 INTEGRATION

f (x)

f (x) = x

3 2 1

R 1

2

3

4

x

Figure 15.10: Area under f (x) = x on [1; 3] Solution The region R is shown in Figure 15.10. The area R is given by the definite integral of f (x) from 1 to 3, that is, A = Z 3

x dx.

1

In order to evaluate this definite integral, observe that an antiderivative of f (x) is 1 F(x) = x2 +C, 2 where C is an arbitrary constant. Therefore, A =

Z 3

x dx

1

F(x)|31  3  1 2 x +C = 2   1  9 1 = +C − +C 2 2 = 4 square units. =

To verify this result by elementary means, we divide the region R into two subregions; a rectangle R1 and a triangle R2 . This is illustrated in Figure 15.11. The area A of R is the area of rectangle R1 plus the area of triangle R2 , that is 1 area R1 + area R2 = (length × breadth) + ( base × height) 2 1 = (2 × 1) + ( · 2 × 2) = 4 square units, 2 which agrees with the result obtained by means of the definite integral.

242

DSC2606 15.7. THE FUNDAMENTAL THEOREM OF CALCULUS

f (x)

f (x) = x

3 2 R2 1 R1 1

2

3

4

x

Figure 15.11: Subdivision of area R Observe that in evaluating the definite integral, the constant of integration “dropped out”. This is true in general, for if F(x) +C denotes an antiderivative of some function f (x), then (F(x) +C)|ba = [F(b) +C] − [F(a) +C] = F(b) +C − F(a) −C = F(b) − F(a). With this fact in mind, we may, in all future computations involving the evaluation of a definite integral, drop the constant of integration. Example 15.17 Use the fundamental theorem of calculus to show that the area under the graph of f (x) = x2 on the interval [0; 1] is indeed 31 square units. Solution The graphical representation of this is in Figure 15.6. The area of R is given by Z 1

2

x dx =

0

which corresponds to the

1 3



 1 3 1 1 1 1 x = (1) − (0) = , 3 3 3 3 0

square units indicated earlier.

Example 15.18 Evaluate

Z 3

(3x2 + ex )dx.

1

243

DSC2606 CHAPTER 15 INTEGRATION

Solution The definite integral is Z 3

2

x

(3x + e )dx =

1

= =

Z 3

2

3x dx +

1 3 x3 1 + ex |31  3 1

3 −1

Z 3

ex dx

1

+ e3 − e1



= 26 + e3 − e1 = 43, 37.

Example 15.19 The management of Staedtler Office Equipment has determined that the daily marginal cost function associated with producing batteryoperated pencil sharpeners is given by C′ (x) = 0, 000006x2 − 0, 006x + 4, where C′ (x) is measured in rand per unit and x denotes the number of units produced. Management has also determined that the daily fixed cost incurred in producing pencil sharpeners is R100. (a) What is Staedtler’s daily total cost if 500 pencil sharpeners are produced? (b) What is the additional daily cost incurred if the production of pencil sharpeners increases from 200 to 400 units? Solution (a) The daily total cost function is C(x) = =

Z

Z

C′ (x)dx (0, 000006x2 − 0, 006x + 4)dx

0, 000006 3 0, 006 2 x − x + 4x + k 3 2 = 0, 000002x3 − 0, 003x2 + 4x + k.

=

We must determine the value of the constant of integration k. The daily fixed cost is R100. This means that a cost of R100 is incurred even if no pencil sharpeners are produced; so C(0) = 100. Now substituting this into C(x), we have C(0) = 0, 000002(0)3 − 0, 003(0)2 + 4(0) + k = 100 The daily total cost function is C(x) = 0, 000002x3 − 0, 003x2 + 4x + 100.

244

⇒ k = 100.

DSC2606 15.8. THE METHOD OF SUBSTITUTION FOR DEFINITE INTEGRALS

We must determine the daily total cost if 500 pencil sharpeners are produced and this is C(500) = 0, 000002(500)3 −0, 003(500)2 +4(500)+100 = 1 600 (rand). An alternative method of calculating the daily total cost may be used and is as follows: We are required to calculate C(500). Let us calculate C(500) − C(0), which is the change in the total cost function C(x) over the interval [0; 500]. Using the fundamental theorem of calculus, we find C(500) −C(0) =

Z 500 0

C′ (x)dx

 500 = 0, 000002x3 − 0, 003x2 + 4x 0 = 1 500 − 0 = 1 500. From this it follows that C(500) = 1 500 +C(0) = 1 500 + 100 = 1 600 (rand), which is the same result as before. Therefore, the daily total cost if 500 pencil sharpeners are produced is R1 600. (b) The additional daily cost incurred if the production of pencil sharpeners is increased from 200 to 400 units is given by C(400) −C(200) =

Z 400 200

C′ (x)dx

 400 = 0, 000002x3 − 0, 003x2 + 4x 200 = 1 248 − 696 = 552 (rand).

15.8 The method of substitution for definite integrals Suppose we have to evaluate the following definite integral: Z 4 p

x

9 + x2 dx.

0

We can do this by using one of two methods. Method 1 We first find the corresponding indefinite integral I =

Z

p x 9 + x2 dx. 245

DSC2606 CHAPTER 15 INTEGRATION

Let u = 9 + x2 . Then

I=

du 1 = 2x or du = 2x dx or x dx = du. Now dx 2 Z

Z p 1 2 (9 + x2 ) 2 (x dx) x 9 + x dx =

=

Z

1 1 u 2 · du 2 Z

1 1 u 2 du 2 1 2 3 = · u 2 +C 2 3 3 1 = (9 + x2 ) 2 +C. 3

=

Using this result, we now evaluate the given definite integral. Then

Z 4 p

x

9 + x2 dx =

0

= = =

4 3 1 (9 + x2 ) 2 3 0 i 3 3 1h 2 2 (9 + 16) − 9 3 1 (125 − 27) 3 98 2 = 32 . 3 3

Method 2 du As before, we make the substitution u = 9 + x2 , so that = 2x or du = dx 1 2x dx or x dx = du. Now observe that the given definite integral is eval2 uated with respect to x, with the range of integration given by the interval [0; 4]. If we perform the integration with respect to u via the substitution u = 9 + x2 , then we must adjust the range of integration to reflect the fact that the integration is being performed with respect to the new variable u. We now determine the proper range of integration as follows: When x = 0, u = 9 + 02 = 9, the required lower limit of integration with respect to u, and when x = 4, u = 9 + 42 = 25, the required upper limit of integration with respect to u. The range of integration when the integration is performed with respect to u 246

DSC2606 15.8. THE METHOD OF SUBSTITUTION FOR DEFINITE INTEGRALS

is given by the interval [9; 25]. Therefore, we have Z 4 p

x

9 + x2 dx =

0

= = = = = =

Z 4

1

(9 + x2 ) 2 (x dx) 0  Z 25  1 1 u2 du 2 9 Z 1 25 1 u 2 du 2 9 1 3 25 u2 3 9 3 3 1 (25 2 − 9 2 ) 3 1 (125 − 27) 3 98 2 = 32 , 3 3

which agrees with the result obtained using Method 1.

Example 15.20 Evaluate

Z 2

2

xe2x dx.

0

Solution Let u = 2x2 . Then

du 1 = 4x or du = 4x dx or x dx = du. dx 4

If x = 0, then u = 0 and if x = 2, then u = 8. Therefore, Z 2

2

xe2x dx =

0

= = = = ≈

Z 2

2

e2x (x dx) 0  Z 8  u 1 e du 4 0 Z 1 8 u e du 4 0 1 u 8 e 4 0 1 8 (e − e0 ) 4 745.

247

DSC2606 CHAPTER 15 INTEGRATION

15.9 Consumers’ and producers’ surplus 15.9.1 Consumers’ surplus Suppose p = D(x) is the demand function that relates the unit price p of a commodity to the quantity x demanded of it. And suppose that a fixed unit market price p¯ has been established for the commodity and that, corresponding to this unit price, the quantity demanded is x¯ units. Then those customers who would be willing to pay a unit price higher than p¯ for the commodity would in effect experience a saving. This difference between what the consumers would be willing to pay for x¯ units of the commodity and what they actually pay is called the consumers’ surplus. The shaded area on the graph in Figure 15.12 represents the consumers’ surplus. D(x)

p¯ x¯

x

Figure 15.12: Consumers’ surplus The consumers’ surplus is calculated by subtracting the area of the rectangle, p¯ × x, ¯ from the total area under the graph of D(x) on the interval [0; x]. ¯ Definition 15.3 The consumers’ surplus is given by CS =

Z x¯ 0

D(x)dx − p¯ · x, ¯

where D(x) is the demand function, p¯ is the fixed unit market price, x¯ is the quantity sold and CS is in rand.

15.9.2 Producers’ surplus Suppose p = S(x) is the supply function that relates the unit price p of a commodity to the quantity x that the supplier will make available in the mar248

DSC2606 15.9. CONSUMERS’ AND PRODUCERS’ SURPLUS

ket at that price. Again, suppose that a fixed unit market price p¯ has been established for the commodity and that, corresponding to this unit price, a quantity of x¯ units will be made available in the market by the supplier. The suppliers who would be willing to make the commodity available at a lower price stand to gain from the fact that the market price is set as such. The difference between what the suppliers actually receive and what they would be willing to receive is called the producers’ surplus. The shaded area on the graph in Figure 15.13 represents the producers’ surplus. S(x)



x



Figure 15.13: Producers’ surplus The producers’ surplus is calculated by subtracting the area under the graph of S(x) over [0; x] ¯ from the area of the rectangle, p¯ × x. ¯ Definition 15.4 The producers’ surplus is given by PS = p¯ · x¯ −

Z x¯

S(x)dx,

0

where S(x) is the supply function, p¯ is the fixed unit market price, x¯ is the quantity supplied and PS is in rand.

249

DSC2606 CHAPTER 15 INTEGRATION

Example 15.21 The demand function for a certain make of compact disk is given by p = D(x) = −0, 001x2 + 250, where p is the unit price in rand and x is the quantity demanded in units of a thousand. The supply function for these disks is given by p = S(x) = 0, 0006x2 + 0, 02x + 100, where p is the unit price in rand and x is the number of disks that the supplier will put on the market in units of a thousand. Determine the consumers’ surplus and producers’ surplus if the market price of a disk is set at the equilibrium price. Solution The equilibrium price is the unit price of the commodity when market equilibrium occurs. We determine the equilibrium price by determining the point of intersection of the demand function and the supply function. We find this point by solving the demand and supply functions simultaneously, that is p = −0, 001x2 + 250 and p = 0, 0006x2 + 0, 02x + 100. Therefore, we have 0, 0006x2 + 0, 02x + 100 0, 0016x2 + 0, 02x − 150 16x2 + 200x − 1 500 000 2x2 + 25x − 187 500 (2x + 625)(x − 300)



= = = = =

−0, 001x2 + 250 0 0 0 0

x = − 625 2 or x = 300.

A negative quantity is impossible; therefore, we are left with the solution x = 300. The corresponding p-value is p = −0, 001(300)2 + 250 = 160.

The equilibrium point is (300; 160); that is, the equilibrium quantity is 300 000 and the equilibrium price is R160. Setting the market price at R160 per unit, we find that the consumers’ surplus is given by CS =

Z x¯ 0

D(x)dx − p¯ · x¯

Z 300

(−0, 001x2 + 250)dx − (160)(300)  300  1 3 x + 250x − 48 000 = − 3 000 0 = 66 000 − 48 000 = 18 000. =

0

250

DSC2606 15.10. NUMERICAL INTEGRATION

The consumers’ surplus is therefore R18 000 000. (Recall that x is measured in units of a thousand.) The producers’ surplus is given by PS = p¯ · x¯ −

Z x¯

S(x)dx

0

Z 300

(0, 0006x2 + 0, 02x + 100)dx 300 = 48 000 − (0, 0002x3 + 0, 01x2 + 100x) 0 = 48 000 − 36 300 = 11 700 or R11 700 000. = (160)(300) −

0

A graphical representation of this example is given in Figure 15.14. p

CS p = p¯ = 160

160 PS

300

x

Figure 15.14: CS and PS for compact disks

15.10 Numerical integration The notes in this section have been developed using the textbook Numerical Mathematics and Computing by Ward Cheney and David Kincaid. Elementary calculus focuses on methods to determine the definite integral Z b a

f (x) f x = F(a) − F(b)

using the anti-derivative of f and integration rules to identify this anti-derivative F. Recall that F ′ (x) = f (x). The definite integral of the function f can be interpreted as the area under the curve f on the interval [a; b]. However, it is

251

DSC2606 CHAPTER 15 INTEGRATION

not always possible to find the anti-derivative. For example Z 1

e

ex dx

0

cannot be solved using the anti-derivative of f , since there is no function 2 F(x) such that F ′ (x) = ex . Hence a numerical technique is necessary to approximate the definite integral. In Numerical Analysis a number of methods are available to approximate a definite integral. In this module we only explain one elementary method to give you an idea of how a numerical integration algorithm can be implemented. For more and advanced techniques you should consult Numerical Analysis textbooks.

15.10.1 Trapezoidal rule The trapezoidal rule for estimating the value of Z b

f (x) dx

a

is based on approximating the area between the graph of f and the x-axis with the sum of n trapezoids of equal width, as shown in Figure 15.15. y = f (x)

Trapezoidal area 1 2 (y1 + y2 )h

Pn

P2

P1

Pn−1 P0 y1

y2

yn−1 yn

h xn = a

x1

x2

h

xn−1

b = xn

Figure 15.15: Graphical representation of the trapezoidal rule The trapezoids have the common base length h=

b−a , n

and the side of each trapezoid runs from the x-axis to the curve. The area of a typical trapezoid with base h and vertical sides yi and yi+1 is given by area = h × (yi + yi+1 )/2.

252

DSC2606 15.10. NUMERICAL INTEGRATION

Rule Trapezoidal Rule Divide the interval [a; b] into n subintervals [xi ; xi+1 ] of width h = (b − a)/n, with x0 = a and xn = b. Hence xi = a + ih, for i = 0; 1; . . .; n. On each subinterval [xi ; xi+1 ], the area under the curve f (x) can be approximated by the area of the trapezoid h h T (i) = (yi + yi+1 ) = [ f (xi ) + f (xi+1 )]. 2 2 The total area underneath the curve can be approximated by the sum of the areas of the n trapezoids. Hence: Z b a

f (x) dx ≈ T (1) + T (2) + . . .T (n) h h h (y0 + y1 ) + (y1 + y2 ) + . . . (yn + yn−1 ) 2 2 2 h h = + h(y1 + y2 + . . . yn−1 ) + yn 2 2 n−1 h = [ f (x0 ) + f (xn )] + h ∑ f (xi ) 2 i=1

=

=

n−1 h [ f (a) + f (b)] + h ∑ f (xi ) 2 i=1

253

DSC2606 CHAPTER 15 INTEGRATION

Example 15.22 Apply the trapezoidal rule with n = 4 to estimate Solution

R2 2 1 x dx.

To apply the trapezoidal rule, note that a = x0 = 1, b = x4 = 2 and h = 0, 25.

2−1 4

=

Thus: i 0 1 2 3 4

xi 1 1, 25 1, 5 1, 75 2

f (xi ) 1 1, 5625 2, 25 3, 0625 4

Then Z 2

x2 dx =

1

3 0, 25 [ f (1) + f (2)] + 0, 25 ∑ f (xi ) 2 i=1

= 0, 125[1 + 4] + 0, 25(1, 5625 + 2, 25 + 3, 0625) ≈ 2, 34375 Comparing this result to the exact value of the integral according to the fundamental theorem, Z 2 1

2 x3 8 1 7 x dx = = − = = 2, 3333˙ 3 1 3 3 3 2

we see that an error of |2, 33333 − 2, 34375| ≈ 0, 01042 is made. Activity Apply the trapezoidal rule with n = 8. The answer is 2, 33594. Notice that the trapezoidal approximation with n = 8 is more accurate.

The larger the value of n (or the smaller the value of h), the more accurate the trapezoidal approximation will be.

15.10.2 Computer algorithm for trapezoidal rule The trapezoidal rule can be easily implemented for a prescribed value of n using a computer package. 254

DSC2606 15.10. NUMERICAL INTEGRATION

Pseudocode for trapezoidal rule Input n, f (x), a, b h

← (b − a)/n

sum ←

h 2 [ f (a) +

f (b)]

for i = 1 to n − 1 do x ← a + ih sum ← sum + f (x) end do Tn

← sum

Output Tn In practice it is usual to start the process with n = 4, calculate the approximation T4 , and then set n = 8 and calculate T8 . The difference between the two approximations is then compared to the prescribed tolerance ε . If |T8 − T4 | < ε , then the value T8 is the approximation to the definite integral. Otherwise the number of subintervals n is doubled and the process repeated until the difference between two successive approximations is less than the prescribed tolerance. In Maxima the rule can be applied using the following instructions: block(f(x):=x^2, n:4, a:1,b:2, h:(b-a)/n, sum:(f(a)+f(b))/2.0, for i:1 thru (n-1) step 1 do (sum:sum+f(a+i*h)), trap:sum*h, print("integral", trap) ) Maxima has a built-in function that uses Romberg’s method to approximate a definite integral. Romberg’s method starts with trapezoidal approximations and then forms linear combinations using these values to recalculate improved approximations (see any elementary Numerical Analysis textbook). This function should be loaded with the command romberg(expression, x, a, b) before it is called. Maxima instructions (%i5) (%o5) (%i6) (%o6)

load(romberg); C:/PROGRA~1/MAXIMA~1.0/share/maxima/5.14.0/share/numeric/romberg.lis romberg(x^2,x,1.0,2.0); 2.333333333333334

255

DSC2606 CHAPTER 15 INTEGRATION

15.11 Exercises 1. Evaluate the following indefinite integrals:  Z  1 2 x √ − + 3e dx. (a) x x (b)

Z

x2 3

dx.

(2x3 + 1) 2

2. The current circulation of Investor’s Digest is 2 000 per week. Circula2 tion is expected to grow at the rate of 5 + 2t 3 copies per week, t weeks from now. Determine what the Digest’s circulation will be 125 weeks from now. 3. Evaluate the following definite integrals: (a)

Z 2

(x + ex )dx.

0

(b)

Z 2 1 1

 1 − dx. x x2

4. An efficiency study conducted for the Electra Electronics Company showed that the rate at which walkie-talkies are assembled by the average worker t hours after starting work at 8h00, is given by the function f (t) = −3t 2 + 12t + 15

for

0 ≤ t ≤ 4.

Determine how many walkie-talkies can be assembled by the average worker in (a) the first hour of the morning shift. (b) the second hour of the morning shift. 5. The demand function for a certain exercise bicycle that is sold exclusively through a television advertising campaign is p p = d(x) = 9 − 0, 02x,

where p is the unit price in hundreds of rand and x is the quantity demanded per week. The corresponding supply function is given by p p = s(x) = 1 + 0, 02x, where x is the number of exercise bicycles the supplier will make available at the fixed price p.

Determine the consumers’ surplus if the unit price is set at the equilibrium price. 256

DSC2606 15.12. SOLUTIONS TO EXERCISES

6. Estimate the value of the following definite integrals accurately to four digits by applying the trapezoidal rule with n = 4 and n = 8. Use the Romberg function in Maxima to check the accuracy of the estimates. (a)

Z 1

ex dx

Z 1

1 dx 1 + x2

2

0

(b)

0

15.12 Solutions to exercises 1.

(a) The indefinite integral is Z 

 Z 1 1 2 2 x √ − + 3e dx = (x− 2 − + 3ex )dx x x x Z Z Z 1 − 12 = x dx − 2 dx + 3 ex dx x 1 = 2x 2 − 2 ln |x| + 3ex +C √ = 2 x − 2 ln |x| + 3ex +C.

du = 6x2 (b) Let u = 2x3 +1, then dx Now Z

x2 (2x3 + 1)

3 2

dx = =

⇒ du = 6x2 dx Z Z

1 3

1 ⇒ du = x2 dx. 6

(x2 dx)

(2x3 + 1) 2 1

1 · du 6 Zu 3 2

3 1 u− 2 du 6  1 1 = −2u− 2 +C 6 1 = − √ +C 3 u 1 +C. = − p 3 (2x3 + 1)

=

2. Let S(t) denote the Digest’s circulation t weeks from now. Then S′ (t), the rate of change in the circulation per week, is given by 2

S′ (t) = 5 + 2t 3 .

257

DSC2606 CHAPTER 15 INTEGRATION

Therefore, S(t) =

Z 

5 + 2t

2 3



dt

3 5 = 5t + 2 × t 3 +C 5 6 5 = 5t + t 3 +C. 5 To determine the value of C, observe that the current circulation (t = 0) is 2 000. This means that S(0) = 2 000 and 6 5(0) + (0) +C = 2 000 5

⇒ C = 2 000.

Then

6 5 S(t) = 5t + t 3 + 2 000. 5 Therefore, 125 weeks from now the circulation will be 5 6 S(125) = 5(125) + (125) 3 + 2 000 = 6 375 copies per week. 5

3.

(a) The definite integral is Z 2 0

 2 1 2 x (x + e )dx = x +e 2  0   1 2 1 2 2 0 = (2) + e − (0) + e 2 2 = 8, 389. x



(b) The definite integral is   Z 2 Z 2 1 1 1 −2 − dx = −x dx x x2 x 1 1   2 x−1 = ln |x| − −1 1   1 2 = ln |x| + x   1 1 = ln 2 + − (ln 1 + 1) 2 1 = ln 2 − (ln 1 = 0) 2 = 0, 193. 4. Let N(t) denote the number of walkie-talkies assembled by the average worker t hours after starting work on the morning shift. We have N ′ (t) = f (t) = −3t 2 + 12t + 15. 258

DSC2606 15.12. SOLUTIONS TO EXERCISES

(a) The number of units assembled by the average worker in the first hour of the morning shift is Z 1

N(1) − N(0) =

0

N ′ (t)dt

Z 1

(−3t 2 + 12t + 15)dt 0  1 = −t 3 + 6t 2 + 15t 0 = −1 + 6 + 15 = 20 units. =

(b) The number of units assembled by the average worker in the second hour of the morning shift is N(2) − N(1) =

Z 2 1

N ′ (t)dt

 2 = −t 3 + 6t 2 + 15t 1 = 46 − 20 = 26 units.

5. We find the equilibrium price and quantity by solving the following two equations: p p = 9 − 0, 02x p p = 1 + 0, 02x. Then we have p 9 − 0, 02x 9 − 0, 02x 0, 04x x

p = 1 + 0, 02x = 1 + 0, 02x = 8 = 200.

From this follows that p = = =

p

p √

9 − 0, 02x

9 − 0, 02(200)

5 ≈ 2, 24.

The equilibrium quantity is therefore, x¯ = 200 and the equilibrium price is p¯ = 2, 24 (in hundreds of rand).

259

DSC2606 CHAPTER 15 INTEGRATION

The consumers’ surplus is Z 200 p

CS =

9 − 0, 02x dx − p¯ · x¯

0

Z 200

=

0

Let us start by integrating Let u = 9 − 0, 02x, then Z

Z

1

(9 − 0, 02x) 2 dx − p¯ · x. ¯ 1

(9 − 0, 02x) 2 dx by substitution.

du = −0, 02, and dx 1 2

(9 − 0, 02x) dx = = = = =

Z

1

u2 ·

du −0, 02

1 1 − u 2 du 0, 02 1 2 3 · u 2 +C − 0, 02 3 1 3 − u 2 +C 0, 03 3 1 − (9 − 0, 02x) 2 +C. 0, 03

Z

The consumers’ surplus is then CS =

Z 200

1

(9 − 0, 02x) 2 dx − p¯ · x¯ 0   200 3 1 2 = − (9 − 0, 02x) − 2, 24(200) 0, 03 0 3 3 1 1 = − · 52 + · 9 2 − 448 0, 03 0, 03 = 527, 32 − 448 = 79, 32,

and since the price is in hundreds of rand, this means that the consumers’ surplus is R7 932.

260

DSC2606 15.12. SOLUTIONS TO EXERCISES

6. Determine the function values for n = 8 subintervals and apply the trapezoidal rule. 2

i x

x2

f (x) = ex

0 1 2 3 4 5 6 7 8

0, 000000 0, 015625 0, 062500 0, 140625 0, 250000 0, 390625 0, 562500 0, 765625 1, 000000

1, 000000 1, 015748 1, 064494 1, 150993 1, 284025 1, 477904 1, 755055 2, 150338 2, 718282

0, 000 0, 125 0, 250 0, 375 0, 500 0, 625 0, 750 0, 875 1, 000

1 1 + x2 1, 000000 0, 984615 0, 941176 0, 876712 0, 800000 0, 719101 0, 640000 0, 566372 0, 500000 f (x) =

(a) The estimate of the value of the definite integral i.

Z 1

ex dx ≈ 1, 4907 when n = 4 with the trapezoidal rule

ii.

Z 1

ex dx ≈ 1, 4697 when n = 8 with the trapezoidal rule

iii.

Z 1

ex dx ≈ 1, 4627 with the Romberg function in Maxima.

0

0

0

2

2

2

(b) The estimate of the value of the definite integral i.

Z 1

1 dx ≈ 0, 7828 when n = 4 with the trapezoidal rule 1 + x2

ii.

Z 1

1 dx ≈ 0, 7848 when n = 8 with the trapezoidal rule 1 + x2

iii.

Z 1

0

0

1 dx ≈ 0, 7854 with the Romberg function in Max2 0 1+x ima.

The Romberg function values for the two definite integrals were calculated as follows in Maxima: (%i44) (%o44) (%i45) (%o45) (%i46) (%o46)

load(romberg); C:/PROGRA~1/MAXIMA~1.0/share/maxima/5.13.0/share/numeric/rombe romberg(exp(x^2),x,0.0,1.0); 1.462651757343229 romberg(1/(1+x^2),x,0.0,1.0); 0.7853981595992

261

DSC2606 CHAPTER 15 INTEGRATION

262

Chapter

16

Partial differentiation

Contents 16.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 264 16.2 Partial derivatives . . . . . . . . . . . . . . . . . . . . . 266 16.3 Second-order partial derivatives . . . . . . . . . . . . . 272 16.4 The Cobb-Douglas production function . . . . . . . . . 275 16.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 277 16.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . 277

263

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

Sections from prescribed book, Winston Chapter 11, Section 11.1

Learning objectives After completing this study unit you should be able to • understand the concept of functions of several variables • visualise a function of two variables • understand the concept of partial derivatives of functions • calculate first and second-order partial derivatives of functions of several variables • apply partial differentiation to the Cobb-Douglas production function.

16.1 Introduction In the last few study units we have dealt with functions involving one variable. In many real-life situations, however, we encounter quantities that depend on two or more variables. For example, the Consumer Price Index (CPI) depends on the price of more than 95 000 items from petrol to groceries. In order to study such relationships, we need the notion of a function of several variables. The general form of a function of several variables is y = f (x1 ; x2 , . . . ; xn ). We will start off by studying the special case of functions of two variables as this can be visually represented. We then draw from the experience gained from this to help us understand the concepts and results of the more general case, which by and large, is just a simple extension of the special case. A function of two variables is denoted by z = f (x; y). The domain of this function is a set of ordered pairs of real numbers (x; y). The function associates each ordered pair (x; y) with a specific real number z = f (x; y). The variables x and y are called independent variables and the variable z, which depends on the values of x and y, is called a dependent variable. In order to graph a function of two variables, we need a three-dimensional coordinate system. This is readily constructed by adding a third axis to the usual flat Cartesian coordinate system in such a way that the three resulting axes are mutually perpendicular and intersect at zero. Observe that, by construction, the zeros of the three number scales coincide at the origin of the three-dimensional Cartesian coordinate system. This is shown in Figure 16.1. 264

DSC2606 16.1. INTRODUCTION

z

y x

Figure 16.1: Three-dimensional coordinate system A point in three-dimensional space can now be represented uniquely in this coordinate system by an ordered triple of numbers (x; y; z). And conversely, every ordered triple of real numbers (x; y; z) represents a point in three-dimensional space. This is illustrated in Figure 16.2.

z

b(

x; y ; z ) y

x Figure 16.2: A point in three-dimensional space 1 4; 0) and D(0; 0; 4) are shown in The points A(2; 3; 4), B(1; −2; −2), C(3; Figure 16.3.

The domain of a function of two variables, z = f (x; y), is a subset of the xyplane. Each point (x; y) in the xy-plane has a unique point (x; y; z) associated with it. All these points (x; y; z) then make up the graph of the function and is, except for certain degenerate functions, a surface in three-dimensional space. This is illustrated in Figure 16.4. In general, it is quite difficult to draw the graph of a function of two variables. But techniques have been developed that enable us to generate such graphs with minimum effort using a computer. The computer-generated graphs of two functions are shown in Figure 16.5. We will not sketch graphs of functions of two variables in this module.

265

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

z D (0; 0; 4)

b

b

A (2; 3; 4) y

B (1;

2;

2)

b

x

b

C (3; 4; 0)

Figure 16.3: Four points in three-dimensional space

0.5

(x; y; z)

. z

0

−0.5 2

. 2

1 1

0

y

0 −1

−1 −2

x

−2

Figure 16.4: Graph of z = f (x; y)

16.2 Partial derivatives For a function of one variable, y = f (x), the rate of change with respect to x means the change in the function for movement along the x-axis. As seen in 1 previous study units, this is measured by the slope of the tangent line to the curve y = f (x) through the point (x; y) and is called the derivative of f (x) with respect to x. For a function of two variables, z = f (x; y), the rate of change of f (x; y) can be either with respect to x, where the change is along the x-axis, or with respect to y, where the change is along the y-axis. As seen before, the function z = f (x; y) is a surface in three-dimensional space. Suppose we want to study the rate of change of this function with 266

DSC2606 16.2. PARTIAL DERIVATIVES

10 9 8 7 6 5 4 3 2 25 20

25 15

20 15

10 10

5

5 0

0

(a) f (x; y) = −x2 − y2 + 10

3.5 3 2.5 2 1.5 1 0.5 0 40 35

30 30 25

20

20 15

10

10 0

5 0

(b) f (x; y) = ln(x2 − 2y2 + 1)

Figure 16.5: Two computer generated functions respect to x. Keep y constant, say at y = b, then z = f (x; b) is the equation of a curve C on the surface and this curve is formed by the intersection of the surface and the plane y = b. This is illustrated in Figure 16.6. The rate of change of f (x; y) in the direction of x is measured by the slope of the tangent line T to the curve C at point P = (x; y; f (x; y)) and is called the partial derivative of f (x; y) with respect to x. It is given by lim

h→0

f (x + h; y) − f (x; y) , h

provided the limit exists. The notation used for the partial derivative with respect to x is

∂z ∂x

or

∂ f (x; y) ∂x

or

fx (x; y).

The latter two are often abbreviated as

∂f ∂x

and

fx .

267

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

z

T

C

P = (x ;y ;f(x ;y ))

y b

(x ;y ) x

Figure 16.6: Rate of change of z = f (x; y) in the direction of x If the partial derivative is to be evaluated at a certain point (x; y) = (a; b), it is denoted by ∂ z = fx (a; b). ∂ x (a;b)

It follows that the partial derivative of f (x; y) with respect to x is obtained by keeping the variable y fixed and differentiating the resulting function with respect to x. A similar argument holds for the rate of change in the direction of y and is illustrated in Figure 16.7. z

T P = (x ;y ;f(x ;y )) C

y a

(x ;y ) x

Figure 16.7: Rate of change of z = f (x; y) in the direction of y The partial derivative of f (x; y) with respect to y it is given by lim

k→0

268

f (x; y + k) − f (x; y) , k

DSC2606 16.2. PARTIAL DERIVATIVES

provided the limit exists. The notation used for the partial derivative with respect to y is

∂z ∂y

or

∂ f (x; y) ∂y

or

fy (x; y).

The latter two are often abbreviated as

∂f ∂y

and

fy .

If the partial derivative is to be evaluated at a certain point (x; y) = (a; b), it is denoted by ∂ z = fy (a; b). ∂ y (a;b)

To calculate the partial derivative of a function of several variables with respect to one variable, say x, we think of the other variables as if they were constants and differentiate the resulting function with respect to x, using the rules for differentiation as discussed in Study unit 9. Example 16.1 Consider the function f (x; y) = x2 − xy2 + y3 . (a) Calculate the partial derivatives of f (x; y). (b) What is the rate of change of the function f (x; y) in the x-direction at point (1; 2)? (c) What is the rate of change of the function f (x; y) in the y-direction at point (1; 2)? Solution ∂f (a) To calculate , think of the variable y as a constant and differentiate ∂x the resulting function of x with respect to x. Then ∂f = 2x − y2 + 0 = 2x − y2. ∂x ∂f , think of the variable x as a constant and differentiate ∂y the resulting function of y with respect to y. Then To calculate

∂f = 0 − x × 2y + 3y2 = −2xy + 3y2. ∂y

269

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

(b) The rate of change of the function f (x; y) in the x-direction at point (1; 2) is given by ∂ f fx (1; 2) = = 2(1) − 22 = −2, ∂ x (1;2)

that is, f (x; y) decreases by two units for each unit increase in the xdirection, y being kept constant at y = 2.

(c) The rate of change of the function f (x; y) in the y-direction at point (1; 2) is given by ∂ f = −2(1)(2) + 3(2)2 = 8, fy (1; 2) = ∂ y (1;2) that is, f (x; y) increases by 8 units for each unit increase in the ydirection, x being kept constant at x = 1.

Example 16.2 Calculate the partial derivatives of each of the following functions: xy (a) f (x; y) = 2 x + y2 (b) g(s;t) = (s2 − st + t 3)5 2 −2v2

(c) h(u; v) = eu

(d) f (x; y) = ln(x2 + 2y2 ) (e) f (x; y; z) = xyz − xeyz + x ln y Solution (a) Now f (x; y) =

xy = xy(x2 + y2 )−1 . x2 + y2

∂f To calculate , think of the variable y as if it were a constant. Then ∂x by the product rule ∂f ∂x

270

= xy[−(x2 + y2 )−2 × 2x] + (x2 + y2 )−1 (y) =

−2x2 y y + 2 2 2 2 (x + y ) x + y2

=

−2x2 y + yx2 + y3 (x2 + y2 )2

=

y(y2 − x2 ) . (x2 + y2 )2

DSC2606 16.2. PARTIAL DERIVATIVES

To calculate

∂f , think of the variable x as if it were a constant. Then ∂y

∂f ∂y

= xy[−(x2 + y2 )−2 × 2y] + (x2 + y2 )−1 (x) =

−2xy2 x + 2 2 2 2 (x + y ) x + y2

=

−2xy2 + x3 + xy2 (x2 + y2 )2

=

x(x2 − y2 ) . (x2 + y2 )2

∂g (b) To calculate , we treat the variable t as if it were a constant. Then ∂s by the power rule ∂g ∂s

= 5(s2 − st + t 3 )4 · (2s − t) = 5(2s − t)(s2 − st + t 3 )4 .

To calculate

∂g , we treat the variable s as if it were a constant. Then ∂t ∂g ∂t

= 5(s2 − st + t 3 )4 · (−s + 3t 2) = 5(3t 2 − s)(s2 − st + t 3 )4 .

∂h (c) To calculate , we treat the variable v as if it were a constant. Then ∂u by the chain rule for exponential functions ∂h ∂u

= eu

2 −2v2

= 2ueu

To calculate

× 2u

2 −2v2

.

∂h , we treat the variable u as if it were a constant. Then ∂v ∂h ∂v

= eu

2 −2v2

= −4veu

· [−2(2v)]

2 −2v2

.

∂f (d) To calculate , we treat the variable y as if it were a constant. Then ∂x by the chain rule for logarithmic functions ∂f ∂x

=

2x x2 + 2y2

.

271

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

To calculate

∂f , we treat the variable x as if it were a constant. Then ∂y ∂f ∂y

=

4y x2 + 2y2

.

(e) Here we have a function of three variables x, y and z, and we are required to find all three the partial derivatives fx , fy and fz . To calculate fx , we think of the other two variable y and z as if they were constants. Then fx = yz − eyz + ln y. To calculate fy , we think of the other two variable x and z as if they were constants. Then x fy = xz − xzeyz + . y To calculate fz , we think of the other two variable x and y as if they were constants. Then fz = xy − xyeyz.

16.3 Second-order partial derivatives The partial derivatives that we have considered up till now are also known as first-order partial derivatives. If we consider a function of two variables x and y, z = f (x; y), then the firstorder partial derivatives, fx (x; y) and fy (x; y), are also functions of the two variables x and y. We may differentiate each of these functions, fx and fy , to obtain the second-order partial derivatives of f (x; y). Therefore, differentiating the function fx (x; y) with respect to x leads to the second-order partial derivative fxx (x; y) =

∂ [ fx (x; y)] . ∂x

This is often abbreviated to fxx

or

∂ ( fx ) ∂x

or

∂2 f ∂ x2

.

Differentiating the function fx (x; y) with respect to y leads to the secondorder partial derivative fxy (x; y) =

272

∂ [ fx (x; y)] , ∂y

DSC2606 16.3. SECOND-ORDER PARTIAL DERIVATIVES

which is abbreviated to fxy

or

∂ ( fx ) ∂y

or

∂2 f . ∂ y∂ x

Also, differentiation of fy (x; y) with respect to y leads to fyy (x; y) =

∂ [ fy (x; y)] , ∂y

abbreviated to fyy

or

∂ ( fy ) ∂y

or

∂2 f ∂ y2

.

Similarly, differentiation of fy (x; y) with respect to x leads to fyx (x; y) =

∂ [ fy (x; y)] , ∂x

abbreviated to fyx

or

∂ ( fy ) ∂x

or

∂2 f . ∂ x∂ y

NOTE: In most practical applications, fxy and fyx are equal. It follows that since the second-order partial derivatives are also function of x and y, these functions can again be differentiated to obtain third-order partial derivatives. The same argument can be extended for higher-order derivatives. We will not consider higher-order derivatives here. Example 16.3 Find the second-order partial derivatives of the function f (x; y) = x3 − 3x2 y + 5xy2 + y2 . Solution The first-order partial derivatives are fx =

∂ 3 (x − 3x2 y + 5xy2 + y2 ) = 3x2 − 6xy + 5y2 ∂x

(y constant),

fy =

∂ 3 (x − 3x2 y + 5xy2 + y2 ) = −3x2 + 10xy + 2y ∂y

(x constant).

273

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

The second-order partial derivatives are fxx =

∂ ∂ ( fx ) = (3x2 − 6xy + 5y2) ∂x ∂x

= 6x − 6y,

fxy =

∂ ∂ ( fx ) = (3x2 − 6xy + 5y2) ∂y ∂y

= −6x + 10y,

fyx =

∂ ∂ ( fy ) = (−3x2 + 10xy + 2y) = −6x + 10y, ∂x ∂x

fyy =

∂ ∂ ( fy ) = (−3x2 + 10xy + 2y) = 10x + 2. ∂y ∂y

Example 16.4 2

Find the second-order partial derivatives of the function f (x; y) = exy . Solution The first-order partial derivatives are fx =

∂ xy2 ∂ 2 2 2 (e ) = exy · (xy2 ) = exy × y2 = y2 exy , ∂x ∂x

fy =

∂ xy2 ∂ 2 2 (e ) = exy · (xy2 ) = 2xyexy . ∂y ∂y

The second-order partial derivatives are

∂ ( fx ) ∂x ∂ 2 xy2 (y e ) = ∂x 2 ∂ = y2 exy · (xy2 ) ∂x 2 = y2 exy (y2)

fxx =

2

= y4 exy .

∂ ( fx ) ∂y ∂ 2 xy2 = (y e ) ∂y ∂ 2 2 ∂ = y2 · (exy ) + exy · (y2 ) ∂y ∂y 2 2 ∂ = y2 · exy · (xy2) + exy (2y) ∂y

fxy =

2

2

= y2 exy (2xy) + 2yexy 2

= 2yexy (xy2 + 1).

274

DSC2606 16.4. THE COBB-DOUGLAS PRODUCTION FUNCTION

∂ ( fy ) ∂y ∂ 2 = (2xyexy ) ∂y ∂ 2 2 ∂ = 2xy · (exy ) + exy · (2xy) ∂y ∂y

fyy =

2

2

= 2xyexy (2xy) + exy (2x) 2

= 2xexy (2xy2 + 1).

∂ ( fy ) ∂x 2 ∂ = (2xyexy ) ∂x ∂ 2 2 ∂ = 2xy · (exy ) + exy · (2xy) ∂x ∂x xy2 2 xy2 = 2xye (y ) + e (2y)

fyx =

2

= 2yexy (xy2 + 1).

16.4 The Cobb-Douglas production function For an economic interpretation of the partial derivatives of a function of two variables, let us turn our attention to the function f (x; y) = axb y1−b , where a and b are positive constants with 0 < b < 1. This function is called the Cobb-Douglas production function. Here x represents the amount of money spent on labour, y represents the cost of capital equipment (buildings, machinery and other tools for production), and the function f (x; y) measures the output of the finished product (in suitable units) and is called, accordingly, the production function. The partial derivative fx is called the marginal product of labour. It measures the rate of change in production with respect to the amount of money spent on labour, with the level of capital expenditure held constant. Similarly, the partial derivative fy , called the marginal product of capital, measures the rate of change in production with respect to the amount spent on capital, with the level of labour expenditure held constant.

275

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

Example 16.5 A certain country’s production in the early years following the Second World War, is defined by the function 2

1

f (x; y) = 30x 3 y 3 , where x represents the units spent on labour and y represents the units spent on capital. (a) Calculate the first-order partial derivatives of f (x; y). (b) What is the marginal product of labour and the marginal product of capital when the amounts spent on labour and capital are 125 units and 27 units, respectively? Interpret the results. (c) To increase the country’s production, should the government have encouraged capital investment rather than increased expenditure on labour? Solution (a) The first-order partial derivatives are 2 1 1 fx = 30 · x− 3 y 3 3 2 2 1 fy = 30x 3 · y− 3 3

= 20

y1 3

, x  2 x 3 . = 10 y

(b) The required marginal product of labour is given by 

27 fx (125; 27) = 20 125

1 3

  3 = 20 = 12 units. 5

Production increases by 12 units for each unit increase in labour expenditure (capital expenditure is held constant at 27 units). The required marginal product of capital is given by 

125 fy (125; 27) = 10 27

2 3



25 = 10 9



= 27, 777 units.

Production increases by 27, 777 units for each unit increase in capital expenditure (labour expenditure is held constant at 125 units). (c) From the results in (b), we see that a unit increase in capital expenditure resulted in a much faster increase in production than a unit increase in labour expenditure. Therefore, the government should have encouraged increased spending on capital rather than on labour during the early years of reconstruction.

276

DSC2606 16.5. EXERCISES

16.5 Exercises 1. Find the first-order partial derivatives of f (x; y) = x ln y + yex − x2 at point (0; 2) and interpret the results. 2. Find the second-order partial derivatives of the following function and show that the mixed partial derivatives fxy and fyx , fxz and fzx , fyz and fzy , are equal: f (x; y; z) = 3xyz2 + x2 yz. 3. A country’s production function is 1

2

f (x; y) = 60x 3 y 3 , where x represents the units spent on labour and y represents the units spent on capital. (a) What is the marginal product of labour and the marginal product of capital when the amounts spent on labour and capital are 125 units and 8 units, respectively? Interpret the results. (b) To increase the country’s production, should the government encourage capital investment rather than increased expenditure on labour at this time?

16.6 Solutions to exercises 1. The first-order partial derivatives are

∂ (x ln y + yex − x2 ) ∂x = ln y + yex − 2x,

fx =

∂ (x ln y + yex − x2 ) ∂y x = + ex . y

fy =

We now evaluate fx at (0; 2), then fx (0; 2) = ln 2 + 2e0 − 2(0) = 2, 693.

277

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

This means that at point (x; y) = (0; 2), the function increases by 2, 693 units for each unit increase in the x-direction if y is held constant. We now evaluate fy at (0; 2), then 0 + e0 2 = 1.

fy (0; 2) =

This means that at point (x; y) = (0; 2), the function increases by 1 unit for each unit increase in the y-direction if x is held constant. 2. The first-order partial derivatives are fx =

∂ (3xyz2 + x2 yz) = 3yz2 + 2xyz, ∂x

fy =

∂ (3xyz2 + x2 yz) = 3xz2 + x2 z, ∂y

fz =

∂ (3xyz2 + x2 yz) = 6xyz + x2 y. ∂z

The second-order partial derivatives with respect to x are fxx =

∂ ∂ ( fx ) = (3yz2 + 2xyz) = 2yz, ∂x ∂x

fyx =

∂ ∂ ( fy ) = (3xz2 + x2 z) ∂x ∂x

= 3z2 + 2xz,

fzx =

∂ ∂ ( fz ) = (6xyz + x2 y) ∂x ∂x

= 6yz + 2xy.

The second-order partial derivatives with respect to y are fyy =

∂ ∂ ( fy ) = (3xz2 + x2 z) ∂y ∂y

fxy =

∂ ∂ ( fx ) = (3yz2 + 2xyz) = 3z2 + 2xz, ∂y ∂y

fzy =

∂ ∂ ( fz ) = (6xyz + x2 y) ∂y ∂y

= 0,

= 6xz + x2 .

The second-order partial derivatives with respect to z are

278

fzz =

∂ ∂ ( fz ) = (6xyz + x2 y) ∂z ∂z

fxz =

∂ ∂ ( fx ) = (3yz2 + 2xyz) = 6yz + 2xy, ∂z ∂z

fyz =

∂ ∂ ( fy ) = (3xz2 + x2 z) ∂z ∂z

= 6xy,

= 6xz + x2 .

DSC2606 16.6. SOLUTIONS TO EXERCISES

Compare the mixed partial derivatives fxy and fyx . Then fxy = 3z2 + 2xz = fyx . Therefore, fxy and fyx are equal. Compare the mixed partial derivatives fxz and fzx . Then fxz = 6yz + 2xy = fzx . Therefore, fxz and fzx are equal. Compare the mixed partial derivatives fyz and fzy . Then fyz = 6xz + x2 = fzy . Therefore, fyz and fzy are equal. 3.

(a) The marginal product of labour is  1 2 ∂  60x 3 y 3 ∂ x  2 1 −2 3 = 60 x y3 3 y2 3 = 20 . x

fx =

If the amounts spent on labour and capital are 125 and 8 units respectively, then the point (x; y) = (125; 8) results. The marginal product of labour at this point is 2 8 3 fx (125; 8) = 20 125  2 2 = 20 5 = 3, 2. 

This means that production increases by 3, 2 units for each unit increase in labour expenditure if capital expenditure is held constant. The marginal product of capital is  1 2 ∂  60x 3 y 3 ∂y   1 2 −1 = 60x 3 y 3 3  1 x 3 = 40 . y

fy =

279

DSC2606 CHAPTER 16 PARTIAL DIFFERENTIATION

The marginal product of capital at (x; y) = (125; 8) is 

125 fy (125; 8) = 40 8   5 = 40 2 = 100.

1 3

This means that production increases by 100 units for each unit increase in capital expenditure if labour expenditure is held constant. (b) The marginal product of capital is much greater than the marginal product of labour at point (125; 8). So we can deduce that the government should encourage capital investment rather than increased labour expenditure.

280

Chapter

17

Optimisation of NLPs in several variables

Contents 17.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 282 17.2 Convex and concave functions . . . . . . . . . . . . . . 283 17.3 Stationary points and the nature of stationary points . . 287 17.4 Solving NLPs in several variables by differential calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289 17.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 293 17.6 Solutions to exercises . . . . . . . . . . . . . . . . . . . 294

281

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

Sections from prescribed book, Winston Chapter 2, Section 2.1 Chapter 2, Section 2.6 Chapter 11, Section 11.3 Chapter 11, Section 11.6

Learning objectives After completing this study unit you should be able to • explain the concepts of the Hessian, the principal minors, the leading principal minors and the determinant of a matrix • set up the Hessian matrix of a function of several variables • determine the principal minors and the leading principal minors of a matrix • determine the convexity/concavity of a function • determine the stationary points of a function of several variables • determine the nature of the stationary points: local (or relative) minimum, maximum or saddle points • use the convexity/concavity of the objective function to determine whether a local extremum is an absolute extremum; in other words, whether the solution obtained is in fact the optimal solution • solve NLP models in several variables by means of differential calculus.

17.1 Introduction In previous study units we saw that many practical problems were formulated as functions of one variable. Such functions were then solved by determining their extreme values. For example, the problem of finding a firm’s production level, x, that will yield maximum profit was solved by finding the absolute maximum value of the profit function P(x). The notion of an extreme value of a function plays an equally important role in the case of a function of several variables. As in the case of a function of one variable, it is important to distinguish between the concept of a relative maximum (relative minimum) and that of an absolute maximum (absolute minimum) of the function. A relative extremum (relative maximum or relative minimum) may or may not be an absolute extremum. 282

DSC2606 17.2. CONVEX AND CONCAVE FUNCTIONS

Just as the first and second-order derivatives play an important role in determining the relative extrema of a function of one variable, the first and second-order partial derivatives are powerful tools for locating and classifying the relative extrema of functions of several variables, as we shall see later in this study unit. We also saw in previous study units that there is no guarantee that the solution obtained to an NLP model (whether by hand or by computer) is actually the optimal solution to the model. This contributes to the fact that NLP models are much harder to solve than LP models. It is important that we know the criteria that must apply before we can deduce that the solution obtained is indeed the optimal solution. The convexity/concavity of a function gives such criteria. We discussed this for functions of one variable in Study unit 13. We saw that the second-order derivative of a function is used to determine the convexity/concavity of the function. Now we extend our study to functions of several variables.

17.2 Convex and concave functions Functions of several variables do not have only one second-order derivative. A function of n variables has n2 second-order partial derivatives. For example, a function of two variables, say g(x1 ; x2 ), has four second-order partial derivatives, namely

∂2 g(x1 ; x2 ); ∂ x21

∂2 g(x1 ; x2 ); ∂ x2 ∂ x1

∂2 g(x1 ; x2 ); ∂ x22

∂2 g(x1 ; x2 ). ∂ x1 ∂ x2

Second-order partial derivatives as well as the following concepts are used in the theorems that establish the convexity/concavity of functions of several variables: • the Hessian matrix of a function • the ith principal minor of a matrix • the kth leading principal minor of a matrix. Refer to the second half of Winston, Section 11.3 for the definitions of these concepts and the theorems on convexity/concavity. Examples are also given to illustrate these definitions and theorems. To understand the definitions of these concepts, you may need to refresh your memory on basic algebra, especially on matrices and determinants. Refer to Winston, Sections 2.1 and 2.6 for this.

283

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

Determining the convexity/concavity of functions is important and we illustrate this by means of additional examples. These examples are taken from the problems at the end of Winston, Section 11.3. Example 17.1 Determine whether the function f (x) = ther on the set S = (0; ∞).

1 is convex, concave or neix

Solution The derivatives are f ′ (x)

=

f ′′ (x) =

∂ −1 (x ) ∂x

= −x−2 = −

∂ (−x−2 ) = ∂x

1 , x2 2 . x3

2x−3 =

2 > 0 for x ∈ S = (0; ∞). x3 1 Therefore, f (x) = is a convex function on S. x Now f ′′ (x) =

Example 17.2 Determine whether the function f (x1 ; x2 ) = x31 + 3x1 x2 + x22 is convex, concave or neither on the set S = R2 . Solution The first-order partial derivatives are

∂f ∂ x1

= 3x21 + 3x2,

∂f ∂ x2

= 3x1 + 2x2 .

The second-order partial derivatives are

∂2 f ∂ x1

2

∂2 f ∂ x2 ∂ x1 ∂2 f ∂ x2

2

∂2 f ∂ x1 ∂ x2

284

=

∂ (3x2 + 3x2 ) = 6x1 , ∂ x1 1

=

∂ (3x2 + 3x2 ) = ∂ x2 1

3,

=

∂ (3x1 + 2x2 ) = ∂ x2

2,

=

∂ (3x1 + 2x2 ) = ∂ x1

3.

DSC2606 17.2. CONVEX AND CONCAVE FUNCTIONS

The Hessian matrix is H(x1; x2 ) =



6x1 3

3 2



.

The first principal minors are 6x1 and 2. Now 6x1 > 0 for x1 > 0 and 6x1 < 0 for x1 < 0. Therefore, the two theorems for convexity/concavity do not hold for x ∈ S = R2 . We conclude that the function is neither convex nor concave on S = R2 .

Example 17.3 Determine whether the function f (x1 ; x2 ; x3 ) = −x21 − x22 − 2x23 + 0, 5x1 x2 is convex, concave or neither on the set S = R3 . Solution The first-order partial derivatives are

∂f ∂ x1

= −2x1 + 0, 5x2 ,

∂f ∂ x2

= −2x2 + 0, 5x1 ,

∂f ∂ x3

= −4x3 .

The second-order partial derivatives are

∂2 f ∂ x1 2 ∂2 f ∂ x2

2

∂2 f ∂ x3 2

= −2,

∂2 f ∂ x2 ∂ x1

= 0, 5,

∂2 f ∂ x3 ∂ x1

= 0,

= −2,

∂2 f ∂ x1 ∂ x2

= 0, 5,

∂2 f ∂ x3 ∂ x2

= 0,

= −4,

∂2 f ∂ x1 ∂ x3

=

∂2 f ∂ x2 ∂ x3

= 0.

0,

Notice that the following mixed partial derivatives are equivalent: fx1 x2

= 0, 5 =

fx2 x1 ,

fx1 x3

=

0

=

fx3 x1 ,

fx2 x3

=

0

=

fx3 x2 .

The Hessian matrix is 

−2  0, H(x1; x2 ; x3 ) =  5 0

0, 5 −2 0

 0  0 . −4

285

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

The first principal minors are −2, −2 and −4, all with the same sign as (−1)1 = −1. The second principal minors are −2 0

0 = (−2)(−4) − 0 = 8 > 0, −4

−2 0, 5

0, 5 −2

−2 0

3, 75 > 0.

0 = 8 > 0, −4

All have the same sign as (−1)2 = +1.

= (−2)(−2)−(0, 5)(0, 5) =

The third principal minor is the determinant of the Hessian matrix. Let 

−2  A =  0, 5 0

0, 5 −2 0

 0  0 . −4

The determinant of A can be obtained by expanding row 1 by means of cofactors. Then det A = (−1)1+1a11 det A11 + (−1)1+2a12 det A12 + (−1)1+3a13 det A13 −2 0

= (−1)2 (−2).

= −2(8) − 0, 5(−2) + 0 = −16 + 1 = −15.

0, 5 0 3 + (−1) (0, 5). 0 −4

0, 5 0 4 + (−1) (0). 0 −4

(Refer to Winston, Section 2.6 for determinants.) This means that the third principal minor is −15 < 0, and this has the same sign as (−1)3 = −1. Therefore, we conclude that the function is a concave function on S = R3 .

We now know how to determine the convexity/concavity of functions of several variables. This can be used to determine whether the solution to an NLP model is in fact the optimal solution. The criteria for this is given in two theorems in the first part of Winston, Section 11.3. These theorems are repeated as two vital theorems in Section 13.2 of this study guide. Make sure that you are familiar with these theorems. 286

−2 0

DSC2606 17.3. STATIONARY POINTS AND THE NATURE OF STATIONARY POINTS

17.3 Stationary points and the nature of stationary points Stationary points and the nature of stationary points, that is, whether a stationary point is a local maximum, a local minimum or a saddle point, are discussed in Winston, Section 11.6. Turn to this section and study these concepts, the three important theorems and the examples. Determining the stationary points of a function and the nature of these stationary points is illustrated by means of examples. Example 17.4 Find the relative extrema of the function f (x; y) = 3x2 − 4xy + 4y2 − 4x + 8y + 4. Solution The first-order partial derivatives are fx =

∂f ∂x

= 6x − 4y − 4,

fy =

∂f ∂y

= −4x + 8y + 8.

The second-order partial derivatives are

∂2 f ∂x

2

∂2 f ∂ y∂ x ∂2 f ∂y

2

∂2 f ∂ x∂ y

=

∂ ( fx ) = ∂x

∂ (6x − 4y − 4) ∂x

=

=

∂ ( fx ) = ∂y

∂ (6x − 4y − 4) ∂y

= −4,

=

∂ ( fy ) = ∂y

∂ (−4x + 8y + 8) = ∂y

=

∂ ( fy ) = ∂x

∂ (−4x + 8y + 8) = −4. ∂x

To find the stationary points, we set ing equations simultaneously. Then

6,

8,

∂f ∂f = 0 and = 0 and solve the result∂x ∂y

6x − 4y − 4 = 0 −4x + 8y + 8 = 0.

(1) (2)

Multiply equation (1) by 2. Then 12x − 8y = 8 −4x + 8y = −8.

(3) (4)

Add equations (3) and (4). Then

287

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

8x = 0 x = 0. Substitute this into (3). Then 12(0) − 8y = 8 y = −1. The point (x; y) = (0; −1) is the only stationary point of f (x; y). The Hessian matrix is H(x; y) =



6 −4

−4 8



.

The first leading principal minor is H1 (x; y) = 6 > 0. The second leading principal minor is H2 (x; y) = 6(8) − (−4)(−4) = 32 > 0. Both leading principal minors are > 0. Therefore, the stationary point (0; −1) is a local minimum point.

Example 17.5 Find the relative extrema of the function f (x; y) = 4y3 + x2 − 12y2 − 36y + 2. Solution The first-order partial derivatives are fx =

∂f ∂x

= 2x,

fy =

∂f ∂y

= 12y2 − 24y − 36.

The second-order partial derivatives are

∂2 f ∂ x2 ∂2 f ∂ y2

= 2,

∂2 f ∂ y∂ x

= 0,

= 24y − 24,

∂2 f ∂ x∂ y

= 0.

To find the stationary points, we set 2x = 0

288

∂f ∂f = 0 and = 0. Then ∂x ∂y ⇒ x = 0,

DSC2606 17.4. SOLVING NLPS IN SEVERAL VARIABLES BY DIFFERENTIAL CALCULUS

and

12y2 − 24y − 36 y2 − 2y − 3 (y + 1)(y − 3) y = −1 or

= 0 = 0 = 0 y = 3.

There are two stationary points: (0; −1) and (0; 3). The Hessian matrix is H(x; y) =



2 0

0 24y − 24



.

Consider point (x; y) = (0; −1). The first leading principal minor is H1 (x; y) = 2 > 0. The second leading principal minor is H2 (x; y) = 2(24y − 24) − 0 = 48y − 48. And this evaluated at (0; −1) is H2 (0; −1) = 48(−1) − 48 = −96 < 0. The two theorems on local extrema do not hold. Since the leading principal minors are nonzero, we can conclude that (0; −1) is a saddle point. Consider point (x; y) = (0; 3). The first leading principal minor is H1 (x; y) = 2 > 0. The second leading principal minor evaluated at (0; 3) is H2 (0; 3) = 48(3) − 48 = 96 > 0. Now both leading principal minors are > 0. Therefore, the stationary point (0; 3) is a local minimum point.

17.4 Solving NLPs in several variables by differential calculus To solve an NLP in several variables by differential calculus, we must carry out the following steps: Step 1: Find the stationary points of the function under consideration. Step 2: Determine the nature of the stationary points. Step 3: Determine the convexity/concavity of the function.

289

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

Example 17.6 The Acrosonic Company produces and sells loudspeakers. Its total weekly revenue (in rand) is 1 3 1 R(x; y) = − x2 − y2 − xy + 300x + 240y. 4 8 4 Here x represents the number of fully assembled units and y represents the number of assemble-it-yourself kits. The total weekly cost (in rand) for the production of these loudspeakers is C(x; y) = 180x + 140y + 5 000. Acrosonic wants to determine how many assembled units and how many kits to produce per week to maximise its profit. Solution Decision variables The decision variables are given as x y

= =

number of fully assembled units produced weekly, number of assemble-it-yourself kits produced weekly.

Objective function Profit must be maximised. Now profit = revenue − cost P(x; y) = R(x; y) −C(x; y)

= − 14 x2 − 38 y2 − 14 xy + 300x + 240y − (180x + 140y + 5 000) = − 14 x2 − 38 y2 − 14 xy + 120x + 100y − 5 000.

The objective function is 1 3 1 Maximise P(x; y) = − x2 − y2 − xy + 120x + 100y − 5 000. 4 8 4 Constraints The number of units produced cannot be negative; so x; y ≥ 0. The NLP model is Maximise P(x; y) = − 14 x2 − 83 y2 − 14 xy + 120x + 100y − 5 000 subject to x; y ≥ 0. Solution

290

DSC2606 17.4. SOLVING NLPS IN SEVERAL VARIABLES BY DIFFERENTIAL CALCULUS

Step 1: Find the stationary points The stationary points are determined from Px = 0 and Py = 0 as Px = − 12 x − 41 y + 120 = 0, Py = − 34 y − 41 x + 100 = 0. From this it follows that 2x + y = 480 x + 3y = 400.

(1) (2)

Multiply equation (2) by −2. Then −2x − 6y = −800.

(3)

Add equations (1) and (3). Then −5y = −320 y = 64.

(4)

Substitute (4) into (2). Then x + 3(64) = 400 x = 208. The stationary point is at (x; y) = (208; 64). Step 2: Determine the nature of the stationary points The second-order partial derivatives are Pxx = − 12 ,

Pyx = − 14 ,

Pyy = − 34 ,

Pxy = − 14 .

The Hessian matrix is H(x; y) =

"

− 12

− 14

− 14

− 34

#

.

The first leading principal minor is − 21 < 0, which has the same sign as (−1)1 = −1.

The second leading principal minor is (− 12 )(− 34 ) − (− 14 )(− 14 ) = 3 1 5 2 8 − 16 = 16 > 0, which has the same sign as (−1) = +1. This means that (208; 64) is a local maximum point. Step 3: Determine convexity/concavity The first principal minors are − 12 and − 34 ; both are negative and have the same sign as (−1)1 = −1. The second principal minor is (−1)2 = +1.

5 16

> 0, which has the same sign as

From this it follows that the profit function is a concave function.

291

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

Conclusion The point (x; y) = (208; 64) is the optimal solution. Acrosonic will maximise weekly profit if it produces 208 fully assembled loudspeakers and 64 assemble-it-yourself kits per week. The associated maximum profit is P(208; 64) = − 14 (208)2 − 38 (64)2 − 14 (208)(64) + 120(208) + 100(64) − 5 000 = R10 680.

Example 17.7 A television relay station will serve towns A, B and C whose relative locations are shown in Figure 17.1. Determine a site for the location of the station if the sum of the squares of the distances from each town to the site is minimised. y 30 A (30; 20)

20 B (−20; 10)

P (x; y)

10

−20

−10

−10

10

20

30

x

C (10; −10)

Figure 17.1: Television relay station Solution Suppose the required site is located at point P(x; y), where x and y represent the coordinates on the x-axis and y-axis respectively. The square of the distance from town A to the site is (x − 30)2 + (y − 20)2. The respective distances from towns B and C to the site are found in a similar manner. The sum of the squares of the distances from each town to the site is given by f (x; y) = (x − 30)2 + (y − 20)2 + (x + 20)2 + (y − 10)2 + (x − 10)2 + (y + 10)2 , and this must be minimised.

292

DSC2606 17.5. EXERCISES

The first-order partial derivatives are obtained by means of the chain rule as

Setting

∂f ∂x

= 2(x − 30) + 2(x + 20) + 2(x − 10) = 6x − 40,

∂f ∂y

= 2(y − 20) + 2(y − 10) + 2(y + 10) = 6y − 40.

∂f ∂f = 0 and = 0 gives ∂x ∂y 6x − 40 = 0 ⇒ x =

20 3,

6y − 40 = 0 ⇒ y =

20 3 .

20 The point (x; y) = ( 20 3 ; 3 ) is the only stationary point.

The second-order partial derivatives are

∂2 f ∂ x2 ∂2 f ∂ y2

= 6,

∂2 f ∂ y∂ x

= 0,

= 6,

∂2 f ∂ x∂ y

= 0.

The Hessian matrix is H(x; y) =



6 0

0 6



.

The first leading principal minor is 6 > 0. The second leading principal minor is 6(6) − 0 = 36 > 0.

20 Both leading principal minors are nonnegative. Therefore, the point ( 20 3; 3) is a local minimum.

The first principal minors are 6 and 6; both > 0. The second principal minor is 36 > 0. All principal minors are > 0. Therefore, the function is convex. 20 We can conclude that the point (x; y) = ( 20 3 ; 3 ) is the optimal solution. This means that the relay station should be located at a site with coordinates x = 20 20 3 and y = 3 . This will minimise the sum of the squared distances from each of the three towns to the site.

17.5 Exercises 1. Determine whether the function f (x1 ; x2 ) = −x21 − x1 x2 − 2x22 293

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

is convex, concave or neither on the set S = R2 . (This problem is from Winston, Section 11.3.) 2. Find all local maxima, local minima and saddle points for f (x1 ; x2 ) = x31 − 3x1 x22 + x42 . (This problem is from Winston, Section 11.6.) 3. A company can sell all it produces of a given output for R2 per unit. The output is produced by combining two inputs. If q1 units of input 1 and q2 units of input 2 are used, the company can 1

2

produce q13 + q23 units of the output. If it costs R1 to purchase a unit of input 1 and R1, 50 to purchase a unit of input 2, how can the company maximise its profit? (This problem is from Winston, Section 11.6.)

17.6 Solutions to exercises 1. The first-order partial derivatives are

∂f = −2x1 − x2 , ∂ x1 ∂f = −x1 − 4x2 . ∂ x2 The second-order partial derivatives are

∂2 f = −2, ∂ x1 2

∂2 f = −1, ∂ x2 ∂ x1

∂2 f = −4, ∂ x2 2

∂2 f = −1. ∂ x1 ∂ x2

The Hessian matrix is H(x1 ; x2 ) =

"

−2 −1

−1 −4

#

.

The first principal minors are −2 and −4; both are negative and have the same sign as (−1)1 = −1. The second principal minor is (−2)(−4) − 1 = 7 > 0; the same sign as (−1)2 = +1. Therefore, the function is a concave function on S = R2 . 294

DSC2606 17.6. SOLUTIONS TO EXERCISES

2. The first-order partial derivatives are

∂f = 3x21 − 3x22 , ∂ x1 ∂f = −6x1 x2 + 4x32 . ∂ x2 The second-order partial derivatives are

∂2 f ∂ x1 2 ∂2 f ∂ x2 2 Set

= 6x1 ,

∂2 f = −6x2 , ∂ x2 ∂ x1

= −6x1 + 12x22 ,

∂2 f = −6x2 . ∂ x1 ∂ x2

∂f = 0, then ∂ x1 3x21 − 3x22 = 0 3(x1 − x2 )(x1 + x2 ) = 0 x1 − x2 = 0 or x1 + x2 = 0 x1 = x2 or x1 = −x2 .

Now substitute x1 = x2 into

∂f = 0. Then ∂ x2

−6x1 x2 + 4x32 = 0 −6x22 + 4x32 = 0 −2x22 (3 − 2x2 ) = 0 x22 = 0 or 3 − 2x2 = 0 x2 = 0 or x2 = 32 .

Therefore, (0; 0) and ( 32 ; 32 ) are stationary points. Now substitute x1 = −x2 into

∂f = 0. Then ∂ x2

−6x1 x2 + 4x32 = 0 6x22 + 4x32 = 0 2x22 (3 + 2x2 ) = 0 x22 = 0 or 3 + 2x2 = 0 x2 = 0 or x2 = − 32 . Therefore, (0; 0) and ( 32 ; − 32 ) are stationary points. 295

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

The given function has three stationary points: (0; 0), ( 23 ; 32 ) and ( 23 ; − 32 ). The Hessian matrix is H(x1 ; x2 ) =

"

−6x2 −6x1 + 12x22

6x1 −6x2

#

.

The first leading principal minor is 6x1 . The second leading principal minor is 6x1 (−6x1 + 12x22 ) − (−6x2 )(−6x2 ) = −36x21 + 72x1 x22 − 36x22 . The leading principal minors must be evaluated at each of the stationary points. Consider point ( 32 ; 32 ). Now the first leading principal minor is H1 ( 32 ; 32 ) = 6( 32 ) = 9 > 0. The second leading principal minor is H2 ( 23 ; 32 ) = −36( 32 )2 + 72( 23 )( 32 )2 − 36( 23 )2 = 81 > 0. All leading principal minors are nonnegative. Therefore, ( 32 ; 32 ) is a local minimum. Consider point ( 23 ; − 23 ). We find H1 ( 32 ; − 32 ) = 9 > 0 and H2 ( 32 ; − 32 ) = 81 > 0. Therefore, ( 23 ; − 32 ) is also a local minimum. Consider point (0; 0). We find H1 (0; 0) = 0 and H2 (0; 0) = 0. Therefore, we cannot determine the nature of the stationary point (0; 0). 3. Decision variables The decision variables are given as q1 q2

= number of units of input 1 to be used, = number of units of input 2 to be used.

Objective function Profit must be maximised. Now profit = revenue − cost 1

2

= 2(q13 + q23 ) − (q1 + 1, 5q2 ). 296

DSC2606 17.6. SOLUTIONS TO EXERCISES

The objective function is then 1

2

Maximise f (q1 ; q2 ) = 2q13 + 2q23 − q1 − 1, 5q2 . Constraints A negative number of units of the two inputs cannot be used; so q1 ; q2 ≥ 0. There are no other constraints. The NLP model is 1

2

Maximise f (q1 ; q2 ) = 2q13 + 2q23 − q1 − 1, 5q2 subject to q1 ; q2 ≥ 0. Solution The first-order partial derivatives are

∂f 2 − 23 = q − 1, ∂ q1 3 1 ∂f 4 − 13 3 = q − . ∂ q2 3 2 2 The second-order partial derivatives are

∂2 f ∂ q1 2 ∂2 f ∂ q2 2

Set

4 −5 = − q1 3 , 9

∂2 f = 0, ∂ q2 ∂ q1

4 −4 = − q2 3 , 9

∂2 f = 0. ∂ q1 ∂ q2

∂f = 0, then ∂ q1 2 − 23 q = 1 3 1 1 2 3

=

q1 2

q13 q1

3 2

2 3  3 2 2 = . 3 =

297

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

Set

∂f = 0, then ∂ q2

4 − 31 3 q2 = 3 2 1

=

1 3

q2

9 8

8 9  3 8 = . 9

1

=

q23 q2

The function has the following stationary point: (q1 ; q2 ) =



The Hessian matrix is 

 H(q1 ; q2 ) = 

−5 − 49 q1 3

0

0

− 49 q2 3

−4

2 3

 32

;

 8 3 9



 . −5

The first leading principal minor is H1 (q1 ; q2 ) = − 49 q1 3 < 0, for q1 > 0, which has the same sign as (−1)1 = −1. −5 −4

16 q1 3 q2 3 > 0, for The second leading principal minor is H2 (q1 ; q2 ) = 81 q1 ; q2 > 0, which has the same sign as (−1)2 = +1.    3 8 3 2 2 This means that the stationary point (q1 ; q2 ) = 3 ; 9 is a local

maximum point.

To determine the convexity/concavity of the function, we need the principal minors. −5

−4

The first principal minors are − 49 q1 3 < 0 and − 49 q2 3 < 0, for q1 ; q2 > 0; both have the same sign as (−1)1 = −1. −5 −4

3 3 The second principal minor is 16 81 q1 q2 > 0, for q1 ; q2 > 0; which has the same sign as (−1)2 = +1.

Therefore, the function is a concave function. Conclusion We can conclude that the local maximum point is in fact the absolute maximum. The optimal solution is as follows: 3 3 The company must purchase q1 = 23 2 ≈ 0, 54 and q2 = 89 ≈ 0, 7 units of input 1 and input 2, respectively. 298

 .

DSC2606 17.6. SOLUTIONS TO EXERCISES

The associated maximum profit is 1

2

f (0, 54; 0, 7) = 2(0, 54) 3 + 2(0, 7) 3 − 0, 54 − 1, 5(0, 7) = R1, 62. (What a small profit!!!)

299

DSC2606 CHAPTER 17 OPTIMISATION OF NLPS IN SEVERAL VARIABLES

300

Chapter

18

Method of steepest ascent

Contents 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . 302 18.2 The method of steepest ascent . . . . . . . . . . . . . . 302 18.3 The method of steepest descent . . . . . . . . . . . . . . 306 18.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 306 18.5 Solutions to exercises . . . . . . . . . . . . . . . . . . . 306

301

DSC2606 CHAPTER 18 METHOD OF STEEPEST ASCENT

Sections from prescribed book, Winston Chapter 2, Section 2.1 Chapter 11, Section 11.7

Learning objectives After completing this study unit you should be able to • understand the concepts of the gradient vector of a function and the direction of steepest ascent • determine the gradient vector of a function • solve an unconstrained NLP of the form

Maximise (or minimise) z = f (x1 ; x2 ; . . . ; xn ) with (x1 ; x2 ; . . . ; xn ) ∈ Rn ,

using the method of steepest ascent/descent.

18.1 Introduction As was the case for functions of one variable, it may also be difficult to solve NLP models in several variables by means of differential calculus. In such cases alternative solution techniques must be used. Three such techniques are the method of steepest ascent/descent, which will be studied in this study unit; Lagrange multipliers (Study unit 23) and Kuhn-Tucker conditions (Study unit 24).

18.2 The method of steepest ascent Consider the following unconstrained NLP: Maximise z = f (x1 ; x2 ; . . . ; xn ) with (x1 ; x2 ; . . .; xn ) ∈ Rn . Suppose we find that it is difficult or impossible to solve this NLP by means of differential calculus. We now turn to the alternative solution technique: the method of steepest ascent. You must study the method of steepest ascent as given in Winston, Section 11.7. Extract only the information that will enable you to understand how this method works. To understand this section, you need the concept of a vector. To refresh your memory on this, refer back to Winston, Section 2.1. 302

DSC2606 18.2. THE METHOD OF STEEPEST ASCENT

NOTATION: We use bold letters to indicate vectors, for example, v0 . When writing, you must denote a vector by underlining the symbol, for example, v0 . The essence of the method of steepest ascent is as follows: Starting at an arbitrarily chosen point, v0 , construct a row of points, v1 , v2 , . . . , such that, in general, each point is closer to the solution than the previous point. Continue with the construction of points until one of the points equals the solution or until the required level of accuracy is achieved. To calculate the following point, say vk+1 , we proceed as follows: (The previous point vk is known.) If we move from vk in the direction of ∇ f (vk ), the function value of f increases as quickly as possible. We move from vk in the direction of ∇ f (vk ) and all points on this line of direction can be expressed in the form vk + t∇ f (vk ), t ≥ 0. We want to move in this direction until we reach the one point, vk+1 , where f (vk + t∇ f (vk )) is a maximum. This is then point vk+1 = vk + tk ∇ f (vk ), where tk is the optimal solution to Maximise f (vk + t∇ f (vk )) with t ≥ 0. This latter function is a function in one variable t which can be solved by means of differential calculus. Example 18.1 Use two iterations of the method of steepest ascent to approximate the solution to Maximise f (x1 ; x2 ) = x1 x2 + 3x2 − x21 − x22 . Start at point v0 = (0; 0). Solution The partial derivatives are

∂f = x2 − 2x1 ; ∂ x1 Then

∂f = x1 + 3 − 2x2. ∂ x2

∇ f (x1 ; x2 ) = (x2 − 2x1; x1 + 3 − 2x2).

303

DSC2606 CHAPTER 18 METHOD OF STEEPEST ASCENT

Iteration 1 The starting point is v0 = (x1 ; x2 ) = (0; 0). Then ∇ f (v0 ) = ∇ f (0; 0) = (0 − 2(0); 0 + 3 − 2(0)) = (0; 3). There exists a t0 ≥ 0 such that the new point is

v1 = v0 + t0 ∇ f (v0 ) = (0; 0) + t0(0; 3) = (0; 3t0 ).

Now calculate the following: f (v1 ) =

f (0; 3t0 )

= 0 + 3(3t0) − 0 − (3t0)2 = 9t0 − 9t0 2 . Let g(t0 ) = 9t0 − 9t0 2 . This function must be maximised. The stationary point is determined from g′ (t0 ) = 0. Then g′ (t0 ) = 9 − 18t0 = 0

⇒ t0 = 0, 5.

The second derivative is g′′ (t0 ) = −18 < 0

⇒ t0 = 0, 5 is a local maximum.

Now the next point will be v1 = (0; 3t0 ) = (0; 3(0, 5)) = (0; 1, 5). At the maximum point, all partial derivatives must be zero and so ∇ f = 0. Now since

∇ f (v1 ) = = = 6=

∇ f (0; 1, 5) (1, 5 − 2(0); 0 + 3 − 2(1, 5)) (1, 5; 0) (0; 0),

it follows that v1 = (0; 1, 5) is not the optimal solution and we must carry on with the method. Iteration 2 The starting point is v1 = (0; 1, 5). Then ∇ f (v1 ) = ∇ f (0; 1, 5) = (1, 5; 0).

304

DSC2606 18.2. THE METHOD OF STEEPEST ASCENT

There exists a t1 ≥ 0 such that the new point is v2 = v1 + t1 ∇ f (v1 ) = (0; 1, 5) + t1 (1, 5; 0) = (1, 5t1 ; 1, 5). Now calculate the following: f (v2 ) =

f (1, 5t1 ; 1, 5)

= (1, 5t1 )(1, 5) + 3(1, 5) − (1, 5t1)2 − (1, 5)2 = 2, 25t1 + 4, 5 − 2, 25t12 − 2, 25 = 2, 25t1 − 2, 25t1 2 + 2, 25. Let h(t1 ) = 2, 25t1 − 2, 25t1 2 + 2, 25. This function must be maximised. The stationary point is determined from h′ (t1 ) = 0. Then h′ (t1 ) = 2, 25 − 4, 5t1 = 0

⇒ t1 = 0, 5.

The second derivative is h′′ (t1 ) = −4, 5 < 0

⇒ t1 = 0, 5 is a local maximum.

Now the next point will be v2 = (1, 5t1 ; 1, 5) = (1, 5(0, 5); 1, 5) = (0, 75; 1, 5). Now since ∇ f (v2 ) = = = 6=

∇ f (0, 75; 1, 5) (1, 5 − 2(0, 75); 0, 75 + 3 − 2(1, 5)) (0; 0, 75) (0; 0),

it follows that the optimal solution has not been reached after this second iteration. The approximate solution after the second iteration is at point (x1 ; x2 ) = v2 = (0, 75; 1, 5) and the maximum function value is f (x1 ; x2 ) = f (0, 75; 1, 5) = (0, 75)(1, 5) + 3(1, 5) − (0, 75)2 − (1, 5)2 = 2, 8125.

305

DSC2606 CHAPTER 18 METHOD OF STEEPEST ASCENT

18.3 The method of steepest descent The method of steepest descent is used to solve unconstrained minimisation NLPs in several variables. Here we move in the direction of ∇ f (vk ) so that the function value decreases as quickly as possible. We search for a point vk+1 = vk + tk ∇ f (vk ) where tk is the optimal solution to the following one-dimensional function: Minimise f (vk + t∇ f (vk )) with t ≥ 0. REMARK: The method of steepest ascent/descent may be modified to solve NLPs with linear constraints. This method is known as the method of feasible directions, but does NOT form part of the study material for this module.

18.4 Exercises 1. Use one iteration of the method of steepest ascent to approximate the solution to Maximise z = 4x1 + 6x2 − 2x21 − 2x1 x2 − 2x22 . Start at point v0 = (1; 1).

18.5 Solutions to exercises 1. Set z = f (x1 ; x2 ) = 4x1 + 6x2 − 2x21 − 2x1 x2 − 2x22 . The derivatives are ∂f = 4 − 4x1 − 2x2 ; ∂ x1

∂f = 6 − 2x1 − 4x2 . ∂ x2

Then ∇ f (x1 ; x2 ) = (4 − 4x1 − 2x2 ; 6 − 2x1 − 4x2 ). Iteration 1 The starting point is v0 = (x1 ; x2 ) = (1; 1). Then ∇ f (v0 ) = ∇ f (1; 1) = (4 − 4 − 2; 6 − 2 − 4) = (−2; 0). 306

DSC2606 18.5. SOLUTIONS TO EXERCISES

There exists a t0 ≥ 0 such that the new point is v1 = v0 + t0 ∇ f (v0 ) = (1; 1) + t0(−2; 0) = (1 − 2t0; 1). Now calculate the following: f (v1 ) = f (1 − 2t0; 1)

= 4(1 − 2t0) + 6(1) − 2(1 − 2t0)2 − 2(1 − 2t0)(1) − 2(1)2 = 4 + 4t0 − 8t0 2 .

Let g(t0) = 4 + 4t0 − 8t02 . This function must be maximised. The stationary point is determined from g′ (t0) = 0. Then g′ (t0) = 4 − 16t0 = 0

⇒ t0 = 0, 25.

The second derivative is g′′ (t0 ) = −16 < 0

⇒ t0 = 0, 25 is a local maximum.

Now the next point will be v1 = (1 − 2t0; 1) = (1 − 2(0, 25); 1) = (0, 5; 1). Now since ∇ f (v1 ) = = = 6=

∇ f (0, 5; 1) (4 − 4(0, 5) − 2; 6 − 2(0, 5) − 4) (0; 1) (0; 0),

it follows that the optimal solution has not been reached. The approximate solution after the first iteration is at point (x1 ; x2 ) = v1 = (0, 5; 1) and the maximum function value is z = f (0, 5; 1) = 4(0, 5) + 6(1) − 2(0, 5)2 − 2(0, 5)(1) − 2(1)2 = 4, 5.

307

DSC2606 CHAPTER 18 METHOD OF STEEPEST ASCENT

308

Chapter

19

Lagrange multipliers

Contents 19.1 The method of Lagrange multipliers . . . . . . . . . . . 310 19.2 Verification . . . . . . . . . . . . . . . . . . . . . . . . . 312 19.3 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 313 19.4 Solutions to exercises . . . . . . . . . . . . . . . . . . . 313

309

DSC2606 CHAPTER 19 LAGRANGE MULTIPLIERS

Sections from prescribed book, Winston Chapter 11, Section 11.8

Learning objectives After completing this study unit you should be able to • give the Lagrangian for an NLP with equality constraints • calculate a solution to an NLP with equality constraints using Lagrange multipliers • verify whether the solution obtained is the optimal solution to the NLP.

19.1 The method of Lagrange multipliers Consider the following NLP: Maximise (or minimise) z = f (x1 ; x2 ; . . . ; xn ) subject to g1 (x1 ; x2 ; . . . ; xn ) = b1 g2 (x1 ; x2 ; . . . ; xn ) = b2 .. .. . . gm (x1 ; x2 ; . . .; xn ) = bm . Lagrange multipliers may be used to solve such NLP models with equality constraints. Winston, Section 11.8 gives a description of this method, as well as examples. Refer to Tutorial Letter 101 for the exact page references. The essence of the method of Lagrange multipliers is as follows: An NLP of the form Maximise (or minimise) z = f (x1 ; x2 ; . . . ; xn ) subject to g1 (x1 ; x2 ; . . . ; xn ) = b1 g2 (x1 ; x2 ; . . . ; xn ) = b2 .. .. . . gm (x1 ; x2 ; . . .; xn ) = bm , 310

DSC2606 19.1. THE METHOD OF LAGRANGE MULTIPLIERS

referred to as equality NLP, is rewritten as a model consisting of a single unconstrained function Maximise (or minimise) L(x1 ; x2 ; . . . ; xn ; λ1 ; λ2 ; . . .; λm ) m

= f (x1 ; x2 ; . . . ; xn ) + ∑ λi (bi − gi (x1 ; x2 ; . . . ; xn )), i=1

referred to as Lagrangian NLP. The latter model, which is an easier model, is then solved. The solution to the Lagrangian NLP is often also the optimal solution to the equality NLP. Students may be asked to verify whether the solution to the Lagrangian NLP is the optimal solution to the equality NLP, and this is explained in Section 19.2. Example 19.1 Use the method of Lagrange multipliers to solve the following NLP: Minimise f (x; y) = 2x2 + 3y2 + x − 9y + 16 subject to x + y = 5. Solution The Lagrangian is L(x; y; λ ) = 2x2 + 3y2 + x − 9y + 16 + λ (5 − x − y). To determine the stationary points of L, set

∂L ∂L ∂L = = = 0. Then ∂x ∂y ∂λ

∂L = 4x + 1 − λ = 0 ∂x

⇒ x=

λ −1 4

(1)

∂L = 6y − 9 − λ = 0 ∂y

⇒ y=

λ +9 6

(2)

∂L = 5−x−y = 0 ∂λ

⇒ x + y = 5.

(3)

Substitute (1) and (2) into (3), then

λ −1 λ +9 + 4 6 3λ − 3 + 2λ + 18 5λ λ Now x =

9−1 4

= 2 and y =

9+9 6

=

5

= 60 = 45 = 9.

= 3.

311

DSC2606 CHAPTER 19 LAGRANGE MULTIPLIERS

The point (x; y; λ ) = (2; 3; 9) is a stationary point of L. We assume here that (x; y) = (2; 3) is the optimal solution to the given NLP. The minimum function value is f (2; 3) = 2(2)2 + 3(3)2 + 2 − 9(3) + 16 = 26.

19.2 Verification It is important to verify whether the solution to the Lagrangian NLP is the optimal solution to the equality NLP. Two theorems for this purpose are given in Winston, Section 11.8. We illustrate the use of these theorems by verifying the solution to Example 19.1. We must determine whether (x; y) = (2; 3) is the optimal solution. To do this, we must firstly determine the convexity/concavity of the objective function f (x; y) = 2x2 + 3y2 + x − 9y + 16. The second-order partial derivatives are

∂2 f ∂ x2 ∂2 f ∂ y2

∂2 f = 0; ∂ y∂ x ∂2 f = 0. ∂ x∂ y

= 4; = 6;

The Hessian matrix is H(x; y) =

"

4 0

0 6

#

.

The first principal minors are 4 and 6; both positive. The second principal minor is 24, also positive. All the principal minors are > 0. Therefore, the function f (x; y) is a convex function. Also, the constraint x + y = 5 is linear. We can therefore conclude that the point (x; y) = (2; 3), with f (x; y) = 26, is the optimal solution. In some cases we cannot conclude that the solution to the Lagrangian NLP is the optimal solution to the equality NLP. In these cases, the equality NLP must be solved using alternative methods. 312

DSC2606 19.3. EXERCISES

19.3 Exercises 1. Suppose it costs R2 to purchase an hour of labour and R1 to purchase a unit of capital. If L hours of labour and K units of capital are pur2 1 chased, then L 3 K 3 machines can be produced. What is the maximum number of machines that can be produced if R10 is available for the purchase of labour and capital? (This problem is from Winston, Section 11.8.) Suppose the problem in Exercise 1 is altered so that we must now find the minimum cost of producing six machines. (a) Formulate an NLP model for this problem. Do NOT solve it. (b) Suppose the solution (L; K) = (6; 6) is found by means of the method of Lagrange multipliers. (You may check this solution if you like!) Is this the optimal solution to the NLP model formulated in (a)?

19.4 Solutions to exercises 1. The decision variables are L K

= the number of labour hours purchased, = the number of capital units purchased.

To determine the maximum number of machines that can be produced if R10 is available to purchase both labour and capital, the following NLP model must be solved: 2

1

Maximise f (L; K) = L 3 K 3 subject to 2L + K = 10 and L; K ≥ 0. Since L is one of the decision variables, we use F to denote the Lagrangian. Then 2

1

F(L; K; λ ) = L 3 K 3 + λ (10 − 2L − K). The stationary points of F are determined as follows:

∂F 2 −1 1 = L 3 K 3 − 2λ = 0 ∂L 3

1 1 1 ⇒ λ = L− 3 K 3 3

(1)

2 ∂F 1 2 = L 3 K− 3 − λ = 0 ∂K 3

2 1 2 ⇒ λ = L 3 K− 3 3

(2)

∂F = 10 − 2L − K = 0 ∂λ

⇒ 2L + K = 10.

(3)

313

DSC2606 CHAPTER 19 LAGRANGE MULTIPLIERS

Equate (1) and (2). Then 1 − 13 13 3L K

=

1

1 23 − 23 3L K 2

K3

L3

= 1 2 L3 K3 L = K.

(4)

Substitute (4) into (3). Then 2L + K = 10 2L + L = 10 L =

10 3.

Substitute (4) into (1). Then 1 − 13 13 3L K 1 − 13 13 3L L 1 3.

λ = = =

10 1 Now (L; K; λ ) = ( 10 3 ; 3 ; 3 ) is a stationary point of F. 10 We must determine whether (L; K) = ( 10 3 ; 3 ) is the optimal solution to the NLP.

To do this we must firstly determine the convexity/concavity of the 1 2 objective function f (L; K) = L 3 K 3 . The second-order partial derivatives are

∂2 f ∂ L2

=

∂ 2 −1 1 2 4 1 ( L 3 K 3 ) = − L− 3 K 3 ∂L 3 9

∂2 f ∂ 2 −1 1 = ( L 3K3) = ∂ K∂ L ∂K 3 ∂2 f ∂K

2

=

2 −1 −2 L 3K 3 9

5 ∂ 1 2 −2 2 2 ( L 3 K 3 ) = − L 3 K− 3 ∂K 3 9

∂2 f ∂ 1 2 −2 2 −1 −2 = ( L3 K 3 ) = L 3K 3. ∂ L∂ K ∂L 3 9 The Hessian matrix is 

H(L; K) =  314

4

1

− 29 L− 3 K 3 2 − 13 − 23 9L K

2 − 13 − 23 9L K 2 5 − 29 L 3 K − 3



.

DSC2606 19.4. SOLUTIONS TO EXERCISES

4

1

2

5

The first principal minors are − 29 L− 3 K 3 and − 29 L 3 K − 3 . Since L; K > 0, both first principal minors are negative, and have the same sign as (−1)1 = −1. The second principal minor is 0. Therefore, all nonzero principal minors have the same sign as (−1)k (here k = 1) and it follows that f (L; K) is concave. 10 Also, the constraint 2L+K = 10 is linear. Therefore, the point (L; K) = ( 10 3; 3) is an optimal solution to the original NLP.

The maximum number of machines will be produced by purchasing 10 10 10 3 labour hours and 3 capital units, and therefore spending 2 × 3 = R6, 67 to purchase labour hours and 1 × 10 3 = R3, 33 to purchase capital units. The maximum number of machines produced will be 2

1

10 10 3 10 3 f ( 10 3; 3) = (3) (3)

= 10 3 = 3, 33. (Interpret this as three machines being completed, as well as a third of the fourth machine.) 2. (a) The decision variables are as defined in Exercise 1. The NLP model is Minimise f (L; K) = 2L + K subject to 2

and

1

L3 K 3 = 6 L; K ≥ 0.

(b) Since the objective function, f (L; K) = 2L + K, is linear, it is a convex 2 1 function. However, the constraint L 3 K 3 = 6 is not linear. Therefore, we cannot deduce that the point (L; K) = (6; 6) is an optimal solution to the NLP. We may say that a solution, not necessarily optimal, is as follows: Purchase six hours of labour and six capital units and then the cost of producing six machines will be 2 × 6 + 1 × 6 = R18.

315

DSC2606 CHAPTER 19 LAGRANGE MULTIPLIERS

316

Chapter

20

Kuhn-Tucker conditions

Contents 20.1 Kuhn-Tucker conditions . . . . . . . . . . . . . . . . . . 318 20.2 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . 322 20.3 Solutions to exercises . . . . . . . . . . . . . . . . . . . 322

317

DSC2606 CHAPTER 20 KUHN-TUCKER CONDITIONS

Sections from prescribed book, Winston Chapter 11, Section 11.9

Learning objectives After completing this study unit you should be able to • rewrite ≥ and = constraints as ≤ constraints • give the Kuhn-Tucker necessary conditions for an NLP • verify whether a point, say (x¯1 ; x¯2 ; . . .; x¯n ), satisfies the Kuhn-Tucker necessary conditions • calculate a point satisfying the Kuhn-Tucker necessary conditions (consider all allowable possibilities of the multipliers, λ s) • verify whether a point, say (x¯1 ; x¯2 ; . . .; x¯n ), satisfies the Kuhn-Tucker sufficient conditions • solve an NLP using the Kuhn-Tucker conditions.

20.1 Kuhn-Tucker conditions Consider the following NLP: Maximise (or minimise) z = f (x1 ; x2 ; . . . ; xn ) subject to g1 (x1 ; x2 ; . . . ; xn ) ≤ b1 g2 (x1 ; x2 ; . . . ; xn ) ≤ b2 .. .. . . gm (x1 ; x2 ; . . .; xn ) ≤ bm . Kuhn-Tucker conditions may be used to solve such NLP models with lessthan-or-equal-to constraints (≤ constraints). Constraints of the types ≥ or = must be rewritten as ≤ constraints before Kuhn-Tucker conditions can be applied to the NLP model. This means that a constraint of the form h(x1 ; x2 ; . . .; xn ) ≥ b, must be rewritten as −h(x1 ; x2 ; . . .; xn ) ≤ −b. And a constraint of the form h(x1 ; x2 ; . . .; xn ) = b, 318

DSC2606 20.1. KUHN-TUCKER CONDITIONS

must be rewritten as h(x1 ; x2 ; . . . ; xn ) ≤ b, −h(x1 ; x2 ; . . . ; xn ) ≤ −b. The nonnegativity restrictions on variables must also be rewritten as ≤ constraints. For example, xi ≥ 0 must be rewritten as −xi ≤ 0. These are then included as constraints in the NLP model. Kuhn-Tucker conditions are grouped as • Kuhn-Tucker necessary conditions • Kuhn-Tucker sufficient conditions. This is discussed in Winston, Section 11.9 and you must study the relevant parts. Refer to Tutorial Letter 101 for the exact page references. The Kuhn-Tucker necessary conditions are given in two theorems, and we conclude the following: If a point, say x¯ = (x¯1 ; x¯2 ; . . . ; x¯n ), satisfies the necessary conditions, it implies that x¯ may be the optimal solution to the NLP. (The possibility must be examined.) The Kuhn-Tucker sufficient conditions are also given in two theorems, and we conclude the following: If a point, say x¯ = (x¯1 ; x¯2 ; . . . ; x¯n ), satisfies the sufficient conditions, it implies that x¯ is definitely the optimal solution to the NLP. We now illustrate the use of Kuhn-Tucker conditions by means of an example.

Example 20.1 Consider the following NLP model: Maximise z = 24x1 − x21 + 10x2 − x22 subject to x1 ≤ 8 x2 ≤ 7 and x1 ; x2 ≥ 0. Use Kuhn-Tucker conditions to find the optimal solution.

319

DSC2606 CHAPTER 20 KUHN-TUCKER CONDITIONS

Solution Set f (x1 ; x2 ) = z = 24x1 − x21 + 10x2 − x22 . All constraints must be ≤ constraints. The model can be rewritten as Maximise subject to

f (x1 ; x2 ) = 24x1 − x21 + 10x2 − x22

g1 (x1 ; x2 ) g2 (x1 ; x2 ) g3 (x1 ; x2 ) g4 (x1 ; x2 )

= x1 = x2 = −x1 = −x2

≤ ≤ ≤ ≤

8 7 0 0.

The first-order partial derivatives are

∂f ∂ x1 ∂ g1 ∂ x1 ∂ g2 ∂ x1 ∂ g3 ∂ x1 ∂ g4 ∂ x1

∂f ∂ x2 ∂ g1 ∂ x2 ∂ g2 ∂ x2 ∂ g3 ∂ x2 ∂ g4 ∂ x2

= 24 − 2x1; =

1;

=

0;

= −1; =

0;

= 10 − 2x2; =

0;

=

1;

=

0;

= −1.

The Kuhn-Tucker necessary conditions are 24 − 2x1 − [λ1 (1) + λ2(0) + λ3(−1) + λ4(0)] 10 − 2x2 − [λ1 (0) + λ2(1) + λ3(0) + λ4(−1)] λ1 (8 − x1) λ2 (7 − x2) λ3 (0 + x1) λ4 (0 + x2) λ1 ; λ2 ; λ3 ; λ4

= = = = = = ≥

0 0 0 0 0 0 0.

These are simplified and rewritten as 24 − 2x1 − λ1 + λ3 10 − 2x2 − λ2 + λ4 λ1 (8 − x1) λ2 (7 − x2) λ3 x1 λ4 x2 λ1 ; λ2 ; λ3 ; λ4

= = = = = = ≥

0 0 0 0 0 0 0.

(1) (2) (3) (4) (5) (6)

Since we are interested in a nontrivial solution (x1 6= 0 and x2 6= 0), the last two equations imply that λ3 = λ4 = 0. Now the system consists of four equations in four unknowns.

320

DSC2606 20.1. KUHN-TUCKER CONDITIONS

Consider the following cases: Case 1: Let λ1 = λ2 = 0, then (1) ⇒ 24 − 2x1 = 0 ⇒ x1 = 12. This violates the constraint g1 (x1 ; x2 ) = x1 ≤ 8. Therefore, this case is not acceptable. Case 2: Let λ1 = 0; λ2 > 0, then (1) ⇒ x1 = 12. As in case 1 above, this case is not acceptable. Case 3: Let λ1 > 0; λ2 = 0, then (3) ⇒ 8 − x1 = 0 ⇒ x1 = 8 (1) ⇒ 24 − 2(8) − λ1 = 0 ⇒ λ1 = 8 (2) ⇒ 10 − 2x2 = 0 ⇒ x2 = 5. Case 4: Let λ1 > 0; λ2 > 0, then (3) (4) (1) (2)

⇒ ⇒ ⇒ ⇒

x1 = 8 x2 = 7 24 − 2(8) − λ1 = 0 ⇒ λ1 = 8 10 − 2(7) − λ2 = 0 ⇒ λ2 = −4.

The multiplier λ2 = −4 violates the condition λ2 ≥ 0. Therefore, this case is not acceptable. We have found nonnegative multipliers, λ1 = 8; λ2 = λ3 = λ4 = 0, for which the Kuhn-Tucker necessary conditions hold at point (x1 ; x2 ) = (8; 5). We must now check whether this point is an optimal solution to the NLP model. The second-order partial derivatives are

∂2 f ∂ x21

= −2;

∂2 f ∂ x2 ∂ x1

= 0;

∂2 f ∂ x22

= −2;

∂2 f ∂ x1 ∂ x2

= 0.

The Hessian matrix is H(x1; x2 ) =



−2 0

0 −2



.

The first principal minors are −2 and −2; both with the same sign as (−1)1 = −1. The second principal minor is (−2)(−2) = 4 > 0, which has the same sign as (−1)2 = +1. Therefore, the function f (x1 ; x2 ) is a concave function.

321

DSC2606 CHAPTER 20 KUHN-TUCKER CONDITIONS

All four of the constraints are linear. Therefore, the constraints gi (x1 ; x2 ), i = 1; 2; 3; 4 are convex functions. This means that the Kuhn-Tucker sufficient conditions are met. We can conclude that (x1 ; x2 ) = (8; 5) is an optimal solution to the given NLP model. The maximum function value is f (8; 5) = 24(8) − 82 + 10(5) − 52 = 153.

You should also study the examples given in Winston, Section 11.9.

20.2 Exercises 1. Consider the following NLP model: Minimise Z = (x1 − 1)2 + (x2 − 1)2 subject to x1 − 2x2 ≥ −2 x1 + 2x2 ≤ 10 x1 − x2 ≤ 4 x1 ≥ 2 x2 ≥ 0. The optimal solution to this model is at the point (x1 ; x2 ) = (2; 1). (a) Graph the feasible area and indicate the optimal point clearly on the graph. Which constraints are binding and which are nonbinding? (b) Use the Kuhn-Tucker conditions to verify that the solution (x1 ; x2 ) = (2; 1) is indeed the optimal solution.

20.3 Solutions to exercises 1.

(a) The graphical representation of the feasible area is given in Figure 20.1. Only the fourth constraint, x1 ≥ 2, is binding. The other constraints are nonbinding.

322

DSC2606 20.3. SOLUTIONS TO EXERCISES

x2 x1 = 2

8

x1 −

−2

x1



x2

=

4

= 2x 2

4

6

2 (2; 1)

−2 −2

x1 +

2x

2

b

2

4

6

=1

8

0

10

12

x1

−4 Figure 20.1: Study Unit 20, Exercise 1 – Feasible area

(b) All the constraints must be ≤ constraints. The NLP model can then be rewritten as Minimise Z = f (x1 ; x2 ) = (x1 − 1)2 + (x2 − 1)2 subject to g1 (x1 ; x2 ) = −x1 + 2x2 ≤ 2 g2 (x1 ; x2 ) = x1 + 2x2 ≤ 10 g3 (x1 ; x2 ) = x1 − x2 ≤ 4 g4 (x1 ; x2 ) = −x1 ≤ −2 g5 (x1 ; x2 ) = −x2 ≤ 0. The first-order partial derivatives are

∂f ∂ x1 ∂ g1 ∂ x1 ∂ g2 ∂ x1 ∂ g3 ∂ x1 ∂ g4 ∂ x1 ∂ g5 ∂ x1

= 2x1 − 2; = −1; =

1;

=

1;

= −1; =

0;

∂f ∂ x2 ∂ g1 ∂ x2 ∂ g2 ∂ x2 ∂ g3 ∂ x2 ∂ g4 ∂ x2 ∂ g5 ∂ x2

= 2x2 − 2; =

2;

=

2;

= −1; =

0;

= −1. 323

DSC2606 CHAPTER 20 KUHN-TUCKER CONDITIONS

The Kuhn-Tucker necessary conditions are 2x1 − 2 − λ1 + λ2 + λ3 − λ4 + 0λ5 2x2 − 2 + 2λ1 + 2λ2 − λ3 + 0λ4 − λ5 λ1 (2 + x1 − 2x2 ) λ2 (10 − x1 − 2x2 ) λ3 (4 − x1 + x2 ) λ4 (−2 + x1 ) λ5 (0 + x2 ) λ1 ; λ2 ; λ3 ; λ4 ; λ5

= = = = = = = ≥

0 0 0 0 0 0 0 0.

(1) (2) (3) (4) (5) (6) (7)

Now substitute point (2; 1) into the Kuhn-Tucker necessary conditions. Then we obtain the following: Equation (7): λ5 (1) = 0 ⇒ λ5 = 0. ⇒ λ4 > 0. Equation (6): λ4 (−2 + 2) = 0 (This also follows from the fact that the fourth constraint is binding.) Equation (5): λ3 (4 − 2 + 1) = 0 ⇒ λ3 = 0. Equation (4): λ2 (10 − 2 − 2) = 0 ⇒ λ2 = 0. Equation (3): λ1 (2 + 2 − 2) = 0 ⇒ λ1 = 0. Equation (2): 2 − 2 = 0 ⇒ True. Equation (1): 2(2) − 2 − λ4 = 0 ⇒ λ4 = 2. Therefore, there are nonnegative multipliers λ1 = λ2 = λ3 = λ5 = 0 and λ4 = 2 satisfying the Kuhn-Tucker necessary conditions at the point (x1 ; x2 ) = (2; 1). Now we must verify whether the Kuhn-Tucker sufficient conditions are satisfied. The second-order partial derivatives of f (x1 ; x2 ) are

∂2 f ∂ x21 ∂2 f ∂ x22

∂2 f = 0; ∂ x2 ∂ x1 ∂2 f = 0. ∂ x1 ∂ x2

= 2; = 2;

The Hessian matrix is H(x1 ; x2 ) =

"

2 0

0 2

#

.

The first principal minors are 2 and 2; both positive. The second principal minor is 4, also positive. Therefore, the function f (x1 ; x2 ) is a convex function. 324

DSC2606 20.3. SOLUTIONS TO EXERCISES

All five the constraints are linear. Therefore, the constraints gi (x1 ; x2 ), i = 1; 2; 3; 4; 5, are convex functions. This means that the Kuhn-Tucker sufficient conditions have been met. We can conclude that (x1 ; x2 ) = (2; 1) is the optimal solution to the given NLP model.

325

DSC2606 CHAPTER 20 KUHN-TUCKER CONDITIONS

326